模型相关的CUDA技术
CUDA lazy loading
我测试了一下
export CUDA_MODULE_LOADING=LAZY
root@a484578ca5f8:/workspace# nvidia-smi
Thu Dec 22 12:29:51 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.56.06 Driver Version: 520.56.06 CUDA Version: 11.8 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA RTX A4000 Off | 00000000:01:00.0 Off | Off |
| 41% 34C P8 16W / 140W | 469MiB / 16376MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
从670MB降到469MB,请求了几次稳定到471MB,显存
注意事项
- Lazy Loading is a CUDA Runtime and CUDA Driver feature.
- Lazy Loading was introduced in CUDA 11.7, and received a significant upgrade in CUDA 11.8.
- As CUDA Runtime is usually linked statically into programs and libraries, this means that you have to recompile your program with CUDA 11.7+ toolkit and use CUDA 11.7+ libraries.