cuda toolkit install
Manjaro/Arch
for newest version, install from AUR:
pamac blabla
To install older version, use nvidia's .run file:
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run --2023-05-11 15:42:38-- https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt' Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.199.20.126 Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.199.20.126|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 3490450898 (3,2G) [application/octet-stream] Saving to: ‘cuda_11.7.0_515.43.04_linux.run’ cuda_11.7.0_515.43.04_l 100%[=============================>] 3,25G 37,5MB/s in 90s 2023-05-11 15:44:08 (37,2 MB/s) - ‘cuda_11.7.0_515.43.04_linux.run’ saved [3490450898/3490450898] (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ls -l total 3408820 -rw-r--r-- 1 chris chris 1110 11. Mai 15:34 convert_llama_weights_to_hf.py -rw-r--r-- 1 chris chris 3490450898 4. Mai 2022 cuda_11.7.0_515.43.04_linux.run -rw-r--r-- 1 chris chris 6280 11. Mai 15:34 datautils.py -rw-r--r-- 1 chris chris 5450 11. Mai 15:34 gptq.py -rw-r--r-- 1 chris chris 13368 11. Mai 15:34 llama_inference_offload.py -rw-r--r-- 1 chris chris 3784 11. Mai 15:34 llama_inference.py -rw-r--r-- 1 chris chris 16519 11. Mai 15:34 llama.py -rw-r--r-- 1 chris chris 397 11. Mai 15:34 modelutils.py -rw-r--r-- 1 chris chris 18554 11. Mai 15:34 opt.py -rw-r--r-- 1 chris chris 4142 11. Mai 15:34 quant_cuda.cpp drwxr-xr-x 2 chris chris 4096 11. Mai 15:35 quant_cuda.egg-info -rw-r--r-- 1 chris chris 27359 11. Mai 15:34 quant_cuda_kernel.cu -rw-r--r-- 1 chris chris 19738 11. Mai 15:34 quant.py -rw-r--r-- 1 chris chris 8947 11. Mai 15:34 README.md -rw-r--r-- 1 chris chris 90 11. Mai 15:34 requirements.txt -rw-r--r-- 1 chris chris 287 11. Mai 15:34 setup_cuda.py -rw-r--r-- 1 chris chris 5282 11. Mai 15:34 test_kernel.py (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ chmod +x cuda_11.7.0_515.43.04_linux.run (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ./cuda_11.7.0_515.43.04_linux.run Failed to verify gcc version. See log at /tmp/cuda-installer.log for details. (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ cat /tmp/cuda-installer.log [INFO]: Checking compiler version... [INFO]: gcc location: /usr/bin/gcc [INFO]: gcc version: gcc version 12.2.1 20230201 (GCC) [ERROR]: unsupported compiler version: 12.2.1. Use --override to override this check. (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /build/gcc/src/gcc/configure --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --enable-bootstrap --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --with-build-config=bootstrap-lto --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-werror Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.2.1 20230201 (GCC) (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ./cuda_11.7.0_515.43.04_linux.run --override Signal caught, cleaning up (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ./cuda_11.7.0_515.43.04_linux.run --override Installation failed. See log at /tmp/cuda-installer.log for details. (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ sudo ./cuda_11.7.0_515.43.04_linux.run --override [sudo] password for chris: =========== = Summary = =========== Driver: Not Selected Toolkit: Installed in /usr/local/cuda-11.7/ Please make sure that - PATH includes /usr/local/cuda-11.7/bin - LD_LIBRARY_PATH includes /usr/local/cuda-11.7/lib64, or, add /usr/local/cuda-11.7/lib64 to /etc/ld.so.conf and run ldconfig as root To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.7/bin ***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 515.00 is required for CUDA 11.7 functionality to work. To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file: sudo <CudaInstaller>.run --silent --driver Logfile is /var/log/cuda-installer.log (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ mhwd -li > Installed PCI configs: -------------------------------------------------------------------------------- NAME VERSION FREEDRIVER TYPE -------------------------------------------------------------------------------- video-hybrid-intel-nvidia-prime 2023.03.23 false PCI video-modesetting 2020.01.13 true PCI Warning: No installed USB configs! (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ pamac list | grep -i nvidia lib32-nvidia-utils 530.41.03-1 multilib 194,5 MB linux61-nvidia 530.41.03-6 extra 53,8 MB mhwd-nvidia 530.41.03-4 extra 1,6 kB mhwd-nvidia-390xx 390.157-6 extra 1,9 kB mhwd-nvidia-470xx 470.182.03-2 extra 1,8 kB nvidia-prime 1.0-4 extra 112 bytes nvidia-utils 530.41.03-4 extra 690,8 MB opencl-nvidia 530.41.03-4 extra 80,5 MB (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ pamac list | grep -i cuda cuda 12.1.1-1 community 4,7 GB cuda-tools 12.1.1-1 community 2,5 GB (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ echo $PATH /home/chris/miniconda3/envs/textgen/bin:/home/chris/miniconda3/condabin:/home/chris/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/opt/cuda/bin:/opt/cuda/nsight_compute:/opt/cuda/nsight_systems/bin:/home/chris/.dotnet/tools:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/var/lib/snapd/snap/bin (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ echo $LD_LIBRARY_PATH (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ echo $LD_LIBRARY_PATH (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ls- l ^C (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ls -l /etc/ld.so.conf -rw-r--r-- 1 root root 117 21. Mär 15:24 /etc/ld.so.conf (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ more /etc/ld.so.conf # Dynamic linker/loader configuration. # See ld.so(8) and ldconfig(8) for details. include /etc/ld.so.conf.d/*.conf
as mentioned in the installer output, add the following at the end of ~/.bashrc and, to be safe, also to ~/.zshrc:
Then, add the path also to ld.so.conf which has to be done as root:
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ su Password: [bender-22 GPTQ-for-LLaMa]# echo /usr/local/cuda-11.7/lib64 >> /etc/ld.so.conf [bender-22 GPTQ-for-LLaMa]# ldconfig [bender-22 GPTQ-for-LLaMa]# exit exit (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ cat ~/.bashrc ~/.zshrc