===== cuda toolkit install ===== ==== Manjaro/Arch ==== for newest version, install from AUR: pamac blabla To install older version, use nvidia's .run file: (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run --2023-05-11 15:42:38-- https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt' Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.199.20.126 Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.199.20.126|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 3490450898 (3,2G) [application/octet-stream] Saving to: ‘cuda_11.7.0_515.43.04_linux.run’ cuda_11.7.0_515.43.04_l 100%[=============================>] 3,25G 37,5MB/s in 90s 2023-05-11 15:44:08 (37,2 MB/s) - ‘cuda_11.7.0_515.43.04_linux.run’ saved [3490450898/3490450898] (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ls -l total 3408820 -rw-r--r-- 1 chris chris 1110 11. Mai 15:34 convert_llama_weights_to_hf.py -rw-r--r-- 1 chris chris 3490450898 4. Mai 2022 cuda_11.7.0_515.43.04_linux.run -rw-r--r-- 1 chris chris 6280 11. Mai 15:34 datautils.py -rw-r--r-- 1 chris chris 5450 11. Mai 15:34 gptq.py -rw-r--r-- 1 chris chris 13368 11. Mai 15:34 llama_inference_offload.py -rw-r--r-- 1 chris chris 3784 11. Mai 15:34 llama_inference.py -rw-r--r-- 1 chris chris 16519 11. Mai 15:34 llama.py -rw-r--r-- 1 chris chris 397 11. Mai 15:34 modelutils.py -rw-r--r-- 1 chris chris 18554 11. Mai 15:34 opt.py -rw-r--r-- 1 chris chris 4142 11. Mai 15:34 quant_cuda.cpp drwxr-xr-x 2 chris chris 4096 11. Mai 15:35 quant_cuda.egg-info -rw-r--r-- 1 chris chris 27359 11. Mai 15:34 quant_cuda_kernel.cu -rw-r--r-- 1 chris chris 19738 11. Mai 15:34 quant.py -rw-r--r-- 1 chris chris 8947 11. Mai 15:34 README.md -rw-r--r-- 1 chris chris 90 11. Mai 15:34 requirements.txt -rw-r--r-- 1 chris chris 287 11. Mai 15:34 setup_cuda.py -rw-r--r-- 1 chris chris 5282 11. Mai 15:34 test_kernel.py (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ chmod +x cuda_11.7.0_515.43.04_linux.run (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ./cuda_11.7.0_515.43.04_linux.run Failed to verify gcc version. See log at /tmp/cuda-installer.log for details. (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ cat /tmp/cuda-installer.log [INFO]: Checking compiler version... [INFO]: gcc location: /usr/bin/gcc [INFO]: gcc version: gcc version 12.2.1 20230201 (GCC) [ERROR]: unsupported compiler version: 12.2.1. Use --override to override this check. (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /build/gcc/src/gcc/configure --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --enable-bootstrap --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --with-build-config=bootstrap-lto --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-werror Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.2.1 20230201 (GCC) (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ./cuda_11.7.0_515.43.04_linux.run --override Signal caught, cleaning up (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ./cuda_11.7.0_515.43.04_linux.run --override Installation failed. See log at /tmp/cuda-installer.log for details. (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ sudo ./cuda_11.7.0_515.43.04_linux.run --override [sudo] password for chris: =========== = Summary = =========== Driver: Not Selected Toolkit: Installed in /usr/local/cuda-11.7/ Please make sure that - PATH includes /usr/local/cuda-11.7/bin - LD_LIBRARY_PATH includes /usr/local/cuda-11.7/lib64, or, add /usr/local/cuda-11.7/lib64 to /etc/ld.so.conf and run ldconfig as root To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.7/bin ***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 515.00 is required for CUDA 11.7 functionality to work. To install the driver using this installer, run the following command, replacing with the name of this run file: sudo .run --silent --driver Logfile is /var/log/cuda-installer.log (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ mhwd -li > Installed PCI configs: -------------------------------------------------------------------------------- NAME VERSION FREEDRIVER TYPE -------------------------------------------------------------------------------- video-hybrid-intel-nvidia-prime 2023.03.23 false PCI video-modesetting 2020.01.13 true PCI Warning: No installed USB configs! (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ pamac list | grep -i nvidia lib32-nvidia-utils 530.41.03-1 multilib 194,5 MB linux61-nvidia 530.41.03-6 extra 53,8 MB mhwd-nvidia 530.41.03-4 extra 1,6 kB mhwd-nvidia-390xx 390.157-6 extra 1,9 kB mhwd-nvidia-470xx 470.182.03-2 extra 1,8 kB nvidia-prime 1.0-4 extra 112 bytes nvidia-utils 530.41.03-4 extra 690,8 MB opencl-nvidia 530.41.03-4 extra 80,5 MB (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ pamac list | grep -i cuda cuda 12.1.1-1 community 4,7 GB cuda-tools 12.1.1-1 community 2,5 GB (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ echo $PATH /home/chris/miniconda3/envs/textgen/bin:/home/chris/miniconda3/condabin:/home/chris/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/opt/cuda/bin:/opt/cuda/nsight_compute:/opt/cuda/nsight_systems/bin:/home/chris/.dotnet/tools:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/var/lib/snapd/snap/bin (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ echo $LD_LIBRARY_PATH (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ echo $LD_LIBRARY_PATH (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ls- l ^C (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ls -l /etc/ld.so.conf -rw-r--r-- 1 root root 117 21. Mär 15:24 /etc/ld.so.conf (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ more /etc/ld.so.conf # Dynamic linker/loader configuration. # See ld.so(8) and ldconfig(8) for details. include /etc/ld.so.conf.d/*.conf as mentioned in the installer output, add the following at the end of ~/.bashrc and, to be safe, also to ~/.zshrc: Then, add the path also to ld.so.conf which has to be done as root: (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ su Password: [bender-22 GPTQ-for-LLaMa]# echo /usr/local/cuda-11.7/lib64 >> /etc/ld.so.conf [bender-22 GPTQ-for-LLaMa]# ldconfig [bender-22 GPTQ-for-LLaMa]# exit exit (textgen) [chris@bender-22 GPTQ-for-LLaMa]$ cat ~/.bashrc ~/.zshrc