===== cuda toolkit install =====
==== Manjaro/Arch ====
for newest version, install from AUR:
pamac blabla
To install older version, use nvidia's .run file:
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run
--2023-05-11 15:42:38-- https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.199.20.126
Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.199.20.126|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3490450898 (3,2G) [application/octet-stream]
Saving to: ‘cuda_11.7.0_515.43.04_linux.run’
cuda_11.7.0_515.43.04_l 100%[=============================>] 3,25G 37,5MB/s in 90s
2023-05-11 15:44:08 (37,2 MB/s) - ‘cuda_11.7.0_515.43.04_linux.run’ saved [3490450898/3490450898]
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ls -l
total 3408820
-rw-r--r-- 1 chris chris 1110 11. Mai 15:34 convert_llama_weights_to_hf.py
-rw-r--r-- 1 chris chris 3490450898 4. Mai 2022 cuda_11.7.0_515.43.04_linux.run
-rw-r--r-- 1 chris chris 6280 11. Mai 15:34 datautils.py
-rw-r--r-- 1 chris chris 5450 11. Mai 15:34 gptq.py
-rw-r--r-- 1 chris chris 13368 11. Mai 15:34 llama_inference_offload.py
-rw-r--r-- 1 chris chris 3784 11. Mai 15:34 llama_inference.py
-rw-r--r-- 1 chris chris 16519 11. Mai 15:34 llama.py
-rw-r--r-- 1 chris chris 397 11. Mai 15:34 modelutils.py
-rw-r--r-- 1 chris chris 18554 11. Mai 15:34 opt.py
-rw-r--r-- 1 chris chris 4142 11. Mai 15:34 quant_cuda.cpp
drwxr-xr-x 2 chris chris 4096 11. Mai 15:35 quant_cuda.egg-info
-rw-r--r-- 1 chris chris 27359 11. Mai 15:34 quant_cuda_kernel.cu
-rw-r--r-- 1 chris chris 19738 11. Mai 15:34 quant.py
-rw-r--r-- 1 chris chris 8947 11. Mai 15:34 README.md
-rw-r--r-- 1 chris chris 90 11. Mai 15:34 requirements.txt
-rw-r--r-- 1 chris chris 287 11. Mai 15:34 setup_cuda.py
-rw-r--r-- 1 chris chris 5282 11. Mai 15:34 test_kernel.py
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ chmod +x cuda_11.7.0_515.43.04_linux.run
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ./cuda_11.7.0_515.43.04_linux.run
Failed to verify gcc version. See log at /tmp/cuda-installer.log for details.
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ cat /tmp/cuda-installer.log
[INFO]: Checking compiler version...
[INFO]: gcc location: /usr/bin/gcc
[INFO]: gcc version: gcc version 12.2.1 20230201 (GCC)
[ERROR]: unsupported compiler version: 12.2.1. Use --override to override this check.
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --enable-bootstrap --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --with-build-config=bootstrap-lto --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-werror
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.2.1 20230201 (GCC)
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ./cuda_11.7.0_515.43.04_linux.run --override
Signal caught, cleaning up
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ./cuda_11.7.0_515.43.04_linux.run --override
Installation failed. See log at /tmp/cuda-installer.log for details.
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ sudo ./cuda_11.7.0_515.43.04_linux.run --override
[sudo] password for chris:
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-11.7/
Please make sure that
- PATH includes /usr/local/cuda-11.7/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-11.7/lib64, or, add /usr/local/cuda-11.7/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.7/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 515.00 is required for CUDA 11.7 functionality to work.
To install the driver using this installer, run the following command, replacing with the name of this run file:
sudo .run --silent --driver
Logfile is /var/log/cuda-installer.log
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ mhwd -li
> Installed PCI configs:
--------------------------------------------------------------------------------
NAME VERSION FREEDRIVER TYPE
--------------------------------------------------------------------------------
video-hybrid-intel-nvidia-prime 2023.03.23 false PCI
video-modesetting 2020.01.13 true PCI
Warning: No installed USB configs!
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ pamac list | grep -i nvidia
lib32-nvidia-utils 530.41.03-1 multilib 194,5 MB
linux61-nvidia 530.41.03-6 extra 53,8 MB
mhwd-nvidia 530.41.03-4 extra 1,6 kB
mhwd-nvidia-390xx 390.157-6 extra 1,9 kB
mhwd-nvidia-470xx 470.182.03-2 extra 1,8 kB
nvidia-prime 1.0-4 extra 112 bytes
nvidia-utils 530.41.03-4 extra 690,8 MB
opencl-nvidia 530.41.03-4 extra 80,5 MB
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ pamac list | grep -i cuda
cuda 12.1.1-1 community 4,7 GB
cuda-tools 12.1.1-1 community 2,5 GB
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ echo $PATH
/home/chris/miniconda3/envs/textgen/bin:/home/chris/miniconda3/condabin:/home/chris/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/opt/cuda/bin:/opt/cuda/nsight_compute:/opt/cuda/nsight_systems/bin:/home/chris/.dotnet/tools:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/var/lib/snapd/snap/bin
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ echo $LD_LIBRARY_PATH
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ echo $LD_LIBRARY_PATH
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ls- l ^C
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ls -l /etc/ld.so.conf
-rw-r--r-- 1 root root 117 21. Mär 15:24 /etc/ld.so.conf
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ more /etc/ld.so.conf
# Dynamic linker/loader configuration.
# See ld.so(8) and ldconfig(8) for details.
include /etc/ld.so.conf.d/*.conf
as mentioned in the installer output, add the following at the end of ~/.bashrc and, to be safe, also to ~/.zshrc:
Then, add the path also to ld.so.conf which has to be done as root:
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ su
Password:
[bender-22 GPTQ-for-LLaMa]# echo /usr/local/cuda-11.7/lib64 >> /etc/ld.so.conf
[bender-22 GPTQ-for-LLaMa]# ldconfig
[bender-22 GPTQ-for-LLaMa]# exit
exit
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ cat ~/.bashrc ~/.zshrc