===== cuda toolkit install =====

==== Manjaro/Arch ====

for newest version, install from AUR:
<cli>
pamac blabla
</cli>

To install older version, use nvidia's .run file:
<cli>
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run
--2023-05-11 15:42:38--  https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.199.20.126
Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.199.20.126|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3490450898 (3,2G) [application/octet-stream]
Saving to: ‘cuda_11.7.0_515.43.04_linux.run’

cuda_11.7.0_515.43.04_l 100%[=============================>]   3,25G  37,5MB/s    in 90s     

2023-05-11 15:44:08 (37,2 MB/s) - ‘cuda_11.7.0_515.43.04_linux.run’ saved [3490450898/3490450898]

(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ls -l
total 3408820
-rw-r--r-- 1 chris chris       1110 11. Mai 15:34 convert_llama_weights_to_hf.py
-rw-r--r-- 1 chris chris 3490450898  4. Mai 2022  cuda_11.7.0_515.43.04_linux.run
-rw-r--r-- 1 chris chris       6280 11. Mai 15:34 datautils.py
-rw-r--r-- 1 chris chris       5450 11. Mai 15:34 gptq.py
-rw-r--r-- 1 chris chris      13368 11. Mai 15:34 llama_inference_offload.py
-rw-r--r-- 1 chris chris       3784 11. Mai 15:34 llama_inference.py
-rw-r--r-- 1 chris chris      16519 11. Mai 15:34 llama.py
-rw-r--r-- 1 chris chris        397 11. Mai 15:34 modelutils.py
-rw-r--r-- 1 chris chris      18554 11. Mai 15:34 opt.py
-rw-r--r-- 1 chris chris       4142 11. Mai 15:34 quant_cuda.cpp
drwxr-xr-x 2 chris chris       4096 11. Mai 15:35 quant_cuda.egg-info
-rw-r--r-- 1 chris chris      27359 11. Mai 15:34 quant_cuda_kernel.cu
-rw-r--r-- 1 chris chris      19738 11. Mai 15:34 quant.py
-rw-r--r-- 1 chris chris       8947 11. Mai 15:34 README.md
-rw-r--r-- 1 chris chris         90 11. Mai 15:34 requirements.txt
-rw-r--r-- 1 chris chris        287 11. Mai 15:34 setup_cuda.py
-rw-r--r-- 1 chris chris       5282 11. Mai 15:34 test_kernel.py
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ chmod +x cuda_11.7.0_515.43.04_linux.run 
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ./cuda_11.7.0_515.43.04_linux.run 
 Failed to verify gcc version. See log at /tmp/cuda-installer.log for details.
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ cat /tmp/cuda-installer.log
[INFO]: Checking compiler version...
[INFO]: gcc location: /usr/bin/gcc

[INFO]: gcc version: gcc version 12.2.1 20230201 (GCC) 

[ERROR]: unsupported compiler version: 12.2.1. Use --override to override this check.
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --enable-bootstrap --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --with-build-config=bootstrap-lto --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-werror
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.2.1 20230201 (GCC) 
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ./cuda_11.7.0_515.43.04_linux.run --override

Signal caught, cleaning up
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ./cuda_11.7.0_515.43.04_linux.run --override
 Installation failed. See log at /tmp/cuda-installer.log for details.
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ sudo ./cuda_11.7.0_515.43.04_linux.run --override
[sudo] password for chris: 
===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-11.7/

Please make sure that
 -   PATH includes /usr/local/cuda-11.7/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-11.7/lib64, or, add /usr/local/cuda-11.7/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.7/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 515.00 is required for CUDA 11.7 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ mhwd -li
> Installed PCI configs:
--------------------------------------------------------------------------------
                  NAME               VERSION          FREEDRIVER           TYPE
--------------------------------------------------------------------------------
video-hybrid-intel-nvidia-prime            2023.03.23               false            PCI
     video-modesetting            2020.01.13                true            PCI


Warning: No installed USB configs!
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ pamac list | grep -i nvidia
lib32-nvidia-utils                530.41.03-1                   multilib   194,5 MB
linux61-nvidia                    530.41.03-6                   extra      53,8 MB
mhwd-nvidia                       530.41.03-4                   extra      1,6 kB
mhwd-nvidia-390xx                 390.157-6                     extra      1,9 kB
mhwd-nvidia-470xx                 470.182.03-2                  extra      1,8 kB
nvidia-prime                      1.0-4                         extra      112 bytes
nvidia-utils                      530.41.03-4                   extra      690,8 MB
opencl-nvidia                     530.41.03-4                   extra      80,5 MB
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ pamac list | grep -i cuda
cuda                              12.1.1-1                      community  4,7 GB
cuda-tools                        12.1.1-1                      community  2,5 GB
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ 
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ echo $PATH
/home/chris/miniconda3/envs/textgen/bin:/home/chris/miniconda3/condabin:/home/chris/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/opt/cuda/bin:/opt/cuda/nsight_compute:/opt/cuda/nsight_systems/bin:/home/chris/.dotnet/tools:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/var/lib/snapd/snap/bin
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ echo $LD_LIBRARY_PATH

(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ echo $LD_LIBRARY_PATH

(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ls- l ^C
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ ls -l /etc/ld.so.conf
-rw-r--r-- 1 root root 117 21. Mär 15:24 /etc/ld.so.conf
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ more /etc/ld.so.conf
# Dynamic linker/loader configuration.
# See ld.so(8) and ldconfig(8) for details.

include /etc/ld.so.conf.d/*.conf

</cli>

as mentioned in the installer output, add the following at the end of ~/.bashrc and, to be safe, also to ~/.zshrc:
<cli>
</cli>

Then, add the path also to ld.so.conf which has to be done as root:
<cli>
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ su
Password: 
[bender-22 GPTQ-for-LLaMa]# echo /usr/local/cuda-11.7/lib64 >> /etc/ld.so.conf
[bender-22 GPTQ-for-LLaMa]# ldconfig 
[bender-22 GPTQ-for-LLaMa]# exit
exit
(textgen) [chris@bender-22 GPTQ-for-LLaMa]$ cat ~/.bashrc ~/.zshrc 
</cli>