记录一下部署大模型的经历
【更新中】
git clone https://github.com/NVIDIA/FasterTransformer.git
查看docs/bert_guide.md
https://github.com/NVIDIA/nvidia-docker
NVIDIA Docker
的安装和使用 好文!
期间需要注意这个问题,于是我最后一句改成了docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
遇到问题
`permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/version": dial unix /var/run/docker.sock: connect: permission denied`
解决方式:
docker守护进程启动的时候,会默认赋予名字为docker的用户组读写Unix socket的权限,因此只要创建docker用户组,并将当前用户加入到docker用户组中,那么当前用户就有权限访问Unix socket了,进而也就可以执行docker相关命令
事实上,后来我跟着这篇教程 做的
1 2 3 4 sudo groupadd docker #添加docker用户组 sudo gpasswd -a $USER docker #将登陆用户加入到docker用户组中 newgrp docker #更新用户组 docker images #测试docker命令是否可以使用sudo正常使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 docker version Client: Docker Engine - Community Version: 24.0.3 API version: 1.43 Go version: go1.20.5 Git commit: 3713ee1 Built: Wed Jul 5 20:44:55 2023 OS/Arch: linux/amd64 Context: default Server: Docker Engine - Community Engine: Version: 24.0.3 API version: 1.43 (minimum version 1.12) Go version: go1.20.5 Git commit: 1d9c861 Built: Wed Jul 5 20:44:55 2023 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.6.21 GitCommit: 3dce8eb055cbb6872793272b4f20ed16117344f8 runc: Version: 1.1.7 GitCommit: v1.1.7-0-g860f061 docker-init: Version: 0.19.0 GitCommit: de40ad0
查看nvidia-smi
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Fri Jul 7 14:04:28 2023 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 Quadro P620 Off | 00000000:01:00.0 On | N/A | | 34% 41C P8 N/A / N/A | 191MiB / 2048MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 1833 G /usr/lib/xorg/Xorg 72MiB | | 0 N/A N/A 1949 C+G ...libexec/gnome-remote-desktop-daemon 38MiB | | 0 N/A N/A 1990 G /usr/bin/gnome-shell 73MiB | | 0 N/A N/A 4784 G gnome-control-center 1MiB | +---------------------------------------------------------------------------------------+
查找nvidia/cuda镜像