|
根据本地算力,不知道应该部署哪个大模型?看这里:不知道部署哪个版本的Qwen3?一文看懂Qwen3本地部署的配置要求 - 知乎
win10, 安装WSL2. ubuntu24.01
安装了nvidia cuda 驱动,以及nvidia-cuda-toolkit.
验证Nvidia驱动安装:
- # 查看驱动
- nvidia-smi
- # 查看cuda版本
- nvcc -V
复制代码 如果nvcc -V 报错,安装 nvidia-cuda-toolkit:- sudo apt install nvidia-cuda-toolkit
复制代码
下载大模型文件:- git clone https://www.modelscope.cn/Qwen/Qwen3-30B-A3B
复制代码 下载到一半卡住了:
- Cloning into 'Qwen3-30B-A3B'...
- remote: Enumerating objects: 79, done.
- remote: Counting objects: 100% (79/79), done.
- remote: Compressing objects: 100% (63/63), done.
- remote: Total 79 (delta 19), reused 71 (delta 15), pack-reused 0
- Receiving objects: 100% (79/79), 1.78 MiB | 11.56 MiB/s, done.
- Resolving deltas: 100% (19/19), done.
复制代码 这里按ctrl_c 中断。
安装git-lfs:
cd 下载目录,检查文件完整性:
- 454e77b346 - model-00001-of-00016.safetensors
- 47f015d6e5 - model-00002-of-00016.safetensors
- ac0bf5990f - model-00003-of-00016.safetensors
- 89b01fd34a - model-00004-of-00016.safetensors
复制代码 有个“-”说明文件没有下载下来。使用以下命令继续下载:
- git lfs install
- git lfs pull
复制代码 下载出来还是报错:
- $ git-lfs pull
- Error updating the Git index:
- error: model-00007-of-00016.safetensors: cannot add to the index - missing --add option?
- fatal: Unable to process path model-00007-of-00016.safetensors
- exit status 128
- Errors logged to '/mnt/c/LLM/Qwen3-30B-A3B/.git/lfs/logs/20250610T202649.063879086.log'.
- Use `git lfs logs last` to view the log.
复制代码 强行运行竟然还报这个错;- torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 768.00 MiB. GPU 0 has a total capacity of 24.00 GiB of which 0 bytes is free. Including non-PyTorch memory, this process has 17179869184.00 GiB memory in use. Of the allocated memory 37.78 GiB is allocated by PyTorch, and 15.58 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
- [2025-06-10 22:23:15] Received sigquit from a child process. It usually means the child failed.
复制代码
离了大谱了,17179869184.00GiB是个什么鬼?
ref:Qwen3-30B-A3B部署(使用vllm和sglang)_sglang部署qwen3-CSDN博客
|
|