本地部署Qwen3

admin · 发表于 2025-6-10 17:33:27

根据本地算力，不知道应该部署哪个大模型？看这里：不知道部署哪个版本的Qwen3？一文看懂Qwen3本地部署的配置要求 - 知乎

win10，安装WSL2. ubuntu24.01
安装了nvidia cuda 驱动，以及nvidia-cuda-toolkit.
验证Nvidia驱动安装：

复制代码

如果nvcc -V 报错，安装 nvidia-cuda-toolkit:

复制代码

下载大模型文件：

复制代码

下载到一半卡住了：

复制代码

这里按ctrl_c 中断。

安装git-lfs:

复制代码

cd 下载目录，检查文件完整性：

复制代码

复制代码

有个“-”说明文件没有下载下来。使用以下命令继续下载：

复制代码

下载出来还是报错：

$ git-lfs pull
Error updating the Git index:
error: model-00007-of-00016.safetensors: cannot add to the index - missing --add option?
fatal: Unable to process path model-00007-of-00016.safetensors
exit status 128
Errors logged to '/mnt/c/LLM/Qwen3-30B-A3B/.git/lfs/logs/20250610T202649.063879086.log'.
Use `git lfs logs last` to view the log.

复制代码

强行运行竟然还报这个错；

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 768.00 MiB. GPU 0 has a total capacity of 24.00 GiB of which 0 bytes is free. Including non-PyTorch memory, this process has 17179869184.00 GiB memory in use. Of the allocated memory 37.78 GiB is allocated by PyTorch, and 15.58 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
[2025-06-10 22:23:15] Received sigquit from a child process. It usually means the child failed.

复制代码

admin · 发表于 2025-9-12 06:00:34

ollama show qwen3:30b
ollama show qwen3

netstat -ano| findstr 11434
nvidia-smi
nvcc -V

		自动登录	找回密码
密码			立即注册