摘要:
Meta 的 SAM3 (Segment Anything Model 3) 带来了强大的图像分割和视频跟踪能力。本文详细介绍了如何在 Docker 环境下部署 ComfyUI-SAM3,解决了依赖缺失、CUDA 编译加速以及模型路径配置等常见坑点,并提供了现成的 Docker 配置文件和测试工作流。
Meta 最近发布的 SAM3 在图像分割和视频对象跟踪方面表现出色。虽然 ComfyUI 社区迅速跟进适配了 PozzettiAndrea/ComfyUI-SAM3 插件,但在 Docker 环境下部署时,我们遇到了一系列依赖和环境问题。
本文将分享一套经过验证的 Docker 部署方案,包含显存优化、CUDA 加速编译以及常见报错修复。
1. 核心配置文件
我们将使用 pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel 作为基础镜像,以支持 SAM3 的 CUDA 加速扩展编译。
Dockerfile
这个 Dockerfile 修复了官方镜像中缺少 GitPython、uv 包管理器以及 SAM3 运行时必须的 ftfy 等库的问题。
# ------------------------------------------------------------------------------
# 阶段 1: 基础环境
# 使用 devel 版本以支持 Flash Attention 和 SAM3 Speedup 编译 (包含 nvcc)
# ------------------------------------------------------------------------------
FROM pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel AS base
# 环境变量配置
ENV DEBIAN_FRONTEND=noninteractive \
PYTHONUNBUFFERED=1 \
PIP_NO_CACHE_DIR=1 \
HF_HOME="/root/cache/huggingface" \
TORCH_HOME="/root/cache/torch" \
# [关键] 强制 SAM3 初始化,防止 pytest 环境误报导致节点不加载
SAM3_FORCE_INIT=1
# ------------------------------------------------------------------------------
# 阶段 2: 系统依赖
# ------------------------------------------------------------------------------
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential git wget curl nano zip unzip \
ninja-build \
libgl1-mesa-glx libglib2.0-0 libsm6 libxext6 libxrender-dev ffmpeg \
libpng-dev libjpeg-dev \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# 升级 pip
RUN pip install --upgrade pip setuptools wheel
# ------------------------------------------------------------------------------
# 阶段 3: Python 依赖
# ------------------------------------------------------------------------------
WORKDIR /opt
# 1. 获取 ComfyUI 依赖列表
RUN wget https://raw.githubusercontent.com/comfyanonymous/ComfyUI/master/requirements.txt -O comfyui_requirements.txt
# 2. 移除 torch 相关行,防止覆盖基础镜像版本
RUN sed -i '/torch/d' comfyui_requirements.txt && \
sed -i '/torchvision/d' comfyui_requirements.txt && \
sed -i '/torchaudio/d' comfyui_requirements.txt
# 3. 安装 ComfyUI 依赖
RUN pip install -r comfyui_requirements.txt
# 4. 补回核心库 + 安装 ComfyUI-Manager 必须的 GitPython 和 uv
RUN pip install torchsde einops transformers safetensors GitPython uv
# 5. 安装 Flash Attention (耗时较长)
ENV MAX_JOBS=4
RUN pip install flash-attn --no-build-isolation
# 6. 预安装 SAM 3 及视频处理相关依赖
# [重点] 包含 ftfy 和 regex 以解决文本提示报错
RUN pip install opencv-python pycocotools matplotlib onnxruntime-gpu scipy timm huggingface_hub \
hydra-core iopath moviepy av ftfy regex
# ------------------------------------------------------------------------------
# 阶段 4: 收尾与启动配置
# ------------------------------------------------------------------------------
WORKDIR /opt/ComfyUI
EXPOSE 8188
COPY scripts/entrypoint.sh /usr/local/bin/entrypoint.sh
RUN chmod +x /usr/local/bin/entrypoint.sh
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
CMD ["python", "main.py", "--listen", "0.0.0.0", "--port", "8188"]
docker-compose.yml
SAM3 处理视频时非常消耗内存,因此必须配置 shm_size。
version: '3.8'
services:
comfyui-sam3:
build: .
container_name: comfyui-sam3
restart: unless-stopped
ports:
- "8188:8188"
environment:
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
# 开启 High VRAM 模式
- CLI_ARGS=--highvram --preview-method auto
- HF_HOME=/root/ai-dock/cache/huggingface
- SAM3_FORCE_INIT=1
# [关键] 增加共享内存,解决视频处理崩溃问题
shm_size: 16gb
volumes:
- ./storage/comfyui_core:/opt/ComfyUI
- ./storage/input:/opt/ComfyUI/input
- ./storage/output:/opt/ComfyUI/output
# 挂载模型库
- ./storage/models:/opt/ComfyUI/models
# 挂载插件目录
- ./storage/custom_nodes:/opt/ComfyUI/custom_nodes
# 缓存持久化
- ./storage/cache:/root/ai-dock/cache
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
entrypoint.sh (启动脚本)
此脚本负责自动拉取插件,并尝试编译 CUDA 加速扩展。
#!/bin/bash
set -e
COMFY_DIR="/opt/ComfyUI"
CUSTOM_NODES_DIR="${COMFY_DIR}/custom_nodes"
echo "Checking ComfyUI installation..."
# 1. 安装/恢复 ComfyUI 本体
if [ ! -f "${COMFY_DIR}/main.py" ]; then
echo "ComfyUI main.py not found. Installing..."
git clone https://github.com/comfyanonymous/ComfyUI.git /tmp/comfyui_temp
cp -rn /tmp/comfyui_temp/* ${COMFY_DIR}/
rm -rf /tmp/comfyui_temp
fi
# 2. 安装 ComfyUI Manager
if [ ! -d "${CUSTOM_NODES_DIR}/ComfyUI-Manager" ]; then
git clone https://github.com/ltdrdata/ComfyUI-Manager.git "${CUSTOM_NODES_DIR}/ComfyUI-Manager"
fi
# 3. 安装 PozzettiAndrea/ComfyUI-SAM3
SAM3_NODE_DIR="${CUSTOM_NODES_DIR}/ComfyUI-SAM3"
if [ ! -d "${SAM3_NODE_DIR}" ]; then
echo "Installing PozzettiAndrea/ComfyUI-SAM3..."
git clone https://github.com/PozzettiAndrea/ComfyUI-SAM3.git "${SAM3_NODE_DIR}"
cd "${SAM3_NODE_DIR}"
echo "Running install.py..."
python install.py
# 尝试编译 GPU 加速 (视频跟踪提速 5-10倍)
echo "Running speedup.py for GPU acceleration..."
python speedup.py || echo "Warning: GPU speedup compilation failed. Falling back to standard mode."
cd "${COMFY_DIR}"
fi
# 确保包管理器存在
pip install GitPython uv > /dev/null 2>&1 || true
echo "Starting ComfyUI..."
exec "$@"
2. 模型准备(关键步骤)
SAM3 模型较大(约 3.2GB),为了避免启动时下载失败或等待时间过长,建议手动下载模型并放入指定目录。
操作步骤:
- 下载模型文件
sam3.pt。 - 将其放置在宿主机的以下路径:
./storage/models/sam3/sam3.pt
对应容器内的路径为 /opt/ComfyUI/models/sam3/sam3.pt。如果该目录不存在,请先手动创建。
注意:如果已经在本地有模型文件,请直接拷贝进去。如果不放,第一次运行节点时会自动从 HuggingFace 下载。
3. 运行演示工作流
启动容器:
docker-compose up --build -d
启动完成后,打开浏览器访问 http://localhost:8188。你可以直接将下面的 JSON 文件拖入 ComfyUI 界面加载工作流。
JSON 说明:
- 这是一个基础的“文本提示分割”工作流。
- LoadSAM3Model 节点已配置为读取
sam3.pt。 - SAM3Grounding 节点默认提示词为
"person",。
{
"id": "9e5b67d0-53dc-42aa-bdb7-541f1939e114",
"revision": 0,
"last_node_id": 14,
"last_link_id": 24,
"nodes": [
{
"id": 1,
"type": "LoadImage",
"pos": [
626.0189034779078,
251.07377362433684
],
"size": [
274.080078125,
314
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
22
]
},
{
"name": "MASK",
"type": "MASK",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.72",
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"example_image.jpg",
"image"
]
},
{
"id": 4,
"type": "MaskPreview",
"pos": [
1589.7307778177144,
96.54681627376111
],
"size": [
210,
258
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [
{
"name": "mask",
"type": "MASK",
"link": 24
}
],
"outputs": [],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.72",
"Node name for S&R": "MaskPreview"
},
"widgets_values": []
},
{
"id": 3,
"type": "PreviewImage",
"pos": [
1233.971252285426,
223.48707505899935
],
"size": [
312.1604553633704,
258
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 23
}
],
"outputs": [],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.72",
"Node name for S&R": "PreviewImage"
},
"widgets_values": []
},
{
"id": 13,
"type": "SAM3Grounding",
"pos": [
921.0206799155256,
127.5149432108375
],
"size": [
278.08203125,
190
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [
{
"name": "sam3_model",
"type": "SAM3_MODEL",
"link": 21
},
{
"name": "image",
"type": "IMAGE",
"link": 22
},
{
"name": "positive_boxes",
"shape": 7,
"type": "SAM3_BOXES_PROMPT",
"link": null
},
{
"name": "negative_boxes",
"shape": 7,
"type": "SAM3_BOXES_PROMPT",
"link": null
}
],
"outputs": [
{
"name": "masks",
"type": "MASK",
"links": [
24
]
},
{
"name": "visualization",
"type": "IMAGE",
"links": [
23
]
},
{
"name": "boxes",
"type": "STRING",
"links": null
},
{
"name": "scores",
"type": "STRING",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-sam3",
"ver": "82c8e4f88a6c0a9242712b827e05f5e67c4a90a7",
"Node name for S&R": "SAM3Grounding"
},
"widgets_values": [
0.2,
"person",
-1,
false
]
},
{
"id": 12,
"type": "LoadSAM3Model",
"pos": [
619.424717299163,
120.03196674570798
],
"size": [
270,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "sam3_model",
"type": "SAM3_MODEL",
"links": [
21
]
}
],
"properties": {
"cnr_id": "comfyui-sam3",
"ver": "82c8e4f88a6c0a9242712b827e05f5e67c4a90a7",
"Node name for S&R": "LoadSAM3Model"
},
"widgets_values": [
"models/sam3/sam3.pt",
""
]
}
],
"links": [
[
21,
12,
0,
13,
0,
"SAM3_MODEL"
],
[
22,
1,
0,
13,
1,
"IMAGE"
],
[
23,
13,
1,
3,
0,
"IMAGE"
],
[
24,
13,
0,
4,
0,
"MASK"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 1.1167815779424768,
"offset": [
-353.95100285588137,
87.4757595301123
]
},
"frontendVersion": "1.32.9",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true,
"workflowRendererVersion": "LG"
},
"version": 0.4
}