
BGE-M3 嵌入服务 Milvus 向量数据库部署指南适用环境Ubuntu 24.04 服务器已安装 Docker、宝塔面板最终部署方式Milvus Attu连接外部 MinIO、BGE-M3 CPU 嵌入服务一、Milvus 向量数据库部署连接外部 MinIO1. 准备工作已有一台 Ubuntu 24 服务器安装好 Docker 和 Docker Compose。已有外部 MinIO 服务此处为https://oss.tlplay.cn:8999Access Key/Secret Key 已知。在 MinIO 中预先创建 Bucket本例使用milvus-data。2. 创建部署目录mkdir-p/opt/milvuscd/opt/milvusmkdir-pvolumes/{etcd,milvus}3. 编写 docker-compose.ymlyamlservices: etcd: container_name: milvus-etcd image: quay.io/coreos/etcd:v3.5.5 environment: - ETCD_AUTO_COMPACTION_MODErevision - ETCD_AUTO_COMPACTION_RETENTION1000 - ETCD_QUOTA_BACKEND_BYTES4294967296 - ETCD_SNAPSHOT_COUNT50000 volumes: - ./volumes/etcd:/etcd command: etcd -advertise-client-urlshttp://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd restart: always network_mode: host milvus: container_name: milvus-standalone image: milvusdb/milvus:v2.4.0 command: [milvus, run, standalone] environment: ETCD_ENDPOINTS: localhost:2379 MINIO_ADDRESS: oss.tlplay.cn:8999 # 外部 MinIO 地址不加 http/https MINIO_ACCESS_KEY_ID: IcbRMMLgZW7V2fMt7TRA # 替换为真实 Access Key MINIO_SECRET_ACCESS_KEY: yFCPDAM1mGc8kgrKOpZl6h1IWHvlZ6vWDQ16a9Q0| # 替换为真实 Secret Key MINIO_BUCKET_NAME: milvus-data MINIO_USE_SSL: true # 使用 HTTPS MINIO_SSL_VERIFY: false # 若为自签名证书建议关闭验证 volumes: - ./volumes/milvus:/var/lib/milvus depends_on: - etcd restart: always network_mode: host4. 启动服务bashdocker compose up -d5. 检查状态bashdocker compose ps docker logs -f milvus-standalone看到minio client init success或类似信息即表示连接 MinIO 成功。二、Attu 图形界面部署内网访问1. 启动 Attu 容器Host 网络模式bashdocker run -d --name attu \ --restart always \ --network host \ -e MILVUS_URL127.0.0.1:19530 \ -e PORT8000 \ zilliz/attu:latest若镜像拉取失败请先配置 Docker 镜像加速器。2. 防火墙放行端口bashsudo ufw allow 8000/tcp在宝塔面板“安全”中放行 TCP8000端口。3. 访问 Attu内网浏览器打开http://服务器内网IP:8000连接时只需填写地址127.0.0.1:19530若本机访问或服务器IP:19530无需用户名密码。三、BGE-M3 嵌入服务部署Linux CPU1. 安装 Python 3.11若系统默认低于 3.11bashsudo apt update sudo apt install software-properties-common -y sudo add-apt-repository ppa:deadsnakes/ppa -y sudo apt update sudo apt install python3.11 python3.11-venv python3.11-dev -y2. 创建虚拟环境并安装依赖bashmkdir -p /opt/bge-service cd /opt/bge-service python3.11 -m venv venv source venv/bin/activate pip install --upgrade pip pip install FlagEmbedding fastapi uvicorn[standard] pydantic modelscope3. 下载 BGE-M3 模型通过 ModelScopebashpython -c from modelscope import snapshot_download; snapshot_download(BAAI/bge-m3, cache_dir./models)模型将被保存到/opt/bge-service/models/BAAI/bge-m3。4. 创建服务脚本embed_server.pypythonfrom fastapi import FastAPI, HTTPException from pydantic import BaseModel from typing import List from FlagEmbedding import BGEM3FlagModel import os app FastAPI(titleBGE-M3 Embedding Service (CPU)) model_path /opt/bge-service/models/BAAI/bge-m3 if not os.path.exists(model_path): raise FileNotFoundError(fModel not found: {model_path}) model BGEM3FlagModel(model_path, use_fp16False, devicecpu) class EmbedRequest(BaseModel): texts: List[str] max_length: int 512 batch_size: int 4 class EmbedResponse(BaseModel): embeddings: List[List[float]] app.post(/embed, response_modelEmbedResponse) async def embed(request: EmbedRequest): if not request.texts: raise HTTPException(status_code400, detailtexts cannot be empty) output model.encode(request.texts, batch_sizerequest.batch_size, max_lengthrequest.max_length, return_denseTrue, return_sparseFalse) dense_embs output[dense_vecs] return EmbedResponse(embeddingsdense_embs.tolist()) app.get(/health) async def health(): return {status: ok, device: cpu}5. 注册为 systemd 服务开机自启、后台运行创建服务文件bashsudo nano /etc/systemd/system/bge-embedding.service写入以下内容ini[Unit] DescriptionBGE-M3 Embedding Service Afternetwork.target [Service] Typesimple Userroot WorkingDirectory/opt/bge-service ExecStart/opt/bge-service/venv/bin/uvicorn embed_server:app --host 0.0.0.0 --port 8081 Restartalways RestartSec10 [Install] WantedBymulti-user.target启动服务bashsudo systemctl daemon-reload sudo systemctl start bge-embedding sudo systemctl enable bge-embedding sudo systemctl status bge-embedding6. 防火墙放行 8081 端口bashsudo ufw allow 8081/tcp宝塔面板同样放行 TCP8081。7. 测试服务bashcurl -X POST http://127.0.0.1:8081/embed \ -H Content-Type: application/json \ -d {texts: [测试文本]}返回包含 1024 维向量的 JSON 即表示成功。四、将 BGE-M3 接入 Milvus 知识库LangChain 示例pythonfrom langchain_community.embeddings import HuggingFaceEndpointEmbeddings from langchain_community.vectorstores import Milvus embeddings HuggingFaceEndpointEmbeddings( modelhttp://127.0.0.1:8081/embed, huggingfacehub_api_tokenno-need ) vector_store Milvus( embedding_functionembeddings, connection_args{host: 127.0.0.1, port: 19530}, collection_nameproduct_knowledge ) # 后续可使用 vector_store.add_documents(splits) 导入文档五、注意事项Milvus 使用的network_mode: host会忽略端口映射容器直接监听宿主机 19530 和 9091。MinIO 地址不要加http://或https://协议由MINIO_USE_SSL控制。BGE-M3 CPU 推理较慢单条文本约 200~500ms建议max_length512、batch_size4生产环境可升级 GPU。若 Milvus 日志出现Your previous request to create the named bucket succeeded是正常提醒无需处理。服务启动顺序先启动 Milvus再启动 BGE-M3 嵌入服务。六、参考信息Milvus 官方文档https://milvus.io/docsAttu GitHubhttps://github.com/zilliztech/attuBGE-M3 模型https://huggingface.co/BAAI/bge-m3FlagEmbeddinghttps://github.com/FlagOpen/FlagEmbedding