LocalAI 部署
介绍
LocalAI 是免费的开源 OpenAI 替代方案。 LocalAI 充当 REST API 的直接替代品,与本地推理的 OpenAI API 规范兼容。 它无需 GPU,还有多种用途集成,允许您使用消费级硬件在本地或本地运行 LLM、生成图像、音频等等,支持多个模型系列。
启动方式
1. Linux AMD64 docker 启动
helm repo add go-skynet https://go-skynet.Github.io/helm-charts/ helm search repo go-skynet helm pull go-skynet/local-ai tar -xvf local-ai-3.1.0.tgz && cd local-ai vim value.yaml # 取消下面截图的注释
helm install --create-namespace local-ai . -n local-ai -f values.yaml
2. Mac M2 手动启动
# install build dependencies brew install abseil cmake go grpc protobuf wget # clone the repo git clone https://github.com/go-skynet/LocalAI.git cd LocalAI # build the binary make build # make BUILD_TYPE=metal build ## Set `gpu_layers: 1` to your YAML model config file and `f16: true` ## Note: only models quantized with q4_0 are supported! # Download gpt4all-j to models/ wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j # Use a template from the examples cp -rf prompt-templates/ggml-gpt4all-j.tmpl models/ # Run LocalAI ./local-ai --models-path=./models/ --debug=true
使用
# Now API is accessible at localhost:8080 curl http://localhost:8080/v1/models curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "ggml-gpt4all-j", "messages": [{"role": "user", "content": "How are you?"}], "temperature": 0.9 }'
官方编译启动文档
FQA
Q1: 编译报错日志 sources/go-llama/llama.go:372:13: undefined: min
binding.cpp:333:67: warning: format specifies type 'size_t' (aka 'unsigned long') but the argument has type 'int' [-Wformat] binding.cpp:809:5: warning: deleting pointer to incomplete type 'llama_model' may cause undefined behavior [-Wdelete-incomplete] sources/go-llama/llama.cpp/llama.h:60:12: note: forward declaration of 'llama_model' # github.com/go-skynet/go-llama.cpp sources/go-llama/llama.go:372:13: undefined: min note: module requires Go 1.21 make: *** [backend-assets/grpc/llama] Error 1
需要使用 go 1.21 版本
brew install mercurial # 安装 gvm bash