Skip to content

Commit 47d325f

Browse files
committed
Document Sync by Tina
1 parent c6a52bc commit 47d325f

File tree

2 files changed

+141
-1
lines changed

2 files changed

+141
-1
lines changed
Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
# Docker Quickstart Guide
2+
3+
This guide will help you get started with the basics of using ServerlessLLM with Docker. Please make sure you have Docker installed on your system and have installed ServerlessLLM CLI following the [installation guide](./installation.md).
4+
5+
## Local Test Using Docker
6+
7+
First, let's start a local Docker-based ray cluster to test the ServerlessLLM.
8+
9+
### Step 1: Build Docker Images
10+
11+
Run the following commands to build the Docker images:
12+
13+
```bash
14+
docker build . -t serverlessllm/sllm-serve
15+
docker build -f Dockerfile.worker . -t serverlessllm/sllm-serve-worker
16+
```
17+
18+
### Step 2: Configuration
19+
20+
Ensure that you have a directory for storing your models and set the `MODEL_FOLDER` environment variable to this directory:
21+
22+
```bash
23+
export MODEL_FOLDER=path/to/models
24+
```
25+
26+
Also, check if the Docker network `sllm` exists and create it if it doesn't:
27+
28+
```bash
29+
if ! docker network ls | grep -q "sllm"; then
30+
echo "Docker network 'sllm' does not exist. Creating network..."
31+
docker network create sllm
32+
else
33+
echo "Docker network 'sllm' already exists."
34+
fi
35+
```
36+
37+
### Step 3: Start the Ray Head and Worker Nodes
38+
39+
Run the following commands to start the Ray head node and worker nodes:
40+
41+
#### Start Ray Head Node
42+
43+
```bash
44+
docker run -d --name ray_head \
45+
--runtime nvidia \
46+
--network sllm \
47+
-p 6379:6379 \
48+
-p 8343:8343 \
49+
--gpus '"device=none"' \
50+
serverlessllm/sllm-serve
51+
52+
sleep 5
53+
```
54+
55+
#### Start Ray Worker Nodes
56+
57+
```bash
58+
docker run -d --name ray_worker_0 \
59+
--runtime nvidia \
60+
--network sllm \
61+
--gpus '"device=0"' \
62+
--env WORKER_ID=0 \
63+
--mount type=bind,source=$MODEL_FOLDER,target=/models \
64+
serverlessllm/sllm-serve-worker
65+
66+
docker run -d --name ray_worker_1 \
67+
--runtime nvidia \
68+
--network sllm \
69+
--gpus '"device=2"' \
70+
--env WORKER_ID=1 \
71+
--mount type=bind,source=$MODEL_FOLDER,target=/models \
72+
serverlessllm/sllm-serve-worker
73+
```
74+
75+
### Step 4: Start ServerlessLLM Serve
76+
77+
Run the following command to start the ServerlessLLM serve:
78+
79+
```bash
80+
docker exec ray_head sh -c "/opt/conda/bin/sllm-serve start"
81+
```
82+
83+
### Step 5: Deploy a Model Using sllm-cli
84+
85+
Open a new terminal, activate the `sllm` environment, and set the `LLM_SERVER_URL` environment variable:
86+
87+
```bash
88+
conda activate sllm
89+
export LLM_SERVER_URL=http://localhost:8343/
90+
```
91+
92+
Deploy a model to the ServerlessLLM server using the `sllm-cli`:
93+
94+
```bash
95+
sllm-cli deploy --model "facebook/opt-2.7b"
96+
```
97+
> Note: This command will spend some time downloading the model from the Hugging Face Model Hub.
98+
> You can use any model from the [Hugging Face Model Hub](https://huggingface.co/models) by specifying the model name in the `--model` argument.
99+
100+
Expected output:
101+
102+
```plaintext
103+
INFO xx-xx xx:xx:xx deploy.py:36] Deploying model facebook/opt-1.3b with default configuration.
104+
INFO xx-xx xx:xx:xx deploy.py:49] Model registered successfully.
105+
```
106+
107+
### Step 6: Query the Model Using sllm-cli
108+
109+
Now, you can query the model by any OpenAI API client. For example, you can use the following Python code to query the model:
110+
```bash
111+
curl http://localhost:8343/v1/chat/completions \
112+
-H "Content-Type: application/json" \
113+
-d '{
114+
"model": "facebook/opt-2.7b",
115+
"messages": [
116+
{"role": "system", "content": "You are a helpful assistant."},
117+
{"role": "user", "content": "What is your name?"}
118+
]
119+
}'
120+
```
121+
122+
Expected output:
123+
124+
```plaintext
125+
{"id":"chatcmpl-8b4773e9-a98b-41db-8163-018ed3dc65e2","object":"chat.completion","created":1720183759,"model":"facebook/opt-2.7b","choices":[{"index":0,"message":{"role":"assistant","content":"system: You are a helpful assistant.\nuser: What is your name?\nsystem: I am a helpful assistant.\n"},"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":16,"completion_tokens":26,"total_tokens":42}}%
126+
```
127+
128+
### Cleanup
129+
130+
If you need to stop and remove the containers, you can use the following commands:
131+
132+
```bash
133+
docker exec ray_head sh -c "ray stop"
134+
docker exec ray_worker_0 sh -c "ray stop"
135+
docker exec ray_worker_1 sh -c "ray stop"
136+
137+
docker stop ray_head ray_worker_0 ray_worker_1
138+
docker rm ray_head ray_worker_0 ray_worker_1
139+
docker network rm sllm
140+
```

docs/stable/getting_started/installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,6 @@ cd Phantom-component
2020
```
2121
conda create -n sllm python=3.10 -y
2222
conda activate sllm
23-
pip install -e .[worker]
23+
pip install -e ".[worker]"
2424
pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ serverless_llm_store==0.0.1.dev3
2525
```

0 commit comments

Comments
 (0)