Skip to content

Commit 9b94ac9

Browse files
committed
Document Sync by Tina
1 parent 864819d commit 9b94ac9

File tree

4 files changed

+29
-28
lines changed

4 files changed

+29
-28
lines changed

docs/stable/getting_started/multi_machine_setup.md

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ sidebar_position: 3
44

55
# Multi-Machine Setup Guide
66

7-
This guide will help you get started with running ServerlessLLM on multiple machines by adding worker nodes on different machines, connecting them to the head node, and starting the `sllm-store-server` on the worker nodes. You can extend this setup to use as many nodes as you need. Please make sure you have installed the ServerlessLLM following the [installation guide](./installation.md) on all machines.
7+
This guide will help you get started with running ServerlessLLM on multiple machines by adding worker nodes on different machines, connecting them to the head node, and starting the `sllm-store` on the worker nodes. You can extend this setup to use as many nodes as you need. Please make sure you have installed the ServerlessLLM following the [installation guide](./installation.md) on all machines.
88

99
## Multi-Machine Setup
1010

@@ -64,24 +64,25 @@ You can continue adding more worker nodes by repeating the above steps on additi
6464

6565
```bash
6666
conda activate sllm-worker
67-
sllm-store-server
67+
sllm-store start
6868
```
6969

7070
Expected output:
7171

7272
```bash
73-
TODO Run server...
73+
INFO 12-31 17:09:35 cli.py:58] Starting gRPC server
74+
INFO 12-31 17:09:35 server.py:34] StorageServicer: storage_path=./models, mem_pool_size=4294967296, num_thread=4, chunk_size=33554432, registration_required=False
7475
WARNING: Logging before InitGoogleLogging() is written to STDERR
75-
I20240724 06:46:25.054241 1337444 server.cpp:290] Log directory already exists.
76-
I20240724 06:46:25.199916 1337444 checkpoint_store.cpp:29] Number of GPUs: 4
77-
I20240724 06:46:25.200362 1337444 checkpoint_store.cpp:31] I/O threads: 4, chunk size: 32MB
78-
I20240724 06:46:25.326860 1337444 checkpoint_store.cpp:52] GPU 0 UUID: c9938b31-33b0-e02f-24c5-88bd6fbe19ad
79-
I20240724 06:46:25.472143 1337444 checkpoint_store.cpp:52] GPU 1 UUID: 3f4f72ef-ed7f-2ddb-e454-abcc6c0330b0
80-
I20240724 06:46:25.637110 1337444 checkpoint_store.cpp:52] GPU 2 UUID: 99b39a1b-5fdd-1acb-398a-426672ebc1a8
81-
I20240724 06:46:25.795079 1337444 checkpoint_store.cpp:52] GPU 3 UUID: c164f9d9-f157-daeb-d7be-5c98029c2a2b
82-
I20240724 06:46:25.795164 1337444 pinned_memory_pool.cpp:12] Creating PinnedMemoryPool with 1024 buffers of 33554432 bytes
83-
I20240724 06:46:40.843920 1337444 checkpoint_store.cpp:63] Memory pool created with 32GB
84-
I20240724 06:46:40.845937 1337444 server.cpp:262] Server listening on 0.0.0.0:8073
76+
I20241231 17:09:35.480175 2164266 checkpoint_store.cpp:41] Number of GPUs: 4
77+
I20241231 17:09:35.480214 2164266 checkpoint_store.cpp:43] I/O threads: 4, chunk size: 32MB
78+
I20241231 17:09:35.480228 2164266 checkpoint_store.cpp:45] Storage path: "./models"
79+
I20241231 17:09:35.662346 2164266 checkpoint_store.cpp:71] GPU 0 UUID: c9938b31-33b0-e02f-24c5-88bd6fbe19ad
80+
I20241231 17:09:35.838738 2164266 checkpoint_store.cpp:71] GPU 1 UUID: 3f4f72ef-ed7f-2ddb-e454-abcc6c0330b0
81+
I20241231 17:09:36.020437 2164266 checkpoint_store.cpp:71] GPU 2 UUID: 99b39a1b-5fdd-1acb-398a-426672ebc1a8
82+
I20241231 17:09:36.262537 2164266 checkpoint_store.cpp:71] GPU 3 UUID: c164f9d9-f157-daeb-d7be-5c98029c2a2b
83+
I20241231 17:09:36.262609 2164266 pinned_memory_pool.cpp:29] Creating PinnedMemoryPool with 128 buffers of 33554432 bytes
84+
I20241231 17:09:38.241055 2164266 checkpoint_store.cpp:83] Memory pool created with 4GB
85+
INFO 12-31 17:09:38 server.py:243] Starting gRPC server on 0.0.0.0:8073
8586
```
8687

8788
### Step 4: Start ServerlessLLM Serve on the Head Node

docs/stable/getting_started/quickstart.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -29,22 +29,22 @@ And start ServerlessLLM Store server. This server will use `./models` as the sto
2929
```bash
3030
conda activate sllm-worker
3131
export CUDA_VISIBLE_DEVICES=0
32-
sllm-store-server
32+
sllm-store start
3333
```
3434

3535
Expected output:
3636
```bash
37-
$ sllm-store-server
38-
Run server...
37+
$ sllm-store start
38+
INFO 12-31 17:13:23 cli.py:58] Starting gRPC server
39+
INFO 12-31 17:13:23 server.py:34] StorageServicer: storage_path=./models, mem_pool_size=4294967296, num_thread=4, chunk_size=33554432, registration_required=False
3940
WARNING: Logging before InitGoogleLogging() is written to STDERR
40-
I20241111 16:34:14.856642 467195 server.cpp:333] Log directory already exists.
41-
I20241111 16:34:14.897728 467195 checkpoint_store.cpp:41] Number of GPUs: 1
42-
I20241111 16:34:14.897949 467195 checkpoint_store.cpp:43] I/O threads: 4, chunk size: 32MB
43-
I20241111 16:34:14.897960 467195 checkpoint_store.cpp:45] Storage path: "./models/"
44-
I20241111 16:34:14.972811 467195 checkpoint_store.cpp:71] GPU 0 UUID: c9938b31-33b0-e02f-24c5-88bd6fbe19ad
45-
I20241111 16:34:14.972856 467195 pinned_memory_pool.cpp:29] Creating PinnedMemoryPool with 128 buffers of 33554432 bytes
46-
I20241111 16:34:16.449775 467195 checkpoint_store.cpp:83] Memory pool created with 4GB
47-
I20241111 16:34:16.462957 467195 server.cpp:306] Server listening on 0.0.0.0:8073
41+
I20241231 17:13:23.947276 2165054 checkpoint_store.cpp:41] Number of GPUs: 1
42+
I20241231 17:13:23.947299 2165054 checkpoint_store.cpp:43] I/O threads: 4, chunk size: 32MB
43+
I20241231 17:13:23.947309 2165054 checkpoint_store.cpp:45] Storage path: "./models"
44+
I20241231 17:13:24.038651 2165054 checkpoint_store.cpp:71] GPU 0 UUID: c9938b31-33b0-e02f-24c5-88bd6fbe19ad
45+
I20241231 17:13:24.038700 2165054 pinned_memory_pool.cpp:29] Creating PinnedMemoryPool with 128 buffers of 33554432 bytes
46+
I20241231 17:13:25.557906 2165054 checkpoint_store.cpp:83] Memory pool created with 4GB
47+
INFO 12-31 17:13:25 server.py:243] Starting gRPC server on 0.0.0.0:8073
4848
```
4949

5050
Now, let’s start ServerlessLLM.

docs/stable/store/installation_with_rocm.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ python3 examples/sllm_store/save_transformers_model.py --model_name facebook/opt
6161
2. Start the `sllm-store` server
6262

6363
``` bash
64-
sllm-store-server
64+
sllm-store start
6565
```
6666

6767
3. Load the model and run the inference in another terminal
@@ -107,7 +107,7 @@ python3 examples/sllm_store/save_vllm_model.py --model_name facebook/opt-1.3b --
107107
2. Start the `sllm-store` server
108108
109109
``` bash
110-
sllm-store-server
110+
sllm-store start
111111
```
112112
113113
3. Load the model and run the inference in another terminal

docs/stable/store/quickstart.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ save_model(model, './models/facebook/opt-1.3b')
7070
2. Launch the checkpoint store server in a separate process:
7171
```bash
7272
# 'mem_pool_size' is the maximum size of the memory pool in GB. It should be larger than the model size.
73-
sllm-store-server --storage_path $PWD/models --mem_pool_size 4
73+
sllm-store start --storage-path $PWD/models --mem-pool-size 4GB
7474
```
7575

7676
<!-- Running the server using a container:
@@ -145,7 +145,7 @@ After downloading the model, you can launch the checkpoint store server and load
145145
2. Launch the checkpoint store server in a separate process:
146146
```bash
147147
# 'mem_pool_size' is the maximum size of the memory pool in GB. It should be larger than the model size.
148-
sllm-store-server --storage_path $PWD/models --mem_pool_size 4
148+
sllm-store start --storage-path $PWD/models --mem-pool-size 4GB
149149
```
150150

151151
3. Load the model in vLLM:

0 commit comments

Comments
 (0)