You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/deployment/tgi/README.md
+25-21Lines changed: 25 additions & 21 deletions
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,11 @@
1
1
---
2
2
title: HuggingFace TGI
3
-
description: "This example shows how to deploy Llama 3.1 to any cloud or on-premises environment using HuggingFace TGI and dstack."
3
+
description: "This example shows how to deploy Llama 4 Scout to any cloud or on-premises environment using HuggingFace TGI and dstack."
4
4
---
5
5
6
6
# HuggingFace TGI
7
7
8
-
This example shows how to deploy Llama 3.1 8B with `dstack` using [HuggingFace TGI :material-arrow-top-right-thin:{ .external }](https://huggingface.co/docs/text-generation-inference/en/index){:target="_blank"}.
8
+
This example shows how to deploy Llama 4 Scout with `dstack` using [HuggingFace TGI :material-arrow-top-right-thin:{ .external }](https://huggingface.co/docs/text-generation-inference/en/index){:target="_blank"}.
9
9
10
10
??? info "Prerequisites"
11
11
Once `dstack` is [installed](https://dstack.ai/docs/installation), go ahead clone the repo, and run `dstack init`.
@@ -22,37 +22,43 @@ This example shows how to deploy Llama 3.1 8B with `dstack` using [HuggingFace T
22
22
23
23
## Deployment
24
24
25
-
Here's an example of a service that deploys Llama 3.1 8B using TGI.
25
+
Here's an example of a service that deploys [`Llama-4-Scout-17B-16E-Instruct` :material-arrow-top-right-thin:{ .external }](https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct){:target="_blank"} using TGI.
2. Browse the [Llama 3.1](https://dstack.ai/examples/llms/llama31/), [vLLM](https://dstack.ai/examples/deployment/vllm/),
121
-
and [NIM](https://dstack.ai/examples/deployment/nim/) examples
125
+
2. Browse the [Llama](https://dstack.ai/examples/llms/llama/), [vLLM](https://dstack.ai/examples/deployment/vllm/), [SgLang](https://dstack.ai/examples/deployment/sglang/) and [NIM](https://dstack.ai/examples/deployment/nim/) examples
122
126
3. See also [AMD](https://dstack.ai/examples/accelerators/amd/) and
0 commit comments