Skip to content

Commit 120b8fd

Browse files
authored
Merge pull request #2 from google/kaz-04041903
add: Added audio and video support to the streaming quickstart
2 parents db6d62a + b89190b commit 120b8fd

File tree

1 file changed

+87
-32
lines changed

1 file changed

+87
-32
lines changed

docs/get-started/quickstart-streaming.md

Lines changed: 87 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# ADK Streaming Quickstart {#adk-streaming-quickstart}
22

3-
This Quickstart will guide you through installing ADK, setting up a basic "Google Search" agent, and building a simple asynchronous web app that uses the Streaming API and [FastAPI](https://fastapi.tiangolo.com/).
3+
With this quickstart, you'll learn to create a simple agent and use ADK Streaming to enable audio and video communication with it. We will install ADK, set up a basic "Google Search" agent, try running the agent with Streaming with `adk web` tool, and then explain how to build a simple asynchronous web app by yourself using ADK Streaming and [FastAPI](https://fastapi.tiangolo.com/).
44

55
**Note:** This guide assumes you have experience using a terminal in Windows, Mac, and Linux environments.
66

@@ -23,27 +23,22 @@ Install ADK:
2323
pip install google-adk
2424
```
2525

26-
**Note:** We recommend using a Python virtual environment.
27-
28-
## 2\. Project Structure {#2.-project-structure}
26+
## 2. Project Structure {#2.-project-structure}
2927

3028
Create the following folder structure with empty files:
3129

3230
```console
3331
adk-streaming/ # Project folder
34-
└── app/ # FastAPI web app folder
35-
|── main.py # FastAPI web app
36-
|── .env # Gemini API key
37-
├── static/ # Static content folder
38-
| └── index.html # The web client page
32+
└── app/ # the web app folder
33+
├── .env # Gemini API key
3934
└── google_search_agent/ # Agent folder
4035
├── __init__.py # Python package
4136
└── agent.py # Agent definition
4237
```
4338

4439
### agent.py
4540

46-
Copy-paste the following code block to the [`agent.py`](http://agent.py). This is exactly the same code as the Quickstart guide earlier, except for the model name.
41+
Copy-paste the following code block to the [`agent.py`](http://agent.py). Please note that ADK Streaming works with `gemini-2.0-flash-exp` model only.
4742

4843
```py
4944
from google.adk.agents import Agent
@@ -73,7 +68,79 @@ Copy-paste the following code block to `__init__.py` and `main.py` files.
7368
from . import agent
7469
```
7570

76-
```py title="main.py"
71+
## 3\. Setup Gemini API Key {#3.-setup-gemini-api-key}
72+
73+
To run your agent, you'll need to set up a Gemini API Key.
74+
75+
1. Get an API key from [Google AI Studio](https://aistudio.google.com/apikey).
76+
2. Inside your `app` directory, create a `.env` file.
77+
3. Add these lines to `.env`, replacing `YOUR_API_KEY_HERE` with your key:
78+
79+
**.env**
80+
81+
```
82+
GOOGLE_API_KEY=YOUR_API_KEY_HERE # Replace with your API Key
83+
GOOGLE_GENAI_USE_VERTEXAI=0
84+
```
85+
86+
## 4. Try the agent with `adk web` {#4.-try-it-adk-web}
87+
88+
Now it's ready to try the agent. Run the following command to launch the **dev UI**. First, make sure to set the current directory to `app`:
89+
90+
```
91+
cd app
92+
```
93+
94+
Then, run the dev UI:
95+
96+
```
97+
adk web
98+
```
99+
100+
Open the URL provided (usually `http://localhost:8000` or
101+
`http://127.0.0.1:8000`) **directly in your browser**. This connection stays
102+
entirely on your local machine. Select `basic_search_agent`.
103+
104+
### 📝 Try with text
105+
106+
Try the following prompts by typing them in the UI.
107+
108+
* What is the weather in New York?
109+
* What is the time in New York?
110+
* What is the weather in Paris?
111+
* What is the time in Paris?
112+
113+
The agent will use the google_search tool to get the latest information to answer those questions.
114+
115+
### 📝 Try with voice and video
116+
117+
Now, click the microphone button to enable the voice input, and ask the same question in voice. You will hear the answer in voice in real-time.
118+
119+
Also, click the camera button to enable the video input, and ask questions like "What do you see?". The agent will answer what they see in the video input.
120+
121+
### Stop the tool
122+
123+
Stop `adk web` by pressing `Ctrl-C` on the console.
124+
125+
## 5. Building a Custom Streaming App (Optional) {#5.-build-custom-app}
126+
127+
We have checked that our basic search agent works with the ADK Streaming. In the following sections, we will learn how to build your own web application capable of the streaming communication using [FastAPI](https://fastapi.tiangolo.com/).
128+
129+
Add `static` directory under `app`, and add `main.py` and `index.html` as empty files, as in the following structure:
130+
131+
```
132+
adk-streaming/ # Project folder
133+
└── app/ # the web app folder
134+
├── main.py # FastAPI web app
135+
└── static/ # Static content folder
136+
└── index.html # The web client page
137+
```
138+
139+
**main.py**
140+
141+
Copy-paste the following code block to the main.py file.
142+
143+
```py
77144
import os
78145
import json
79146
import asyncio
@@ -355,38 +422,26 @@ This HTML file sets up a basic webpage with:
355422
* Sends the text entered in the input field to the WebSocket server when the form is submitted.
356423
* Attempts to reconnect if the WebSocket connection closes.
357424

358-
## 3\. Setup Gemini API Key {#3.-setup-gemini-api-key}
425+
## 6\. Interact with Your Streaming app {#4.-interact-with-your-streaming-app}
359426

360-
To interact with your agent, you'll need to set up a Gemini API Key.
427+
1\. **Navigate to the Correct Directory:**
361428

362-
1. Get an API key from [Google AI Studio](https://aistudio.google.com/apikey).
363-
2. Inside your `app` directory, create a `.env` file.
364-
3. Add these lines to `.env`, replacing `YOUR_API_KEY_HERE` with your key:
429+
To run your agent effectively, you need to be in the **app folder (`adk-streaming/app`)**
365430

366-
```shell title=".env"
367-
GOOGLE_API_KEY=YOUR_API_KEY_HERE # Replace with your API Key
368-
GOOGLE_GENAI_USE_VERTEXAI=0
369-
```
370-
371-
## 4. Interact with Your Agent (FastAPI web app) {#4.-interact-with-your-agent-(fastapi-web-app)}
372-
373-
1. **Navigate to the Correct Directory:**
431+
2\. **Start the Fast API**: Run the following command to start CLI interface with
374432

375-
To run your agent effectively, you need to be in the **app folder (`adk-streaming/app`)**
376-
377-
1. Start the Fast API: Run the following command to start CLI interface with
378-
379-
```shell
433+
```
380434
uvicorn main:app --reload
381435
```
382436

383-
2. **Access the UI:** Once the UI server starts, the terminal will display a local URL (e.g., [http://localhost:8000](http://localhost:8501)). Click this link to open the UI in your browser. **\[hover-link\]** [[Ref](https://screenshot.googleplex.com/4vxZejAZ4hpa4Rx)\]
437+
3\. **Access the UI:** Once the UI server starts, the terminal will display a local URL (e.g., [http://localhost:8000](http://localhost:8501)). Click this link to open the UI in your browser.
438+
384439

385-
Now you should see the ADK dev UI like this:
440+
Now you should see the UI like this:
386441

387442
<img src="../../assets/adk-streaming.png" alt="ADK Streaming Test">
388443

389-
The agent will use Google Search to respond to your queries. You can send messages to the agent at any time, even while the agent is still responding. The agent's responses will appear incrementally, demonstrating the bidirectional communication capability of the Streaming API.
444+
Try asking a question `What is Gemini?`. The agent will use Google Search to respond to your queries. You would notice that the UI shows the agent's response as streaming text. You can also send messages to the agent at any time, even while the agent is still responding. This demonstrates the bidirectional communication capability of ADK Streaming.
390445

391446
Benefits over conventional synchronous web apps:
392447

0 commit comments

Comments
 (0)