llama-stack-operator

This repo hosts a kubernetes operator that is responsible for creating and managing llama-stack server.

Features

Automated deployment of Llama Stack servers
Support for multiple distributions (includes Ollama, vLLM, and others)
Customizable server configurations
Volume management for model storage
Kubernetes-native resource management

Quick Start

Installation

You can install the operator directly from a released version or the latest main branch using kubectl apply -f.

To install the latest version from the main branch:

kubectl apply -f https://raw.githubusercontent.com/llamastack/llama-stack-k8s-operator/main/release/operator.yaml

To install a specific released version (e.g., v1.0.0), replace main with the desired tag:

kubectl apply -f https://raw.githubusercontent.com/llamastack/llama-stack-k8s-operator/v1.0.0/release/operator.yaml

Deploying the Llama Stack Server

Deploy the inference provider server (ollama, vllm etc). Example to deploy a new ollama server:

bash hack/deploy-ollama.sh

Create LlamaStackDistribution CR to get the server running. Example:

apiVersion: llamastack.io/v1alpha1
kind: LlamaStackDistribution
metadata:
  name: llamastackdistribution-sample
spec:
  replicas: 1
  server:
    distribution:
      name: ollama
    containerSpec:
      port: 8321
      env:
      - name: INFERENCE_MODEL
        value: "llama3.2:1b"
      - name: OLLAMA_URL
        value: "http://ollama-server-service.ollama-dist.svc.cluster.local:11434"
    storage:
      size: "20Gi"
      mountPath: "/home/lls/.lls"

Verify the server pod is running in the user defined namespace.

Developer Guide

Prerequisites

Kubernetes cluster (v1.20 or later)
Go version go1.23
operator-sdk v1.39.2 (v4 layout) or newer
kubectl configured to access your cluster
A running inference server:
- For local development, you can use the provided script: /hack/deploy-ollama.sh

Building the Operator

Custom operator image can be built using your local repository
```
make image IMG=quay.io/<username>/llama-stack-k8s-operator:<custom-tag>
```
The default image used is quay.io/llamastack/llama-stack-k8s-operator:latest when not supply argument for make image
Once the image is created, the operator can be deployed directly. For each deployment method a kubeconfig should be exported
```
export KUBECONFIG=<path to kubeconfig>
```

Deployment

Deploying operator locally

Deploy the created image in your cluster using following command:

make deploy IMG=quay.io/<username>/llama-stack-k8s-operator:<custom-tag>

To remove resources created during installation use:
```
make undeploy
```

Running E2E Tests

The operator includes end-to-end (E2E) tests to verify the complete functionality of the operator. To run the E2E tests:

Ensure you have a running Kubernetes cluster
Run the E2E tests using one of the following commands:
- If you want to deploy the operator and run tests:
```
make deploy test-e2e
```
- If the operator is already deployed:
```
make test-e2e
```

The make target will handle prerequisites including deploying ollama server.

API Overview

Please refer to api documentation

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github		.github
api/v1alpha1		api/v1alpha1
config		config
controllers		controllers
docs		docs
hack		hack
pkg		pkg
release		release
tests/e2e		tests/e2e
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.limgo.json		.limgo.json
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Makefile		Makefile
PROJECT		PROJECT
README.md		README.md
crd-ref-docs.config.yaml		crd-ref-docs.config.yaml
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

llama-stack-operator

Features

Table of Contents

Quick Start

Installation

Deploying the Llama Stack Server

Developer Guide

Prerequisites

Building the Operator

Deployment

Running E2E Tests

API Overview

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 11

Uh oh!

Languages

llamastack/llama-stack-k8s-operator

Folders and files

Latest commit

History

Repository files navigation

llama-stack-operator

Features

Table of Contents

Quick Start

Installation

Deploying the Llama Stack Server

Developer Guide

Prerequisites

Building the Operator

Deployment

Running E2E Tests

API Overview

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 11

Uh oh!

Languages

Packages