Llama Stack Showroom

Reference architecture and CI for Llama Stack on Red Hat OpenShift AI (RHOAI).

Status

Work in Progress - This repository is actively evolving toward a production-ready reference architecture for Llama Stack on RHOAI. While core functionality is operational (deployment, authentication, RAG demos), we're continuously expanding components, refining kustomize overlays, and adding demo scripts to showcase Llama Stack capabilities in action.

Purpose

Reference Architecture: Production-ready deployment of Llama Stack using RHOAI components (VLLM, PostgreSQL, Milvus, Keycloak)
Automated Testing: CI that validates deployments with example client scripts
Integration Testing: Test RHOAI/ODH/upstream Llama Stack images through GitHub Actions
Demo Scripts: Reusable examples (RAG, authentication) for downstream projects

Note: Documentation is intentionally kept minimal during early development to avoid rapid obsolescence. Use LLMs to explore the codebase and understand usage patterns.

Architecture

┌─────────────────────────────────────────────────────┐
│ Llama Stack Distribution (CRD)                      │
│  ├─ Inference: VLLM (llama-3-2-3b)                  │
│  ├─ Embeddings: VLLM (nomic-embed-text-v1.5)        │
│  ├─ Auth: Keycloak OAuth2 (RBAC + Team-based)       │
│  ├─ Vector Store: Milvus (50Gi)                     │
│  └─ Storage: PostgreSQL (20Gi)                      │
└─────────────────────────────────────────────────────┘

CI/CD: GitHub Actions workflow tests full deployment lifecycle on ROSA with configurable image overrides for testing ODH/upstream builds.

Prerequisites

OpenShift CLI (oc)
Container tool (podman or docker)
Python 3.12+
uv

Setup

Create environment file (see config.sh.example for details):

cp config.sh.example ~/.lls_showroom
# Edit ~/.lls_showroom and set required values

./setup.sh       # Install RHOAI operator and dependencies
./provision.sh   # Deploy Llama Stack distribution

Run Demos

After provisioning, URLs and credentials are automatically saved to ~/.lls_showroom_generated:

# Run demos by tags (see demos/manifest.yaml for available tags)
./test.sh              # Run all demos
./test.sh simple       # Run simple demos only
./test.sh complex      # Run complex demos (requires OpenAI API key)
./test.sh rag,api      # Run demos tagged with 'rag' OR 'api'

# Available tags: simple, complex, rag, api, agents, storage, embeddings, openai-required

Or run individual demos directly:

uv run demos/rag/demo.py              # RAG with S3 file storage and vector search
uv run demos/responses/demo.py        # Multi-turn conversations with response tracking
uv run demos/responses/demo.py --prompt "What is RAG?"  # Single-turn with custom question
./demos/tests/restarttest/restarttest.sh  # Test response persistence across server restarts (requires `oc` cluster access)
uv run demos/multi_agent/demo.py      # Multi-agent research assistant

With explicit parameters:

uv run demos/rag/demo.py <LLAMASTACK_URL> <KEYCLOAK_URL> <USERNAME> <PASSWORD>
uv run demos/responses/demo.py <LLAMASTACK_URL> <KEYCLOAK_URL> <USERNAME> <PASSWORD>

Note: The multi-agent demo requires SHOWROOM_OPENAI_API_KEY to be set in ~/.lls_showroom.

Deploy Local Changes

Test local LlamaStack code changes on the cluster for rapid iteration.

Quick Start

# 1. Clone llama-stack locally
git clone https://github.com/meta-llama/llama-stack ~/llama-stack

# 2. Configure
echo "export LLAMA_STACK_SOURCE_PATH=~/llama-stack" >> ~/.lls_showroom

# 3. Deploy your changes
./deploy-local.sh
# → Builds image, pushes to in-cluster registry, restarts pod, shows logs

# 4. Test your changes
curl https://$(oc get route llamastack-distribution -o jsonpath='{.spec.host}')/v1/health

# 5. Revert to official image when done
./provision.sh

Features: Uses in-cluster registry (no external accounts needed), auto-detects base image and handles authentication.

Configuration

Add to ~/.lls_showroom:

Variable	Default	Description
`LLAMA_STACK_SOURCE_PATH`	(required)	Path to local llama-stack repository
`DEV_IMAGE_NAMESPACE`	`redhat-ods-applications`	Namespace for images
`DEV_IMAGE_NAME`	`llama-stack-dev`	Image name
`DEV_IMAGE_TAG`	`dev-YYYYMMDD-HHMMSS`	Image tag (auto-generated)
`DEV_BASE_IMAGE`	(auto-detected)	Base image to use
`CONTAINER_TOOL`	`podman`	Container tool (podman/docker)

Troubleshooting

Registry authentication fails:

REGISTRY=$(oc get route default-route -n openshift-image-registry -o jsonpath='{.spec.host}')
podman login -u $(oc whoami) -p $(oc whoami -t) --tls-verify=false $REGISTRY

Registry route not available (requires cluster-admin):

oc patch configs.imageregistry.operator.openshift.io/cluster \
  --type=merge -p '{"spec":{"defaultRoute":true}}'

Pod not using dev image:

# Check Kyverno policy exists
oc get clusterpolicy replace-rhoai-llama-stack-images

# Check pod image
oc get pod -l app=llama-stack -n redhat-ods-applications \
  -o jsonpath='{.items[0].spec.containers[0].image}'

Cleanup

./unprovision.sh  # Remove Llama Stack distribution
./cleanup.sh      # Remove RHOAI operator and dependencies

Testing

CI workflow (.github/workflows/provision.yml) runs on PRs and supports image overrides:

catalog_image: Custom RHOAI catalog source
llama_stack_image: Custom Llama Stack distro image
llama_stack_operator_image: Custom operator image

This enables testing ODH/upstream builds before they're released.

Contributing

Contributions welcome in:

Additional demo scripts (reuse from llama-stack-demos)
Kustomize overlays to work towards a single refarch
CI/CD improvements and test coverage

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Llama Stack Showroom

Status

Purpose

Architecture

Prerequisites

Setup

Run Demos

Deploy Local Changes

Quick Start

Configuration

Troubleshooting

Cleanup

Testing

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
.github/workflows		.github/workflows
demos		demos
kustomize		kustomize
policies		policies
scripts		scripts
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile.dev		Dockerfile.dev
MIGRATION_NOTES.md		MIGRATION_NOTES.md
README.md		README.md
action.yml		action.yml
cleanup.sh		cleanup.sh
config.sh.example		config.sh.example
deploy-local.sh		deploy-local.sh
provision.sh		provision.sh
push-image-to-registry.sh		push-image-to-registry.sh
pyproject.toml		pyproject.toml
setup.sh		setup.sh
test.sh		test.sh
unprovision.sh		unprovision.sh
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Llama Stack Showroom

Status

Purpose

Architecture

Prerequisites

Setup

Run Demos

Deploy Local Changes

Quick Start

Configuration

Troubleshooting

Cleanup

Testing

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages