Fix prow pipeline: in-cluster image build, RAG config, port-forward fix by are-ces · Pull Request #2 · radofuchs/lightspeed-stack

are-ces · 2026-03-29T17:42:20Z

Summary

Build llama-stack image in OpenShift internal registry (via oc new-build/oc start-build) instead of building from source in init containers
Add FAISS_VECTOR_STORE_ID and KV_RAG_PATH env vars to lightspeed-stack pod manifest
Add inference, byok_rag, and rag sections to all prow lightspeed-stack configs (using vllm provider)
Fix free_local_tcp_port to only kill LISTEN sockets — was killing the behave test process via lsof -ti:8080 | kill -9 (Error 137)
Use scoped envsubst '${LLAMA_STACK_IMAGE}' in pipeline-services.sh to avoid clobbering other ${} patterns
Add image-puller role grant for default SA to pull from internal registry
Add MCP token secrets and empty OpenAI secret creation to pipeline.sh
Simplify llama-stack.yaml manifest to use pre-built image

Test plan

Run full prow pipeline end-to-end on OpenShift cluster
Verify faiss disruption test completes without killing behave (no more Error 137)
Verify llama-stack pod pulls from internal registry successfully
Verify lightspeed-stack starts with RAG config (no FAISS_VECTOR_STORE_ID error)

🤖 Generated with Claude Code

…re#1403) * Add tekton file --------- Co-authored-by: Radovan Fuchs <rfuchs@rfuchs-thinkpadp1gen7.tpb.csb>

) * LCORE-1270: Added e2e tests for responses endpoint * fix confusing variable names --------- Co-authored-by: Radovan Fuchs <rfuchs@rfuchs-thinkpadp1gen7.tpb.csb>

* LCORE-1422: Inline RAG (BYOK) e2e tests - Add 6 e2e scenarios: query, streaming query, responses API, streaming responses API, referenced documents, RAG source registration - Add inline RAG configs for server, library, and prow modes - Add reusable "The service uses the {config} configuration" Background step - Fix BYOK RAG enrichment duplicate detection with env var references - Add after_feature safety net to restore config backups Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Add comprehensive end-to-end tests verifying that Llama Stack's NetworkConfig (proxy, TLS) works correctly through the Lightspeed Stack pipeline. Test infrastructure: - TunnelProxy: Async HTTP CONNECT tunnel proxy that creates TCP tunnels for HTTPS traffic. Tracks CONNECT count and target hosts. - InterceptionProxy: Async TLS-intercepting (MITM) proxy using trustme CA to generate per-target server certificates. Simulates corporate SSL inspection proxies. Behave scenarios (tests/e2e/features/proxy.feature): - Tunnel proxy: Configures run.yaml with NetworkConfig proxy pointing to a local tunnel proxy. Verifies CONNECT to api.openai.com:443 is observed and the LLM query succeeds through the proxy. - Interception proxy: Configures run.yaml with proxy and custom CA cert (trustme). Verifies TLS interception of api.openai.com traffic and successful LLM query through the MITM proxy. - TLS version: Configures run.yaml with min_version TLSv1.2 and verifies the LLM query succeeds with the TLS constraint. Each scenario dynamically generates a modified run-ci.yaml with the appropriate NetworkConfig, restarts Llama Stack with the new config, restarts the Lightspeed Stack, and sends a query to verify the full pipeline. Added trustme>=1.2.1 to dev dependencies.

@Proxy

Expand proxy e2e test coverage to fully address all acceptance criteria: AC1 (tunnel proxy): - Add negative test: LLM query fails gracefully when proxy is unreachable AC2 (interception proxy with CA): - Add negative test: LLM query fails when interception proxy CA is not provided (verifies "only successful when correct CA is provided") AC3 (TLS version and ciphers): - Add TLSv1.3 minimum version scenario - Add custom cipher suite configuration scenario (ECDHE+AESGCM:DHE+AESGCM) Test infrastructure: - Add after_scenario cleanup hook in environment.py that restores original Llama Stack and Lightspeed Stack configs after @Proxy scenarios. Prevents config leaks between scenarios. - Use different ports for each interception proxy instance to avoid address-already-in-use errors in sequential scenarios. Documentation: - Update docs/e2e_scenarios.md with all 7 proxy test scenarios. - Update docs/e2e_testing.md with proxy-related Behave tags (@Proxy, @tunnelproxy, @InterceptionProxy, @TLSVersion, @tlscipher).

Changes requested by reviewer (tisnik) and CodeRabbit: - Detect Docker mode once in before_all and store as context.is_docker_mode. All proxy step functions now use the context attribute instead of calling _is_docker_mode() repeatedly. - Log exception in _restore_original_services instead of silently swallowing it. - Only clear context.services_modified on successful restoration, not when restoration fails (prevents leaking modified state). - Add 10-second timeout to tunnel proxy open_connection to prevent stalls on unreachable targets. - Handle malformed CONNECT port with ValueError catch and 400 response.

@Proxy

Move config restoration from @Proxy after_scenario hook to an explicit Background Given step. This follows the team convention that tags are used only for test selection (filtering), not for triggering behavior. The Background step "The original Llama Stack config is restored if modified" runs before every scenario. If a previous scenario left a modified run.yaml (detected by backup file existence), it restores the original and restarts services. This handles cleanup even when the previous scenario failed mid-way. Removed: - @Proxy tag from feature file (was triggering after_scenario hook) - after_scenario hook for @Proxy in environment.py - _restore_original_services function (replaced by Background step) - context.services_modified tracking (no hook reads it) Updated docs/e2e_testing.md: tags documented as selection-only, not behavior-triggering.

Rewrite proxy e2e tests to follow project conventions: - Reuse existing step definitions: use "I use query to ask question" from llm_query_response.py and "The status code of the response is" from common_http.py instead of custom query/response steps. - Split service restart into two explicit Given steps: "Llama Stack is restarted" and "Lightspeed Stack is restarted" so restart ordering is visible in the feature file. - Remove local (non-Docker) mode code path. Proxy tests use restart_container() exclusively, consistent with the rest of the e2e test suite. - Check specific status code 500 for error scenarios instead of the broad >= 400 range. - Remove custom send_query, verify_llm_response, and verify_error_response steps that duplicated existing functionality. Net reduction: -183 lines from step definitions.

Stop proxy servers and their event loops explicitly in the Background restore step. Previously, proxy daemon threads were left running after each scenario, causing asyncio "Task was destroyed but it is pending" warnings at process exit. The _stop_proxy helper schedules an async stop on the proxy's event loop, waits for it to complete, then stops the loop. Context references are cleared so the next scenario starts clean.

Add proxy cleanup in after_feature to stop proxy servers left running from the last scenario. The Background restore step handles cleanup between scenarios, but the last scenario's proxies persist until process exit, causing asyncio "Task was destroyed" warnings. The cleanup checks for proxy objects on context (no tag check needed) and calls _stop_proxy to gracefully shut down the event loops.

Fixes ruff I001 linter error by reordering imports in `src/app/endpoints/responses.py`. Signed-off-by: Anik Bhattacharjee <anbhatta@redhat.com>

Signed-off-by: red-hat-konflux-kflux-prd-rh02 <190377777+red-hat-konflux-kflux-prd-rh02[bot]@users.noreply.github.com>

- Add -n $NAMESPACE to all oc commands in pipeline-services.sh, pipeline-test-pod.sh - Export NAMESPACE from pipeline.sh so child scripts inherit it - Add MCP token secrets and empty OpenAI secret creation to pipeline.sh - Replace pre-built llama-stack image with build-from-source manifest (llama-stack.yaml) - Use envsubst with explicit variable list for lightspeed-stack image substitution - Fix e2e-ops.sh restart-llama-stack: delete+recreate pod instead of oc apply (immutable spec) - Separate kv_default and kv_rag db paths in prow run.yaml config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…, fix port-forward kill - Build llama-stack image in OpenShift internal registry instead of from source - Add image-puller role for default SA to pull from internal registry - Add FAISS_VECTOR_STORE_ID and KV_RAG_PATH env vars to lightspeed-stack pod - Add inference, byok_rag, and rag sections to prow lightspeed-stack configs - Use envsubst with specific variable scoping in pipeline-services.sh - Fix free_local_tcp_port to only kill LISTEN sockets (was killing behave process) - Add MCP token secrets and empty OpenAI secret to pipeline.sh - Add rlsapi_v1_infer action to prow RBAC config - Simplify llama-stack.yaml to use pre-built image with init container for RAG data Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add the spike doc (decisions up front, background below, 4 proposed JIRAs) and the spec doc (R1..R6 requirements, architecture, key files and insertion points, known limitations) under docs/design/byok-pdf/. The spike is lightweight by design: HTML support shipped under LCORE-1035 (commit 7f688b0, 2026-01-15), so the architectural pattern, docling dependency, BaseReader plumbing, CLI shape, and test layout are all already established. PDF support is a scaffold-and-mirror job plus a one-line addition to document_processor.py's doc_type branches. Decisions captured for confirmation (each with options table and recommendation in the spike doc): D1: Library -- docling (already a dependency) D2: OCR for scanned PDFs -- out of scope; track as follow-up D3: Repo placement -- rag-content (impl) + lightspeed-stack (BYOK guide update only) D4: Pipeline knobs -- hard-coded sensible defaults; no CLI flags in v1 (mirrors HTMLReader) D5: Chunking strategy -- reuse MarkdownNodeParser; add "pdf" to document_processor.py:75 and :87 D6: Code organization -- new pdf/ package mirroring html/ D7: Test coverage -- unit/integration in JIRA #2, e2e in #3 Four sub-JIRAs proposed under LCORE-1471 (parseable by dev-tools/file-jiras.sh): 1. rag-content: Implement PDF support 2. rag-content: Unit and integration tests 3. rag-content: End-to-end test (PDF -> vector store -> stack query) 4. lightspeed-stack: Update BYOK guide for native PDF support PoC evidence under poc-results/: 01-poc-report.txt Methodology, findings, implications 02-conversion-log.txt Exact commands and timings 03-sample-jira-1311.md Clean conversion (Atlassian Cloud PDF) 04-sample-jira-836.md Body clean, headings degraded (Confluence PDF, letter-spaced display font) Honest PoC findings worth surfacing: - No new dependencies are needed (docling is already in pyproject.toml). - Body text and tables convert cleanly to Markdown. - MarkdownNodeParser handles the output -- no parallel chunking pipeline. - Letter-spaced display fonts (typical of Confluence "Export to PDF") produce noisy heading text; documented as a v1 known limitation. - Cold model load is ~5 minutes on CPU; warm conversions ~30-90 s for small/medium PDFs. Acceptable for offline indexing. Per howto-run-a-spike.md step 10, poc/ and poc-results/ will be removed before merge; spike doc and spec doc remain in the repo.

Apply CodeRabbit's actionable comments and the per-comment nits: 1. PoC results section in spike doc previously listed paths under poc-results/ that are deleted before merge per howto-run-a-spike.md step 10, leaving broken links in the merged document. Replace the file list with a self-contained summary of what the PoC proved plus the heading-degradation finding, and a note pointing future readers at the PR diff if the raw artifacts are ever needed. 2. Drop the reference to docs/local-stack-testing.md (a local-only file, never committed to the repo). 3. Replace fragile line-numbered references (document_processor.py:75, :87, byok_guide.md ~106-118) with stable symbol anchors: _BaseDB.__init__, _LlamaStackDB.__init__, "Knowledge Sources" subsection, "Step 1" subsection. Line numbers rot; section names and symbol names rot less. 4. Spec doc now instructs the implementation ticket to extract the ("markdown", "html", "pdf") predicate to a single MARKDOWN_COMPATIBLE_DOC_TYPES: Final[tuple[str, ...]] constant in document_processor.py and reference it from both call sites, instead of duplicating the tuple. JIRA #1 scope updated to match. 5. Add R7: PDFReader.load_data emits a logger.warning when its docling output is empty / under a small threshold (a likely indicator of a scanned PDF given R5's no-OCR scope). Threshold is a module-level Final[int] constant. JIRA #1 scope and JIRA #2 test patterns updated to require coverage via caplog. Surfacing the silent- degradation case in custom_processor.py logs costs nothing and makes the OCR-needed signal visible. Plus the two reviewer nits worth carrying into JIRA #1: - Use docling's TableFormerMode.ACCURATE enum, not the string literal "accurate"; both work via Pydantic coercion but the enum is type-checked. - Mirror HTMLReader's choice on whether to call super().__init__(); llama-index's BaseReader does not require it but symmetry between the two readers is preferred. The spec doc changelog records this revision and its trigger (the PR lightspeed-core#1598 CodeRabbit review).

radofuchs and others added 17 commits March 29, 2026 19:45

LCORE-1221: Added tekton file for konflux e2e test run (lightspeed-co…

fefde72

…re#1403) * Add tekton file --------- Co-authored-by: Radovan Fuchs <rfuchs@rfuchs-thinkpadp1gen7.tpb.csb>

LCORE-1270: Added e2e tests for responses endpoint (lightspeed-core#1411

4ea7fa6

) * LCORE-1270: Added e2e tests for responses endpoint * fix confusing variable names --------- Co-authored-by: Radovan Fuchs <rfuchs@rfuchs-thinkpadp1gen7.tpb.csb>

Fix import ordering in responses.py

e8548d4

Fixes ruff I001 linter error by reordering imports in `src/app/endpoints/responses.py`. Signed-off-by: Anik Bhattacharjee <anbhatta@redhat.com>

LCORE-1177: Add explicit dependency on requests library

cac6ad4

LCORE-1606: Update AGENTS.md in LCORE

1d057f7

Update Konflux references

3a4bdf9

Signed-off-by: red-hat-konflux-kflux-prd-rh02 <190377777+red-hat-konflux-kflux-prd-rh02[bot]@users.noreply.github.com>

LCORE-1356: Updated architecture documentation

b489d2c

are-ces force-pushed the LCORE-1496-rhoai-version-bump branch from 06ce083 to 538071c Compare March 29, 2026 17:45

are-ces closed this Mar 29, 2026

radofuchs pushed a commit that referenced this pull request May 13, 2026

New unit test #2

06cf91f

radofuchs pushed a commit that referenced this pull request May 13, 2026

New unit test #2

7c93197

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix prow pipeline: in-cluster image build, RAG config, port-forward fix#2

Fix prow pipeline: in-cluster image build, RAG config, port-forward fix#2
are-ces wants to merge 17 commits into
radofuchs:LCORE-1496-rhoai-version-bumpfrom
are-ces:LCORE-1496-rhoai-version-bump

are-ces commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

are-ces commented Mar 29, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants