Skip to content

feat: add OCI Generative AI embeddings provider#4966

Draft
fede-kamel wants to merge 11 commits intocrewAIInc:mainfrom
fede-kamel:feat/oci-embeddings
Draft

feat: add OCI Generative AI embeddings provider#4966
fede-kamel wants to merge 11 commits intocrewAIInc:mainfrom
fede-kamel:feat/oci-embeddings

Conversation

@fede-kamel
Copy link

Summary

  • OCIEmbeddingFunction: ChromaDB-compatible embedding callable with batching, config serialization, and image embedding support
  • OCIProvider: Pydantic-based provider with AliasChoices for env var and config key validation
  • Factory registration in embeddings/factory.py + types.py (provider name: "oci")
  • Supports text and image embeddings, configurable output dimensions, custom endpoints, all 4 OCI auth modes
  • Shared auth via utilities/oci.py (from PR 1)

Depends on #4964, #4963, #4962, #4961, #4959. Draft until all merge.
Tracking issue: #4944

Diff breakdown (vs multimodal PR)

Change Lines
embedding_callable.py (new) +181
oci_provider.py (new) +75
types.py (new) +30
__init__.py (new) +17
factory.py (registration + overload) +9
types.py (ProviderSpec + Literal) +3
test_factory_oci.py (5 unit tests) +207
test_oci_embedding_integration.py (2 live tests) +66
Total +588

Test plan

  • 5 unit tests: factory wiring, batching, output dimensions, config serialization, image embeddings
  • 2 live integration tests against cohere.embed-english-v3.0 (API_KEY_AUTH): single text + batch
  • All 42 prior LLM unit tests still pass

Add native OCI Generative AI support to CrewAI with basic text
completion for generic (Meta, Google, OpenAI, xAI) and Cohere model
families. This is the first in a series of PRs to incrementally
build out full OCI support (streaming, tool calling, structured
output, embeddings, and multimodal in follow-up PRs).

Tracking issue: crewAIInc#4944
Supersedes: crewAIInc#4885
Tool calling is not implemented in this PR. Returning True would
cause CrewAI to choose the native tools path, silently dropping
tools from agents. Flagged by Cursor Bugbot review.
Both methods are unnecessary in this PR. The base class and callers
already default correctly when the methods are absent:
- supports_function_calling: callers use getattr with False default
- supports_stop_words: base class already returns True

These will be added back in the tool calling follow-up PR.
Remove json, re imports and _OCI_SCHEMA_NAME_PATTERN regex that
are only needed for structured output (not in this PR scope).
Use model_lower instead of model in the dot check to match the
convention used by all other providers in _matches_provider_pattern.
Flagged by Cursor Bugbot.
Add streaming text completion via OCI SSE events:
- stream=True in call() routes to _stream_call_impl with chunk events
- iter_stream() yields raw text chunks (sync generator)
- astream() wraps iter_stream via thread+queue for async callers
- _stream_chat_events holds client lock for full stream duration
- SSE event parsing handles both string and mapping payloads

Tested live against meta.llama-3.3-70b-instruct,
cohere.command-r-plus-08-2024, google.gemini-2.5-flash,
and openai.gpt-5.2-chat-latest.

Depends on: crewAIInc#4959
Tracking issue: crewAIInc#4944
Add native function calling for generic and Cohere model families:
- _format_tools converts CrewAI tool specs to OCI SDK format
- _extract_tool_calls normalizes responses back to CrewAI shape
- _handle_tool_calls executes tools and recurses until model finishes
- Cohere tool message handling with trailing tool results
- Tool choice control (auto/none/required/function)
- Passthrough parameter filtering via SDK introspection
- Streaming tool call accumulation from SSE fragments
- supports_function_calling() returns True

Tested live against meta.llama-3.3-70b-instruct with
raw tool call return and recursive tool execution.

Depends on: crewAIInc#4961 (streaming), crewAIInc#4959 (basic text)
Tracking issue: crewAIInc#4944
Add response_model (Pydantic) support for structured output:
- _build_response_format converts Pydantic schema to OCI
  JsonSchemaResponseFormat (generic) or CohereResponseJsonFormat
- _parse_structured_response validates and returns typed models
- response_model threaded through call, _call_impl, _stream_call_impl,
  and _handle_tool_calls for full coverage
- Handles JSON in markdown fences via base class _validate_structured_output

Tested live against meta.llama-3.3-70b-instruct and
google.gemini-2.5-flash.

Depends on: crewAIInc#4962 (tool calling), crewAIInc#4961 (streaming), crewAIInc#4959 (basic text)
Tracking issue: crewAIInc#4944
Add multimodal content handling for generic model families:
- vision.py: model lists, data URI helpers, image encoding utilities
- _build_generic_content handles image_url, document_url, video_url,
  audio_url content types mapped to OCI SDK content objects
- _message_has_multimodal_content detects non-text payloads
- Cohere models reject multimodal with clear error message
- supports_multimodal() returns True

Depends on: crewAIInc#4963, crewAIInc#4962, crewAIInc#4961, crewAIInc#4959
Tracking issue: crewAIInc#4944
Send a 2x2 red PNG to google.gemini-2.5-flash via data URI and
verify it identifies the color. Tests the full image_url content
pipeline end-to-end against a live OCI vision model.
Add OCI embedding support integrated with CrewAI's RAG pipeline:
- OCIEmbeddingFunction: ChromaDB-compatible embedding callable with
  batching, config serialization, image embedding support
- OCIProvider: Pydantic-based provider with alias validation for
  env vars and config keys
- Factory registration in embeddings/factory.py + types.py
- Supports text and image embeddings, output dimensions,
  custom endpoints, all 4 OCI auth modes

Tested live against cohere.embed-english-v3.0 with API_KEY auth.

Depends on: crewAIInc#4964, crewAIInc#4963, crewAIInc#4962, crewAIInc#4961, crewAIInc#4959
Tracking issue: crewAIInc#4944
fede-kamel added a commit to fede-kamel/crewAI that referenced this pull request Mar 20, 2026
Replace asyncio.to_thread wrappers with true async I/O using aiohttp
for acall() and astream(). The OCI SDK is sync-only, so we bypass it
for HTTP and use its signer for request authentication directly.

- oci_async.py: OCIAsyncClient with aiohttp, OCI request signing,
  native SSE parsing, connection pooling
- acall(): true async chat completion (no thread pool)
- astream(): true async SSE streaming (no thread+queue bridge)
- Graceful fallback to asyncio.to_thread when aiohttp unavailable
  or client is mocked (unit tests)
- aiohttp + certifi added to crewai[oci] optional deps

Temporary measure until OCI SDK ships native async support.

Tested live: acall, astream, and concurrent acall against
meta.llama-3.3-70b-instruct with API_KEY auth.

Depends on: crewAIInc#4966, crewAIInc#4964, crewAIInc#4963, crewAIInc#4962, crewAIInc#4961, crewAIInc#4959
Tracking issue: crewAIInc#4944
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant