The decision engine for agentic AI.

One API. 17 providers. 31 endpoints. Automatic model selection.

acme-corp
TIME KEY STRATEGY ROUTED TO TPS MS
0 req/min
0ms avg latency
99.9% uptime
17/ 17 providers
Strategy distribution
38% 34% 28%
prod_7x
beta_3k

Simulated routing decisions. Request content is illustrative only.

What you don't have to build.

No SDK sprawl

One format for every provider. Format translation, tool schemas, and vision payloads handled automatically.

No maintenance

Provider APIs change. Models deprecate. Cailos absorbs every breaking change so your integration doesn't.

No lock-in

Switch providers in seconds. Circuit breakers auto-failover when a provider goes down. Your on-call never wakes up.

Live evals

Every endpoint is evaluated on intelligence, tool calling, and vision. Routing always reflects the current model landscape.

Build smarter agents.

Drop Cailos into any OpenAI-compatible agent framework. The SDK doesn't change — just the endpoint. Model selection becomes automatic.

Before Hardcoded
from agents import Agent

triage = Agent(
    name="triage",
    model="gpt-4o-mini",
    instructions="Classify: billing, technical, or escalate.",
)

resolver = Agent(
    name="resolver",
    model="gpt-4o",
    instructions="Draft resolution from KB.",
    tools=[search_kb, lookup_customer],
)
Triage is cheap but may misclassify. Resolver is expensive for simple tickets. If GPT-4o goes down, the pipeline breaks.
After Cailos
from agents import Agent
from openai import AsyncOpenAI

cailos = AsyncOpenAI(base_url="https://cailos.com/v1", api_key="cai_...")

triage = Agent(
    name="triage",
    model="auto",              # fastest cheap model
    instructions="Classify: billing, technical, or escalate.",
)

resolver = Agent(
    name="resolver",
    model="auto",              # best tool-calling model
    instructions="Draft resolution from KB.",
    tools=[search_kb, lookup_customer],
)
model="auto" — each agent gets the optimal model. Automatic fallback across 17 providers.

How it works

Stack capability constraints. Cailos narrows to one endpoint. Integrate with two lines.

Capability filtering

17
18
15
available
post hard_filters
post soft_filters

Every request flows through the filtering pipeline. Cailos narrows 17 endpoints down to the best candidates for your constraints.

Integration main.py
from openai import OpenAI

client = OpenAI(
    base_url="https://cailos.com/v1",
    api_key="cai_...",
)

response = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "..."}],
    extra_body={
        "cailos": {
            "optimise": "quality",
            "require_vision": True,
            "speed": "fast",
        }
    },
)

Standard OpenAI SDK. Change base_url and api_key. Add cailos for routing hints.

cailos.com/v1

Drop-in replacement for any OpenAI SDK.
Change two lines. Access 31 endpoints across 17 providers.

Get Started