Specification Details
Core Specification
Section titled “Core Specification”The core specification (v1/spec.yaml) defines the standard vocabulary that all provider manifests and runtimes share.
Standard Parameters
Section titled “Standard Parameters”These parameters have consistent meaning across all providers:
| Parameter | Type | Description |
|---|---|---|
temperature | float | Randomness control (0.0 – 2.0) |
max_tokens | integer | Maximum response tokens |
top_p | float | Nucleus sampling threshold |
stream | boolean | Enable streaming response |
stop | string[] | Stop sequences |
tools | object[] | Tool/function definitions |
tool_choice | string/object | Tool selection mode |
response_format | object | Structured output format |
Provider manifests map these standard names to provider-specific parameter names. For example, OpenAI uses max_completion_tokens while Anthropic uses max_tokens.
Streaming Events
Section titled “Streaming Events”The specification defines unified streaming event types that runtimes emit:
| Event | Description |
|---|---|
PartialContentDelta | Text content fragment |
ThinkingDelta | Reasoning/thinking block (extended thinking models) |
ToolCallStarted | Function/tool invocation begins |
PartialToolCall | Tool call argument streaming |
ToolCallEnded | Tool invocation complete |
StreamEnd | Response stream complete |
StreamError | Stream-level error |
Metadata | Usage statistics, model info |
Provider manifests declare JSONPath-based rules that map provider-specific events to these standard types.
Error Classes
Section titled “Error Classes”13 standard error types normalize provider-specific error responses:
| Error Class | Typical HTTP Status | Description |
|---|---|---|
authentication | 401 | Invalid or missing API key |
permission | 403 | Insufficient permissions |
not_found | 404 | Model or endpoint not found |
rate_limited | 429 | Rate limit exceeded |
quota_exhausted | 402 | Billing/quota limit reached |
invalid_request | 400 | Malformed request |
context_length | 400 | Context window exceeded |
content_filter | 400 | Content policy violation |
overloaded | 503/529 | Server overloaded |
server_error | 500 | Internal server error |
timeout | 408/504 | Request timeout |
network | — | Network connectivity issue |
unknown | — | Unclassified error |
Retry Policies
Section titled “Retry Policies”The spec defines standard retry strategies:
retry_policy: strategy: "exponential_backoff" max_retries: 3 initial_delay_ms: 1000 max_delay_ms: 30000 backoff_multiplier: 2.0 retryable_errors: - "rate_limited" - "overloaded" - "server_error" - "timeout"Termination Reasons
Section titled “Termination Reasons”Normalized finish reasons for response completion:
| Reason | Description |
|---|---|
end_turn | Natural completion |
max_tokens | Token limit reached |
tool_use | Model wants to call a tool |
stop_sequence | Stop sequence encountered |
content_filter | Filtered by content policy |
API Families
Section titled “API Families”Providers are categorized into API families to prevent request/response format confusion:
openai— OpenAI-compatible APIs (also used by Groq, Together, DeepSeek, etc.)anthropic— Anthropic Messages APIgemini— Google Gemini APIcustom— Provider-specific format
Next Steps
Section titled “Next Steps”- Provider Manifests — How provider configs work
- Model Registry — Model configuration details