zeroclaw-labs/zeroclaw
View on GitHub[Feature]: Provider-scoped model fallback chains (not global model fallback only)
Open
#4647 opened on Mar 25, 2026
configenhancementhelp wantedpriority:p2providerprovider:reliablerisk: mediumstatus:acceptedstatus:no-stale
Description
Summary
Provider-scoped model fallback chains (not global model fallback only)
Problem statement
Summary
Today reliability.fallback_providers and reliability.model_fallbacks are configured separately.
This makes fallback behavior hard to control when each provider has different valid models.
I’d like provider-aware fallback chains, so each fallback provider can define its own model list (predefined or custom).
Current behavior
Fallback order is effectively:
- model chain (
model_fallbacks) - provider chain (
fallback_providers) - retries Because model fallback is global, a model may be attempted on providers where it is invalid/unavailable before moving on.
Requested feature
Support provider-scoped model fallback (priority chain of provider+model pairs), for example:
- predefined provider IDs (
openai,ollama, etc.) - custom providers (
custom:https://...) - optional per-entry profile/credential selection
Proposed solution
Possible config design (example)
[reliability]
provider_retries = 2
provider_backoff_ms = 500
[[reliability.fallback_chain1]]
provider = "openrouter"
models = ["anthropic/claude-sonnet-4-6"]
[[reliability.fallback_chain2]]
provider = "custom:http://127.0.0.1:1234/v1"
models = ["model-a", "model-b"]
[[reliability.fallback_chain3]]
provider = "openai"
models = ["gpt-4o-mini", "gpt-4.1-mini"]
Expected behavior
Runtime should try entries in order:
1. openrouter + claude-sonnet-4-6
2. custom + model-a
3. custom + model-b
4. openai + gpt-4o-mini
5. openai + gpt-4.1-mini
(With configured retries/backoff at each step.)
Why this helps
- avoids invalid model/provider combinations
- predictable failover in production
- easier to reason about outages and costs
- supports mixed cloud/local/custom routing
### Non-goals / out of scope
_No response_
### Alternatives considered
_No response_
### Acceptance criteria
_No response_
### Architecture impact
_No response_
### Risk and rollback
_No response_
### Breaking change?
No
### Data hygiene checks
- [x] I removed personal/sensitive data from examples, payloads, and logs.
- [x] I used neutral, project-focused wording and placeholders.