[runtime] Provider safety-filtered LLM responses can still trigger tool execution loops
#3028 opened on May 17, 2026
Description
Background: Real Run Evidence
I observed this in a real DeerFlow run persisted in the local database.
- Model:
kimi-k2.6 - User prompt:
帮我整理一下最近一周政经新闻 - Final status:
interrupted
Event trace summary:
-
The run started normally and entered streaming mode.
-
The agent loaded the
deep-researchskill. -
The agent performed multiple
web_searchcalls to gather political/economic news. -
The agent attempted to write a markdown report to:
/mnt/user-data/outputs/political-economic-news-weekly-may-16-2026.md -
Starting from the report-writing phase, the LLM repeatedly returned:
response_metadata.finish_reason = "content_filter"
tool_calls = [write_file(...)] or [bash(...)]
- DeerFlow still executed those tool calls.
- The repeated write_file calls all wrote truncated markdown content. The content stopped around the same incomplete section:
- **会晤时间**:2026年5月12日—13日,特朗普访问中国,与
-
The agent then checked file size via bash:
cat /mnt/user-data/outputs/political-economic-news-weekly-may-16-2026.md | wc -c
Result:
2596
-
The agent continued trying alternate write methods:
- python3 << 'PYEOF' ... was blocked by SandboxAudit.
- cat > ... << 'EOF' passed audit but produced an unterminated heredoc warning.
-
The run stopped only after an explicit cancel request:
POST /api/threads/.../runs/7619fda5-9794-439c-a02a-b78ca8b5101c/cancel?
wait=0&action=interrupt
This shows that content_filter + tool_calls can cause DeerFlow to keep executing incomplete tool calls and enter a repeated write/retry loop.
Problem
DeerFlow currently has no centralized policy for LLM finish_reason.
In practice, control flow follows the presence of AIMessage.tool_calls. If a provider returns:
finish_reason = "content_filter"
tool_calls = [...]
the tool calls are still treated as valid and executed.
This is unsafe because content_filter indicates that the provider stopped, filtered, or truncated the response. Any tool call arguments included in that response may be incomplete or unreliable.
Provider Behavior Context [help wanted]
Other model providers generally treat safety-filtered responses as a terminal/intervention state rather than a normal tool-call turn.
[!IMPORTANT] I haven't been able to find enough information about what kinds of responses the currently supported providers in Deer Flow send when a streaming output is interrupted, so I need help from the community. Here are the provider behaviors that I've been able to confirm:
- Moonshot: finish_reason="content_filter" means content was omitted due to filtering; tool_calls is a separate finish reason.
- Anthropic: exposes refusal via stop_reason="refusal". docs
- Gemini: exposes safety termination through finishReason=SAFETY / related safety finish reasons. docs
So for DeerFlow, finish_reason=content_filter should take precedence over tool execution.
Proposed Strategy
Workaround for deer-flow-2.0-m1
Handle it in LLMErrorHandlingMiddleware.
Better Solution
Add a narrowly scoped middleware, for example:
SafetyFinishReasonMiddleware
Place it in the lead agent middleware chain after model response generation and before tool execution, preferably before LoopDetectionMiddleware.
This should not be added to:
- LLMErrorHandlingMiddleware, because this is not an exception/retry problem.
- ToolErrorHandlingMiddleware, because that is too late: the tool is already being executed.
- LoopDetectionMiddleware, because loop detection handles the symptom, while this is a safety finish reason policy.
Desired Behavior
When the last AIMessage has:
response_metadata.finish_reason == "content_filter" tool_calls is non-empty
DeerFlow should:
- Suppress tool execution.
- Clear structured tool_calls.
- Clear raw provider tool-call metadata from additional_kwargs, such as:
- tool_calls
- function_call
- Preserve observability, for example by storing:
- original_finish_reason = "content_filter"
- suppressed_tool_calls = [...] or a count
- Convert the assistant message into a normal user-facing response explaining that the provider filtered/truncated the response and tool execution was skipped.
- was stopped may be incomplete or unsafe. Please rephrase the request or ask for a narrower output.
Scope
This issue is intentionally limited to:
finish_reason == "content_filter" AND tool_calls present
Out of scope:
- General model refusals expressed as normal assistant text.
- API-level auth/quota/rate-limit errors.
- Normal finish_reason="tool_calls".
- General loop detection tuning.
- Broader provider-specific safety taxonomy beyond content_filter.
Acceptance Criteria
- Add middleware or equivalent control-flow guard that detects content_filter + tool_calls.
- Tool calls from such messages are not executed.
- The final assistant message is user-facing and clearly explains that execution was skipped due to provider content filtering.
- The original finish reason and suppressed tool-call info remain observable in logs/events/ metadata.
- Existing normal tool-call behavior remains unchanged.
- Tests cover:
- content_filter + tool_calls suppresses tool execution.
- normal tool_calls still executes tools.
- content_filter without tool calls produces/keeps a safe final assistant response.
- raw provider tool-call metadata is cleared so no dangling tool-call/tool-message state is created.