bytedance/deer-flow

[runtime] Provider safety-filtered LLM responses can still trigger tool execution loops

Closed

#3028 opened on May 17, 2026

View on GitHub
 (2 comments) (0 reactions) (0 assignees)Python (67,767 stars) (9,005 forks)batch import
help wanted

Description

Background: Real Run Evidence

I observed this in a real DeerFlow run persisted in the local database.

  • Model: kimi-k2.6
  • User prompt: 帮我整理一下最近一周政经新闻
  • Final status: interrupted

Event trace summary:

  1. The run started normally and entered streaming mode.

  2. The agent loaded the deep-research skill.

  3. The agent performed multiple web_search calls to gather political/economic news.

  4. The agent attempted to write a markdown report to:

    /mnt/user-data/outputs/political-economic-news-weekly-may-16-2026.md
    
    
  5. Starting from the report-writing phase, the LLM repeatedly returned:

     response_metadata.finish_reason = "content_filter"
     tool_calls = [write_file(...)] or [bash(...)]
  1. DeerFlow still executed those tool calls.
  2. The repeated write_file calls all wrote truncated markdown content. The content stopped around the same incomplete section:
     - **会晤时间**:2026年5月12日—13日,特朗普访问中国,与
  1. The agent then checked file size via bash:

    cat /mnt/user-data/outputs/political-economic-news-weekly-may-16-2026.md | wc -c

    Result:

    2596

  2. The agent continued trying alternate write methods:

    • python3 << 'PYEOF' ... was blocked by SandboxAudit.
    • cat > ... << 'EOF' passed audit but produced an unterminated heredoc warning.
  3. The run stopped only after an explicit cancel request:

  POST /api/threads/.../runs/7619fda5-9794-439c-a02a-b78ca8b5101c/cancel?
  wait=0&action=interrupt

This shows that content_filter + tool_calls can cause DeerFlow to keep executing incomplete tool calls and enter a repeated write/retry loop.

Problem

DeerFlow currently has no centralized policy for LLM finish_reason.

In practice, control flow follows the presence of AIMessage.tool_calls. If a provider returns:

  finish_reason = "content_filter"
  tool_calls = [...]

the tool calls are still treated as valid and executed.

This is unsafe because content_filter indicates that the provider stopped, filtered, or truncated the response. Any tool call arguments included in that response may be incomplete or unreliable.

Provider Behavior Context [help wanted]

Other model providers generally treat safety-filtered responses as a terminal/intervention state rather than a normal tool-call turn.

[!IMPORTANT] I haven't been able to find enough information about what kinds of responses the currently supported providers in Deer Flow send when a streaming output is interrupted, so I need help from the community. Here are the provider behaviors that I've been able to confirm:

  • Moonshot: finish_reason="content_filter" means content was omitted due to filtering; tool_calls is a separate finish reason.
  • Anthropic: exposes refusal via stop_reason="refusal". docs
  • Gemini: exposes safety termination through finishReason=SAFETY / related safety finish reasons. docs

So for DeerFlow, finish_reason=content_filter should take precedence over tool execution.

Proposed Strategy

Workaround for deer-flow-2.0-m1

Handle it in LLMErrorHandlingMiddleware.

Better Solution

Add a narrowly scoped middleware, for example:

SafetyFinishReasonMiddleware

Place it in the lead agent middleware chain after model response generation and before tool execution, preferably before LoopDetectionMiddleware.

This should not be added to:

  • LLMErrorHandlingMiddleware, because this is not an exception/retry problem.
  • ToolErrorHandlingMiddleware, because that is too late: the tool is already being executed.
  • LoopDetectionMiddleware, because loop detection handles the symptom, while this is a safety finish reason policy.

Desired Behavior

When the last AIMessage has:

response_metadata.finish_reason == "content_filter" tool_calls is non-empty

DeerFlow should:

  1. Suppress tool execution.
  2. Clear structured tool_calls.
  3. Clear raw provider tool-call metadata from additional_kwargs, such as:
    • tool_calls
    • function_call
  4. Preserve observability, for example by storing:
    • original_finish_reason = "content_filter"
    • suppressed_tool_calls = [...] or a count
  5. Convert the assistant message into a normal user-facing response explaining that the provider filtered/truncated the response and tool execution was skipped.
  6. was stopped may be incomplete or unsafe. Please rephrase the request or ask for a narrower output.

Scope

This issue is intentionally limited to:

finish_reason == "content_filter" AND tool_calls present

Out of scope:

  • General model refusals expressed as normal assistant text.
  • API-level auth/quota/rate-limit errors.
  • Normal finish_reason="tool_calls".
  • General loop detection tuning.
  • Broader provider-specific safety taxonomy beyond content_filter.

Acceptance Criteria

  • Add middleware or equivalent control-flow guard that detects content_filter + tool_calls.
  • Tool calls from such messages are not executed.
  • The final assistant message is user-facing and clearly explains that execution was skipped due to provider content filtering.
  • The original finish reason and suppressed tool-call info remain observable in logs/events/ metadata.
  • Existing normal tool-call behavior remains unchanged.
  • Tests cover:
    • content_filter + tool_calls suppresses tool execution.
    • normal tool_calls still executes tools.
    • content_filter without tool calls produces/keeps a safe final assistant response.
    • raw provider tool-call metadata is cleared so no dangling tool-call/tool-message state is created.

Contributor guide