bytedance/deer-flow
View on GitHub[Stability][BUG-003] Large write_file failures amplify token usage
Open
#3114 opened on May 21, 2026
help wanted
Description
Parent stability dashboard: #3107
This issue tracks BUG-003 from #3107.
Problem
When generating a large HTML artifact, write_file can fail because the model output is truncated or tool arguments become incomplete. The failure path can echo large attempted file contents back into the conversation state, causing subsequent model calls to carry much larger context.
Evidence
Source: gateway log, token usage middleware.
LLM token usage: input=29324 output=8192 total=37516
LLM token usage: input=46564 output=8192 total=54756
LLM token usage: input=63274 output=3903 total=67177
LLM token usage: input=71117 output=2682 total=73799
Source: checkpoint/state inspection of write_file tool messages.
write_file error payload: ~23.7K chars
write_file error payload: ~24.1K chars
write_file error payload: ~10.6K chars
Another observed shape:
Source: checkpoint/state inspection of AI message usage + following write_file tool result.
write_file output=8192 finish_reason=length
write_file missing required path
tool error echoed ~23K chars of attempted HTML content
Suspected mechanism
- The model tries to generate a large HTML report as one
write_filecall. - Output hits a limit or tool args become incomplete.
write_filefails.- The tool error includes a large portion of the attempted
content. - That large error becomes part of conversation state.
- The next LLM call has a much larger input context.
- The agent retries with another writing strategy.
Impact
- Token usage can grow from a normal large task into a million-token class run.
- Runtime cost becomes hard for users to predict.
- Persistence/checkpoint writes also increase.
- The final artifact may eventually succeed, but after expensive retries.
Expected behavior
- Tool errors should not echo large
contentarguments back into model context. - Large artifact generation should use a bounded, reliable writing strategy.
- If an artifact cannot be written, the error returned to the model should be concise and structured.