Prompt Replay Cache to Reduce Redundant Model Calls · google-gemini/gemini-cli#21570

(6 comments) (0 reactions) (1 assignee)TypeScript (13,657 forks)batch import

area/agenthelp wantedkind/enhancementkind/featurepriority/p3status/bot-triaged

Repository metrics

Stars: (103,992 stars)
PR merge metrics: (平均マージ 4d 2h) (30d で 55 merged PRs)

説明

What would you like to be added?

Introduce a Prompt Replay Cache mechanism that stores responses for previously executed prompts and reuses them when the same prompt is issued again within the same project context.

Currently, identical prompts always trigger a new model request even if the same prompt was executed moments earlier. A caching layer would allow the CLI to check if a prompt has already been processed and return the cached response instead of making another API call.

Proposed workflow:

User Prompt
→ Check local cache
→ If cached response exists → return cached result
→ If not → call the model → store response in cache

Example cache structure:

.cache/ prompt-cache.json

Each entry could store:

prompt hash
original prompt
response
timestamp
project path (optional for project scoping)

This would be implemented as a lightweight cache layer before the model invocation step.

Why is this needed?

Currently, repeated prompts cause repeated API calls, which leads to:

Increased latency for users
Unnecessary API usage
Higher operational costs
Repeated computation for identical queries

During development workflows, users often repeat prompts such as:

explaining a file
summarizing code
debugging errors

A prompt replay cache would significantly improve the developer experience by making repeated interactions faster while reducing unnecessary load on the model API.

This feature also helps improve CLI responsiveness during iterative workflows.

Additional context

Possible implementation considerations:

Generate a hash from the prompt + project directory to uniquely identify cache entries
Store cache files in a local directory (e.g., .cache/)
Implement cache expiration (TTL) or LRU eviction to prevent unbounded growth
Optionally log cache hits/misses in debug mode

Example pseudo-code:

const key = hash(prompt + projectPath);

if (cache.has(key)) { return cache.get(key); }

const response = await callModel(prompt); cache.set(key, response);

This feature would be relatively lightweight to implement and could significantly improve performance for repeated prompts.

コントリビューターガイド

調査方針: プロンプトとプロジェクトパスのハッシュに基づいてモデル応答を保存および取得するキャッシュ層を実装します。ローカルキャッシュファイルとオプションのTTLを使用します。
技術スタック: typescriptnodejs
領域: cli
Issue 種別: 機能
難度: 3
推定時間: 半日
活動状況: アクティブ
明確さ: 明確
前提条件: Node.jsGit
初心者向け度: 80

Repository metrics

説明

What would you like to be added?

Why is this needed?

Additional context

コントリビューターガイド

新着 Easy issues をメールで受け取る。