reworkd/AgentGPT

✨ Investigate the best similarity score threshold to remove duplicate tasks

Open

#729 创建于 2023年6月6日

在 GitHub 查看
 (0 评论) (0 反应) (0 负责人)TypeScript (34,594 star) (9,446 fork)batch import
enhancementhelp wanted

描述

When we generate tasks, we filter tasks that have a similarity score that is too close to existing tasks in the vector database

similar_tasks = memory.get_similar_tasks(
    task, score_threshold=0.95  # TODO: Once we use ReAct, revisit
)

This is done with the help of the code above. Arbitrarily, we use 0.95. Even with this, the task may not be related.

On the other hand, there may be very related / duplicated tasks that have a score that is less than this.

This ticket is tasked with investigating what the best value for this threshold is, or to use some other means of calculating similarity for this given case.

贡献者指南

✨ Investigate the best similarity score threshold to remove duplicate tasks · reworkd/AgentGPT#729 | Good First Issue