alibaba/GraphScope

[BUG] Loading from large dataframe/large numpy requires holding all chunks in coordinator

Open

#2,342 创建于 2022年12月23日

在 GitHub 查看
 (0 评论) (1 反应) (0 负责人)HTML (2,401 star) (301 fork)batch import
bugcomponent:coordinatorgood first issue

描述

Describe the bug

It looks strange that we need to accumulate all chunks in the request stream into a list in coordinator before sending to analytical engine, that would requires large available memory for the coordinator pod.

https://github.com/alibaba/GraphScope/blob/b80a35599424580325a750e734f8a3b2dead2a5b/coordinator/gscoordinator/dag_manager.py#L77-L107

贡献者指南

[BUG] Loading from large dataframe/large numpy requires holding all chunks in coordinator · alibaba/GraphScope#2342 | Good First Issue