alibaba/GraphScope
在 GitHub 查看[BUG] Running clustering app on graphscope.nx is slower than networkx over p2p dataset
Open
#2,934 创建于 2023年6月26日
component:networkxgood first issueperformance
描述
import os
import graphscope.nx as gs_nx
import networkx as nx
import time
start = time.time()
g1 = nx.read_edgelist(
os.path.expandvars('./p2p-31.e'),
nodetype=int,
data=False,
create_using=nx.Graph
)
print(type(g1))
print("networkx = ", time.time() - start)
# networkx.classes.graph.Graph
start = time.time()
g2 = gs_nx.read_edgelist(
os.path.expandvars('./p2p-31.e'),
nodetype=int,
data=False,
create_using=gs_nx.Graph
)
print(type(g2))
print("gs = ", time.time() - start)
start = time.time()
ret_nx = nx.clustering(g1)
print("networkx = ", time.time() - start)
# 0.91s
start = time.time()
ret_gs = gs_nx.clustering(g2)
print("gs = ", time.time() - start)
# 2.12s
# compare the results
print(ret_gs == ret_nx)
In addition, our blog shows on Twitter dataset, graphscope.nx is over 25X faster than networkx, but on my testbed, graphscope.nx is only about 7x faster than networkx.