piskvorky/gensim

bug in gensim.summarization.mz_entropy.mz_keywords

Open

#2,523 创建于 2019年6月10日

在 GitHub 查看
 (1 评论) (1 反应) (0 负责人)Python (15,144 star) (4,349 fork)batch import
Hacktoberfestbugdifficulty easygood first issueimpact LOWreach LOW

描述

Problem statement:

It seems to be a bug if the text is too short and number of words is lower than blocksize. In my case the values were: n_words (232.0) and blocksize (1024).

Log:

gensim\summarization\mz_entropy.py:127: RuntimeWarning: invalid value encountered in double_scalars
  - __log_combinations(n_words, blocksize)

Dirty solution:

Override blocksize value from the default 1024 to something lower:

mz_keywords(text, blocksize=128)

贡献者指南