piskvorky/gensim
在 GitHub 查看WordEmbeddingsKeyedVectors.add() doesn't clear `vectors_norm`, causing `IndexError` on later `most_similar()`
Open
#2,532 创建于 2019年6月18日
Hacktoberfestbugdifficulty easygood first issueimpact MEDIUMreach LOW
描述
As reported in a StackOverflow question/answer: https://stackoverflow.com/a/56641265/130288
An adapted version of the asker's minimal test case (which could become a unit test):
import numpy as np
from gensim.models.keyedvectors import WordEmbeddingsKeyedVectors
kv = WordEmbeddingsKeyedVectors(vector_size=3)
kv.add(entities=['a', 'b'],
weights=[np.random.rand(3), np.random.rand(3)])
kv.most_similar('a') # works
kv.add(entities=['c'], weights=[np.random.rand(3)])
kv.most_similar('c') # fails with `IndexError`
Clearing the vectors_norm property (with either del or assignment-to-None) should be sufficient to trigger re-calculation upon the next most_similar().