Vocab vectors using complete pretrained-embedding? · pytorch/text#446

(6 评论) (0 反应) (0 负责人)Python (822 fork)batch import

enhancementhelp wanted

仓库指标

Star: (3,396 star)
PR 合并指标: (30 天内没有已合并 PR)

描述

I am new to pytorch and nlp. I have a question when I tried to build a model.

Since my training dataset is not so big, the size of its vocab is relatively small (around 5000). However, I want to deal with any other user input which could be out of this vocabulary.

The problem is, in the model I trained, the embedding layer's weight is based on the vectors of the field, not the whole word2vec pretrained embeddings. So I cannot modified it after the training is done.

I wondered is there any better approach to do it? Thanks in advance!

贡献者指南

研究方向: 探索如何通过查找完整的预训练嵌入矩阵来扩展嵌入层以包含OOV标记，可能使用单独的未知词映射。
技术栈: pythonpytorch
领域: machine learning
议题类型: 调研
难度: 1
预计时间: 1 小时以内
活动状态: 活跃
清晰度: 清晰
前置要求: PythonPyTorch
新手友好度: 40

仓库指标

描述

贡献者指南

每天在邮箱收到新鲜 Easy issues。