Vocab vectors using complete pretrained-embedding? · pytorch/text#446

(6 commenti) (0 reazioni) (0 assegnatari)Python (822 fork)batch import

enhancementhelp wanted

Metriche repository

Star: (3396 star)
Metriche merge PR: (Nessuna PR mergiata in 30 g)

Descrizione

I am new to pytorch and nlp. I have a question when I tried to build a model.

Since my training dataset is not so big, the size of its vocab is relatively small (around 5000). However, I want to deal with any other user input which could be out of this vocabulary.

The problem is, in the model I trained, the embedding layer's weight is based on the vectors of the field, not the whole word2vec pretrained embeddings. So I cannot modified it after the training is done.

I wondered is there any better approach to do it? Thanks in advance!

Guida contributor

Direzione di ricerca: Esplora come estendere il livello di embedding per includere token OOV cercando la matrice di embedding pre addestrata completa durante l'inferenza, possibilmente utilizzando una mappatura separata per parole sconosciute.
Tech stack: pythonpytorch
Dominio: machine learning
Tipo issue: Ricerca
Difficoltà: 1
Tempo stimato: Meno di 1 ora
Stato attività: Attiva
Chiarezza: Chiara
Prerequisiti: PythonPyTorch
Adatta ai principianti: 40

Metriche repository

Descrizione

Guida contributor

Ricevi issue Easy fresche nella tua inbox.