enhancementhelp wanted
説明
This project doesn't currently allow for the predicting the type of an input, as there is no sense of knowing to what type an input value maps.
Normally when using a classifier, there is a two stage process. 1 - fit(X, y), using training input and output data 2 - predict(X), using unknown data, and returning the estimated
It would be good if this project presented a similar interface.
I would suggest creating a class, wmd_classifier, which implements these two models.
fit, which would:
- take in an array of documents and break them down into
bows - create a WMD instance
- cache centroids
predict, which would:
- take in a document
- break it into a
bow - calculate its centroid
- call nearest_neighbours
- calculate the output type, based on the k nearest neighbours, weighted by their closeness