src-d/wmd-relax

Classification use-case

Open

#16 opened on 2017年6月11日

GitHub で見る
 (2 comments) (0 reactions) (1 assignee)Python (456 stars) (83 forks)batch import
enhancementhelp wanted

説明

This project doesn't currently allow for the predicting the type of an input, as there is no sense of knowing to what type an input value maps.

Normally when using a classifier, there is a two stage process. 1 - fit(X, y), using training input and output data 2 - predict(X), using unknown data, and returning the estimated

It would be good if this project presented a similar interface.

I would suggest creating a class, wmd_classifier, which implements these two models.

fit, which would:

  • take in an array of documents and break them down into bows
  • create a WMD instance
  • cache centroids

predict, which would:

  • take in a document
  • break it into a bow
  • calculate its centroid
  • call nearest_neighbours
  • calculate the output type, based on the k nearest neighbours, weighted by their closeness

コントリビューターガイド