EpistasisLab/pmlb

Credit/Origin?

Open

#13 geöffnet am 19. Dez. 2017

Auf GitHub ansehen
 (10 Kommentare) (2 Reaktionen) (0 zugewiesene Personen)Python (129 Forks)batch import
enhancementhelp wanted

Repository-Metriken

Stars
 (734 Stars)
PR-Merge-Metriken
 (Keine gemergten PRs in 30 T)

Beschreibung

  1. Nice resource! I may add some to it in future (although the ones I use for benchmarking are considerably "rarer" than the ones here - time-series + raw text + locations, entities, etc') .
  2. The varied datasets dont seem to have credit as to their origin. (e.g. "winered" - I assume is the wine datasets from UCI, but there's nothing about that in the data folder or the csv.gz file). Adding the origin (even at the "site" level, e.g. "UCI", "open-ML", "kaggle datasets", "KDD") would make it much easier to analyze the original datasets, context ,domain and interpretation (e.g. "Looking for datasets on time-series + predictive maintenance").

Contributor Guide