BurntSushi/xsv

feature: binned histograms

Open

#84 opened on 2017年6月13日

GitHub で見る
 (3 comments) (2 reactions) (0 assignees)Rust (9,730 stars) (315 forks)batch import
enhancementhelp wanted

説明

Histograms could be done in numerous ways. Here are some thoughts:

  • like most of xsv, should operate over huge tables with a single pass
  • .idx files could store metadata helpful for binning? eg, min and max values per column
  • "chose N evenly spaced numerical bins" seems to require more than one pass (or keeping all values in memory). Keeping a tree of round-sized bins and merging them when the tree gets too big would avoid that
  • logarithmic bins
  • power-of-two or power-of-1024 (eg, for file sizes)
  • binning of strings or decimal numbers by prefix
  • "other" / "NaN" / null bins
  • csv/tsv output by default, then a separate mode like xsv table to pretty print bars

コントリビューターガイド

feature: binned histograms · BurntSushi/xsv#84 | Good First Issue