open-guides/og-aws

S3: Discuss ways to list and tally objects efficiently

Open

#58 创建于 2016年9月2日

在 GitHub 查看
 (9 评论) (0 反应) (0 负责人)Shell (36,412 star) (3,905 fork)batch import
help wantedunder discussion

描述

Topics:

  • Listing and pagination
  • Need for multi-threaded S3 crawl over keys for speed
    • Prefix-based listings, with separators
    • Hash-type prefixes with known alphabet, uniform distribution
  • Possibly: Reassigning work; using markers to optimize if alphabet is not known
  • Tallying usage by mapreduce over keys that propagate usage up by folder

https://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html

贡献者指南