open-guides/og-aws
View on GitHubS3: Discuss ways to list and tally objects efficiently
Open
#58 opened on Sep 2, 2016
help wantedunder discussion
Description
Topics:
- Listing and pagination
- Need for multi-threaded S3 crawl over keys for speed
- Prefix-based listings, with separators
- Hash-type prefixes with known alphabet, uniform distribution
- Possibly: Reassigning work; using markers to optimize if alphabet is not known
- Tallying usage by mapreduce over keys that propagate usage up by folder
https://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html