quickwit-oss/tantivy
View on GitHubGenerate meaningful SegmentIDs instead of pure random
Open
#971 opened on Jan 7, 2021
enhancementgood first issuehigh priorityquickwit
Description
Is your feature request related to a problem? Please describe. Related to #969 I would like to suggest a cheap feature which will help debugging in the future. Now SegmentIDs are generated randomly but it is a waste of 16 bytes which could be used to embed debugging info otherwise.
Describe the solution you'd like Generate SegmentID containing the following info:
- timestamp of segment creation
- segment origin (merging or writing new data)
- hash of hostname, probably useful for those who will implement sharding/replication paired with Tantivy.
Additionally here we should ensure that there is left enough randomness to avoid any possibility of collisions on the one hand, and that names are not too long to avoid metadata bloating (? not sure if it is actual, the number of segments is supposed to be relatively low by design afaik) on the other hand.