enhancementhelp wantedpossible mentorship
描述
Currently we have two analytics solutions for generating service maps:
- Jaeger Analytics Flink
- Real time streaming, requires Kafka.
- More feature rich, includes code for both 1-hop and transitive dependency graphs -- https://www.jaegertracing.io/docs/1.47/features/#topology-graphs
- Aggregates data for a given time window (originally at Uber - 15min) and writes a summary snapshot to storage
- Not easy deployment solution is provided in the repository.
- Spark Dependencies
- Batch job that reads all data for a period of time, aggregates, and writes a summary snapshot to storage.
- Does not require Kafka.
- Theoretically can be run as frequently as 15min to produce similar results as Flink jobs above, but the implementation for Cassandra may need to be tweaked for that.
- Does not support transitive dependency graphs.
Objectives:
- Ideally we want a single code base that supports both types of service dependencies
- The solution needs to be documented, packaged (e.g. published containers) and easy to deploy (e.g. with docker compose or k8s operator)
- Supporting both batch (goes directly against span storage) and streaming (reads from Kafka) is nice to have