jaegertracing/jaeger

Productionize Streaming Jobs for Service Dependencies

Open

#4,590 opened on Jul 18, 2023

View on GitHub
 (17 comments) (0 reactions) (0 assignees)Go (18,974 stars) (2,326 forks)batch import
enhancementhelp wantedpossible mentorship

Description

Currently we have two analytics solutions for generating service maps:

  • Jaeger Analytics Flink
    • Real time streaming, requires Kafka.
    • More feature rich, includes code for both 1-hop and transitive dependency graphs -- https://www.jaegertracing.io/docs/1.47/features/#topology-graphs
    • Aggregates data for a given time window (originally at Uber - 15min) and writes a summary snapshot to storage
    • Not easy deployment solution is provided in the repository.
  • Spark Dependencies
    • Batch job that reads all data for a period of time, aggregates, and writes a summary snapshot to storage.
    • Does not require Kafka.
    • Theoretically can be run as frequently as 15min to produce similar results as Flink jobs above, but the implementation for Cassandra may need to be tweaked for that.
    • Does not support transitive dependency graphs.

Objectives:

  • Ideally we want a single code base that supports both types of service dependencies
  • The solution needs to be documented, packaged (e.g. published containers) and easy to deploy (e.g. with docker compose or k8s operator)
  • Supporting both batch (goes directly against span storage) and streaming (reads from Kafka) is nice to have

Contributor guide