jaegertracing/jaeger

Productionize Streaming Jobs for Service Dependencies

Open

#4,590 创建于 2023年7月18日

在 GitHub 查看
 (17 评论) (0 反应) (0 负责人)Go (18,974 star) (2,326 fork)batch import
enhancementhelp wantedpossible mentorship

描述

Currently we have two analytics solutions for generating service maps:

  • Jaeger Analytics Flink
    • Real time streaming, requires Kafka.
    • More feature rich, includes code for both 1-hop and transitive dependency graphs -- https://www.jaegertracing.io/docs/1.47/features/#topology-graphs
    • Aggregates data for a given time window (originally at Uber - 15min) and writes a summary snapshot to storage
    • Not easy deployment solution is provided in the repository.
  • Spark Dependencies
    • Batch job that reads all data for a period of time, aggregates, and writes a summary snapshot to storage.
    • Does not require Kafka.
    • Theoretically can be run as frequently as 15min to produce similar results as Flink jobs above, but the implementation for Cassandra may need to be tweaked for that.
    • Does not support transitive dependency graphs.

Objectives:

  • Ideally we want a single code base that supports both types of service dependencies
  • The solution needs to be documented, packaged (e.g. published containers) and easy to deploy (e.g. with docker compose or k8s operator)
  • Supporting both batch (goes directly against span storage) and streaming (reads from Kafka) is nice to have

贡献者指南

Productionize Streaming Jobs for Service Dependencies · jaegertracing/jaeger#4590 | Good First Issue