[flyte2] Add gRPC/Connect RPC metrics interceptor to the runs service
#7,447 opened on May 29, 2026
Description
Part of #7445. Depends on #7446 (the
/metricsendpoint +Scopemust exist first).
Summary
Add RPC-level Prometheus metrics (request count, error count, latency) to the runs service by attaching a shared Connect interceptor to every service handler.
Background
The runs service is a Connect (connectrpc.com/connect) server. Handlers are mounted in runs/setup.go via calls like:
runsPath, runsHandler := workflowconnect.NewRunServiceHandler(runsSvc)
sc.Mux.Handle(runsPath, runsHandler)
There are currently no interceptors anywhere in the v2 tree, so no RPC metrics are emitted.
What to do
-
Write a Connect interceptor (a
connect.UnaryInterceptorFunc/connect.Interceptor) that records, per RPC procedure:- request count (e.g.
requests_totallabeled by procedure) - error count (labeled by procedure, and ideally
connect.CodeOf(err)) - latency (a Prometheus histogram /
Scope.MustNewStopWatchstyle timer)
Use the
sc.Scopeprovided by #7446 to create the metrics (e.g. a sub-scopesc.Scope.NewSubScope("grpc")). - request count (e.g.
-
Pass the interceptor to every
New*ServiceHandler(...)call inruns/setup.goviaconnect.WithInterceptors(...), e.g.:interceptors := connect.WithInterceptors(metricsInterceptor) runsPath, runsHandler := workflowconnect.NewRunServiceHandler(runsSvc, interceptors)Apply it to RunService, InternalRunService, TaskService, IdentityService, AuthMetadataService, TriggerService, ProjectService (and RunLogsService when mounted).
Acceptance criteria
- After making RPC calls,
/metricsexposes per-procedure request count, error count, and latency metrics. - The interceptor is shared/created once and reused across all handlers.
- A unit test verifies the interceptor increments the request counter (and error counter on error) for a sample procedure.
Pointers
runs/setup.go— all thesc.Mux.Handle(...)registrations (lines ~78-120+).- Connect interceptor docs: https://connectrpc.com/docs/go/interceptors/
flytestdlib/promutils/scope.go—Scopehelpers (MustNewCounter,MustNewStopWatch,NewSubScope, etc.).
Notes for contributors
- Keep label cardinality bounded — label by procedure name and status code, not by arbitrary user input.