PostHog/posthog

chore(flags): inject governor Clock to make rate-limit tests deterministic

Open

#55412 opened on Apr 21, 2026

View on GitHub
 (1 comment) (0 reactions) (1 assignee)Python (16,704 stars) (988 forks)batch import
feature/feature-flagsgood first issueteam/flags-platform

Description

Background

PR #55399 fixed a 100% test failure rate on Blacksmith CI runners for the rust/feature-flags rate-limit tests. The root cause is that governor's default QuantaClock is TSC-backed, and TSC calibration differs across hypervisors (Blacksmith vs Depot), causing the token bucket to replenish faster than wall-clock seconds.

The test-only fix in #55399 widens the rate-limit windows (1/second1/minute) so that wall-clock waits dominate any TSC drift. It ships real-time sleeps up to 11 seconds, which is a reasonable CI cost but leaves two things unresolved:

  1. The tests are still non-deterministic in principle. They rely on real elapsed wall-clock time rather than a controllable clock.
  2. Production code uses the default QuantaClock by construction, so any production rate limiter with sub-minute granularity has the same TSC exposure on the same hypervisors. For per-minute-or-longer windows this is noise, but worth auditing.

Proposed approach

governor supports injecting a Clock when constructing a RateLimiter, and ships a FakeRelativeClock explicitly for tests. Introducing a clock seam lets tests be fully deterministic (advance the clock instantly) and removes wall-clock sleeps from the suite.

Rough shape:

pub struct FlagsRateLimiter<C: Clock = QuantaClock> {
    inner: RateLimiter<SomeKey, SomeState, C>,
}

impl<C: Clock> FlagsRateLimiter<C> {
    pub fn new_with_clock(quota: Quota, clock: C) -> Self { ... }
}

// tests
let clock = FakeRelativeClock::default();
let limiter = FlagsRateLimiter::new_with_clock(quota, clock.clone());
clock.advance(Duration::from_secs(60));   // no real sleep

Scope

  • Identify every governor::RateLimiter construction in rust/feature-flags/.
  • Audit which are per-team keyed vs direct; DashMapStateStore-backed keyed limiters need the same clock plumbing and the type signatures get verbose, so a short spike first is worthwhile.
  • Add a Clock-parameterized construction path for production (default QuantaClock) and tests (FakeRelativeClock).
  • Convert the tests in rust/feature-flags/tests/test_flag_definitions.rs and test_rate_limiting.rs to use FakeRelativeClock and revert #55399's window widening + multi-second sleeps where possible.
  • Consider whether any production limiter with sub-minute granularity should use MonotonicClock instead of the default QuantaClock.

Non-goals

  • Not blocking on this. #55399 is the right short-term fix; this issue is the principled cleanup.
  • Not changing rate-limit semantics for users.

Context

Contributor guide