Support periodic DNS re-resolution for FQDN targets discovered via Consul SD
#18,387 建立於 2026年3月27日
描述
Proposal
In dynamic environments where services are accessed via FQDN (e.g., multi-tenant systems behind nginx), IP addresses frequently change due to instance rescheduling or migration.
In our setup, Prometheus uses Consul service discovery and targets are defined via FQDN. When the underlying IP changes, Prometheus does not re-resolve DNS for already discovered targets. As a result, targets become permanently DOWN even though they are healthy and reachable.
This behavior creates a significant operational issue:
- Monitoring becomes unreliable
- Alerts are triggered incorrectly
- Manual intervention is required to recover
We verified that:
- DNS updates correctly reflect new IPs
- Prometheus only re-resolves DNS when the target is rediscovered (e.g., after deregistration and ~6 minutes delay)
- Restarting Prometheus also restores correct behavior, as DNS is resolved again on startup
This demonstrates that DNS resolution is tied to discovery events and process lifecycle, rather than ongoing scraping.
In environments where IPs are ephemeral and FQDN is the only stable identifier, this limitation makes Prometheus difficult to operate reliably without additional workarounds.