envoyproxy/envoy

dynamic_modules/tls: support auto_host_sni for runtime hosts and SNI-scoped session reuse

Open

#45.962 geöffnet am 3. Juli 2026

Auf GitHub ansehen
 (5 Kommentare) (0 Reaktionen) (0 zugewiesene Personen)C++ (5.373 Forks)batch import
help wanted

Repository-Metriken

Stars
 (27.997 Stars)
PR-Merge-Metriken
 (Durchschn. Merge 8T) (378 gemergte PRs in 30 T)

Beschreibung

Title: dynamic_modules/tls: support auto_host_sni for runtime hosts and SNI-scoped session reuse

Description:

I would like to propose making auto_host_sni usable with hosts added at runtime by dynamic-module clusters, and making upstream TLS session reuse safe when one shared UpstreamTlsContext connects to multiple effective SNI names.

cc @wbpcode for visibility/context.

Today, dynamic-module clusters can add hosts dynamically, but the host-add ABI only carries socket addresses. For HTTPS upstreams using:

auto_host_sni: true
auto_sni_san_validation: true

Envoy needs a logical hostname on the selected HostDescription, separate from the concrete socket address used to connect. Without that, runtime-added hosts cannot cleanly use host-driven SNI/SAN validation without pushing per-host transport socket config through xDS.

There is also a related TLS correctness issue: upstream client TLS sessions are currently cached at ClientContextImpl scope. That is fine when a client TLS context maps to one server name, but when effective SNI varies by selected host, a session established for one SNI must not be offered to another SNI.

The proposed behavior is:

  1. Extend the dynamic-module cluster host-add API so runtime-added hosts may carry an optional logical hostname.
  2. Scope upstream client TLS session caching by effective SNI.
  3. Include the router/async host-selection support needed for this to work after async ChooseHost.

Public API/interface notes:

  • Add a dynamic-module cluster ABI path for adding hosts with optional hostnames.
  • Preserve existing address-only ABI behavior and compatibility.
  • Add or discuss TLS config surface for bounded SNI-scoped client session caching.
  • No xDS per-host transport_socket_matches should be required for the target use case.

The PoC demonstrates one dynamic-module cluster with two HTTPS upstreams. Each runtime-added host has:

  • a concrete resolved socket address for connection, and
  • a distinct logical hostname for SNI/SAN validation.

The Envoy config uses one shared UpstreamTlsContext with:

auto_host_sni: true
auto_sni_san_validation: true

Expected validation for an upstream implementation:

  • dynamic-module cluster tests cover hostnames passed through the ABI, null/empty hostname legacy behavior, and synthesized hostname preservation.
  • TLS tests cover session reuse within the same SNI and no reuse across different SNI names.
  • TLS tests cover bounded eviction and empty-SNI behavior if SNI-scoped caching is configurable/bounded.
  • router/async tests cover worker-local host resolution and transport socket option rebuild after async host selection.
  • integration or regression coverage alternates requests between at least two runtime-added HTTPS hosts with distinct hostnames and verifies:
    • both upstreams complete TLS successfully,
    • SAN validation uses the selected host hostname,
    • request order does not affect correctness,
    • session resumption remains possible within the same SNI bucket only.

I am happy to split the implementation into reviewable PRs if maintainers prefer, but I think the issue should track the full behavior because the pieces interact.

[optional Relevant Links:]

Contributor Guide