[Feature][Zeta] Implement proper dry-run mode with progressive validation layers
#10681 opened on Mar 31, 2026
Description
Motivation
SeaTunnel currently lacks a meaningful dry-run capability. The existing --check flag (SeaTunnelConfValidateCommand) is effectively a stub — its execute() body contains only a // TODO: validate config using new api comment and performs no real validation beyond checking that the config file path exists.
This means users must submit jobs to a cluster (or run them fully in local mode) just to discover problems like:
- Typos in option names
- Missing required fields
- Wrong types for option values
- Unreachable source/sink systems
- Schema mismatches between source output and sink expectation
- Broken transform SQL expressions
These failures account for the majority of user-filed bugs and support requests, yet they could all be caught before a single byte of data is read or written.
Proposed Solution
Introduce a proper --dry-run mode with four progressive validation layers, each independently selectable via --dry-run=<level>:
Layer 0: Static Analysis (no network, no I/O)
Triggered by: --dry-run=static
- Config file syntax parsing (HOCON/YAML validity)
- All
Optionkeys validated against the connector's declared option set (catches typos, unknown fields) - Required options presence check
- Option value type validation (e.g., string value in an integer field)
- Plugin class loadability check (connector JAR exists and class is loadable)
- DAG topology validation (at least one Source, at least one Sink, no cycles)
- Transform SQL syntax parsing (parse-only, no execution)
Cost: milliseconds, zero external dependencies.
Layer 1: Connectivity Check (connect, no data)
Triggered by: --dry-run=connect
- Establishes connections to source and sink systems
- Validates credentials and permissions
- Verifies source tables/topics/paths exist
- Verifies sink write permissions and target existence (or createability)
- Infers source schema
- Checks source schema compatibility with sink schema (field names, type mapping)
Cost: seconds, requires live systems. Catches ~80% of real-world job failures before execution.
Layer 2: Data Sampling (read N rows, sink to memory only)
Triggered by: --dry-run=sample[=N] (default N=100)
- Reads up to N rows from the source
- Passes data through the full transform chain
- Validates transform logic on real data (NULL handling, type casts, SQL semantics)
- Writes output to memory/Console — never to the real sink
- For transactional sinks (JDBC, Kafka), opens a transaction/producer but always rolls back / does not commit
- Prints a sample of the output rows for human review
Cost: seconds to minutes depending on source. Zero side effects on sink.
Layer 3: Shadow Execution (full read, sink discarded)
Triggered by: --dry-run=shadow
- Full source read + full transform execution
- Sink replaced with a no-op implementation — no data written to target system
- Generates complete statistics: row count, null distribution, throughput, checkpoint behavior
- Useful for capacity planning and validating split enumeration logic at scale
Cost: same as a real job run. Only the final write is skipped.
Design Principles
- Fast-fail per layer: on any error within a layer, stop immediately and report the exact location (connector index, option name, field name).
- Strict no-side-effect guarantee for Layer 0–2: verified at the framework level, not relying on connector authors to remember to skip writes.
- CI/CD-friendly exit codes: exit 0 on pass, non-zero with structured output on failure, so
--dry-run=connectcan be used as a pre-deploy gate in pipelines. - Layer independence:
--dry-run=staticdoes not require live systems;--dry-run=connectdoes not read data. Users can choose the trade-off between speed and coverage.
Current State vs Target
| Layer | Target Capability | Current State |
|---|---|---|
| Layer 0 | Full option validation, DAG check, SQL parse | Not implemented (--check is a stub) |
| Layer 1 | Connectivity + schema inference + compatibility | Not implemented |
| Layer 2 | Data sampling through full transform chain | Not implemented |
| Layer 3 | Shadow full execution | Not implemented |
Suggested Implementation Priority
Layer 0 + Layer 1 should be addressed first. They require the least infrastructure investment and deliver the highest value-to-cost ratio. Most user-reported "job failed immediately" issues fall into these two layers.
Affected Modules
seatunnel-core/seatunnel-core-starter—AbstractCommandArgs,SeaTunnelConfValidateCommandseatunnel-core/seatunnel-starter—ClientCommandArgs,ClientExecuteCommandseatunnel-api— option validation frameworkseatunnel-engine— execution mode extension- All connectors (Layer 1 requires a
validateConnection()SPI method)