Description
Background
IBM Informix is a relational database widely used in financial institutions, retail chains, and legacy enterprise systems, especially in banking core systems and point-of-sale (POS) applications.
SeaTunnel currently supports Informix via the JDBC connector (dialect-based). However, JDBC-based reading cannot provide CDC, and it cannot leverage Informix-specific capabilities (for example, fragmentation-aware reads) for better performance.
Motivation
- Legacy system integration: Informix remains mission-critical in many industries.
- Multi-table CDC: Sync multiple tables in one job to modern platforms.
- Performance: JDBC cannot utilize Informix fragmentation/parallelism effectively.
- Near real-time: CDC is required for timely analytics and replication.
Current Status vs. Proposed Enhancement
| Feature | Current (JDBC Connector) | Proposed (Informix-CDC Source) |
|---|---|---|
| Read | Snapshot | Snapshot + CDC |
| CDC | No | Yes (API / log / trigger based) |
| Multi-Table | Limited (depends on connector) | Yes (CDC multi-table patterns) |
| Optimizations | Generic | Informix-aware (when available) |
| Data Types | Basic (limited) | SERIAL8, LVARCHAR, Smart LOBs |
Proposed Solution
Implement a dedicated Informix-CDC Source connector.
Why not table_list?
For CDC connectors in SeaTunnel, multi-table selection and per-table overrides are typically done via:
table-names/table-pattern(choose tables)table-names-config(per-table overrides such as custom PK and snapshot split column)
This is already established in MySQL-CDC and the shared CDC base config model. Using table_list here would introduce a third multi-table style and create confusion around parameter naming (for example primary_keys vs primaryKeys).
If we need a batch-style multi-table snapshot/incremental reader for Informix, it should continue to align with JDBC Source and use table_list there (separate from this CDC connector).
Goals
- Align with existing CDC connector configuration patterns (
table-names,table-pattern,table-names-config). - Support snapshot + CDC with clear offset/checkpoint semantics.
- Improve read performance with Informix-specific optimizations when possible.
Non-goals (for the initial version)
- Cover every Informix edition-specific CDC feature in one release.
- Replace JDBC Source for batch reads.
Core Features
-
Multi-Table Support (CDC-style)
- Use
table-namesfor explicit table lists. - Use
table-patternfor regex-based table selection. - Use
table-names-configfor per-table overrides (for exampleprimaryKeys,snapshotSplitColumn).
- Use
-
Change Data Capture (CDC)
- Option A: CDC API (Enterprise Edition), preferred for low latency.
- Option B: Log-based CDC (logical logs), for Standard Edition if feasible.
- Option C: Trigger-based CDC as a fallback.
-
Type Mapping
- Support Informix-specific types (for example, SERIAL8, LVARCHAR, Smart LOBs) beyond basic JDBC mappings.
-
Performance Tuning
- Fragmentation-aware and parallel reads where applicable.
- Standard connector tuning parameters such as
fetch_size.
Configuration Examples
CDC (Multi-Table, table-names + table-names-config)
env {
parallelism = 4
checkpoint.interval = 5000
}
source {
Informix-CDC {
# Connection
url = "jdbc:informix-sqli://informix-server.example.com:9088/stores_demo:INFORMIXSERVER=ol_informix"
username = "informix"
password = "******"
# Multi-table selection
table-names = ["stores_demo.customer", "stores_demo.orders", "stores_demo.items"]
# Per-table overrides (custom PK / snapshot split column)
table-names-config = [
{
table = "stores_demo.customer"
primaryKeys = ["customer_num"]
snapshotSplitColumn = "customer_num"
},
{
table = "stores_demo.orders"
primaryKeys = ["order_num"]
snapshotSplitColumn = "order_num"
}
]
# Startup mode (example naming; align with existing CDC connectors)
startup.mode = "initial"
fetch_size = 1000
}
}
sink {
Console {}
}
CDC (Regex-based multi-table via table-pattern)
source {
Informix-CDC {
# Connection omitted (same as above)
table-pattern = "stores_demo\\..*"
startup.mode = "initial"
}
}
Technical Considerations
- Configuration alignment: Follow existing CDC connector patterns and the CDC base config model.
- Table discovery: Use a stable table identifier (for example
TableId) to matchtable-names-configentries reliably. - Snapshot split strategy: Use
snapshotSplitColumnand align split semantics with CDC base.