Verdict Summary
This section provides a high-level summary of whether multiple Debezium connectors can concurrently read from the same Oracle database using XStream or LogMiner. The verdict depends heavily on whether the connectors are targeting the same or different schemas. The table below outlines the support status, rationale, and key risks for each scenario.
| Mode | Allowed? | Why | Required Setup | Risks |
|---|---|---|---|---|
| XStream (Same Schema) | No | Oracle's XStream architecture implies a single client per outbound server. Multiple clients would conflict. | N/A (Unsupported pattern) | Connection errors, missed data, unpredictable behavior. |
| XStream (Different Schemas) | Yes | Supported scale-out pattern where each connector has its own dedicated outbound server. | One outbound server per Debezium connector instance. Each must be configured for a different schema/table set. | Increased database resource overhead. GoldenGate licensing costs per outbound server. |
| LogMiner (Same Schema) | It Depends | Oracle allows multiple LogMiner sessions, but without filtering, it will lead to duplicate change events. | Connectors must have disjoint (non-overlapping) table.include.list configurations to partition the workload. |
High risk of data duplication if table filters overlap or are misconfigured. |
| LogMiner (Different Schemas) | Yes | This is the standard and recommended scale-out pattern for LogMiner. | Run one Debezium connector per schema. Ensure supplemental logging is enabled for all captured tables. | Increased I/O and CPU load on the database server from multiple concurrent LogMiner sessions. |
XStream Deep Dive
Oracle XStream provides an API for a client application to receive changes from an Oracle database. The architecture is designed around a dedicated outbound server that streams Logical Change Records (LCRs) to an attached client. This section explores the concurrency constraints and supported patterns for using XStream with Debezium.
1. Concurrent Readers (Same Schema)
It is not supported to have more than one Debezium connector instance attach to the same XStream outbound server. The outbound server maintains a single stream position for its client. Two clients attempting to read and acknowledge LCRs from the same stream would lead to a race condition, where one client's progress would cause the other to miss data.
Verdict: Unsupported Configuration
Unsupported Architecture:
"An outbound server in an XStream Out configuration streams Oracle database changes to a client application. The client application attaches to the outbound server..." - Oracle Streams Documentation (implies a single client-server relationship)
2. Multiple Schemas (Scale-Out Pattern)
The correct and supported way to capture data from multiple schemas (or shard a single large schema) using XStream is to deploy multiple, independent Debezium connectors. Each connector must be configured with its own dedicated XStream outbound server. This ensures isolation and prevents any conflicts.
Verdict: Supported Configuration
Supported Architecture:
Configuration & Licensing Notes:
- Each Debezium instance requires a unique
database.out.server.name. - Each outbound server must be created and configured separately within the Oracle database.
- Using Oracle XStream requires a GoldenGate license. Each outbound server may have licensing implications.
- This pattern increases resource consumption (CPU, memory) on the database server.
LogMiner Deep Dive
The LogMiner strategy uses Oracle's built-in LogMiner utility to directly query the redo and archive logs for changes. This approach is more flexible regarding concurrent sessions but introduces a critical risk of data duplication if not configured carefully. This section details the supported patterns and necessary configurations.
1. Concurrent Readers (Same Schema)
Oracle allows multiple, concurrent LogMiner sessions against the same database logs. If two Debezium connectors are configured to capture the same schema without any further filtering, both will read the same changes and produce duplicate events downstream. This pattern is only viable if the workload is explicitly partitioned.
Verdict: Supported with caution (requires partitioning)
Configuration Notes for Partitioning:
To prevent duplicates, each connector must have a mutually exclusive set of tables.
Connector 1 Config:
table.include.list=SCHEMA.TABLE_A,SCHEMA.TABLE_B
Connector 2 Config:
table.include.list=SCHEMA.TABLE_C,SCHEMA.TABLE_D
Risk: Any overlap in the table.include.list will result in duplicate change events.
2. Multiple Schemas (Scale-Out Pattern)
Running multiple Debezium connectors where each captures a different schema is a standard, supported, and recommended scale-out pattern for LogMiner. Each connector runs an independent LogMiner session, isolating its work to the specified schema.
Verdict: Recommended Configuration
Supported Architecture:
(Schema A)
(Schema B)
(LogMiner Sessions)
"If two connectors are configured to capture changes from different tables or schemas within the same database, they can operate concurrently without interference. Each connector would only read the log entries relevant to the tables it is configured to observe." - Debezium Documentation Principles
Performance & Setup Considerations:
- Supplemental logging must be enabled on the source database for all tables being captured by any connector.
- Each active LogMiner session adds CPU and I/O overhead. Monitor database performance closely as you add more connectors.
- Ensure proper sizing of redo logs and archive log retention policies to support the combined read activity.
Scaling & High Availability (HA)
Understanding how Debezium scales is crucial for a robust deployment. The Debezium Oracle connector is a single-task connector, which has specific implications for how High Availability (HA) and horizontal scaling are achieved in an environment like Kubernetes.
Connector Parallelism: Single Task Only
The Debezium Oracle Connector does not support internal parallelism. Setting the Kafka Connect tasks.max property to a value greater than 1 will have no effect. The connector will always run as a single task.
"The spec.class names the Debezium...connector and spec.tasksMax must be 1 because that's all this connector ever uses." - Strimzi Blog (Debezium Deployment Guide)
HA vs. Scale-Out
It is vital to distinguish between High Availability and scaling out the workload.
High Availability (HA)
Achieved by running a single instance with a restart policy. If it fails, it comes back up and resumes.
(replicas: 1)
Provides: Fault Tolerance
Horizontal Scale-Out
Achieved by running multiple, independent instances, each with a partitioned workload.
(Schema A)
(Schema B)
Provides: Increased Throughput
Recommended Scaling Pattern
The only way to horizontally scale the capture process is to deploy multiple, independent Debezium Server (or Kafka Connect) instances. Each instance should be configured to handle a disjoint part of the total workload.
Implementation Strategy:
- Deploy multiple Debezium Server resources in Kubernetes.
- Each deployment should have
replicas: 1for HA. - Assign a specific schema or a non-overlapping list of tables to each deployment.
- Use
table.include.listorschema.include.listto partition the work.
- Use
- This creates multiple parallel streams, increasing overall throughput.