Optimizing Debezium Server for Oracle
A comprehensive research report on tuning Debezium Server (Kubernetes Standalone) connecting Oracle to HTTP Sinks. Focusing on Online Log Mining strategy.
Strategy: Online Log Mining
Sink: HTTP / Custom Batch
debezium.source.max.batch.size
Determines memory pressure & throughput.
Why this matters?
Debezium Server is strictly a change data capture (CDC) engine. Unlike Kafka Connect, it runs as a standalone Java process. When using HTTP Sinks, network round-trips become the bottleneck. Tuning the read buffer (Oracle) and the write buffer (HTTP Batch) is essential to prevent Backpressure or OutOfMemory errors in Kubernetes.
Configuration Tuning Studio
Select an optimization profile to generate the optimal configuration and analyze trade-offs.
Select Profile
Trade-off Visualizer
Relative performance impact estimation.
⚙ Generated `application.properties`
High Throughput
# Default Placeholder
debezium.source.connector.class=io.debezium.connector.oracle.OracleConnector
Rationale
Loading rationale...
Critical Settings Explained
- Loading details...
Technical Deep Dive
Source: Oracle Log Mining
The online_catalog strategy tells Debezium to use the database's current data dictionary. This is generally faster for startup but requires the connector to be tightly coupled to the DB state.
Polling vs. Streaming
Although Oracle LogMiner reads logs, Debezium "polls" the LogMiner view.
The poll.interval.ms does not control how often we read from the DB, but how often the connector asks the Debezium engine for a new batch.
The real work happens in the log.mining.batch.size.* parameters.
Batch Size Impact
Custom HTTP Batch Sink
A standard HTTP sink sends one POST request per event. This kills throughput. Your custom batch sink (up to 100 events) is critical for performance.
- ✓ High Throughput: Fill the batch (100). Wait for it.
- ✓ Low Latency: Reduce batch size (10-20) or add a time-flush (e.g., send batch after 50ms even if not full).
Resource Constraints
Low Memory: The internal queue max.queue.size is the biggest memory hog. If the HTTP sink is slow, this queue fills up. Reduce it to avoid OOM.
Low CPU: Parsing Oracle Redo Logs is CPU intensive. Increasing poll.interval.ms gives the CPU "breathing room" between batches, reducing context switching, but increases latency.