Modern data platforms are no longer built around slow batch ETL pipelines. Organizations today require systems that can capture data changes instantly, stream them across distributed systems, and power real-time analytics, applications, and machine learning workflows.
Change Data Capture in Practice is a hands-on guide to building modern real-time data pipelines using industry-proven technologies such as Debezium, Apache Kafka, Kafka Connect, PostgreSQL, MySQL, and modern lakehouse platforms including Apache Iceberg, Delta Lake, and Snowflake.
Rather than focusing on theory, this book takes a practical engineering approach. You will learn how to capture database changes directly from transaction logs, stream them through scalable event pipelines, process them with modern stream-processing frameworks, and deliver continuously updated datasets into analytical systems.
Inside the book, you will learn how to:
- Understand modern CDC architectures and real-time data pipelines
- Capture database changes using Debezium connectors for PostgreSQL and MySQL
- Build scalable streaming pipelines with Kafka and Kafka Connect
- Transform and process change events using modern stream-processing frameworks
- Deliver CDC data into lakehouse platforms such as Iceberg and Delta Lake
- Integrate CDC pipelines with Snowflake for modern data warehouse workflows
- Build real-time analytical queries and dashboards powered by CDC data
- Operate, monitor, and scale CDC pipelines in production environments
Each chapter includes practical explanations and hands-on labs designed to help you implement real CDC pipelines step by step. The book concludes with a full end-to-end capstone project, where you will build a complete real-time data platform-from source databases to analytics.
This book is ideal for:
- Data engineers building real-time data platforms
- platform engineers designing streaming architectures
- analytics engineers working with lakehouse systems
- software engineers building event-driven applications
- architects designing modern data infrastructure
By the end of this book, you will have the knowledge and practical skills needed to design, deploy, and operate modern Change Data Capture systems capable of powering real-time analytics and event-driven data platforms.
If you want to move beyond traditional batch ETL and build modern streaming data pipelines that operate continuously, this book will give you the practical blueprint to do it.