Posts

Showing posts from August, 2011

Change Data Capture

 Change data capture (CDC) is the process of capturing changes made at the data source and applying them throughout the enterprise. CDC minimizes the resources required for ETL ( extract, transform, load ) processes because it only deals with data changes. The goal of CDC is to ensure data synchronicity. There are four methods to handle Change Data Capture(CDC): 1. Timestamp-based CDC                            Timestamp column in the source table is used to capture the date and time of the last change, whether it’s a new entry or an update to an existing row * Simple Method * Cannot identify deleted records 2. Trigger-based CDC                            Database triggers are added to the source tables so all changes (inserts, updates, deletes) are replicated in a second set of tables specifically used for the CDC process. Only the “changed” records that are captured in the CDC tables are used to update the data warehouse during the ETL process  * Complex  * Can identi