SCD Type 2 in MySQL/ coforge

 To implement SCD Type 2 in MySQL, you typically use SQL scripts to:

  1. Detect changes between source and target

  2. Expire old rows in the dimension table

  3. Insert new rows with updated values

Assume two tables:

  • customer_src (source/staging table)

  • customer_dim (dimension table with SCD Type 2 tracking)

CREATE TABLE customer_dim (
    customer_sk INT AUTO_INCREMENT PRIMARY KEY,
    customer_id INT,
    name VARCHAR(100),
    address VARCHAR(255),
    start_date DATE,
    end_date DATE,
    is_current CHAR(1) -- 'Y' or 'N'
);

Step 1: Identify Changes (New or Modified Data)

-- Find records in source that are new or have changed
CREATE TEMPORARY TABLE scd2_changes AS
SELECT s.*
FROM customer_src s
LEFT JOIN customer_dim d
  ON s.customer_id = d.customer_id AND d.is_current = 'Y'
WHERE d.customer_id IS NULL
   OR s.name <> d.name
   OR s.address <> d.address;

Step 2: Expire Existing Records

-- Update old records as expired (for changed customers)
UPDATE customer_dim
SET end_date = CURDATE(),
    is_current = 'N'
WHERE customer_id IN (SELECT customer_id FROM scd2_changes)
  AND is_current = 'Y';

Step 3: Insert New Versions

-- Insert new records for changed or new customers
INSERT INTO customer_dim (
    customer_id, name, address, start_date, end_date, is_current
)
SELECT
    customer_id, name, address, CURDATE(), NULL, 'Y'
FROM scd2_changes;

Optional: Clean Up

DROP TEMPORARY TABLE IF EXISTS scd2_changes;

Output Example

Before:


customer_id | name | address | start_date | end_date | is_current -------------------------------------------------------------- 101 | John | NY | 2021-01-01 | NULL | Y

After John moves from NY to LA (in source):


customer_id | name | address | start_date | end_date | is_current --------------------------------------------------------------------- 101 | John | NY | 2021-01-01 | 2025-06-28 | N 101 | John | LA | 2025-06-28 | NULL | Y


Comments

Popular posts from this blog

sql question set 1

3 days running average / capco