DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Inmon vs. Kimball: Let’s get to the bottom of this! TDWI LA Chapter Los Angeles, CA March 9, 2010 Speaker: Bob Conway
[email protected] 303-885-4811 ©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 2
Anthony’s Pyramid
Strategic Planning
Decision Support
Management Monitoring & Control
Analysis & Reporting
Business Operations
Transaction Processing
Business Perspective
Systems Perspective
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 1
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 3
Why Data Warehousing? Transaction Processing
Analytical/Reporting
•Real-time perspective
•Historic perspective
•Detailed data
•Summarized/derived info.
•Update intensive
•Read-only
•Online updates
•Batch updates
•Batch reporting
•Online reporting
•Performance sensitive
•Flexibility priority
•Structured processes
•Ad-hoc reporting
•Stable data structures
•Evolving data structures
•Functional organization
•Cross-functional
•Clerical community
•Mgmt/analyst community
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 4
OLTP versus Reporting Design Transaction Processing
Analytical Processing
•Highly normalized
•Denormalized
•Minimal indexing
•Liberal indexing
•Transaction logging on
•Transaction logging off
•Record locking on
•Record locking off
•Individual records
•Sets of data
•Calculate derived data
•Store derived data
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 2
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 5
Traditional System Development Purchasing Application
PARTS SUPPLIERS POs
•Data belongs to an application Inventory •Point-to-point interfaces Management
PARTS ASSYs REQUISITIONS Accounts Payable
INVOICES VENDORS PAYMENTS ©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 6
Typical Legacy ‘Architecture’ • Redundant data ($) • Inconsistent, unreliable data ($$$)
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 3
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 7
Data Warehouse Concept Data Warehouse
Reporting and Analysis Read Only Snapshots Historical, Summarized Validated, Integrated
Data Transformation Layer
Extract, Translate Load
Operational Layer (source of record)
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 8
Anthony’s Pyramid •DW complement Tx Proc •DW pathway to integration •DW immediate value
Strategic Planning
DW addresses these
Management Monitoring & Control
Business Operations
Business Perspective
Decision Support
Analysis & Reporting
IE only addresses these
Transaction Processing
Systems Perspective
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 4
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 9
Inmon Architecture circa 1992
Oper DB •op detail •current •App-oriented •Unintegrated
Atomic DW Dept DW •op grain •parochial •latency •summary •subject oriented •derived data •time variant •enterprise integrated
Individual •PC-based •temporary •ad-hoc •heuristic
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 10
Relational (3NF) Design COUNTRY cntry_id
REGION region_id
DIVISION division_id
MARKET market_id
YEAR
PRODUCT_CLASS
ACCOUNT
year_id
prodcut_class_id
account_id
QUARTER INVOICE
quarter_id STATE cntry_id state_id
DEPARTMENT TERRITORY territory_id
department_id division_id
region_id
customer_id market_id postal_cd cntry_id
invoice_id
year_id
CUSTOMER
account_id
PRODUCT_CATEGORY product_category_id
MONTH
prodcut_class_id
month_id quarter_id
COUNTY county_id state_id cntry_id
PAY_PERIOD
WEEK
EMPLOYEE
pay_period_id
week_id
employee_id
PRODUCT product_id
territory_id department_id
product_category_id
month_id
month_id DAY day_id month_id week_id pay_period_id
CITY city_id county_id
ORDER order_id customer_id employee_id day_id
POSTAL_ZONE postal_cd cntry_id city_id
INVOICE_ITEM ORDER_ITEM order_id order_line_id product_id
invoice_line_id invoice_id order_id order_line_id day_id
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 5
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 11
Atomic History Table Design CSTMR_HST cstmr_dwidhst_sqnc_id 101 1 102 1 103 1 101 2
strt_ext_ts Mon night Mon night Tues night Wed night
end_ext_ts** Wed night 12/31/2999 12/31/2999 12/31/2999
hst_crf N Y Y Y
cstmr_nm Bob Joe Mary Robert
101 - Bob 101-Robert Mon Wed 29991231 No gaps or overlaps in time spans for a given DWID value ** Use high date (12/31/2999) instead of NULL for current row
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 12
Inmon Architecture circa 1996
ODS
Oper DB •op detail •current •App-oriented •Unintegrated
Oper Data Store •op grain •latency •subject oriented •volatile •enterprise integrated
EDW Enterprise DW •op grain •latency •subject oriented •time variant •enterprise integrated
Data Marts Dept DW •parochial •summary •derived data
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 6
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 13
Kimball Architecture circa 1996 3NF is great for tx processing but is inappropriate for DW Dimensional model (star schemas) Central fact table –aggregate measures Grouped by dimensions - denormalized
EDW- collection of star schemas (by subject area) with shared (conforming) dimensions.
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 14
Dimensional (Star Schema) Design D_DAY day_id week_id pay_period_id month_id quarter_id year_id
F_SALES D_EMPLOYEE employee_id territory_id region_id department_id division_id
day_id employee_id product_id customer_id sales_units sales_amount ytd_sales
D_PRODUCT product_id prodcut_class_id product_category_id
D_CUSTOMER customer_id market_id postal_cd city_id county_id state_id cntry_id
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 7
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 15
Proliferation of Data Marts ETL tools Sales
HR
Sales DM
BI Tools
Mfg DM
BI Tools
Finance DM
BI Tools
MRP
Acctg
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 16
DW Challenges in 2010
Multiple DWs-Redunant/Inconsistent Data Integration/Data Quality Performance/Scalability (ETL and BI) Maintenance/Extensibility Evolving/Expanding Reporting Needs Changing/Expanding Source Systems
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 8
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 17
Program Management Office BusReq1
Bus Req2
PMO
StrgcProj1 SP2
SP3
SP4
BusReq3 TactProj11 TP12 TP13 TP14 BusReq4 •Reconcile Business Requirements •Scope DW Projects •Justify DW Projects •Prioritize DW Projects ©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 18
Data Warehouse Concept Data Warehouse
Reporting and Analysis Read Only Snapshots Historical, Summarized Validated, Integrated
Data Transformation Layer
Extract, Translate Load
Operational Layer (source of record)
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 9
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 19
Data Warehouse Concept Order Mgmt SOR1 Extraction Transformation Load (ETL)
Oracle Billing
Data Warehouse SQL Server
SOR2 DB2 •Optimized for Tx processing •Heterogeneous technology •Redundant data •Inconsistent semantics
•Optimized for Analysis Reporting •Integrated data •Detail, Summarized, Historic •Cross-Functional •Enterprise Perspective •Flexibility, Ad-Hoc Access
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 20
Data Warehouse Architecture Order Mgmt SOR1 Oracle Billing SOR2 DB2
Extraction Filter Conform Datatype Conform Domain Error Trapping Consolidate Generate PKs Renormalize Aggregate Track History Load
Data Warehouse SQL Server
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 10
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 21
RAPID Architecture Components
As-Was
SOR1
ETL1
ACQ1
ETL2
As-Is
CNF1 ETL3
SOR2
ETL1
… •Heterogeneous •Redundancy •Inconsistency •Difficult Reports
ACQ2
ETL2
•Persistent •Historic •Non-transformed •Comprehensive* *SOR tables *All rows/cols.
Extract Change Capture Audit Columns
HIST ETL4
BASE ETL5
CNF2
•Transient •Subset •Source Layout* •Target Domains
Filter Tables/Rows/Cols Conform Data Types Conform Values Default/Error
DIM
•Persistent •Integrated •Detailed •Normalized •Non-Historic
Renormalize Match/Merge Gen DW PKs Enforce FK/RI
•Dimensional •Historic •Summarized •Derived Data
Denormalize Aggregation Derivation Time Resolved
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 22
Dimensional Layer Bus. Objects
Cognos
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 11
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 23
DW Layers - Purpose/Usage As-Was SOR1
ETL1
ACQ1
ETL2
As-Is
CNF1 ETL3
SOR2
ETL1
ACQ2
…
ETL2
HIST ETL4
BASE ETL5
CNF2
…
…
• Drill Down • Reusable BASE-DIM • Ad-Hoc Reporting • Operational Reports
• Drill Back • Error Reporting
• Simplify ACQ-BASE • Reusable CNF tables • Expedite Development • Expedite Prod Schedule
DIM
• Simplify Reports • Performance • Std. Derivations • Leverage BI Tools
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 24
Error Tracking/Reporting 1st Logic
RULES
ACQ
ETL2
CNF Inline
ETL3
BASE
Inline ERR Log
BI Tool Offline
Data Stewards
Error Reports ©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 12
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 25
DW Layers - Purpose/Usage As-Was SOR1
ETL1
ACQ1
ETL2
As-Is
CNF1 ETL3
SOR2
ETL1
ACQ2
…
ETL2
BASE ETL5
CNF2
…
…
HIST ETL4
• Drill Down • Reusable BASE-DIM • Ad-Hoc Reports • Operational Reports
• Drill Back • Error Reporting • Retrospective ETL Rules • Simplify ACQ-BASE • Reusable CNF tables • Expedite Development • Expedite Prod Schedule
DIM
• Simplify Reports • Performance • Std. Derivations • Leverage BI Tools
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 26
DW Layers - Purpose/Usage As-Was SOR1
ETL1
ACQ1
ETL2
As-Is
CNF1 ETL3
SOR2
…
ETL1
ACQ2
ETL2
…
• Drill Back • Error Reporting • Retrospective ETL Rules • Retrospective DW Scope • Stable ETL1 • Interim Reporting
BASE ETL5
CNF2
…
HIST ETL4
• Drill Down • Reusable BASE-DIM • Ad-Hoc Reports • Operational Reports
• Simplify ACQ-BASE • Reusable CNF tables • Expedite Development • Expedite Prod Schedule
DIM
• Simplify Reports • Performance • Std. Derivations • Leverage BI Tools
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 13
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 27
ACQ Layer for Interim Reporting SOR1
ETL1
Bus. Objects
ACQ1
V1 SOR2
ETL1
ACQ2
V2 SOR3
ETL1
EII Layer
Cognos
ACQ3
V3 FF4
ETL1
ACQ4
2ndry
Indexes
View Layer
SAS
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 28
Interim Reporting in ACQ Layer Rapid deployment of ‘tactical’ reports Offload OLTP reporting Unified tactical/strategic reporting Common extraction/delta process/history
Early insight/experience into the business requirement
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 14
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 29
EDW Architecture Roadmap S1
A1
S2 S3 S4
A2 A3 A4
S1
A1
St1
S2
A2 A3 A4
St2
S1
A1
St1
S2
A2 A3 A4
St2
S3 S4
S3 S4
V1
EII Project 1
V2
EDW Release 1
V3
Hist
EDW Release 2
Base Dim
Project 2
St4
Hist
EDW Release 3
Base
Project 3
Dim
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 30
Re-Acquire Legacy DW Legacy ETL
SOR
ACQ
Legacy Reports
Legacy DW
Legacy ETL
HIS
CNF BAS
SOR …
ACQ …
CNF …
DIM
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 15
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 31
Re-Conform Legacy DW
Legacy ETL
SOR
ACQ
Legacy ETL
Legacy Reports
Legacy DWs
HIS
CNF BAS
SOR …
ACQ …
CNF …
DIM
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 32
Re-Integrate Legacy DW
Legacy ETL
SOR
ACQ
Legacy DWs
Legacy Reports
HST
CNF
Re-Arch Reports
BAS SOR …
ACQ …
CNF …
DIM
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 16
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 33
RAPID Architecture Summary
Multi-layer, modular design Supports incremental development Durable for expansion Flexible for changes Leverage reusability Strategic and tactical solutions Ad-hoc and structured reporting Technology neutral ©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 34
Questions
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 17
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 35
Bill Inmon on Ralph Kimball The Data Warehouse Toolkit, 1996 Forward: “…Kimball’s stark cognizance and revolutionary approaches…have been tested in crucible of reality. DW Toolkit is one of the definitive books of our industry and mandatory reading for IT professionals…to successfully and profitably conduct business. W.H. Inmon Prism Solutions Aug. 27, 1995
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 36
Retrospective Scope Expansion Release 1
Release 2
Time
Release 3
Release 4
Incremental data elements ©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 18
DW Boot Camp
DW Overview
βA
reckenridge cademy Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 37
ACQ Layer Storage Management XYZ 200 tables •90 SOR •110 Non-SOR
ETL1
ACQ_XYZ 90 SOR •15 In Scope •75 Out Scope
Monthly Archive (Out Scope, CRF=N) (In Scope >120d)
ETL2
CNF_XYZ 15 In Scope
Restore (On-Demand) (Specify tables and dates) Off Line Storage
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Data Warehouse Boot Camp
DW Overview, Page 38
Data Warehouse Critical Success Factors Incremental/ Opportunistic
Technology/ Tools
DW C.S.F.s
Business/IT Partnership
Architecture/ Methodology
Governance Resources/ Organization
©Copyright 2010, Breckenridge Academy. All rights reserved.
©Copyright 2010, Breckenridge Academy. All rights reserved. 19