DATA MODELING FOR DATA WAREHOUSING AND BIG DATA INTEGRATION Keynote 11:00 – 11:40
Hans Hultgren June 2016
* Register for the Data Vault Certification class CDVDM on GeneseeAcademy.com
AGENDA 1
• Agile & the Modern EDW
2
• Ensemble, DV & ELF
3
• Big Data & Data Modeling
AGILE & DATA WAREHOUSING
Data Warehousing Integrated, Non-Volatile, TimeVariant, Subject Oriented Data in support of [the Business] ⁺ ⁺ ⁺ ⁺
Agile Auditable Predictable Scalable & Repeatable
Agility: Accept Change / Embrace Change • Recognize that Change is ever-present in your data warehouse • Accept and Embrace Change
Agility is “The measure of our ability to adapt to Change” Agility is the primary feature of the enterprise data warehouse 5
DW Agility: Change Engineering • Engineering for Change means including Agility as a requirement of the data warehouse and a variable that should be optimized
• The Problem: – 3NF & Dimensional Modeling Modeling Techniques are not “Change Friendly” – Hardened Forms require Re-Engineering 6
Ensemble Modeling • Data Modeling for the Data Warehouse
• A family of data modeling approaches that are optimized for the data warehouse • Data modeling forms that are particularly strong at accommodating Change… 7
How do we Engineer for Change? • Engineering for Change in Ensemble Modeling begins with one common premise; one common approach… Every time anything changes it impacts the whole thing!
Hmmm, why don’t you separate the things that change from the things that don’t change?
8
ENSEMBLE MODELING
™ Ensemble Modeling • The constellation of component parts acts as a whole – an Ensemble.
• An Ensemble is based on all things defining a Core Business Concept that can be uniquely and specifically said for one instance of that Concept. 10
Ensemble Modeling Forms Ensemble
Anchor
Focal Point
Data Vault
Hyper Agility
Temporal
DV2.0
Your Style 2G
Anchor Vault Matter
11
Ensemble & thinking differently Customer
• The minimal construct then for an “entity” such as “Customer” is now (in data vault) a Hub with a set of Satellites Customer
12
Business Driven Modeling with EMF Store
Region
Customer
Sale Product
Em ployee
Vendor Sale LI
Store
Region
Customer
Sale Product Employee Vendor Sale LI
13
Ensemble Modeling Process The Modeling Process for creating a Data Vault model includes three primary steps: 1) Identify and Model the Core Business Concepts • Business Interviews is at the heart of this step What do you do?
What are the main things you work with?
• Also find best/target Natural Business Key
2) Identify and Model the Natural Business Relationships • Specific Unique Relationships • Be considerate of the Unit of Work and Grain
3) Analyze and Design the Context Satellites • Consider Rate of Change, Type of Data and also the Sources of your data during design process 14
Big Data & Data Modeling Modeling is Mans Search for Meaning… • • • •
Conceptual Modeling Logical Modeling Information Modeling Physical Data Modeling
Inconvenient Truth about BIG DATA
http://community.embarcadero.com/blogs/entry/the-hidden-elephant-in-big-data-modeling
And we Need Data Integration • Enterprise Data is all about Integration
EDW
• Data kept in silos is less valuable than integrated data • Data needs to be understood in order for it to be used, applied, integrated, leveraged… 17
Ensemble Logical Form (
)
Logical Business Model
• Leveraged for all logical model needs including the data warehouse, big data lake, master data management (MDM) and operational integration initiatives Region
Store Customer
Sale Product
Employee
Vendor Sale LI
18
The Ensemble Backbone
Core Business Concepts CBC
Natural Business Relationships NBR 19
Multiple Paths for Modeling Structured / Known • CBC • NBR • Attribution • Columns
N-Structured / NVP • CBC • NBR • Attribution
N-Structured / KVP • CBC • NBR
Big Data Platform • CBC • NBR • Attribution
Virtualized (DMTD) • CBC • NBR • Attribution
Big Data Modeling Ensemble for the Big Data, DMTD Virtualization & Modern EDW World • • • • •
Conceptual Modeling Logical Modeling Information Modeling Physical Data Modeling Integration Platform
+ + + +
PREDICTION • By
2020
30% of all Data
Warehousing Projects will be based on Ensemble Modeling 22
About Data Vault Ensemble
Estimated 1400 + Data Vault based Data Warehouses around the world
23
Links and Information CDVDM Training & Certification www.GeneseeAcademy.com
[email protected]
gohansgo
Book DataVaultBook.blogspot.com HansHultgren.WordPress.com HansHultgren DataVaultAcademy
Online video-lesson training
DataVaultAcademy.com 24