Common Warehouse Metamodel (CWM), UML and XML Dr. Daniel T. Chang IBM Database Technology Institute Chair, OMG CWM Working Group (
[email protected]) Meta Data Conference, March 19-23, 2000
Topics • Why CWM? • What is CWM? • CWM and UML • The CWM Metamodel • CWM and XML • CWM Extensions • Conclusion Dan Chang
Why CWM?
Dan Chang
The Problem Domain: Data Warehousing/Business Intelligence Cleansing Tools P R O D U C E
Relational Warehouse End User Tool METADATA
Interchange METADATA
C O N S U M E
End User Tool
Transformation Tools
OLAP Warehouse
Dan Chang
End User Tool
The Problem • I/T Perspective – Product integration challenges are immense – End user requirements are continually changing
• User Perspective – We can’t find the information that we need – The interpretation of information is a challenge
• Vendor Perspective – No industry standard way of sharing information – Metadata integration costs are significant Dan Chang
Problem Statement • Metadata management and integration is the number one integration problem in data warehousing and business intelligence. – Data warehousing and business intelligence often involve the use of a variety of tools and products, each with its own definition and format for metadata. – Creating, sharing and managing the metadata for these tools and products is time consuming and error prone. Dan Chang
The Solution: CWM Cleansing Tools P R O D U C E
Data Warehouse End User Tool Common Warehouse Metamodel
Repository Services
C O N S U M E
End User Tool
Transformation Tools
RDBMS
Dan Chang
End User Tool
Solution Framework • A successful framework for solving the metadata management and integration problem must provide: – A standard language for defining the structure and semantics of metadata in a formal way – A standard interchange mechanism for sharing metadata defined in the standard language – A common specification that defines, in the standard language, the structure and semantics of shared metadata in data warehousing and business intelligence Dan Chang
Solution Requirements: CWMI RFP • Interchange of all warehouse metadata including both technical metadata and business metadata • Interchange of metadata that describes all warehouse data elements including data sources, transformations and data targets • Interchange of metadata that describes all warehouse processing elements including scheduling, status reporting and history recording Dan Chang
Solution Requirements: CWMI RFP • Interchange of metadata that describes informational data and the use of major types of informational data models (such as multidimensional and hierarchical classification) for representing informational data • Interchange of metadata that describes operational data and the use of major types of operational data models (such as relational, object-oriented and hierarchical) for representing operational data Dan Chang
What is CWM?
Dan Chang
CWM • A complete specification of the syntax and semantics needed to export/import shared warehouse metadata and the common warehouse metamodel, including: – The CWM Metamodel (Volume 1) – Interchange format for shared warehouse metadata (CWM DTD, Volume 2) – Interchange format for the CWM Metamodel (CWM XML, Volume 2) – Access API for shared warehouse metadata (CWM IDL, Volume 2) Dan Chang
CWM Design Basis (I) • OMG Metamodeling Architecture The best starting point for developing a solution framework – – – –
Metamodeling language (M3) Metamodels (M2) Metadata or models (M1) Data or objects (M0)
Dan Chang
OMG Metamodeling Architecture A P P L I C A T I O N
M I D D L E W A R E
Dan Chang
User Data/Object Layer (M0)
Metadata/Model Layer(M1)
Metamodel Layer(M2)
Meta-metamodel Layer (M3)
Stock: name, price
UML: Class, Attribute CWM: Table, Column ElementType, Attribute
MOF: Class, Attribute, Operation, Association
OMG Modeling Architecture Applications, Tools, Repositories Metamodels (UML, CWM, …) Meta Object Facility (MOF Model, MOF-IDL) XML Metadata Interchange (XMI)
Dan Chang
CWM Co-submitting Companies • IBM (Dan Chang, J. J. Daudenarde, Debra LaVergne, Christoph Lingenfelder) • Unisys (Sridhar Iyengar, Don Baisley, Doug Tolbert) • NCR (Vihelm Rosenqvist, Bruce McLean) • Hyperion (John Poole, David Zhang) • Oracle (David Last, David Mellor, Mark Hornick) • UBS (Hans-Peter Hoidn, Jeffrey Peckham) • Genesis (David Frankel, Phil Longden) • Dimension EDI (Chris Nelson, Anders Tornqvist) • Expertise in UML, XML, metadata repository, databases, data warehousing, and business intelligence (OLAP, data mining) Dan Chang
CWM Supporting Companies • • • • • • •
Deere (Dave Smith) Sun (Chuck Mosher, Karsten Riemer, Nidhi Rao) HP (Jishnu Mukerji) Data Access (Cory Casanave) InLine Software (Jack Greenfield) Aonix (Charles Simon) Hitachi (Yuichi Yugawa)
• Expertise in using databases, data warehouses, and business intelligence tools Dan Chang
CWM and UML
Dan Chang
Why UML - Interchange? • Why not XML DTD or XML Schema? – XML DTD • Primitive data model: no inheritance, no operations, no associations, no constraints • Only string data types – XML Schema • Same primitive data model as above • Richer data types
• UML – Rich object-oriented model with associations/constraints – Extensible data types: CORBA (MOF), etc. – UML => XML DTD (per XMI)and/or XML Schema (per XMI, coming soon) Dan Chang
Why UML- Access? • Why not CORBA IDL or Java? – CORBA IDL • Interface only, little structure or semantics • CORBA data types – Java • Java object model • Java data types
• UML – Rich object-oriented model with associations/constraints – Extensible data types: CORBA (MOF), etc. – UML => CORBA IDL (per MOF) and/or Java (per JMI, coming soon) Dan Chang
UML 1.3 Behavioral_Elements (from UML) Activity_Graphs (from Behavioral_Elements)
Collaborations
Use_Cases (from Behavioral_Elements) (from Behavioral_Elements)
State_Machines (from Behavioral_Elements)
Model_Management (from UML) Common_Behavior (from Behavioral_Elements)
Foundation (from UML) Extension_ Mechanisms
Core (from Foundation)
(from Foundation)
Data_Types (from Foundation)
Dan Chang
CWM Design Basis (II) • OMG Metamodeling Architecture – UML as the standard language for defining models of metadata • UML semantics (Class, Attribute, Operation; Association, Role/AssociationEnd; Constraint) • UML notation (class diagram, object diagram, collaboration diagram) • OCL (Object Constraint Language)
Dan Chang
Roles of UML in CWM • The MOF-equivalent meta-metamodel – UML Semantics, UML Notation, OCL
• The foundation metamodel – UML Foundation, Common_Behavior, and Model_Management packages
• The object (resource) metamodel – Same as above
Dan Chang
The CWM Metamodel
Dan Chang
The CWM Metamodel Warehouse Process
Management
Analysis
Resource
Foundation
Transformation OLAP
Object (UML)
Relational
Warehouse Operation Data Information Business Mining Visualization Nomenclature
Record
Multi Dimensional
XML
Business Data Keys Type Software Expressions Information Types Index Mapping Deployment
UML 1.3 (Foundation, Behavioral_Elements, Model_Management) Dan Chang
CWM - Top Level UML
The top-level packages in CWM: org.omg.uml org.omg.cwm
CWM
Dan Chang
{ UML 1.3 }
CWM - Overview Foundation
Management
Dan Chang
Resource
Analysis
CWM - Resource UML
Dan Chang
Relational
Record
Multidimensional
XML
CWM - Analysis Transformation
Olap
Dan Chang
InformationVisualization
DataMining
BusinessNomen clature
Relational Metamodel Classifier
Package
Attribute
(from Core)
(from Model_Management)
(from Core)
Method (from Core)
Class (from Core)
Catalog
Schema
Table
ColumnSet
SQLQuery
SQLDataType
SQLDistinctType
Column
SQLStructuredType
SQLSimpleType BaseTable
Dan Chang
View
Procedure
OLAP Metamodel Package
Class
Attribute
(from Model_Management)
(from Core)
(from Core)
Schema
Dimension
Cube
CubeDimAssoc
Hierarchy
CubeRegion
MemberSelection
Level
HierarchyLevelAssoc
TransformationMap (from Transformation)
CodedLevel StructureMap MemberSelGrp
Dan Chang
Measure
Transformation Metamodel TransformationMap 0..1
/namespace /ownedElement
*
ClassifierMap classifierMap
source
*
1..*
*
1..* classifierMap
0..1 classifierMap
target
classifierMap
0..1 *
featureMap
FeatureMap
Feature
source
featureMap
1..*
*
(from Core)
1..*
*
target
featureMap cfMap
Classifier (from Core)
*
ClassifierFeatureMap
cfMap
classifier
Classifier (from Core)
1..*
* cfMap
feature
Feature (from Core)
*
Dan Chang
1..*
OLAP Deployment to Relational /source 1..*
* ClassifierMap
Class
(from Transformation)
Transformation
(from Core)
* * 0..1 /ClassifierMapToFeatureMap
(from Transformation)
1..* /target
* ColumnSet
FeatureMap
(from Relational)
(from Transformation)
0..1
0..1
*
* TransformationMap (from Transformation)
/source
/target 1..*
1..* Attribute
0..*
(from Core)
Column (from Relational)
DimensionedObject
*
(from Multidimensional)
0..*
0..1
Table (from Relational)
*
Dimension
Dimension
1
*
Hierarchy
(from Multidimensional)
{ ordered} 1 * currentLevel
Level
1 * StructureMap
HierarchyLevelAssoc *
Dan Chang
0..1
LevelBasedHierarchy * {ordered}
1
CWM Resource/Analysis Matrix UML
Package
Class
Attribute
Object
Package
Class
Attribute
Table
Column
RecordFile
RecordDef
Field
Schema
Dimension
XML
Schema
ElementType
OLAP
Schema
Relational
Record
Multidimensional
Dan Chang
Catalog
Schema
Dimension
Cube
DimensionedObject
Attribute
Attribute
Measure
CWM Specification CWM Metamodel (in UML Notation)
CWM Specification
Dan Chang
CWM and XML
Dan Chang
XML Metadata Interchange (XMI) • Use W3C XML for the transfer syntax and interchange format – Specify XML DTD to enable transfer and verification of • MOF-based metamodels (using MOF DTD) • UML-based models (using UML DTD)
• Specify a precise MOF to XML Mapping – Enables automatic generation of XML DTDs – Enables automatic generation of XML documents Dan Chang
XMI Simplified XML
UML 1.3 DTD
Syntax and Encoding
MOF Metadata Definitions & Management
UML Modeling Language Dan Chang
X M I
CWM DTD
MOF 1.3 DTD
XML DTDs Validate
UML UML Models
CWM UML Metadata
MOF UML Metamodels
XML Documents
XMI - Automobile Example XMI DTD
UML Model
Auto Color : String Door : Integer Engine : Integer
XMI
X M
I
XMI Document Red 4 2 Dan Chang
CWM Design Basis (III) • OMG Metamodeling Architecture – UML as the standard language for defining models of metadata – XMI as the standard mechanism for interchanging metadata and metamodels in XML • XML DTD Production Rules • XML Document Production Rules
Dan Chang
CWM Specification: CWM XML, CWM DTD CWM Metamodel (in UML Notation)
CWM Specification
Dan Chang
MOF DTD
CWM XML
CWM DTD
CWM Metadata Interchange (in XML)
CWM Extensions
Dan Chang
CWM Extensions (CWMX) • Published vendor specific metamodel for the purpose of metadata interchange (Volume 3) • Common ancestry in the CWM metamodel • Demonstrates the validity of the CWM metamodel • Demonstrates the extensibility of the CWM metamodel Dan Chang
CWMX - Top Level The top-level packages in CWMX: UML
CWM
CWMX
Dan Chang
org.omg.uml { UML 1.3 } org.omg.cwm org.omg.cwmx
CWMX - Foundation UML
Entity Relationship
Dan Chang
CWMX - Resource UML
Relational
COBOLData
Dan Chang
Record
IMS
Multidimensional
DMSII
XML
Essbase
Express
CWMX - Analysis Transformation
Olap
clature InformationSet
Dan Chang
InformationVisualization
DataMining
BusinessNomen
InformationReporting
Conclusion
Dan Chang
CWM Design Basis • OMG Metamodeling Architecture – UML as the standard language for defining metamodels – XMI as the standard mechanism for interchanging metadata and metamodels in XML – MOF to IDL Mapping as the standard mechanism for accessing metadata through APIs (independent of programming languages and object models) – MOF to Java Mapping as the standard mechanism for accessing metadata in Java (coming soon) Dan Chang
CWM • A common specification that defines, in UML, the structure and semantics of shared metadata in data warehousing and business intelligence – Resources: Object, Relational, Record, Multidimensional, XML – Analysis: Transformation, OLAP, Data Mining, Information Visualization, Business Nomenclature
• A common specification that defines, in XML, the interchange format and, in IDL, the access API of such shared metadata Dan Chang
CWM Specification: CWM XML, CWM DTD, CWM IDL CWM Metamodel (in UML Notation)
CWM Specification
MOF DTD
CWM XML
CWM DTD
CWM IDL
Dan Chang
CWM Metadata Interchange (in XML)
CWM Metadata Access
CWM • Enables interchange and access of shared metadata at three abstraction levels UML
Object:Class, Attribute
CWM
Relational:Table, Column XML:ElementType, Attribute ...
CWMX
Dan Chang
ER: Entity, Attribute Your own ...
CWM: Past, Present and Future • Past – Initial submission: 9/17/99 – OMG Demo: 11/99 – Evaluation: 9/99 - 1/00
• Present – Final submission: 2/11/00 – Evaluation: 2/00 - present
• Future – Adopted specification: 3/00 or 6/00 – Available specification: 9/00 or 12/00 – Product enablement and interoperability showcase Dan Chang
November 1999 OMG Demo
Dan Chang
CWM References • OMG CWMI RFP Web page http://www.omg.org/techprocess/meetings/ schedule/CWMI_RFP.html – CWM Specification (ad/2000-01-01) – CWM Specification Volume 2. XML, IDL and DTD (ad/2000-01-02) – CWM Specification Volume 3. Extensions (CWMX) (ad/2000-01-03) – CWM Specification Volume 4. Extensions XML, IDL and DTD (ad/2000-01-11)
• CWM Forum Web site http://www.cwmforum.org/ Dan Chang
Dan Chang