Generic Data Model Pattern for Data Warehouse

2011 International Conference on Electrical Engineering and Informatics 17-19 July 2011, Bandung, Indonesia D3 - 2 Generic Data Model Pattern for Da...

Author: Jocelyn Porter

3 downloads 1 Views 781KB Size

Report

Download PDF

Recommend Documents

Object-Oriented Data Model for Data Warehouse

Data Warehouse Scenarios for Model Management 1

N S W. Soap PRODUCT. JanFeb MONTH. Client. Client. Client. model. schema. Data. Store. Data. Enterprise model. Data. Warehouse. Warehouse

Handling Duplicate Data in Data Warehouse for Data Mining

2. Data Warehouse and OLAP. 2. Data Warehouse and OLAP. Chunping Li. Definition of Data Warehouse. Requirement of Data Warehouse

Data warehouse: conceptual design

Data warehouse development

Financial Aid Data Warehouse

Data Warehouse Part 01

Data Warehouse Schemas

Lecture Data Warehouse Systems

DATA WAREHOUSE. Turin -Italy

6. Data warehouse optimization

Data Warehouse and OLAP

Simple Generic Data Structure

Generic Data. Controlador Controller

Documentation and evaluation of data environments for data warehouse

Simple Generic Data Structure

Data Warehouse Modeler: A CASE Tool for Warehouse Design

OLAP AGGREGATION FUNCTION FOR TEXTUAL DATA WAREHOUSE

Metrics for data warehouse conceptual models understandability

Data Warehouse Architecture for Army Installations

UML for data warehouse dimensional modeling

Conceptual Modeling Solutions for the Data Warehouse

2011 International Conference on Electrical Engineering and Informatics 17-19 July 2011, Bandung, Indonesia

D3 - 2

Generic Data Model Pattern for Data Warehouse Pocut Viqarunnisa #1, Hira Laksmiwati#2, Fazat Nur Azizah#3 #

Data and Software Engineering Research Group (DSE-RG)

Sekolah Teknik Elektro dan Informatika (STEI), Institut Teknologi Bandung (ITB) Jln. Ganesha no. 10, Bandung, Indonesia 1

[email protected] 2 [email protected] 3 [email protected]

Abstract— Useful decision-making information can be proceed through a subject-oriented data warehouse in which it will store an integrated, time-variant, and non volatile collected data. The key to find such data warehouse is to have a good data model that defines the structure of data kept in the data warehouse. Actually the quality of correctness and completeness of an information depends on how well the data model is constructed. One way to get a good data model is by utilizing patterns. This research derived eighteen patterns of generic data model of a warehouse which can be used and chosen. They are created based on analysis of data warehousing needs, existing patterns, and Kimball’s case studies. To measure the level of reusability of the patterns four metrics are defined. Two metrics related to flexibility and two metrics related to comprehensibility. The test result on the pattern reusability shows that the flexibility metrics score are adequate, while the comprehensibility metrics score are almost perfect. The patterns occur in different frequencies test has involving two case studies. It concluded that patterns which are associated with the changes in dimensions, product heterogeneity and multi valued attributes are seldom or almost never used. Further patterns that are used frequently are patterns related with dimension tables, especially generic dimension pattern and date pattern.

other hand, a generic data model pattern is developed to model data at a higher level of abstraction, so that its uses do not depend on the particular application domain [1] [2] [3]. Data warehouse is a data storage medium which has the purpose to produce an accurate and useful information for decision making process. To produce high quality information, the data warehouse is required to have a good data model. This data model determines the structure of data stored in the data warehouse. Thus, the accuracy of the information depends on how the data model is structured. One way to provide good data model is by utilizing data model patterns. In [2], [3], and [9], generic data model patterns have been developed based on best practices of data modelling for data warehouse. However, these patterns consider only the structure of the data model for the data warehouse without concerning on all business needs. To produce better quality information, a data warehouse should also be considered on business needs. This research is aimed at developing generic data model patterns for data warehouse based on not only the structure of the data model for a data warehouse, but also the underlying business requirements.

Keyword— Data warehouse, model data, pattern, reusability

I. INTRODUCTION Patterns have been developed in many areas of software engineering in order to produce high-quality solutions in short amount of time. Patterns can simplify analysis and design processes by reusing existing solutions of particular problems. A collection of patterns that are widely used in software design is design pattern [5]. Data modelling is an area of software engineering in which patterns have been developed in order to assist data design process. In general, data model patterns are divided into two groups: domain-specific pattern (seed model) and generic data model pattern. A domain-specific pattern is developed with a low level of abstraction, so that it is specifically formed to a particular domain [4] [6]. On the

978-1-4577-0752-0/11/$26.00 ©2011 IEEE

II. RELATED STUDIES Data warehouse pattern or transaction summary for data warehouse pattern is one pattern that is defined to describe the structure of the data warehouse [2]. This pattern is a structure oriented pattern that solves the problem of data modelling by offering a specific data model structure. Star schema pattern is a data model pattern that is commonly applied in data warehouse application [3]. This pattern is a structure-oriented pattern that represent data as facts related to dimensions. Facts measure business performance or aspects of the business, while the dimension specifying the basis of facts. This structure is simple, so easy to use for ad-hoc querying to get an idea

about the enterprise. Primary key of each dimension into foreign keys in fact table and this combination becomes the primary key of facts table. Star is a pattern for data warehouse [9] which constructed in two parts. The first section discusses the issue of finding and managing relevant factors of the business that need to be analysed The second part relates to implement these factors in a star schema to query the system (Fig. 1).

6.

Two candidate patterns to deal with product diversity and the diversity of the product attributes: heterogeneous product pattern and multi-valued attribute pattern.

IV. GENERIC DATA MODEL PATTERNS Based on the pattern candidates, 18 generic data model patterns for data warehouse patterns are defined. The patterns are classified into 2 main patterns and 4 groups of patterns. The relationships among the patterns can be observed in Fig. 2.

Fig. 1 Generic STAR Schema [9]

III. DEVELOPMENT OF THE PATTERNS We define several stages for pattern development: top down analysis, bottom up analysis, identify the pattern candidate, design pattern, then organize and identify the pattern. In this research, top down analysis was conducted on data warehouse patterns [2], star schema patterns [3], and the Stars [PET97]. These three collections of patterns encompass the same basic structures. Based on the study on those patterns in accordance to the characteristics of a data model for a data warehouse, we defined four candidate patterns: transaction history, rollup dimension pattern, time pattern, and stars pattern. Bottom-up analysis is carried out to find the common things that are present in several cases on data ware house modelling. In total, 14 cases presented in [8] are studied in this research. Based on the study, we found several Generic star schema pattern: obtained from general structure of star schema including the basic concepts of dimensional modelling 1. 2.

3.

4. 5.

The data warehouse bus matrix pattern: a candidate pattern to describe the state of the entire enterprise and support consistent information of organization. Four candidate patterns related to the fact tables in dimensional modelling concept: transaction level fact table pattern, periodic snapshot fact table pattern, accumulation snapshot fact table pattern and fact-less fact table pattern. Six candidate patterns related to the dimension table: generic dimension, date dimension pattern, degenerate dimension pattern, role playing dimension pattern, junk dimension pattern, and outrigger dimension pattern. Two candidate patterns associated with the problem of multiple level hierarchies: multiple level hierarchy pattern and variable depth hierarchy pattern. Two candidate patterns related to how facts change: slowly changing dimension pattern and rapid changing dimension pattern.

Fig. 2 Relationship among the generic pattern 1.

Generic Star Schema Pattern

Generic star schema pattern is designed to meet the characteristics of the subject-oriented data warehouse, designed for the purposes of analysis (slicing and dicing), and easily understood by the business community. The basic structure of this pattern is found in the data warehouse pattern [2], star schema pattern [3], stars [9], and also the general structure of the dimensional schema. The complete structure of this pattern is illustrated in Fig. 3. Each dimension table must have a primary key that will be referred to by a foreign key in the fact table as well as several supporting attributes. The combination of the foreign keys referred to the dimension tables become a primary key of the fact table. This means that each fact (row) recorded in the fact table is related to the combination of the referred values in the dimension tables. On each of these facts, information on several measurements can be added 2. Data Warehouse Bus Matrix Pattern Data warehouse bus matrix pattern is designed to support the production of consistent organizations information and describes the state of the entire enterprise. To accomplish consistency of information throughout the enterprise, each dimension and fact on every subject must have the same level of granularity. This is achieved by providing conformed dimensions and conformed facts. To ensure that all business needs are met, it is required to map each [business] subject with each common dimension. The mapping is carried out by using the bus matrix as shown in Fig. 4 which is a generic form of bus matrix presented by Kimball in [8]

Dimension 2

Dimension 1 Dim1 Key (PK) Dim1 Attributes...

Dimension 3 Dim3 Key (PK) Dim3 Attributes...

Dim2 Key (PK) Dim2 Attributes...

Fact Dim1 Key (FK) Dim2 Key (FK) Dim3 Key (FK) Dim... Key (FK) Facts...

Dimension ... Dim... Key (PK) Dim... Attributes...

Fig. 3 Generic Star Schema Pattern

Subject

Common Dimensions Dimension 1

Dimension2

Dimension3

Dimension4

Dimension..

from one state to another. One record on this fact table describe a series of circumstances or subprocesses of a long process. Because this pattern describes a series of objects, the pattern is related to role playing dimension pattern (see section 7) to describe several references to time dimension in the fact table. Because each row in the fact table represents a fact related to each sub process, it presents more detailed level than the transaction level fact o periodic snapshot fact. This pattern is usually marked by naming of fact table with accumulating fact or the name of process flow. The general form of this pattern can be observed in Fig. 7.

Subject1 Subject2 Subject3 Subject4 Subject...

Fig. 4 Data Warehouse Bus Matrix Pattern

3. Transaction Level Fact Table Pattern Transaction level fact table pattern is a pattern associated with the fact table. This pattern handles transaction level events. Each transaction event become one record in the fact table. This pattern is usually marked by the naming of the fact table with the transaction fact or entry fact. The fact that appears in the fact table shows the number of transactions for a single event. Transaction level fact table pattern must have the dimension of time as shown in Fig. 5.

Fig. 7 Accumulation Snapshot Fact Table Pattern

6. Fact-less Fact Table Pattern Fact-less fact table pattern is a pattern associated with fact table. This pattern is intended to record an event that do not have measurement other than the number of occurence of the event. This pattern is characterized by the absence of any measurements except for a counter of event which is always 1 for every fact recorded in the fact table. An example of the use of this pattern is student attendance. The general form of factless fact table pattern can be seen in Fig. 8.

Fig. 5 Transaction Level Fact Table Pattern

4. Periodic Snapshot Fact Table Pattern Periodic snapshot fact table pattern is a pattern associated with fact table. This pattern describes the situation at a certain snapshot of time which can be daily, weekly, monthly, or annually. One record on a fact table describes an event in a period of time. Therefore, this pattern must involve a time dimension. This pattern is usually marked by the naming of the facts table with a snapshot fact. The facts appear in fact table indicate the amount or measurement for a period of time. A common form of this pattern can be observed in Fig. 6.

Fig. 8 Fact-less Fact Table Pattern

7. Generic Dimension Pattern Generic dimension pattern is a generic form of dimension table. In a star schema, dimension can be people, place, time, and other things that related to a particular fact. A generic form of dimension contains the dimension key and other attributes. A common form of this pattern can be observed in Fig. 9.

Fig. 9 Generic Dimension Pattern

Fig. 6 Periodic Snapshot Fact Table Pattern

5. Accumulation Snapshot Fact Table Pattern Accumulation snapshot fact table pattern is a pattern associated with a fact table. This pattern describes the conditions of sequence states in a single fact table. This approach is also suitable to perform tracking of changes

8. Date Dimension Pattern Date dimension pattern is a pattern associated with dimension table. This pattern is used to meet the characteristics of a data warehouse in handling the aspect of time (historical data). This pattern has the same structure with a time pattern in stars [9]. Date/time dimension is modeled in a separate table with multiple attributes such as

basic date, fiscal period, seasons, holidays, and so on. A common form of this pattern can be seen in Fig. 10.

dimension table (called the parent dimension), but are still associated with the parent dimension. An example a common form of this pattern can be observed in Fig. 14.

Fig. 10 Date Dimension Pattern Fig. 14 Outrigger Dimension Pattern

9. Degenerate Dimension Pattern Degenerate dimension pattern is a pattern that is associated with a dimension table in which some of its attributes are implemented as attributes in a fact table instead of separate dimension table. A common form of this pattern can be seen in Fig. 11.

13. Multiple Level Hierarchy Pattern Multiple level hierarchy pattern is a pattern to handle multiple hierarchies (hierarchy at different levels) in dimension table. The hiearchy is modeled not in a normal form, but denormalized into a single table. The generic form of multiple level hierarchy pattern is shown in Fig. 15. An example of hierarchy is shown in Fig. 16 and an example of dimension table based on the hierarchy is shown in Fig. 17

Fig. 11 Degenerate Dimension Pattern

10. Role Playing Dimension Pattern Role playing dimension pattern is the pattern associated with dimension table. This pattern is used when one dimension has several roles in a fact table. Accumulation snapshot fact table pattern (see section E) shows a typical use of role playing dimension pattern for its many roles of time dimension in a fact table. Role playing dimension pattern is also typically used to show departure and arrival locations which refer to location dimension. A common form of this pattern can be seen in Fig. 12.

Fig. 15 Multiple Level Hierarchy Pattern

14. Variable Depth Hierarchy Pattern Variable depth hierarchy pattern is a pattern to handle the hierarchy in dimension table. It handles hierarchy of dimension with varying depths. A hierarchy bridge is created between the fact table and the dimension to handle the variations in the hierachy.

Fig. 12 Role Playing Dimension Pattern

11. Junk Dimension Pattern Junk dimension pattern is a pattern associated with dimension table. It aims to classify the pattern dimensions with low cardinality into one dimension table. The attributes of this dimension table usually form indicators. Each record in the dimension table provides the possible combination of each indicator. The naming of this dimension table usually contains indicator or combination of words. Modeling junk dimension can be seen in Fig. 13. .

Fig. 16 Hierarchy Example Dim Key 1 2 3 4 5 6 7 8

... Desc / Name Name1 Name2 Name3 Name4 Name5 Name6 Name7 Name8

Attributes Attr1 Attr2 Attr3 Attr4 Attr5 Attr6 Attr7 Attr8

Group A

SubGroup A

Group B

A1 A1 A1 A1 A2 A2 A2 A2

A11 A11 A12 A12 A21 A21 A21 A22

B1 B2 B3 B1 B1 B2 B3 B3

Fig. 17. Multiple Level Hierarchy Detail Pattern

Fig. 13 Junk Dimension Pattern

12. Outrigger Dimension Pattern Outrigger dimension pattern is a pattern associated with dimension table. This pattern is required when there are two dimension tables in which the attributes of one dimension table have different substance with the other

This bridge table contains attributes to describe hierarchy such as parent key, subsidiary key, #level from parent, parent name, last updated, and first updated. Variable depth hierarchy pattern can be observed in Fig. 18

Fig. 18 Variable Depth Hierarchy Detail Pattern

15. Slowly Changing Dimension Pattern Slowly changing dimension pattern is a pattern that is designed to meet the needs of a data warehouse to be adaptive and able to handle changes of facts. There are three ways to deal with changes that are described into three subpatterns:SCD type 1, SCD type 2, and SCD type 3. SCD type 1 SDC type 1 handles the change by removing and replacing old values with new values. This pattern is usually used if there is no need to keep historical records and used for correction purposes. SCD Type 1 pattern can be seen in Fig. 19.

Fig. 19 SCD Type 1

Fig. 20 SCD Type 2

Fig. 21 SCD Type 3

SCD type 2 SDC type 2 handle changes by adding a new line in the dimension for the new values. This approach is the primary technique to accurately track slowly changing dimension attributes for historical data. SCD type 2 pattern can be seen in Fig. 20. SCD type 3 SDC type 3 handles changes by adding a new attribute that containing the new values in the dimension table besides the old ones.This pattern is normally required when changes do not frequently happen or when changes happen over the naming of something and not based on some related characteristics. SCD type 3 pattern is shown in Fig. 21. 16.

Rapid Changing Dimension Pattern

Rapid changing dimension pattern is a pattern that is designed to meet the needs of an adaptive data warehouse

and able to handle the change. The difference with the previous pattern is that the pattern is used when the data changes that occur relatively frequent. This problem is addressed by making mini dimension for the atribute dimension. The frequent changes occur in the mini dimension to avoid swelling the actual dimension table. Fig 22. Describe rapid changing dimension pattern. 17. Heterogeneous Product Pattern Heterogeneous product pattern is a pattern that is designed to overcome the problem of heterogeneous products. The problem is solved by creating two schemes: one is the generic scheme that is used for all products and two is specific schemes that is used to address specifict product. The fact tables related to the two schemes are similar. The difference lies only in the dimension table that contains some additional attributes. Heterogeneous product pattern can be seen in Fig. 22. Note that in this case, each product must have the same primary key in both generic dimension and specific dimension.

Fig. 22 Heterogeneous Product Pattern

18. Multi-valued Attribute Pattern Multi-valued attribute pattern is a pattern that is designed to handle the problem of an attribute that has more than one value. The examples of such attribute are multiple employee skills in human resource management case, multiple customer in an account for the financial service, multiple diagnosis for the case of health sevices, and multiple insured driver in insurance cases.

Fig. 23 Rapid Changing Dimension Pattern

The problem is solved by creating a bridge table between the fact table and dimension table. Fig. 24shows a multivalued attribute pattern.

Fig. 24 Multi-valued Attribute Pattern

V. PATTERN QUALITY METRIC To measure the quality of the generic data model patterns the reusability measurement metric from [10] is adopted. Two subfactors of the reusability measurement are used: flexibility and comprehensibility. Flexibility is measured by parameterization metric and frequency of occurrence of the patterns in several cases. Comprehensibility is measured by completeness metric and groupness metric.

A. Parameterization Parameterization metric (parametric) is applied to each pattern. To measure this metric, we require parameter participant score and constant participant score of a pattern. Parameter participant score is acquired by multiplying the number of parts in the pattern that must be modified when the pattern is used with the weight applied for such modifications. The weight of a modification is required because every modification may require different effort. Constant participant score is acquired from the number of parts in the pattern that remain constant when the pattern is used. The higher the number of the parameterization metric, the more flexible a pattern is. The formula that is used to provide the parametric is shown in formula (1) below: (1)

B. Frequency of Occurrence Level of flexibility is also be measured by frequency of occurrence of pattern in several cases. The more a pattern shows up in the cases, the more flexible the pattern is. The formula used to measure the frequency of occurrence pattern is as the following: Occurence Frequency =

Total # case pattern Total all cases

(2)

C. Completeness of Pattern Element Completeness of pattern element is measured by defining whether a pattern documentation meets all the required elements of a defined pattern format. If the pattern meets all the required elements in a pattern format, the level of comprehensibility pattern is high. The formula used to measure the completeness of the pattern elements is as the following: Element pattern completeness =

Total defined element pattern -------------------------------------- (3) Total all element pattern

defined as a part of a pattern that must be modified when the pattern is used to create a data model. It can be the name of a table or attribute, a whole new attribute to be added in an existing table, or a whole new table to be added in a model. To measure the parameter participant score, the weight of doing certain modifications on the pattern must be determined. We define three parameter weights: 1. Parameter weight 1 is used on modifications that involve only the names of tables/attributes. 2. Parameter weight 2 is used on modifications that involve the addition of a new attribute of a table, 3. Parameter weight 5 is used on modifications that involve the addition of a new table. A constant parameter on the other hand is defined as a part of a pattern that is not required to be modified when the pattern is used to create a data model. It can be a name or an attribute or an existing table. As an example, in Generic Star Schema Pattern (see section III.A, especially Figure 2), there are: 1. 2 constant participants: the key attribute of Fact table and the key attribute of Dimension 1 table. 2. 2 parameter participants with parameter weight 1: the name of Fact table and the name of Dimension 1 table. 3. 2 parameter participants with parameter weight 2: a measurement attribute in Fact table and an attribute in Dimension 1 table. 4. 1 parameter participant with parameter weight 5: Dimension table. Parameter participants score (PPS) for the pattern is calculated as the following:

Thus the parameterization metric (parametric) is calculated as the following:

D. Groupness Groupness level of the pattern is measured for each pattern, but must be measured for the whole collection of patterns. The formula used to measure the metric is as the following: Total ungrouped pattern Gropuness = 1 -

----------------------------------

(4)

Total all pattern

VI. TESTING AND RESULT ANALYSIS A test is carried out by measuring the patterns using the measurement of flexibility and comprehensibility quality factors defined in the previous section. A. Parameterization Metric As described in section IV, parameterization metric is applied to each pattern. To measure this metric, parameter participant score and constant participant score for each pattern must be calculated. A parameter participant is

B. Frequency of Occurrence Frequency of occurrence of a pattern is a metric to measure the number of occurrence of a pattern in real cases. In this research, a test was conducted by counting the number of the occurrence of all patterns in insurance case defined by Kimball [8] as a benchmark (case #1), and then calculating the number of occurrence of all patterns in two existing data warehouse cases: AdventureWorks Cycle (case #2) and Database of Higher Education or PDPT (case #3). Afterward, the frequency of occurrence of a pattern in the case #2 and case #3 were compared with the frequency of occurrence of a pattern in the case #1 (benchmark) to get the value of the frequency of occurrence of the pattern according to the standard benchmark. As an example, the occurrence of Generic Star Schema Pattern in case #1 is 5. The total occurrence of Generic Star Schema in case #2 is 2. The total occurrence of Generic Star Schema Pattern on case #3 is 3. The frequency of occurrence of a pattern in real cases (case #2 and case #3) is computed using the metric (2). Thus, for Generic Star

Schema pattern the frequency of occurrence in real cases is calculated as the following:

The standard frequency of occurrence of a pattern is calculated by comparing the frequency of occurrence in real cases (cases #2 and #3) with the frequency of occurrence of the benchmark (case #1). If a pattern does not appear on the benchmark, then the standard frequency of occurrence of is given 1. The standard frequency of occurrence for the Generic Star Schema is calculated as the following:

C. Completeness of Pattern Elements Completeness metric is applied to each pattern by calculating the number of elements that are fulfilled in the documentation of the patterns. For example, the completeness metric for Generic Star Schema Pattern is calculated as the following:

D. Groupness Groupness metric is applied to the entire collection of patterns. The value of groupness metric for the patterns is defined as the following:

E. Test Results The test results are presented in Table 1. It can be observed that the average value of parameterization metric values of all patterns to measure flexibility is 0.771. The patterns are called totally flexible if the number is 1. Thus, this figure shows that the patterns are adequately flexible. The average standard frequency of occurrence is measured at 0.569. The average frequency of occurrence of all patterns in case #2 and case #3 is 3.275 and the average frequency of occurrence of all patterns on the benchmark case (case #1) is 5.750. This indicates that there are at least 3 patterns appear in each case in comparison to 5 patterns in the benchmark case. It means that in real cases (case #2 and case #3), the use of the patterns are still lower than expectations. The test on metrics related to the comprehensibility aspect show more promising results. The completeness of pattern elements provides the value of 0.969; while the value of groupness metric provides 0.9000. Those figures are close to 1.000 which shows that the patterns are all nearly complete and well-clustered.

VII.

CONCLUSIONS AND FUTURE WORKS

Eighteen generic patterns for creating data models of data warehouses have been developed in this research. The patterns are created based on the requirements of data modelling for data warehouses, as well as analysis on existing patterns and Kimball’s 14 case studies [8]. The collection of patterns consists of two main patterns (Generic Star Schema Pattern and Data Warehouse Bus Matrix Pattern) and five groups of patterns: 1) fact table related patterns; 2) dimension table related patterns; 3) dimension’s hierarchy related patterns; 4) fact changing related patterns; and finally 5) product heterogeneity related patterns. The result of the test on two factors of reusability: flexibility and comprehensibility, shows considerably good results. Two metrics are defined for the flexibility factor: the parameterization metric and the frequency of occurrence of the patterns in real cases. Two metrics are defined for the comprehensibility factor: the completeness of the elements of the patterns and the group of the pattern collection. The average value of 0.771 for the parameterization metric shows and the average value of 0.569 of standard frequency of occurrence of the patterns in real cases show that the patterns are adequately flexible. The average value of 0.969 on the completeness of pattern elements and the average value of 0.9000 of the group of the patterns provide a good comprehensibility of the patterns. The test on the frequency of occurrence of the patterns on real cases show that some patterns associated with changes in dimensions, product heterogeneity, and multivalued attribute pattern are rarely or almost never used although in the benchmark case, those patterns have relatively high frequency of occurrence. This is probably because the test cases cover only common modelling problems so that particular problems presented by those patterns are not covered. Further research can be conducted on the quality of the patterns using other reusability quality factors and qualities other than reusability. Tests to measure the frequency of occurrence of the patterns can be carried out to a lot more cases in order to provide a more accurate result. Finally, there must be a study on how the generic data model patterns that are created in this research are used in real world cases in order to provide a proof on the usefulness of the patterns and to find improvements on the patterns if necessary. REFERENCES [1] [2]

Azizah, Fazat Nur. “Generic Data Model Patterns using Fully Communication Oriented Information Modeling (FCO-IM)”. Bandung Institute of Technology: 2009. Batra, Dinesh. “Conceptual Data Modeling Patterns: Representation and Validation”. Florida International University:2005.

Pattern Quality Metric Flexibility Percentage of Occurence(%)

Pattern Parametric Generic Star Schema Pattern Data Warehouse Bus Matrix Pattern Transaction Level Fact Table Pattern Periodic Snapshot Fact Table Pattern Accumulation Snapshot Fact Table Pattern Factless Factable Pattern Generic Dimension Pattern Date Dimension Pattern Degenerate Dimension Pattern Role Playing Dimension Pattern Junk Dimension Pattern Outrigger Dimension Pattern Multiple Level Hierarchy Pattern Variable Depth Hierarchies Pattern Slowly Changing Dimension Pattern • SCD Type 1 • SCD Type 2 • SCD Type 3 Rapid Changing Dimension Pattern Heterogeneous Product Pattern Multivalued Attribute Pattern Average

Comprehensibility

Frequency of Occurence

Completeness

1

2

3

1 (BM)

2

3

2& 3

0.556

4.35

5.56

3.16

5

2

3

2.5

0.5

1

0.846

0.87

0.00

1.05

1

0

1

0.5

0.5

1

1

2.61

8.33

1.05

3

3

1

2

0.667

1

0.75

1.74

2.78

9.47

2

1

9

5

2.5

1

0.79

0.87

0.00

0.00

1

0

0

0

0

1

0.8

0.87

0.00

1.05

1

0

1

0.5

0.5

1

0.667

35.65

25.00

28.42

41

9

27

18

0.439

1

0.727

6.09

11.11

10.53

7

4

10

7

1

0.875

0.857

6.09

13.89

0.00

7

5

0

2.5

0.357

0.875

0.786

9.57

22.22

0.00

11

8

0

4

0.363

0.875

0.867

0.00

0.00

0.00

0

0

0

0

0

0.875

0.778

0.00

8.33

7.37

0

3

7

5

1

0.875

0.889

0.00

0.00

37.89

0

0

36

18

1

1

0.55

0.00

0.00

0.00

0

0

0

0

0

1

0.75

6.09

0.00

0.00

7

0

0

0

0

1

0.667 0.714

6.09 6.09

0.00 0.00

0.00 0.00

7 7

0 0

0 0

0 0

0 0

1 1

0.765

6.09

0.00

0.00

7

0

0

0

0

1

0.846

3.48

0.00

0.00

4

0

0

0

0

1

0.647

3.48

2.78

0.00

4

1

0

0.5

0.125

1

5.750

1.800

4.750

3.275

0.569

0.969

0.771

Stndr

Groupness

0.9

0.900

Table 1. Test Results

[3] [4] [5] [6] [7]

Blaha, Michael. “Patterns of Data Modeling”. CRC Press : 2010. Martin, Fowler. “Analysis Patterns. Reusable Object Models”.Addison Wesley Professional: 1996. Gamma, Erich. Helm, Richard. Johnson, Ralph. Vlissides, John. “Design Patterns : Elements of Reusable Object-Oriented Software”. Addison-Wesley Professional Computing Series:1997. Hay, David C. “Data Model Patterns, Convention of Thought”. Dorset House Publishing:1997. Inmon. “Building the Data Warehouse third Edition”. John Wiley & Sons: 2002.

[8] [9]

[10]

Kimball, Ralph. “The Data Warehouse Toolkit 2nd edition”. John Wiley & Sons: 2002. Peterson, Stephen. “Stars: A Pattern Language for Query Optimizated Schema”. Proceedings of PLoP’94, Monticello, IL : Agustus 1994. http://c2.com/ppr/stars.html. Diakses tanggal : 21 Oktober 2010. Zeiss, Benjamin. Vega, D. Schieferdecker, I. “Applying the ISO 9126 Quality Model to Test Specifications Exemplified for TTCN-3 Test Specifications”. Gesellschaft für Informatik, Köllen Verlag, Bonn: 2007.