DATABASE DESIGN WITH UML: IMPLEMENTATION ISSUES

DATABASE DESIGN WITH UML: IMPLEMENTATION ISSUES MOJCA CIGLARIČ, MATJAŽ PANČUR, MATEJ TRAMPUŠ, TONE VIDMAR Faculty of computer and Information Science,...
Author: Abraham Nelson
5 downloads 0 Views 212KB Size
DATABASE DESIGN WITH UML: IMPLEMENTATION ISSUES MOJCA CIGLARIČ, MATJAŽ PANČUR, MATEJ TRAMPUŠ, TONE VIDMAR Faculty of computer and Information Science, Tržaška 25, 1000 Ljubljana University of Ljubljana, Slovenia

ABSTRACT The paper examines the benefits and drawbacks of the object-relational database technology using UML diagrams in comparison with the entity-relationship approach. Guidelines for transition to UML are presented and mapping strategy to relational and object-relational schema is overviewed. OR database extensions are outlined and two of the most popular object oriented modelling tools Oracle Designer 6i and Rational Rose 2000 - are compared. The discussion of how well these tools are adapted to reallife situations is included. and their ability of generating DDL code for object-relational databases is studied.

KEYWORDS: Object-relational extensions, ORDBMS, UML tools, database design.

INTRODUCTION As information systems grow more and more complex, they also represent high investment for a company. Often, legacy systems are based on traditional data model and RDBMS while at the moment, OO analysis and design, considering data and functionality together, seems most natural and also brings other advantages (consistency, reusability). The UML specification is a semantic model, representing OO concepts at all levels – requirements, architecture, details, and implementation of functionality as well as data. The paper addresses several issues regarding transition from ERD to UML with emphasis on logical modelling. Further, it discusses UML data model implementation on pure relational as well as object-relational DBMS-es. A real-life case study is included and commented on.

I.

BACKGROUND AND RELATED WORK

Currently, ER modelling is a widespread option for relational database design. UML diagrams are used in a similar but more powerful manner. The UML class diagrams in fact represent superset of entity-relationship (ER) diagrams. UML models deal with data and functions simultaneously, which implies that the same model can be

utilized for developing programming code as well as for database design. The way the models are being developed helps us see the superstructure of a problem before we have to dig into details. In the last few years, object-relational DBMSes developed remarkably. In 97-98, papers like [1] have shown that UML finally took its shape. Two years ago, in [2] P. Dorsey referring to OR-DBMSes stated that "their time has come… almost". A year later, in the paper [3] he suggests the transition towards UML database design, although he observes that we are not ready to build production objectrelational systems. At the same time, Blaha and Premerlani in [4] emphasize mapping of UML structures and functionality to relational database, while they are not exactly enthusiastic about OO extensions. In the paper [5] and more thoroughly in his book [6] Muller explains UML information modelling. The instructions for mapping UML data model to relational and object-relational schema are particularly helpful, while he strongly disagrees with using arrays of variables and nested tables because they violate 1st NF rule and can reduce performance. In October 99 Oracle issued a White Paper [7] where its object philosophy is made clear. Examples of object-relational extensions to relational DBMSes are overviewed in [8] and [9]. In [10], additional mechanisms are presented (inheritance), that probably weren’t mature enough to be released yet. Today we are surprised to observe that although ORDBMSes are evolving, Dorsey’s “almost” cited above still stands.

II. PROBLEM Since legacy systems represent big investment in traditional data models and DBMSes, we usually want to make the shifts in information area as smooth and painless as possible. We believe that due to the magnitude of earlier investments, the companies will demand easiest possible transition to object-relational and later on, possibly pure object RDBMSes. When a company recognizes a need for more flexible system, that would allow more precise representation of business rules, or a need for more information on how objects interact with each other and how they are changing

states in their lifecycle, or lack of understanding how objects function and what actions are acceptable, then it is time to consider a new approach to logical modelling. II A.

TRANSITION FROM ERD TO UML With reverse engineering tools we are usually able to obtain a legacy data schema that is acceptable as a basis for further design. However, we have to resist the temptation of simply translating old ERDs to UML syntax – tables to classes, columns to attributes. This way, we would have gained nothing. Further options are assigning roles to class associations; limiting classes’ behaviour with constraints, extending or redefining classes’ behaviour with stereotypes; including functionality (methods). When we’re done (re)modelling, the structure is mapped back to tables. While object-relational DBMSs are still rare, some vendors have added OO-extensions to their products. Nevertheless, good old relational database is a perfectly good implementation for object models. Once the tables are established, further issues are assuring referential integrity and mapping functionality to program code. II B. MAPPING TO PURE RELATIONAL SCHEMA Once we have an UML data model, we can apply the following guidelines to obtain a database schema. Classes are mapped to tables, attributes to columns. UML attribute types (in classes) become column types. For classes without generalization, primary keys are created. Associations are implemented considering their cardinality (one-to-one, many-to-many, zero-to-one-to-somewherebetween-2-and-7…). For subclasses, the key of each parent class is added to existing primary and foreign keys. For association classes, primary key from each table is added to existing primary and foreign keys. Explicit constraints (check, unique, etc.) have to be implemented with regard to DBMS used. Foreign keys are created for 0..1 and 1..1 roles. For composite aggregation, primary keys are created with foreign key to aggregating table. Optionally, binary association classes can be optimised. Additional tables with primary and foreign keys are created for many-to-many associations without association classes. Representation of behaviour (functionality) is the least formal step. Usually we choose stored procedures and try to encapsulate underlying tables in packages. Afterwards, the applications have to avoid accessing data directly instead of through the package. II C. MAPPING TO OBJECT – RELATIONAL SCHEMA The rules for transformation from UML into an objectrelational schema are somewhat more complicated. Since Object-relational database extensions are not standardized yet, the procedure depends on vendor-specific implementation of OO extensions. Most of them include object types, object tables and various types of associations. Due to their

controversy, nested tables and arrays of variables are not included in the procedure. UML classes become object types, usually with a corresponding table(s), which would hold the object rows. UML types become object types, UML attributes and their types in class become attributes and types in object type. UML attribute tags are implemented with corresponding constraints or other mechanisms regarding to DBMS specifics. For implicit classes with no generalization primary key is generated. For subclasses, keys of parent classes are added to existing primary and foreign keys. Parent classes reference their subclasses. Object types are created for association classes, primary and foreign keys are identified. Constraints and tags are modelled according to DBMS specifics. Foreign keys are created for 0..1 and 1..1 roles or reference / collection of references is used. Primary keys are created for composite aggregations, with foreign keys to aggregating tables. Alternately, aggregate could be stored in the table itself as an object attribute or a collection (array of variables or nested table). Optionally, binary association classes can be optimised. Additional tables (usually without underlying object type) with primary and foreign keys are created for many-to-many associations without association classes. The last step is creation of methods on object types for operations on corresponding UML classes. Instead of creating tables directly from classes and associations (as shown in section III B), we first create object types, then tables from these types. Both types and tables are modelled as UML classes. Multiple tables can be created for one object type, where table represents an instance of on object type. The latter cannot be done in purely relational world.

III. CASE STUDY Gradually, UML is confirming its position as a powerful, standardized design tool. We speculated about what benefits we could achieve using UML for conceptual database design. Most of all we were concerned about the connection among UML design tools and ORDBMSes. Since we ran into these questions when working on Ljubljana municipal information system, which runs on Oracle platform, we concentrated on the Oracle 8i database, officially declared as an object-relational DBMS. Almost every diagramming tool that supports UML also generates DDL (either ANSI standard or vendor specific). Our case study concentrates on a few database fragments of the Ljubljana municipal information system. We modelled apartment rentals, database included data on apartments, customers - owners, renters and inhabitants, addresses, contracts with annexes. Following the guidelines for transition from ERD to UML, we came up with the conceptual model shown on Fig. III.1.

The UML class diagram on conceptual level is very similar to the original ERD since we chose not to overload it with methods, tags, constraints, stereotypes etc. Our first idea was to design a true object model and implement it with Oracle 8i RDBMS as closely as possible. We took two powerful design tools: Rational Rose 2000 and Oracle Designer 6i, both of which include SQL code generation from UML (or UML-like) class diagrams. Rent_Contract Features hot water heating phone cable TV elevator

Address zip street number +Renter

Annex area price

+Landlord

Customer name1 name2 +Inhabitant account 1..*

+contact 1.. 3

• Nested tables - These are useful where referential integrity is required and are suited to master-detail and one-to-many relationships. Nested tables can have attributes but no methods. • VARRAYs - These are useful for providing quick access to small or uniform-sized collections of objects in a table. VARRAYs do not have methods or attributes. Typically nested tables are used when there is no limit on the potential number of items to be stored. Conversely, VARRAYs can only have a limited number of elements. Nested tables and VARRAYs can only be referenced indirectly via the outer table.

Appartment +owner 1..*

area

Ownership Cust_Corp

collection types are used to define datatypes. They provide support for collections of similar items, and can be implemented as either of the following:

Cust_ind Social Security NR

fraction date from date until

Figure III.1: Conceptual model

III A. OBJECT-RELATIONAL FEATURES IN ORACLE 8i OR features facilitate the employment of new objectoriented structures together with the traditional relational constructs. They are the basic building blocks of an objectrelational schema. A quick summary follows below. Object identifiers (OIDs) - Every row object in an object table has an associated logical object identifier (OID). OIDs behave similarly to Primary and Unique Keys. They can be system-generated or based upon user-defined attributes. New integrity constraints must be added that prevent orphaning in OIDs. The purpose of the OID for a row object is to uniquely identify it in an object table. To do this, ORDBMS implicitly creates and maintains an index on the OID column of an object table.

III B. ORACLE DESIGNER 6i Oracle Designer is a toolset for modelling, designing and generating client/server databases and database applications. A component called “Design Editor” is used for database generation and/or capture. The Server Modeler is a graphical tool for modelling logical database designs. Although not loudly announced, server model diagram includes a subset of UML class diagram notation. It also includes much more so that we cannot say this is a pure UML modelling tool. Instead, this is a tool that enables us to create UML-like diagrams if we want to and if we know how to do it. Instead of classes and objects, object types and object tables are used. Relational tables that may hold attributes of object type can be included too. Associations are called links: embedded link, ref link, embedded collection link, collection of refs link.

Object types - Object type is an abstraction of the real-world entity. It is a schema object with three kinds of components: name, attributes and methods. It acts as a template - defining it doesn't result in storage allocation. Object types are physical structures that serve as templates or building blocks for other types and/or tables. They are userdefined. The methods of an object type model the behaviour of objects. Broadly, they fall into three categories: member, static, and comparison. Possible disadvantages of using object types: they do not support default values, check constraints, optionality and miscellaneous trigger validation. Object table is based on an object type while relational table can have attributes of object type. Oracle

Figure III.2: Server Model Diagram (Oracle Designer 6i)

On the diagram, embedded links are represented with a black diamond on the embedding side. Ref links have empty diamond. Collections (nested tables and arrays of variables) are represented with a “multiplied” icon on the left from attribute name. When creating a diagram, the designer has to define at least object types and most of the implementation details (nested table, VARRAY, ref, scope for references…). Typically, the designer would add at least one object or relational table for each object type (not shown on the diagram!) – to hold rows of that type. The process of code generation in Oracle Designer is simple and quite fast. It’s no surprise that the generated code meets expectations. First, object and collection types are generated for every object and collection type on the diagram: • Simple object type with a method: CREATE OR REPLACE TYPE ADDRESS AS OBJECT (ZIP NUMBER(6) ,STREET VARCHAR2(30) ,NO NUMBER(3,0) ,MEMBER PROCEDURE INSERT_ADDRESS (CODE NUMBER) )

• Object type with ref link: CREATE OR REPLACE TYPE CUST_IND AS OBJECT (SOCIAL_SECURITY_NO NUMBER(13) ,CUST_DATA REF CUSTOMER)

• Object type with ref link (address), embedded link (features) and embedded collection (appt owned by): CREATE OR REPLACE TYPE APPT AS OBJECT (AREA NUMBER ,ADDRESS REF ADDRESS ,OWNERSHIP APPT OWNED BY ,FEATURES FEATURES)

• Collection type as ref array of variables: CREATE TYPE CONTACTS AS VARRAY(3) OF REF CUST_IND

• Collection type as a nested table: CREATE TYPE APPT OWNED BY AS TABLE OF OWNER

Next, object or relational tables are generated if user defined. If none were defined in the diagram, none will be generated either. • Simple object table: CREATE TABLE T_CUST OF CUSTOMER

• Relational table with object type attribute (attribute app_details is of object type appt; table t_appt holds rows of object type appt; since attribute app_details is referenced in table test_tab_appt, scope (table) is defined where app_details are saved): CREATE TABLE TEST_TAB_APPT (ID_APPT NUMBER NOT NULL ,APP_DETAILS REF APPT ) NESTED TABLE APP_DETAILS.OWNERSHIP STORE AS TEST_TAB_APPT_NT42

CREATE TABLE T_APPT OF APPT OWNERSHIP STORE AS T_APPT_NT42 ALTER TABLE TEST_TAB_APPT (APP_DETAILS) IS T_APPT)

ADD

NESTED

(SCOPE

TABLE

FOR

Primary and foreign key constraints are generated for relational tables – if any. At last, bodies of member procedures (methods) and database procedures are generated. In our case, procedure body only consists of keyword “null”: CREATE OR REPLACE TYPE BODY ADDRESS IS MEMBER PROCEDURE INSERT_ADDRESS (CODE NUMBER) IS BEGIN NULL; END; END;

If we choose target database for code generation other than Oracle 8i, only pure relational tables and views (without object attributes) are generated. Procedures are only generated for some targets (Oracle 7), while for others (SQL Server, DB2, Sybase…) they are not generated. Our opinion is that one has to know the underlying database very well in order to create effective server model diagrams or to generate useful code. Nothing is wrong with this when the user is committed to Oracle nevertheless. But if he were not, one would probably find the tool almost useless. III C. RATIONAL ROSE 2000 Rational Rose is an UML modelling tool developed by Rational Software Corporation – the same company that employs the famous Three Amigos (Rumbaugh, Jacobson, Booch), the leading authors of Unified Modelling Language. The Rational Rose features an open architecture and provides numerous plug-ins that provide mappings from UML to variety of development tools and languages. We have evaluated Rational Rose 2000 enterprise, which includes a plug-in with Oracle 8 specific features. Those features include creation of custom classes, associations between those classes, and database tables that are used to store objects. Oracle specific data is added to the model by means of UML stereotypes and custom properties. The Schema generation tool uses stereotypes for mapping model elements to Oracle constructs such as object types and object tables. The data provided by custom properties (for example MethodKind) is used to refine this mapping. The best way to create Oracle's object specific constructs is by using Rose's Data Type Creation Wizard. The wizard offers a simple, yet effective user interface for creation of object types, object tables, object views, VARRAYs, and nested tables. Although these constructs can be created through the standard Rose user interface, it requires much more manual work, is more error-prone, and does not guarantee the compatibility of the result with the Oracle database. The SQL data definition language code generated by Rose is very similar to code generated by Oracle

Designer and is therefore not discussed in the following sections. Classes

Rose’s Data Type Creation Wizard can be used to create a new class, and add methods and attributes to it. The newly created class is stereotyped as Object Type. The class is associated with a component, representing a database schema that will contain the class.

CUSTOMER

CUSTOMER_T

NAME1 : VARCHAR NAME2 : VARCHAR2 PHONE : NUMBER ACCOUNT : VARCHAR2

Figure III.3 - Rational Rose, Object Types and Object Tables

Instances of the particular class are stored in one or more object tables. An object table provides a relational view of the attributes of the objects it stores. Rose models an object table as a class with a stereotype Object Table and no attributes and methods. A dependency relationship between a class stereotyped as Object Table and a class stereotyped as Object Type defines what the class of the objects that are to be stored in the object table is (see Fig. III.3). Associations

An association between classes is modelled in two different ways (see Fig. III. 4). A one-to-one relationship is modelled as a direct association between two classes. Unidirectional aggregation (see Fig. III. 4, apartment - address) is used for by-reference association. Despite the facts that UML offers a consistent notation for by-value and byreference aggregation (black or white diamond on the association line), and that standard Rose user interface provides a way to choose between by-value and by-reference associations, the Oracle plug-in uses a plain unidirectional association for modelling by-value associations (Fig. III. 4, apartment - features). +ADDRESS

OWNERSHIP

+OWN ER

CUSTOMER

ADDRESS +ADDRESS

OWNERSHIP_NT

+OWN ERS

APARTMENT

FEATURES

Figure III.4 - Rational Rose - associations between classes

An association with higher cardinality (one-to-many) is modelled in a separate class - either as a VARRAY or as a nested table (Fig. III. 4 - ownership_nt). A dependency relationship between this separate class and the target class

stereotyped as Object Type defines what the class of the objects that are stored in the nested table or the VARRAY is. We do not find this exactly the best solution. UML already features a language construct that can express cardinality (the cardinality adornment near the end of an association line). The fact that an association with higher cardinality can be implemented either as a nested table or as a VARRAY could be better modelled as a property of the association relationship. This solution provides a more uniform way of modelling associations with different cardinalities. The Data Type Creation Wizard provides no way to choose between a by-value or a by-reference association for nested tables and VARRAYs. This choice is made in standard Rose user interface by setting a value of the classes' custom property called CollectionofREFs. When this property is set to true, the by-reference association should be used. Unfortunately, despite the documentation, which states that this property is used for VARRAYs and nested tables, Rose ignores it for nested tables. This means that the SQL data definition language code that Rose generates for model segment shown in Fig. III. 4 is not correct, and the user must modify the generated code by hand to get desired results. General impression

We had expected a lot more from Rational Rose. Our general impression is that the Oracle plug-in is not appropriate for every day use and that the plug-in was made only to make Oracle "Rose compatible" (or the other way around). The Data Type Creation Wizard is only a one-way tool: once the class is created for the first time, the wizard cannot be used to modify it (although modification is a very common task in iterative development). The mapping between Oracle constructs and UML is poorly chosen, especially for associations between classes. When generating code for the model, Rose does not warn the user whether the UML elements that are not supported by the Schema Generation tool were used (examples of such elements include cardinality and by-value/by-ref selections). The code that Rose generates is sometimes wrong. Therefore, in-depth knowledge of Oracle features is required - general knowledge of UML and Rose is not enough and thus the learning curve could be pretty steep. Rose is frequently used at the conceptual modelling level. On the other hand, the database design represents lowlevel modelling. Rose does not provide a way to bridge this gap and transform conceptual diagrams (such as the one shown on Fig. III.1) to lower-level diagrams that are used to generate database schemas. We had expected that Rose would provide a tool of such kind together with the common patterns that could be applied to the conceptual diagram in order to simulate constructs that are not yet directly supported on Oracle (inheritance, polymorphism).

III D. WHAT CAN WE EXPECT Currently, object relational support in almost all RDBMSes lack many much needed functions. Oracle9i (not released yet) has been enhanced to support full object modelling capabilities, including inheritance and multi-level collections, with type evolution capabilities [10]. XML (with derivates) is becoming a standard for information exchange. Using the object-relational framework, XML data can be stored, indexed and queried efficiently. Key enhancements are: inheritance (including type hierarchy, view hierarchy, substitutability, method invocation and support for different client environments), type evolution, mapping SQL objects to Java, C++ and XML, extensibility framework, etc… Type inheritance is a fundamental concept in any object-oriented system. It allows sharing similarities between types as well as extending their characteristics. Single type inheritance is sufficient to support type organization for most applications. A type may extend (inherit from) one supertype. Single type inheritance model is closely aligned with the ANSI SQL99 standards. Type Evolution is a mechanism that enables the user to change the type and propagate these changes to other schema objects that reference the modified type. The schema objects that may reference a type include types, subtypes, program units (packages, functions, procedures), views, functional indexes, or triggers. Mapping SQL Objects to Java, C++ and XML - The object relational facilities provide a more natural and productive way to maintain a consistent structure between a set of Java classes at the application level and the data model at the data storage level. In Oracle, the objectrelational facilities have been tightly integrated with the Java, C++ environments and XML. The SQL object types can be mapped to Java, C++ classes, XML and vice-versa. Similar functions are expected to be integrated or already are integrated into other ORDBMS products soon.

IV. CONCLUSION More natural than separation of data and function is to consider them together. The paper shows some benefits of adopting UML as a complete object-modelling environment. Although the object-relational DBMSes are still evolving, they will soon reach the stage where we’ll have no excuse for not using them.

REFERENCES: [1] Reed, P.: The Unified Modelling Language Takes Shape, DBMS, July 1998. [2] Dorsey, P.: Object-Relational Databases: Their Time has Come… Almost, Dorsey, P., Select Magazine, July 1998.

[3] Dorsey, P.: Making the Transition from ERD’s to UML and Oracle’s Object Database Designer, Oracle ECO/SEOUC 99, 3. 11. 1999. [4] Blaha, M., Premerlani, W.: Using UML to Create Database Applications, (C) 1999 Rational Software and Miller Freeman, Inc., a United News & Media company. [5] Muller, R. J.: Using UML to Design Oracle 8i Schemas, Oracle Open World ’99 Paper #501, Brisbane 1999. [6] Muller, R. J.:,Database Design for Smarties: Using UML for Data Modeling, Muller, R. J., Morgan Kaufmann, San Francisco 1999. [4] Dorsey, P.: Introduction to Object Modelling in Oracle 8, , IOUG-A Live '99, [6] Fowler, M., Scott, K.: UML Distilled: A Brief Guide to the Standard Object Modelling Language, 2/e, Addison Wesley, 2000. [7] Bringing Objects to Mainstream, an Oracle White Paper, Oracle Corp., Oct. 1999, http://technet.oracle.com/ . [8] Hudicka, J. R.: Taking Advantage of Oracle8i and the Object Layer, Today!, Dulcian 99, http://www.dulcian.com. [9] Oracle8i Concepts, Release 2 (8.1.6), Oracle Corp, 2000, http://technet.oracle.com/. [10] Krishnamurthy, V., Murthy, R., Morsi, M.: The Latest Advances in Oracle's Object-Relational Technology, Oracle Open World, Paper #412, San Francisco, CA, 2000.

Suggest Documents