Unit 5. Data Modeling and Database Design

Student Notebook Unit 5. Data Modeling and Database Design What This Unit Is About This unit describes the BASICS of Data Modeling. What You Should ...
Author: Grace Chandler
4 downloads 1 Views 1MB Size
Student Notebook

Unit 5. Data Modeling and Database Design What This Unit Is About This unit describes the BASICS of Data Modeling.

What You Should Be Able to Do After completing this unit, you should be able to: State the purpose of a business model Identify an Entity-Relational Diagram (ERD) model State the purpose of an association table List several DB2 UDB column data types Identify non-standard column and table names Identify the characteristics of a Primary Key and Foreign Key State the purpose of Referential Integrity

How You Will Check Your Progress Accountability: Checkpoint questions

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-1

Student Notebook

Figure

5-1. Unit Objectives (CF035005)

Notes:

5-2

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

5.1 Business Modeling

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-3

Student Notebook

Figure

5-2. Business Modeling (CF035010)

Notes: A business model is a formal representation of business information with the focus on the business, not how the data is going to be stored or on which platform the applications will run.

5-4

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-3. Define the Business Entities (CF035015)

Notes: A business entity is something that is fundamental to the organization and an individual instance or occurrence of this thing can be uniquely identified.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-5

Student Notebook

Figure

5-4. Draw An Entity Relationship Diagram (ERD) (CF035020)

Notes: Draw lines between related entities and notate the relationships.

5-6

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-5. Benefits of Business Modeling (CF035025)

Notes: When everyone agrees that the model represents the way business processes are really performed it ensures that all participants have the same understanding of those processes.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-7

Student Notebook

5.2 Data Modeling

5-8

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-6. Data Modeling (CF035030)

Notes: Each business entity will be come a data entity (something about which we store data).

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-9

Student Notebook

Figure

5-7. Identify Primary Keys (CF035035)

Notes: Each occurrence of an entity must be uniquely identified. In small companies, this may be the first time an employee, customer, product, etc., has been assigned a unique identifier.

5-10

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-8. Data Modeling (CF035040)

Notes: Defining a table's columns, its primary and foreign keys, and identifying and removing redundancies are all part of Data Modeling.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-11

Student Notebook

Figure

5-9. One Table Per Entity (CF035045)

Notes: General starting point - one table per entity. One row per occurrence of the entity.

5-12

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-10. One Column for Each Data Element (CF035050)

Notes: Each attribute of an entity will be a column in the entity's table.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-13

Student Notebook

5.3 Table and Column Names

5-14

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-11. Names - Review (CF035055)

Notes: Column and table names must conform to the above rules.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-15

Student Notebook

Figure

5-12. Name Examples (CF035060)

Notes: When non-standard characters are used in table or column names, within our queries the names must be enclosed in double quotes.

5-16

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-13. Table Name Construction (CF035065)

Notes: Table names consist of two parts separated by a period.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-17

Student Notebook

Figure

5-14. Table Names - References (CF035070)

Notes: The owner of a table can use the simple table name within an SQL statement. Other users must use the fully qualified table name.

5-18

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-15. Column Names - Construction (CF035075)

Notes: Column names can be prefixed with their table name. Column names must be qualified when the names appear in more than one of the tables accessed in the query.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-19

Student Notebook

5.4 Data Types

5-20

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-16. Data Types (CF035080)

Notes: These are all of the DB2 UDB data type categories.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-21

Student Notebook

Figure

5-17. Numeric Formats (CF035085)

Notes: The approximate data types are used when there is a need to store super large or small numbers (the number of grains of sand on all the beaches of the world). Big integer is an eight byte field that holds very large whole numbers.

5-22

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-18. Character String Formats (CF035090)

Notes: LONG VARCHAR data in DB2 UDB for UNIX/Intel is stored in data sets separate from the rest of the table.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-23

Student Notebook

Figure

5-19. Variable Length Strings (CF035095)

Notes: First page of data holds 2 2 KB rows. The OS/390 page is an example of how OS/390 holds one row due to the LONG VARCHAR field having a maximum length of 'to the end of the page'. The DB2 UDB for UNIX/Intel page shows how 'long field' data is physically stored in data sets. Within a row are pointers to where the long field data is stored.

5-24

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-20. Graphic String Formats (CF035100)

Notes: Graphic data types are analogous to the CHAR data types except in the graphic data types it takes two bytes to represent each character being stored (the alternate name for these data types is 'double byte character sets' (DBCS)).

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-25

Student Notebook

Figure

5-21. Large Objects (CF035105)

Notes: BLOBs can store images and sound data. CLOBs can store large amounts of character data. DBCLOBS can store large amounts of double byte character data.

5-26

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-22. Create Table (CF035110)

Notes: These data types are specified just like the other data we've learned about. This is an example of using these data types with DB2 UDB for UNIX, Windows, and OS/2.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-27

Student Notebook

Figure

5-23. DB2 UDB Relational Extenders (CF035115)

Notes: The DB2 UDB products provide toolkits, called Extenders, that contain sets of predefined datatypes and functions. Each extender helps you to manage a particular kind of data, such as text,image, audio, or video.

5-28

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-24. User-Defined Distinct Types and User-Defined Functions (CF035120)

Notes: These have been discussed earlier, just repeated here for completeness.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-29

Student Notebook

Figure

5-25. Date/Time Formats (CF035125)

Notes: DB2 UDB accepts any of these formats in an SQL statement (character string representation of dates, times, and timestamp).

5-30

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

5.5 NULLs

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-31

Student Notebook

Figure

5-26. Nulls (CF035130)

Notes: There are three null characteristics.

5-32

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

5.6 Related Data

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-33

Student Notebook

Figure

5-27. Related Data (CF035135)

Notes:

5-34

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-28. Related Tables (CF035140)

Notes:

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-35

Student Notebook

Figure

5-29. Which Table Gets the Foreign Key? (CF035145)

Notes: Which table carries the foreign key is determined from the business model.

5-36

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

5.7 Association Table

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-37

Student Notebook

Figure

5-30. Association Table - Many-To-Many (CF035150)

Notes: A pair of entities may have a many-to-many relationship. An employee may be assigned to many projects, and a project may have many employees assigned to it. For each pair of many-to-many related tables we create an additional table - called an association table - which will be used to document relationships between the related tables.

5-38

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-31. Association Table - Normal DB2 UDB Table (CF035155)

Notes: Association tables are standard DB2 UDB tables and can carry information pertinent to the relationship.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-39

Student Notebook

5.8 Referential Integrity

5-40

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-32. The Rules DB2 Enforces to Maintain Referential Integrity (CF035160)

Notes: Primary key values cannot be null. A foreign key contains a set of values from some table's unique key.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-41

Student Notebook

Figure

5-33. Adding Primary and Foreign Keys to an Existing Table (CF035165)

Notes: Referential integrity can be added after the tables have been created, via an ALTER TABLE statement.

5-42

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-34. Terms (CF035170)

Notes: The above are definitions of terms.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-43

Student Notebook

Figure

5-35. Processing Rules Enforced to Maintain Referential Integrity (CF035175)

Notes: Defined referential integrity causes DB2 to enforce these rules.

5-44

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-36. Primary Key Violation (CF035180)

Notes: Why did the insert fail?

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-45

Student Notebook

Figure

5-37. Foreign Key Violation (CF035185)

Notes: Why did the insert fail?

5-46

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-38. Delete Rules (CF035190)

Notes: There are three delete rules to choose from. ON DELETE RESTRICT is the default.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-47

Student Notebook

Figure

5-39. Delete Rules - ON DELETE CASCADE (CF035195)

Notes: In this example, the referenced table is the department table. The foreign key EMPNO.DEP was defined with an ON DELETE CASCADE rule. What happens when the BLU DEP row is deleted from the department table with ON DELETE CASCADE as the delete rule? All employees in the BLU DEP are also deleted.

5-48

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-40. Delete Rules - ON DELETE SET NULL (CF035200)

Notes: In this example, the referenced table is the department table. The foreign key EMPNO.DEP was defined with an ON DELETE SET NULL rule. When the BLU DEP row is deleted from the department table, DB2 will set the DEP value for all BLU DEP employees, to NULL.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-49

Student Notebook

Figure

5-41. Delete Rules - ON DELETE RESTRICT (CF035205)

Notes: In this example, the referenced table is the department table. The foreign key EMPNO.DEP was defined with an ON DELETE RESTRICT rule. Before DB2 will allow the BLU DEP row to be deleted from the department table, it checks to ensure that there are no employee rows with BLU DEP values. If DB2 finds one employee row with a value of BLU in the DEP column, the delete fails (is restricted).

5-50

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

5.9 Triggers

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-51

Student Notebook

Figure

5-42. Triggers (CF035210)

Notes: If your application environment needs to enforce particular business rules, before or after changes to DB2 UDB tables take place, DB2 UDB offers you the ability to automatically activate a user-written program (a trigger) which will perform any action you consider appropriate. These programs can be independent from the applications affecting the contents of the table; when a trigger is run, the application that fired the trigger will wait for its completion, as if the trigger had actually been called by the application itself. Triggers are associated with specific tables and will be activated no matter how the table change took place.

5-52

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-43. Triggers: A Simple Example (CF035215)

Notes:

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-53

Student Notebook

Figure

5-44. Triggers: A Sophisticated Example (CF035220)

Notes: The picture shows an example of how triggers can be used to integrate a traditional application such as order entry with advanced technologies. This could be a very common case of a company acquiring orders over the phone, validating them and sending a confirmation fax to their customer afterwards. Using triggers, the fax could be automatically sent to the customer as soon as the order is inserted into the database file, without changing the order entry application. The insert trigger is invoked every time an order enters the orders file; the new records are automatically passed to the trigger program as input parameters. The program might produce a predefined document with the new order and customer data, and send a fax to the customer.

5-54

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

5.10 Normalization

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-55

Student Notebook

Figure

5-45. Normalization - First Normal Form (CF035225)

Notes: Normalization is the process of steps that will identify for elimination redundancies in a database design.

5-56

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-46. Normalization - Second Normal Form (CF035230)

Notes: Second normal form requires a close look at proposed tables that have multi-column primary keys. The objective is to eliminate all columns from a table's design, that are dependent on only part of the primary key.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-57

Student Notebook

Figure

5-47. Normalization - Third Normal Form (CF035235)

Notes: Third normal form tells us to look at each proposed table and ask of each column 'is this column's value dependent on the primary key or some other column within the table.

5-58

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-48. Normalization Recommendation (CF035240)

Notes:

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-59

Student Notebook

Figure

5-49. Questions to Answer (CF035245)

Notes: Answers to these questions help to determine whether data should be denormalized.

5-60

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-50. Normalization/Denormalization Example (CF035250)

Notes:

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-61

Student Notebook

Figure

5-51. Denormalization Example (CF035255)

Notes: Denormalization could improve the performance of our critical transaction. Transition Statement — Let's look at the costs and benefits of denormalization.

5-62

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

Figure

5-52. Costs and Benefits of Denormalization (CF035260)

Notes: The costs are 900 additional pages of storage and additional updates to the second MGRNO column. The benefits are 20,000 avoided joins each day. Transition Statement — Let's answer the checkpoint questions.

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-63

Student Notebook

Checkpoint Exercise — Unit 5 Checkpoint

5-64

1.

What is a business entity?

2.

What are the two basic parts to a table name?

3.

List some of the DB2 UDB data types.

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Student Notebook

4.

Suppose we are creating two tables. One called ORDER_TAB will contain one row for each order taken by our company. It's primary key will be ORDER_NO. The second table is called ITEM_TAB and will contain one row for each item ordered (within an order a customer can order many items). Into which table should the foreign key be placed?

5.

For a foreign key to be defined the column (or columns) it references must have what characteristic?

6.

When must an association table be created.

7.

From which SQL statements may primary and foreign keys be defined?

8.

What is a referential constraint?

9.

What is the 'automatic enforcement' of referential constraints called?

 Copyright IBM Corp. 1997, 2000 Unit 5. Data Modeling Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-65

Student Notebook

Figure

5-53. Unit Summary (CF035265)

Notes:

5-66

 Copyright IBM Corp. 1997, 2000 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

DB2 Family Fundamentals

Suggest Documents