Introduction to Database Design

Introduction to Database Design UVic C SC 370 Dr. Daniel M. German Department of Computer Science May 5, 2004 Version: 1.1.1 2–1 Introduction to Da...
Author: Morgan Black
7 downloads 4 Views 276KB Size
Introduction to Database Design

UVic C SC 370

Dr. Daniel M. German Department of Computer Science

May 5, 2004 Version: 1.1.1 2–1 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Overview ✥ What are the steps in designing a database? ✥ What is the entity-relationship (ER) model? ✥ How does UML related to the ER model?

Chapter 2 of textbook

2–2 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

ER Model ✥ The Entity-Relationship data model allows us to describe the

data involved in a real-world system in terms of objects and their relationships ✥ It is widely used in database design

2–3 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Database Design Database design can be divided in six major steps: ✥ Requirements analysis ✥ Conceptual Database design (mostly done using the ER model) ✥ Logical Database design ✥ Schema refinement ✥ Physical Database Design ✥ Application and Security Design

2–4 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

ER diagram ER Diagram: ✥ an approximate description of the data, ✥ constructed through a subjective evaluation of the information, ✥ that was collected through the requirements analysis phase.

2–5 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Entities ✥ Entity: is an object in the real world that is distinguishable from

other objects ✥ Entity Set: collection of similar objects ✥ An entity is described using a set of attributes ✥ Each attribute has a domain of possible values. ✥ For each entity set, we should select a key ✥ A key is a minimal set of attributes that uniquely identify an

entity in a set

2–6 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Example of an Entity name

ssn

lot

Employees

2–7 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Relationships ✥ A relationship is an association among two or more entities ✥ A set of similar relationships is called a relationship set:

{(e1 , ..., en )|e1 ∈ E1 , ..., en ∈ En } ✥ Each n-tuple denotes a relationship involving n entities

(e1 , ..., en ) where ei is in the entity set Ei ✥ A relationship can also include its own attributes (called

descriptive attributes ✥ A relationship must be uniquely identified by its participating

entities

2–8 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Example of a Relationship

dname

name

ssn

Since did

lot

Employees

2–9 Introduction to Database Design (1.1.1)

WorksIn

budget

Departments

CSC 370 [email protected]

Example of a Ternary Relationship

dname

name

Since

ssn

did

lot

Employees

address

2–10 Introduction to Database Design (1.1.1)

WorksIn2

Locations

budget

Departments

capacity

CSC 370 [email protected]

Role Indicators name

ssn

lot

Employees

supervisor

subordinate

ReportsTo

2–11 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Key Constraints ✥ One-to-many: an entity is related to many other entities, but each

of these entities can only be related to one entity ✥ Many-to-many: an entity is related to many other entities, and

vice-versa

2–12 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Key Constraint on Manages

dname

name

ssn

Since did

lot

Employees

2–13 Introduction to Database Design (1.1.1)

Manages

budget

Departments

CSC 370 [email protected]

Key Constraint on a Ternary Rel.

dname

name

Since

ssn

did

lot

Employees

address

2–14 Introduction to Database Design (1.1.1)

WorksIn3

Locations

budget

Departments

capacity

CSC 370 [email protected]

Participation Constraints There are two types of participation constraints for an entity in a relationship: ✥ Total: Every instance of the entity is present in the relationship

(represent it by a thick line) ✥ Partial: Not every instance of the entity is present in the

relationship represented

2–15 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Total participation

name

ssn

dname

Since did

lot

Employees

Manages

budget

Departments

WorksIn

Since

2–16 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Weak Entities ✥ Sometimes the attributes associated with an entity set do not

include a key ✥ A weak entity can be identified uniquely only by considering

some of its attributes in conjunction with the primary key of another entity (called identifying owner entity) ✥ The following restrictions must hold:

✦ The owner entity set is a one-to-many to the weak entity (identifying relationship set) ✦ The weak entity set should have total participation in the identifying relationship set. ✥ Represented by drawing the relation and the weak entity in thick

lines 2–17 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

A Weak Entity Set

name

ssn

cost pname

lot

Employees

2–18 Introduction to Database Design (1.1.1)

Policy

age

Dependents

CSC 370 [email protected]

Class Hierarchies ✥ Sometimes we need to define entities as “derivations” of others

(ISA) ✥ That is, the attributes of an entity are those of another entity (its

parent) plus other ones ✥ A class hierarchy can be seen in two different ways:

✦ Specialization: identify subsets of an entity that share some distinguishing characteristics ✦ Generalization: An entity is created that includes several characteristics common to different entity sets.

2–19 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

A Class Hierarchy name lot

ssn

Employees

hoursWorked

ISA

contractId

hourlyWages

HourlyEmps

2–20 Introduction to Database Design (1.1.1)

ContractEmps

CSC 370 [email protected]

Class hierarchies... ✥ Why do we subclass?

✦ We might want to include attributes that only make sense for the subclass (specialization) ✦ We might want to identify a set of entities that participate in a given relation (generalization)

2–21 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Aggregation ✥ Sometimes a relationship needs to relate one relationship with a

collection of entities or other relationships ✥ Aggregation allows us to indicate that a relationship set

participates in another relationship set. ✥ Illustrated by drawing a dashed box around the set of related

entities and relationships.

2–22 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Aggregation... name ssn

lot

Employees

Monitors

until

startedOn

dname since

pid

did

pbudget

Projects

2–23 Introduction to Database Design (1.1.1)

Sponsors

budget

Departments

CSC 370 [email protected]

Conceptual Design with the ER Model Developing an ER diagram presents several choices: ✥ Should a concept be modelled as an entity or an attribute? ✥ Should a concept be modelled as an entity or a relationship? ✥ What are the relationship sets and their participating entity sets? ✥ Should we use binary or ternary relationships? ✥ Should we use aggregation?

2–24 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Entity vs. Attribute ✥ It is not always clear what should be an attribute of an entity and

what should be moved to a new entity set ✥ In general, an attribute should not be an entity unless:

✦ We need to record the same attribute(s) for more than one entity ✦ We want to capture the structure of this “attribute” in our ER-diagram

2–25 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Entity vs. Attribute...

name

ssn

from

did

lot

Employees

2–26 Introduction to Database Design (1.1.1)

dname

to

WorksIn4

budget

Departments

CSC 370 [email protected]

Entity vs. Attribute...

dname

name

ssn

did

lot

Employees

budget

Departments

WorksIn4

from

2–27 Introduction to Database Design (1.1.1)

Duration

to

CSC 370 [email protected]

Entity vs. Relationship ✥ The imprecise nature of ER modelling makes it difficult to

recognize when to define an attribute as part of an entity or as part of a relationship ✥ The only solution (at this point) is to apply common sense: is the

attribute part of the relation, or is it part of the entity? ✥ In general, a mistake in this stage will lead to wasted storage ✥ We will fix this in the future (normalization)

2–28 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Entity vs. Relationship...

name

ssn

since

did

lot

Employees

2–29 Introduction to Database Design (1.1.1)

dname

dbudget

Manages2

budget

Departments

CSC 370 [email protected]

Binary vs. Ternary Relationships ✥ In cases where we can use either a binary or ternary relationship, ✥ the decision is usually determined by any restrictions (integrity

constraints) on the relationship that we are trying to model and ✥ if we can or cannot do it with a ternary relationship

2–30 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Binary vs. Ternary... name

ssn

pname

lot

Employees

age

Dependents

Covers

Policies

policyId

cost

• A policy cannot be owned jointly by two or more employees • Every policy must be owned by some employee • Dependents is a weak entity (uniquely identified by policyid and pname) 2–31 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Binary vs. Ternary... name pname ssn

age

lot

Employees

Purchaser

Beneficiary

Dependents

Policies policyId

2–32 Introduction to Database Design (1.1.1)

cost

CSC 370 [email protected]

Aggregation vs. Ternary Relationships ✥ Again, the choice depends on integrity constraints

2–33 Introduction to Database Design (1.1.1)

CSC 370 [email protected]

Aggregation vs. Ternary... name ssn

lot

Employees startedOn

pid

dname

did

pbudget

Project

Sponsors2

2–34 Introduction to Database Design (1.1.1)

budget

Departments

CSC 370 [email protected]