Identifying potential problems, called update anomalies, in the design of a relational database
Methods for correcting these problems
potential problems (Update, Inconsistent data, Additions, and Deletions)
table has desirable properties
First normal form (1NF)
Second normal form (2NF)
Third normal form (3NF)
Fourth normal form (4NF)
Table in first normal form better than table not in first normal form
Table in second normal form better than table in first normal form, and so on
Goal: new collection of tables that is free of update anomalies
Column B is functionally dependent on column A
Each value for A is associated with exactly one value of B
A → B
A functionally determines B
candidate key not chosen as primary key
multiple entries for a single record
A relation (table) that contains a repeating group
When is a Table in first normal form (1NF)?
When it does not contain repeating groups.
when it is in first normal form and no nonkey column is dependent on only a portion of the primary key.
Second Normal Form
Nonkey column (nonkey attribute)
not part of primary key
arrows indicate all functional dependencies
Arrows above boxes: normal dependencies
Arrows below boxes: partial dependencies
are dependencies only on a portion of the primary key
Any column (or collection of columns) that determines another column
third normal form (3NF)
when it is in second normal form and the only determinants it contains are candidate keys
fourth normal form (4NF)
when it is in third normal form and there are no multivalued dependencies.
In a table with columns A, B, and C, there is a ____ of column B on column A when each value for A is associated with a specific collection of values for B and, further, this collection is independent of any values for C.
a condition that involves two or more relations
A table is in fourth normal form when it is in third normal form and there are no ____.
A table is in first normal form if it does not contain ____.
If B is functionally dependent on A, you can also say that B functionally determines A.
A column is a nonkey column if it is ____.
not part of primary key
Functional dependencies can be determined by looking at sample data.
Another name for a nonkey column is a ____.
The definition for ____ also defines a candidate key.
Partial dependencies are dependencies on only a portion of the ____.
The primary key in a table is a determinant but candidate keys are not determinants.
Tables that are in second normal form do not update anomalies.
The fact that column B is functionally dependent on column A can be written as ____.
A table that is in first normal form is better than one that is in second normal form.
If B (an attribute) is functionally dependent on A, we can also say that ____.
A functionally determines B
A(n) ____ is a column or collection of columns on which all columns in the table are functionally dependent but not has not necessarily been chosen as the primary key.
The possibility of data dependence is one of the four categories of update anomalies.
To convert a table to fourth normal form, split the third normal form table into separate tables, each containing the column that ____ the others.
____ normal form has an additional condition that the only determinants the table contains are candidate keys.
By splitting relations to achieve third normal form tables, you create the need to express interrelation constraints.
If the primary key of a table contains only a single column, the table is automatically in third normal form.
To correct update anomalies in a database, tables must be converted into various types of normal forms.
Potential problems in the design of a relational database are known as ____.
In a dependency diagram, the arrows below the boxes indicate the partial dependencies.
Crucial issue in making the determination between a single many-to-many relationship and two (or three) many-to-many relationships is the independence. When all three entities are critical in the relationship, the three-way relationship is appropriate. When there is independence among the to-many-to-many relationship is created when it is not appropriate to do so, the conversion to fourth normal form will correct the problem.
independence (when there is independence among the individual relationships, separate many to many relationships are appropriate)
The normalization process used to convert a relation or collection of relations to an equivalent collection of third normal form tables is a crucial part of the database design process.
Removal of repeating groups is the starting point in the quest to create tables that are as free of problems as possible.
Second normal form can be defined as a table that is in first normal form but that contains no ____.
no nonkey column is dependent on only a portion of the primary key
A structure that satisfies the properties required to be a relation (table) with the exception of allowing repeating groups (the entries in the table do not have to be single-valued)
A column is a nonkey attribute when it is not part of the primary key.
____ contains a repeating group.
An alternate key is a ____.
column that could be a primary key but was not chosen
The conversion of an unnormalized table to first normal form requires the removal of ____.
Converting to third normal form always avoids the problems related to dependencies.
If there is more than one possible choice for the primary key, and one of the possibilities is chosen to be the primary key, the other choices are referred to as ____.
The ____ is a column (or collection of columns) A such that all other columns are functionally dependent on A and no subcollection of the columns in A has this property.
A column created for an entity to serve solely as the primary key and that is visible to users.
A candidate key that was not chosen to be the primary key.
A minimal collection of columns (attributes) in a table on which all columns are functionally dependent but that has not necessarily been chosen as the primary key
A design method in which specific user requirements are synthesized into a design. Design starts at a low level. Specific user requirements drive design process.
Bottom-up Design Method
The number of items that must be included in a relationship.
OPTIONAL role in the relationship.
An entity in a relationship with a *MINIMUM cardinality of ZERO* plays an
MANDATORY role in the relationship.
An entity with a *MINIMUM cardinality of ONE* plays a
An IDEF1X name for an entity sub-type.
An IDEF1X. A collection of subtypes with the property that every element of the supertype is an element of at least one subtype.
An entity in the entity-relationship model used to *implement a many-to-many-relationship*
Supports all the user views encountered during a design process.
A relational-like language that is used to represent the result of the database design process.
DBDL (Database Design Language)
– Tables (relations), columns (attributes), and primary keys are written by first listing the table name and then, in parentheses, listing the columns that make up the table. The columns that make up the primary key are underlined.
– alternate keys are identified by AK and the column that make up the alternate keys
– Secondary keys are identified by SK and the columns that make up secondary keys
– Foreign keys are identified as FK, and they are followed by an arrow pointing to the table identified by the foreign key.
Rules for defining tables and their keys using DBDL
An entity that *requires a relationship* to another entity for identification .
An approach to representing data in a database that uses ER diagrams exclusively as the tool for representing entities, attributes and relationships. Visually represents database structure.
Entity-Relationship Model (ER Model)
*Rectangles* represents in the Entity-Relationship Model (ER Model) as:
The primary key for each entity appears ____ in the E-R diagram
appears above the line in the rectangle for each entity
The other columns in each entity appear __ in the E-R diagram
below the line within each rectangle
*Diamonds* represents in the Entity-Relationship Model (ER Model) as:
*Oval* represents in the Entity-Relationship Model (ER Model) as:
Alternate keys, secondary keys, and foreign keys are identified with the letters
AK, SK, and FK appear in the parentheses following the alternate key, secondary key, and foreign key, (appear below the line)
Dashed lines from the rectangle indicates ____ in the E-R diagram
the relationship between the foreign key and the table being identified.
Dot indicates the ___ in the E-R diagram
indicates the “many” part of the one-to-many relationship
Entity “A” is a subtype of entity “B” *IF EVERY* occurrence of entity “A” is *ALSO* an occurrence of entity “B.”
Table that is a subtype of another table.
A relationship in which the *existence of one entity depends on the existence of another related entity*.
Table does not contain multiple entries in a field for a single record.
First Normal Form (1NF)
A type of ER diagram; or technically a language of the IDEF (Integrated Definition) family of languages that is used for data modeling.
A relationship that is necessary for identification of an entity.
An IDEF1X; a collection of subtypes with the property that there are elements of the super type that are NOT elements of any subtype.
An entity that *does not require* a relationship to another entity for identification
The step during DB design in which the goal is to create a clean, DBMS independent design, that will support all user requirements.
Step 1 of information-level design
.1. Represent the user view as a collection of tables
*STEP 1: Determine the entities involved and create a separate table for each type of entry
*STEP 2: Determine the primary key for each table
*STEP 3: Determine the attributes/properties for each entity
*STEP 4: Determine relationships between the entities
ONE-TO-MANY/ MANY-TO-MANY/ ONE-TO-ONE
Step 2 of information-level design
2.) Normalize these tables (3rd number form (3NF) the Primary Key)
Step 3 of information-level design
Identify all keys in these tables (primary, alternate, secondary, and foreign)
Step 4 of information-level design
Merge the results of Steps 1-3 into the cumulative design.
A primary key that consists of a column or collection of columns that is an inherent characteristic of the entity. ALSO CALLED A LOGICAL KEY.
A primary key that consists of a column or collection of columns that is an inherent characteristic of the entity. ALSO CALLED AN INTELLIGENT KEY.
The role in a relationship played by an entity with a minimum cardinality of 1 (THAT IS, THERE MUST BE AT LEAST ONE OCCURRENCE OF THE ENTITY).
A relationship between *2 entities* in which each occurrence of each entity can be related to many occurrences of the other entity.
*CREATE A NEW TABLE WHOSE PRIMARY KEY IS THE COMBINATION OF THE PRIMARY KEYS OF THE ORIGINAL TABLES
*YOU CANNOT IMPLEMENT A M2M
A relationship between *3 entities* in which each occurrence of each entity can be related to many occurrences of each of the other entities.
Consists of a column that uniquely identifies an entity. Also called a logical key or an intelligent key.
A relationship that is not necessary for identification.
Special value. Represents absence of a value in a field. Used when a vale is unknown or inapplicable.
A relationship between *two entities* in which *EACH* occurrence of the 1st entity is related to *MANY* occurrences of the *2nd entity*, and *EACH* occurrence of the *2nd entity* is related to at most, one occurrence of the 1st entity.
*INCLUDES PRIMARY KEY OF THE “ONE” TABLE AS A FOREIGN KEY IN THE “MANY” TABLE
A relationship between *2 entities* in which each occurrence of the first entity is related to *ONE* occurrence of the 2nd entity AND each occurrence of the 2nd entity is related to, at most, one occurrence of the 1st entity.
*SIMPLEST IMPLEMENTATION IS TO TREAT IT AS A ONE-TO-MANY RELATIONSHIP
The role in a relationship played by an entity with a minimum cardinality of *ZERO* (that is, there need not be any occurrences of the entity).
The step during DB design in which a design for a given DBMS is produced from the final information-level design.
*Process consists of creating a table for each entity in the DBDL design.
Table is in 1NF and no nonkey attribute is dependent on only a portion of the primary key.
Second Normal Form (2NF)
A column (attribute) or collection of columns that is of interest for retrieval purposes (and that is not already designated as some other type of key).
To avoid use of null values
A system-generated primary key that is usually hidden from users. Also called a synthetic key.
A system-generated primary key that is usually hidden from users.
A design method that begins with a general DB design that models the overall enterprise and then repeatedly refines the model to achieve a design that supports all necessary applications. Refines model until design supports all necessary applications.
Top-Down Design Method
Tabel is in 2NF and only determinants are candidate keys. (primary (candidate) key can determine another column.)
Target/ Third normal Form
Set of requirements necessary to support operations of a particular database user.
Depends on another entity for its own existence.
Database design is a two-part process
*Information-Level Design (NOT dependent on a particular DBMS).
*Physical-Level Design (appropriate for the particular DBMS being used).
*Database design is represented in DBDL (Database Language Design).
*Designs can be represented visually, using entity-relationship (E-R) diagrams.
To obtain information from existing documents:
1.) List all attributes present in the document
2.) Identify potential functional dependencies
3.) Make a tentative list of tables
4.) Use the functional dependencies to refine the list
To implement a one-to-one relationship
1.) Include primary key of one table in the other table as a foreign key and indicate the foreign key as an alternate key.
If a table’s primary key consists of three (or more) columns:
1.) Determine whether there are independent relationships between pairs of these columns.
If a table contains columns that can be null and the nulls mean that the column is inapplicable for some rows:
You can split the table, placing the null column(s) in separate tables
The result of merging third normal form tables:
May not be in third normal form
Entity-Relationship (ER) MODEL represents:
The structure of a DB using and E-R DIAGRAM
a diagram that indicates the dependencies among the columns in a table. Arrows indicate all functional dependencies. A
Dependency diagram- *Arrows above boxes* indicates:
*normal dependencies* that should be present; in other words, *the primary key functionally determines all other columns.*
Dependency diagram- *Arrows below boxes* indicates:
prevents the table from being in second normal form – partial dependencies
dependencies on only a portion of the primary key
Column B is functionally dependent on Column A (or on a collection of columns) if a value for A determines a single value for B at any one time
Column A functionally determines column B if B is functionally dependent on A
A minimum collection of columns (attributes) in a table on which all columns are functionally dependent and that is chosen as the main direct-access vehicle to individual rows.
Column A (or a collection of columns) is ___ for a relation (table) R, if: 1. All columns in R are functionally dependent on A. 2. No subcollection of the columns in A (assuming A is a collection of columns and not just a single column, also has Property 1
When a table’s primary key contains only one column,
the table is automatically in the 2NF because there would be no way for a column to be dependent on only a portion of the primary key.