Practice Exams  
  - CCNA -  
  - CCSP -  
  - CCIE -  
  Search the Web  
  Certification Kits
  Subnet Calculator  
  Online Degrees  
  Exam Vouchers  
  Free Magazines  

MCDBA - 70-229 SQL Server Design
Developing the Logical Model
1. Tables Overview
2. Data Integrity Overview
2.1. Entity Integrity
2.2. Domain Integrity
2.3. Referential Integrity
3. Constraints Overview
4. Normalization

1. Tables Overview

Tables are database objects that represent real world entities and contain all the data related to the entities they represent. A table is made up of columns and rows. In tables, data is organized in a row-and-column format similar to a spreadsheet. For example, a table containing customer data can contain a row for each customer and columns representing the customers detailed information such as his name, address, country, and phone number.

2. Data Integrity

Enforcing Data Integrity ensures the quality of a table. A table is said to have low integrity if any given row in a table is not uniquely identified. In other words, if a table contains a customer named ABCD, the system should not accept another customer with the same name, ABCD. Data Integrity is divided into three categories:

1- Entity Integrity: Entity Integrity or Row Integrity defines a row as a unique entity for a particular table. For example, a table that contains customer data will have one and only one row for every customer. This is done by giving every row a unique identifier column(s) called a primary key.

2- Domain Integrity: Domain Integrity or Column Integrity maintains the validity of entries for a given column. For example, if a table contains a country column for each customer you can enforce domain integrity by restricting the type of the column through data types (i.e. whether the column should allow names or numbers), the format through CHECK constraints and rules, or the range of possible values (through FOREIGN KEY constraints, CHECK constraints, DEFAULT definitions, NOT NULL definitions, and rules). Therefore, by using domain integrity, you can rest assured that an age column (that is supposed to contain a number) will never contain a countries name.

3- Referential Integrity: When more than one type of information is included in a table, i.e. employees data and departments data, it is recommended that the integrity be maintained for all “foreign data” (i.e. data belonging to another table). This is accomplished by using referential integrity which preserves the relationship between the two tables. To put it in easier terms, foreign data, or a copy of a subset of another tables data that does not belong to a table but has to be included because of a necessity should always be the same as it’s original. For example, an employees table contains employee information and a departments table contains department information. Every employee belongs to a department and every department has its own set of attributes (columns) that store information about each department such as the department name, department head, the department type, and so on. Instead of having all the columns in the departments table copied into the employees table in order to know what department a given employee belongs to, it is recommended that one includes only one column called a “foreign key” column that references the “primary key” column of the referenced table or a unique column. The following diagram is a visual example of what I’ve just explained:

3. Constraints Overview

In order to able to efficiently design a table complete with its entity, domain, and referential integrity, one has to have some knowledge about constraints. Constraints are the programming elements used to enforce such integrity. There are six types of constraints in Microsoft SQL Server and they are as follows:

1- Primary Key constraints: a table usually has one or more columns whose values uniquely identify each row in a table. This column or columns is referred to as the “primary key” of a table and enforces the entity integrity of the table. A table can have only one primary key constraint, and a column or columns that participates in the primary key cannot accept null values. If a primary key constraint is defined on more than one column (referred to as a composite key), values may be duplicated within one column, but each combination of values from all the columns in the primary key constraint definition must be unique. Figure 1.1. shows an example of a primary key.

2- Foreign Key constraints: a foreign key is a column or combination of columns used to establish and enforce a link between the data in two tables. This is accomplished by adding the column or columns that hold one table's primary key values to the other table. This column in the second table is then referred to as a “foreign key”. Although the primary purpose of a foreign key is to control the values stored in the foreign key table, it also controls changes to data in the primary key table. For example, all employees that work in a company are assigned to departments and no employee can be assigned to a non-existent department. On the other hand, no department that has an employee assigned to it can be deleted from the departments table. If the sole purpose of the foreign key would be to control values stored in the foreign key table, some one would be able to delete a referenced row and hence break the link and the referential integrity.

3- Unique constraints: if every employee in the employees table has a Social Security Number, one would want to ensure that no two Social Security Numbers are the same throughout the rows. Since it is not appropriate to use a primary key in order to enforce such uniqueness, a unique constraint can be defined on the Social Security Number column and hence prevent duplicate values from being entered into the Social Security Number column.

4- Check constraints: a check constraint is used to enforce domain integrity (i.e. column integrity) by limiting the values that are accepted by a column. For example, a table that stores membership information for Junior Members in a sports club might contain an age field. A member is not accepted as a “junior” if his age is younger than eight or older than sixteen. In order to restrict the age values being entered in the age column, you would want to specify a check constraint to enforce that rule. You can create a check constraint with any logical expression that returns TRUE or FALSE. The logical expression for the previous example would be:
Age > = 8 and Age <= 16

5- Default constraints: every column in a table should contain a value. There are certain situations where a column’s value can be automatically discerned from the values entered in another column. For example, if a table contains a birth date column and an age column, the date entered in the birth date column subtracted from the current date would be the age of the person. Hence, a default constraint is defined on the age column and programmed with the logical expression that would produce the person’s age.

6- NULL or NOT NULL constraints: people typically confuse NULL with zero, blank, or a zero length character (i.e. “”); NULL means no entry has been made. The presence of NULL often defines that there either is no value or that the value is not known. For example, is the price column of the products table contains null, that doesn’t mean the product doesn’t have a price, it means the price hasn’t been entered. In order to ensure that no product will be added to the database unless its price is entered, you can specify a NOT NULL constraint on the price column. A column defined with the NULL constraint will allow a NULL entry from the user whereas a column defined with a NOT NULL constraint will reject NULL entries from the user. By carefully using NULL and NOT NULL constraints, you can maintain a very high level of data integrity.

4. Normalization

In the previous sections we looked at entities and data integrity. The following section is about putting the picture together. As we discussed in the previous sections, a database usually consists of one or more tables representing different entities. A database is said to be normalized if it complies with a set of formal rules, referred to as normal forms. These formal rules define how data that represents different entities should be organized in a database.

The figure above shows a table (the only table in the database) that is not normalized. In order to normalize this table and hence the database, you have to go through three steps or rules.

1- First Normal Form: the first step is to make the table compliant with the first normal form. A table complies with the first normal form if no columns define similar attributes and if no column contains multiple values. Let’s take each condition one at a time.

a. No columns define similar attributes: in this scenario, the Customer Purchases Table includes three Address columns (i.e. Address1, Address2, and Address3) which define similar attributes (i.e. the customers address). By simply altering the three column’s names to Address, City, and Country respectively, would make the table compliant with the first normal form’s first rule.
b. No column contains multiple values: in this scenario, the Customer Name column in the Customer Purchases Table contains multiple values. Multiple values do not mean text separated by spaces. For example, with this current design, if the company’s regional sales officer where to ask you to produce a year-to-date report of all customer purchases sorted out by the customer’s last name, one would have to query the column then separate the values and finally sort out the results by the last name. This process not only requires a few more lines of unnecessary code, it also causes a substantial deterioration to the speed of the query and hence blocks up resources and increases the network traffic. So, by splitting up the whole customer’s full name into three separate columns (i.e. First Name, Middle Name, and Last Name), you reduce the amount of effort needed to produce your report and the amount of traffic and blocked resources on the network. Hence, by doing that, we suffice the second rule in the first normal form and are ready to move on to the second normal form. The following diagram shows the Customer Purchases Table normalized to the first normal form.

2- Second Normal Form: the second step in normalization is to make the table(s) compliant with the second normal form. A table complies with the second normal form if it complies with the first normal form, and if each column that is not part of the primary key depends on all of the columns that are covered by the primary key in the table and not on a subset of the columns that are covered by the primary key. Let’s look at each one at a time.

a. Primary Key: the first sub rule in the second normal form is that every table includes a column that uniquely identifies each row within the table. It is highly recommended that this column be an identity column. An identity column is an auto-incremented number that is assigned automatically to every new row inserted into the table, hence preventing any user typo and increasing the efficiency and speed of an indexed table. It is recommended that the primary key column be named after the entity’s name followed by “ID” or “Number”, thus maintaining consistency throughout the database design.
b. Each column that is not part of the primary key depends on all columns that are covered by the primary key in a table and not a subset of it: this second sub rule is one of the most important rules in normalization and is the rule responsible for splitting up a big confusing chuck of data so that it becomes more but smaller tables that represent specific things. In this scenario, the table contains three different pieces of data (i.e. customer data, product data, and purchase data). Since the purchase details (i.e. quantity, discount) depend on the purchase, the product details (i.e. product name and price) depend on the product, and the customer details (i.e. address, city, and country) depend on the customer, we have columns that depend on other columns other than the primary key. In order to meet the rule’s requirement, we are forced to break up the big chunk into several tables that each represents only one entity. By doing so, we are left with three tables; the Customers table, the Products table, and the Purchases table similar to the following table diagram:

Since a purchase combines both a customer and a product being purchased, the purchases table must include a reference (i.e. foreign key, as we discussed in previous sections) to the customers table and the products table. This referential link is shown above in Figure 1.4

3- Third Normal Form: the last rule in normalization, the third normal form, puts in the final touches to the design. A database complies with the third normal form if it complies with the second normal form and if, in each table, columns that are not covered by the primary key do not depend on each other. In this scenario, the quantity and discount columns in the Purchases table depend on the ProductID rather than the PurchaseID. In order to meet the requirement, we have to split up the purchases table’s information so that we have a separate table for purchase information only, and another for the purchase details. The new table PurchaseDetails is usually referred to as a junction table. A junction table is a table that has 2 one-to-many relationships with the two tables it junctions, hence creating a many-to-many relationship. The junction table’s primary key is a combination of the two foreign columns combines in order to make the junction. This combined primary key is referred to as a composite key. In this scenario, a purchase can contain more than one product, and a product can be included in more than one purchase. By defining a composite key on the ProductID and PurchaseID column, we have a many-to-many relationship compliant with the third normal form. Since we haven’t yet reached relationships, this will be covered in later TechNotes. The following diagram shows the table structure of what the database would look like after we normalized it to the third normal form.


Current related exam topics for the 70-229 SQL Server Design exam:

Developing a Logical Data Model

Define entities. Considerations include entity composition and normalization.
- Specify entity attributes.
- Specify degree of normalization.

Design entity keys. Considerations include FOREIGN KEY constraints, PRIMARY KEY constraints, and UNIQUE constraints.
- Specify attributes that uniquely identify records.
- Specify attributes that reference other entities.

Design attribute domain integrity. Considerations include CHECK constraints, data types, and nullability.
- Specify scale and precision of allowable values for each attribute.
- Allow or prohibit NULL for each attribute.
- Specify allowable values for each attribute

Click here for the complete list of exam objectives.

Discuss this TechNote here Author: Abdul-Rahman Ali


All images and text are copyright protected, violations of these rights will be prosecuted to the full extent of the law.
2002-2015 TechExams.Net | Advertise | Disclaimer

IT Showcase

IT banner exchange