|
1. Databases
Overview
2. Parts of a Database
3. File groups
4. Overview of T-SQL and SQL Query Analyzer
5. Creating and Altering Databases
6. Creating & Designing Tables
1. Databases Overview
A database is a data repository that is made
up of a set of objects such as tables, stored procedures and
views. Data stored in a database is usually related to a particular
subject or theme, such as Inventory/Manufacturing information
for a manufacturing company. Before creating a database, it
is important to understand the different parts of a database
and some design considerations.
2. Parts of a Database
A database stores all information about the
objects it contains in one or more operating-system files.
These files are divided into three main categories, primary
database files, secondary database files, and transaction
log files. Let’s look at each one at a time:
2.1. Primary database files: The
primary database file contains the start-up information
for the database and is used to store the data. A database
can have only one primary database file and no two databases
can share the same primary database file. The primary database
file has a .mdf extension.
2.2. Secondary database files:
Secondary database files are used to store all data that
does not fit in the primary database file. A database should
not include a secondary database file if the primary database
file can hold all of the data. Some databases might need
one or more secondary data files because of their size or
use secondary files on separate disk drives to spread data
across multiple disks. A secondary database file has a .ndf
extension.
2.3. Transaction log files: Every
database usually contains one primary database file, as
we have mentioned before, and one transaction log file.
The transaction log files are used to hold the log information
used to recover the database. The transaction log is a serial
record of all modifications that have occurred in the database
as well as the transaction that performed each modification.
The transaction log records the start of each transaction.
It records the changes to the data and enough information
to undo the modifications (if necessary later) made during
each transaction. Data and transaction log information is
never mixed up on the same file. Every database must include
at least one transaction log file. Transaction log files
have a .ldf extension.
3. File Groups
A filegroup in SQL Server is a named collection
of one or more files the forms a single unit for data allocation
or administrative purposes. For example, two files can be
created on two separate disks (i.e. f1 and f2)
and assigned to the filegroup filegroup1. A table
(i.e. Orders) can then be created in the filegroup filegroup1.
As data is written to the filegroup, SQL Server writes an
amount proportional to the free space in the file to each
file within the filegroup, rather than writing all the data
to the first file until full, and then writing to the next
file. Let’s consider an example where the data that
is being written on the filegroup is 60MB. If file f1
has 100 megabytes (MB) free and file f2 has 200 MB
free, one extent (i.e. 20MB) is allocated from file f1,
and two extents (i.e. 40 MB) from file f2. This way
both files become full at about the same time. When the files
become full, if automatic growth is on, SQL Sever
expands the files in a round-robin fashion. For example, if
both files become full, SQL sever expands the first file and
starts writing data to it. When the first file becomes full,
SQL server expands the second file and starts writing data
to it. When the second file becomes full, SQL sever expands
the first file and so on. This way both disks become full
at about the same time.
4. Overview of T-SQL and SQL Query
Analyzer
Before we move on to real examples of creating
databases and database objects, it is necessary that we get
a brief understanding of what T-SQL and the SQL Query Analyzer
are. Let’s look at each one at a time.
4.1. T-SQL: Just like how
MS DOS commands are used to send commands to the computer,
Transact-SQL is the language used to send commands to SQL
server from all applications. Transact-SQL contains statements
that support all administrative work done in SQL Server. These
statements fall into two main categories:
4.1.1. Data Definition Language (DDL):
which is used to define and manage all the objects in an
SQL database. We’ll focus on DDL in this section when
we talk about creating databases and database objects.
4.1.2. Data Manipulation Language (DML): which
is used to select, insert, update, and delete data in the
objects defined using DDL. We’ll focus on DML in the
upcoming TechNotes when we talk about queries, data manipulation,
and data retrieval. More detailed information about Transact-SQL
can be found here.
4.2. SQL Query Analyzer:
is a graphical application that is used for the ad hoc, interactive
execution of Transact-SQL statements and scripts. In SQL Query
Analyzer, users enter Transact-SQL statements in a full-text
window, execute the statements, and view the results in a
results window. Users also can open a text file containing
Transact-SQL statements, execute the statements, and view
the results in the results window. More detailed information
about SQL Sever Query Analyzer can be found here.
5. Creating and Altering a Database
5.1 Creating a Database
Creating a database in SQL server is a very
simple process if the rules are followed accordingly. The
CREATE DATABASE statement is used to create a new database
in SQL server. Although the syntax for the CREATE DATABASE
statement is very long and complex, we will only focus on
the important parts of the statement for now. The following
figure explains the syntax of the CREATE DATABASE statement.

1. The CREATE DATABASE statement followed
by the new database name is the first and most important clause
in creating a database. Database names must be unique within
a server and conform to the rules for identifiers. Some important
identifier rules to keep in mind are:
a. The first character must be a letter
and should not be an at ‘@’ sign or a number
‘#’ sign.
b. The subsequent characters can be letters or numbers.
c. The database name must not be a Transact-SQL reserved
word. SQL Server reserves both the uppercase and lowercase
versions of reserved words.
2. The second clause in Figure 2.1 specifies
the primary data file for the database. The ON keyword is
followed by a comma-separated list of items defining the data
files for the primary filegroup. The parameters for the primary
filegroup are:
a. Name: specifies the logical
filename which is the name used to reference the file in
any Transact-SQL statements executed after the database
is created. The logical file name must be unique
in the database and conform to the rules for identifiers.
SQL Server creates a default logical file name if no logical
file name is defined.
b. Filename: specifies the physical operating-system
file name. This name should include the path (i.e. drive
and directory) followed by the name of the data file. The
primary data file’s extension is .mdf.
c. Size: is the initial size of the file defined
in the FILENAME. The kilobyte (KB), megabyte (MB), gigabyte
(GB), or terabyte (TB) suffixes can be used. The default
is MB. Specify a whole number; do not include a decimal.
The minimum value for size is 512 KB. If size is not specified,
the default is 1 MB.
d. Maxsize: is the maximum size to which the file
defined in the filename can grow. The kilobyte (KB), megabyte
(MB), gigabyte (GB), or terabyte (TB) suffixes can be used.
The default is MB. Specify a whole number; do not include
a decimal. If maxsize is not specified, the file grows until
the disk is full.
e. Filegrowth: is the amount of space added to
the file each time new space is needed. Specify a whole
number; do not include a decimal. A value of 0 indicates
no growth. The value can be specified in MB, KB, GB, TB,
or percent (%). If a number is specified without an MB,
KB, or % suffix, the default is MB. When % is specified,
the growth increment size is the specified percentage of
the size of the file at the time the increment occurs. If
filegrowth is not specified, the default value is 10 percent
and the minimum value is 64 KB.
3. The third clause in Figure 2.1 specifies
the transaction log file for the database. The LOG ON keyword
is followed by a comma-separated list of items defining the
data files for the transaction log. The parameters for the
transaction log are the same as the primary data files and
have been outlined above.
Once we understand the most important aspects
of the CREATE DATABASE statement, we are ready to create a
real world database. The following is a step-by-step guide
in creating the Sales Database with a SIZE of 50MB, a MAXSIZE
of 200MB, and a FILEGROWTH of 10MB. The example also includes
the creation of the transaction log file with a SIZE of 5MB,
a MAXSIZE of 25MB, and a FILEGROWTH of 5MB.
1. Open SQL Query Analyzer (Query Analyzer
is found in the SQL Server folder in the programs menu)
and start a new connection.
2. In the Query Textbox, type the following:
CREATE DATABASE Sales
ON (
NAME = 'Sales_dat',
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL\Data\Sales_dat.mdf',
SIZE = 50MB,
MAXSIZE = 200MB,
FILEGROWTH = 10MB)
LOG ON (
NAME = 'Sales_log',
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL\Data\Sales_log.ldf',
SIZE = 5MB,
MAXSIZE = 20MB,
FILEGROWTH = 5MB)
Note: the path for the file names might
not be the same as your computer’s SQL Server Data
directory path.
3. Press F5
5.2 Viewing database information
The sp_helpdb command followed by
the database name will show you all the information about
the database. In this example, type sp_helpdb
sales to view the sales database information.
5.3 Altering a database
The ALTER DATABASE statement, as in figure
2.2, is divided into two parts; the statement and the parameters.
The five parameters we will look into are only five of the
ten parameters in the ALTER DATABASE statement. These five
are the most important and with them you can add a file, modify
a file’s settings, remove a file, modify the database’s
name, and set options for the database. We will look into
each one at a time.
1) Add File: data and transaction
log files can be added to expand a database or deleted to
shrink a database. When a file is added, the file is available
immediately for use by the database. The file specifications
are the same as the file specifications we explained
in the creating a database section. The following example
will add a new data file to the sales database. This example
requires that the sales database be created, if you have not
created the sales database, do so following the step-by-step
procedure for creating the sales database in the create a
database section.
a) Open SQL Query Analyzer
b) Type the following in to the query text box:
ALTER DATABASE Sales
ADD FILE (
NAME= 'Sales_dat2',
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL\Data\Sales_dat2.ndf',
SIZE = 15MB,
MAXSIZE = 30MB,
FILEGROWTH = 5MB)
c) Press F5
2) Modify File: once files are added into a database
and changes are required, you either have the choice of deleting
the files or modifying them. Since requirements such as expanding
a data file or shrinking it otherwise requires that the data
remain intact, the modify file statement is the best choice.
The following two example demonstrate the use of the MODIFY
FILE clause.
Example One (Increasing the MAXSIZE of a data file):
a) Open SQL Query Analyzer
b) Type the following in the Query Text Box
ALTER DATABASE Sales
MODIFY FILE (
NAME= 'Sales_dat2',
MAXSIZE = 50MB)
c) Press F5
Example Two (Changing the logical name of
a data file):
a) Open SQL Query Analyzer
b) Type the following in the Query Text Box
ALTER DATABASE Sales
MODIFY FILE (NAME= 'Sales_dat2',NEWNAME=
'Sales2_dat')
c) Press F5
3) Remove File: removes the file description from
the database system tables and deletes the physical file.
To remove a file from a database, use the REMOVE FILE clause
followed by the file’s logical name. Example:
ALTER DATABASE Sales
REMOVE FILE ‘Sales2_dat’
4) Modify Database name: this clause
is used to rename the database itself. To rename the database,
use the MODIFY NAME = clause followed by the new database
name. Example:
ALTER DATABASE Sales
MODIFY NAME = New_Sales
6. Creating & Designing Tables
In previous TechNotes, we looked into what
tables are, how they store data, and formal rules used to
implement normalization. In this section, we’ll talk
about creating tables and other important table design aspects.
As we’ve discussed before, tables are made up of rows
and columns. A table that stores information about customers
would have one row per customer and one or more columns that
store the customer’s attributes or details. Every column
has a name and a data type. A data type defines data allowed
on each column. For example, if you want a column to contain
only numbers you can assigned a numeric data type. Similarly,
you can define a date data type on a column you want to store
date and time information only. Assigning data types with
care and understanding how they work, are the basics of database
design.
6.1 Data Types
Data types are divided into two main categories;
System Data Types and User-defined Data Types.
In this section, we will focus on System Data Types only.
There are many types of System Data Types, but for simplicity
reasons, we’ll look into the most important ones; Character
Data, Numeric Data, Monetary Data,
and Date and Time Data. Let’s take
each one at a time.
1. Character Data: Character data
consists of any combination of letters, symbols, and numeric
characters. For example, valid character data includes "123",
"TechExams", and "()/$%&".
To specify the size of the data type, write the name of the
data type (i.e. char or varchar)
followed by the size in brackets “()” (i.e. char(10),
varchar(20)). Character constants
must be enclosed in single quotation marks (') or double quotation
marks ("). Enclosing a character constant in single quotation
marks is recommended. If the data to be stored is longer than
the number of characters allowed, the data is truncated. For
example, if a column is defined as char(10)
and the value "This is a really long character string"
is stored into the column, SQL Server truncates the character
string to "This is a". There are two main types
of Character Data Types; char and
varchar.
1.1. The char data type: The char
data type is a fixed-length data type when the NOT NULL
constraint is specified. If a value shorter than the length
of the column is inserted into a char
NOT NULL column, the value is right-padded with blanks to
the size of the column. For example, if a column is defined
as char(10) NOT NULL and the data
to be stored is "music", SQL Server stores this
data as "music_____", where "_" indicates
a blank. Use char when the data
values in a column are expected to be consistently close
to the same size.
1.2. The varchar data type: The
varchar data type is a variable-length
data type. Values shorter than the size of the column are
not right-padded to the size of the column. Use varchar
when the data values in a column are expected to vary considerably
in size.
Char and varchar
data types have a maximum size of 8KB and store data composed
of:
- Upper or Lower Case characters such as A, b, or Ab.
- Numerals such as 1, 2, or 3.
- Special characters such as the "at" sign (@),
ampersand (&), and exclamation point (!).
2. Numeric Data: Numeric data consists
of numbers only. Numeric data includes positive and negative
numbers, decimal and fractional numbers, and whole numbers
(integers). Numeric Data is divided into three categories;
Integer Data, Decimal Data, and Approximate Numeric Data.
Let’s look at each one at a time.
2.1. Integer Data: Integer data
consists of negative or positive whole numbers, such as
-10, 0, 5, and 1595. In SQL Sever, integer data is stored
using the bigint, int,
smallint, and tinyint
data types. The bigint data type
can store a larger range of numbers than the int
data type. The int data type can
store a larger range of integers than smallint,
which can store a larger range of numbers than tinyint.
- Use the bigint data type to
store numbers in the range from -2^63 (-9223372036854775808)
through 2^63-1 (9223372036854775807). Storage size is 8
bytes.
- Use the int data type to store
numbers in the range from -2,147,483,648 through 2,147,483,647
only (requires 4 bytes of storage per value).
- Use the smallint data type to
store numbers in the range from -32,768 through 32,767 only
(requires 2 bytes of storage per value).
- Use the tinyint data type to
store numbers in the range from 0 through 255 only (requires
1 byte of storage per value).
2.2. Decimal Data: Decimal data
consists of data that is stored to the least significant
digit. In SQL Sever, decimal data is stored using decimal
or numeric data types. The number
of bytes required to store a decimal
or numeric value depends on the
total number of integer digits to the left of the decimal
point and the total number of decimal digits to the right
of the decimal point. For example, more bytes are required
to store the value 19283.29383 than to store the value 1.1.
In SQL Server, the numeric data
type is equivalent to the decimal
data type.
2.3. Approximate Numeric Data:
Approximate numeric (floating-point) data consists of data
preserved as accurately as the binary numbering system can
offer. Approximate numeric data is stored using the float
and real data types in SQL Server.
For example, because the fraction one-third in decimal notation
is .333333 (repeating), this value cannot be represented
precisely using approximate decimal data. Therefore, the
value retrieved from SQL Server may not be exactly what
was stored originally in the column. Additional examples
of numeric approximations are floating-point values ending
in .3, .6, and .7.
3. Monetery Data: Monetary data
represents positive or negative amounts of money. In SQL Sever,
monetary data is stored using the money and smallmoney data
types. Monetary data can be stored to an accuracy of four
decimal places.
- Use the money data type to store
values in the range from -922,337,203,685,477.5808 through
+922,337,203,685,477.5807 (requires 8 bytes to store a value).
- Use the smallmoney data type to
store values in the range from -214,748.3648 through 214,748.3647
(requires 4 bytes to store a value). If a greater number of
decimal places are required, use the decimal
data type instead.
4. Date and Time Data: Date and
time data consists of valid date or time combinations. For
example, valid date and time data includes both "08/01/2003
9:30:00:00:00 AM" and "1:28:29:15:01 AM 8/01/99".
In SQL Sever, date and time data is stored using the datetime
and smalldatetime data types.
- Use datetime to store dates in
the range from January 1, 1753 through December 31, 9999 (requires
8 bytes of storage per value).
- Use smalldatetime to store dates
in the range from January 1, 1900 through June 6, 2079 (requires
4 bytes of storage per value).
6.2 Creating Tables
After you have designed the tables in a database, you are
now ready to create the tables. A table can have a maximum
of 1,204 columns. Table and column names must follow the rules
for identifiers; they must be unique within a given table,
but you can use the same column name in different tables in
the same database. You must also define a data type for each
column.

The CREATE TABLE statement, as in Figure 2.3, is divided into
two parts; the column definitions, and the table constraint
definitions. The full syntax of the CREATE TABLE statement
is much longer than the one we currently see in Figure 2.3,
but for simplicity reasons, we’ll focus only on the
important parts of the statement which are shown in figure
2.3. When you create a table, you define the table’s
name, its columns, their data types, and if required, their
constraints.
6.2.1 Table Columns
When choosing column names in a table, it’s best to
choose names that are both descriptive and short. Since you’re
not usually the only person who’ll deal with a database
design, it’s important that others be able to easily
discern a columns purpose from its name. For example, a column
that stores a customer’s name should be named CustomerName
rather than NameCustomer, HisName, or CustomerIdentity. A
column should also have a data type as we’ve discussed
previously. It is also recommended that columns should be
assigned data types that best describe their purpose. For
example, a column that stores a customer’s age should
be a numeric data type instead of a character data type. Although
character data types can store numeric data, assigning a character
data type to an age column might cause difficulties when querying
the column and using its value in numeric calculations. The
following example shows the creation of a simple customers
table with only four columns.
CREATE TABLE Customers
(FirstName varchar(30),
MiddleName varchar(10),
LastName varchar(25),
Age int)
6.2.2 Table Constraints
Table Constraints are divided into two main categories; column
level constraints, and table level constraints. A column level
constraint is specified as part of a column definition and
applies only to that column. A table level constraint is declared
independently from a column definition and can apply to more
than one column in a table. For example, a customer is uniquely
identified by his/her first name and last name. No two customers
should have the same first name and last name in the database.
You want to define a primary key on both columns.
This is not possible if you define the constraint at column
level because column level constraints can reference only
one column and defining two primary keys on one table is not
allowed. In order to achieve this goal, you have to specify
a primary key constraint at table level and include
both columns in its expression. All constraints defined in
a table have a name. SQL Sever automatically generates a unique
name if no name is provided. To simplify things, constraint
definitions can be divided into two categories; simple definition
and full definition. The simple definition doesn’t contain
the name of the constraint whereas the full definition does.
Following are examples of simple and full constraint definitions.

It’s a good idea to define constraints using the full
definition because knowing the names of your constraints make
it easier to alter information later on. The following are
example showing the creation of different types of constraints
at different definition levels.
Example One: Column Level Constraints.
CREATE TABLE Customers
(FirstName varchar(30),
MiddleName varchar(10),
LastName varchar(25) PRIMARY KEY)
Example Two: Table Level
Constraints
CREATE TABLE Customers
(FirstName varchar(30),
MiddleName varchar(10),
LastName varchar(25),
CONSTRAINT PK_CustomerName PRIMARY KEY
(FirstName,LastName))
Example Three:
Creating a DEFAULT constraint in a table. Suppose your company
is based in Toronto, Canada. Almost 90% of your customers
live in Canada. Only 10% live in the United States. You don’t
want your customer service rep’s to enter the value
“Canada” again and again. Since most of your customers
are from Canada, customer service rep’s won’t
have to enter “CA” repetitively if you define
a DEFAULT constraint on the Country column and use “CA”
as the default value. Following is an example of creating
a DEFAULT constraint.
CREATE TABLE Customers
(FirstName varchar(30),
MiddleName varchar(10),
LastName varchar(25),
City varchar(10),
Country varchar(2) DEFAULT 'CA',
CONSTRAINT PK_CustomerName PRIMARY KEY (FirstName,LastName))
Example Four: Creating
a CHECK constraint. Suppose your company serves customers
from Canada and the United States only. You want your customer
service rep’s to enter either of the two values only.
By defining a CHECK constraint on the country column, you
can limit the values that are entered in the country column
to either CA or US. Following is an example of creating a
CHECK constraint. Notice how the two constraints (i.e. DEFAULT
and CHECK) are followed by each other without a comma (,)
in between. If you define more than one constraint on a column
at column level, don’t separate the constraints with
a comma.
CREATE TABLE Customers
(FirstName varchar(30),
MiddleName varchar(10),
LastName varchar(25),
City varchar(10),
Country varchar(10) DEFAULT 'CA' CHECK (Country='CA' or Country='US'),
CONSTRAINT PK_CustomerName PRIMARY KEY
(FirstName,LastName))
Example Five: Creating
a UNIQUE constraint. Suppose your customers table includes
a social security number for every customer. No two customers
should have the same SSN in the table. A PRIMARY KEY would
be the right choice if the rows were uniquely identified by
the SSN. But since the SSN is only an attribute like the customer’s
and rows within the customer table are uniquely identified
by the first name and last name column, defining a UNIQUE
constraint on the SSN column would be the right choice. Following
is an example of a UNIQUE constraint being defined on the
SSN column.
CREATE TABLE Customers
(FirstName varchar(30),
MiddleName varchar(10),
LastName varchar(25),
City varchar(10),
Country varchar(10) DEFAULT 'CA' CHECK (Country='CA' or Country='US'),
SSN int UNIQUE,
CONSTRAINT PK_CustomerName PRIMARY KEY
(FirstName,LastName))
Example Six: Creating a
NULL/NOT NULL constraint. Customers usally have many means
of contact; phone, fax or email are several possible methods.
Your company uses the person’s home phone number as
their permanent contact. If a customer service rep leaves
out the home phone number, contact with a customer might be
lost since all customers don’t have a fax or email.
To ensure that a number is entered in the Phone column, define
a NOT NULL constraint on the phone column. Columns defined
as a PRIMARY KEY are always set to NOT NULL. Since a table
can’t have more than one PRIMARY KEY, using the NOT
NULL constraint itself will produce the required results.
Following is an example of a customers table that has the
Phone column set to NOT NULL.
CREATE TABLE Customers
(FirstName varchar(30),
MiddleName varchar(10),
LastName varchar(25),
City varchar(10),
Country varchar(10) DEFAULT 'CA' CHECK (Country='CA' or Country='US'),
SSN int UNIQUE,
Phone int NOT NULL,
Fax int,
Email varchar(40),
CONSTRAINT PK_CustomerName PRIMARY KEY
(FirstName,LastName))
6.2.3 Viewing Table Information
In Query Analyzer, type exec sp_help followed by
the table name to retrieve system information about the table.
Information you’ll see includes the columns in the table,
their data types and nullability, constraints, both column
level and table level, and other important information.
6.3 Altering a Table
Once the creation of a table is complete, there is always
a possibility that something needs to be changed, either because
of improvement or elimination of design bugs. This is made
possible by the ALTER TABLE T-SQL statement. The ALTER TABLE
T-SQL statement allows you to change existing column definitions,
add new columns, and drop existing columns and/or constraints.
Let’s take a look at the statement’s syntax.

When using the ALTER TABLE statement, altering, adding, and
dropping actions are mutually exclusive. This means that you
can not do more than one type of action within every statement
execution. The first part of the alter table statement is
to specify the name of table to be altered. Let’s take
a look at the three other parts, one at a time.
1. Alter: altering a column is very easy. You just specify
the column’s name and the new data type you want assigned
to it. SQL Sever will refuse altering a column if:
a. The column is part of a PRIMARY KEY or FOREIGN key definition.
Remember we talked about table integrity and how columns that
are being depended on by other columns should not be deleted
without removing the dependency first.
b. The column is used in a CHECK or UNIQUE constraint. The
only exception in such a case is altering the length of a
variable-length column such as varchar.
2. Add: to add a new column to a table, specify the new column’s
name followed by its data type.
3. Drop: to drop a column, use the DROP COLUMN part of the
syntax followed by the column’s name. To drop a constraint,
use the DROP CONSTRAINT part of the syntax followed by the
constraint name.
|