SQL Notes
- SQL: Structured Query Language
- many variations exist
- mysql is a public domain version and one that we have running
on the EECS servers
- SQL is an example of a relational database
- Relational databases are comprised of a collection of tables
- Each table is comprised of a collection of named columns
- Each column has a fixed type associated with it. Some examples
include:
- int(size)
- decimal(size, d): size is the maximum number of digits
and d is the number of decimal digits
- char(size): fixed length string--must have this number
of characters
- varchar(size): variable length string--there are other data
types that allow even larger size strings. Different
implementations will specify different maximum sizes for
varchar.
- date(yyymmdd): a date
- Databases tradeoff speed for flexibility: Although the specifics
differ, databases typically assume that each row will have the
same set of columns and hence the data does not need to be tagged
with markup elements. XML is more flexible in that each "row"
can define different data but the data is identifiable to a
processor via tags. However, this requires that an XML file contain
not just data but also data definition commands. The data definition
commands (i.e., markup) adds additional storage cost to a file and
also makes it more difficult to optimize queries because the
file structure is not completely known in advance.
- Data Manipulation Language (DML)-Executes queries against a database
- Select: extract data
- Update: update a column in a row
- Delete: delete a row from a table
- Insert into: insert a new row into a table
- Data Definition Language (DDL): Creates the structure of a table
and deletes tables
- Operation types
- Create Table: create a table
- Alter Table: add or delete a column from an existing table
- Drop Table: delete a table
- Create Index: creates a secondary index
- Drop Index: deletes an index
- Indices and Keys
- primary key: A column which is guaranteed to have a
unique value. When creating a table the designer
can denote exactly one column as being the primary
key
- clustering index: A clustering index is one in which
the records (rows) of a table are stored in the leaves
of the index (typically either a b-tree or extendible
hash table is used).
- The table is said to be physically
sorted on this column
- There can be only one clustering index per table
- A clustering index is useful for range queries
because several records can be usually stored
in a single block, thus reducing the required
number of block accesses.
- non-clustering index: An index consisting only of the
keys in a column. It allows for fast random access
on a single key but cannot be used to facilitate
fast range queries.