Data Modeling (slightly different definition than book)
  Data Model -- A high level description of
    1) a data's entities
    2) the entities' properties
    3) the relationships among entities
    4) the integrity constraints on the data

  Entity: An object we are trying to model
    Formally: an artifact, either physical or 
	virtual, about which we want to maintain information

    Physical objects: Staff member, a branch office
    Virtual objects: contract, invoice

  Relationships: The way in which objects interact
    1) A staff member belongs to only one branch
    2) A branch manager may manage multiple staff members
    3) A customer leases a property from an owner (this is a tertiary
       relationship involving three entities--a customer, a property, and
       an owner).

  Integrity Constraints: A restriction on properties or relationships
    1) No one makes a salary more than $100k (property restriction)
    2) No employee manages more than 30 properties (relationship restriction)

Database Model: A set of rules that define
  1) the structural part of the data (DDL)
  2) the query language that manipulates the data
	(insert, update, delete, find)
  3) constraints on the data

  Once the designer has developed the data model, the designer must decide
        which database model to use (relational, network, hierachical, OO, etc)

Layers of a Database (diagram)

View1   View2 ...  View n
       conceptual view (data model)
       internal view (DDL)
       physical implementation (files)

    User = views
    DB modeler = conceptual view
    DB administrator = internal view
    Vendor (oracle, microsoft) = physical implementation

Creating the Data Model 

    We will use an example involving a property leasing firm named
	Dream Home. Dream Home manages properties. It has a stable
	of owners, each of whom owns one or more properties. It has
	staff that manage properties and show them to clients. For
	each rented property it has a lease. For each viewing, the
	client may optionally fill out a form listing their comments.
   	Dream Homes has a number of branches.

  1.  Identify entities: an artifact, either physical or 
	virtual, about which we want to maintain information
        a) forms (contract, prescription, invoice): good for identifying
	   entities and properties
        b) interviews: good for identifying integrity constraints and
	c) observation: good for identifying entities

  2.  Associate properties with entities: most fields in a form map to a
	e.g., branch has street, city, zip code, manager
	      lease has client name, monthly rent, deposit paid, start
		date, end date, duration, property address
  3. Identify keys that uniquely identify entities (called primary keys): 

	initially identify natural keys. natural keys involve only attributes
	   that are "innate" to an object
	e.g., street and zip code uniquely identify branch 
	      first/middle/last name probably uniquely identifies person

	1) both of the above natural keys may not uniquely identify either
		the branch or the person (two cities in same zip code might
		have same street or two people might have same name).
	2) natural keys may be verbose and since we duplicate keys in the
		database, we want them to be compact.

	resolution: create artificial key, such as ownerno, leaseno, 
		branchno. Artificial keys guarantee entity uniqueness and
		are compact.

   4. Identify relationships among entities

	1-1: one manager per branch
	n-1: many staff per branch
	n-m: properties in newspapers (1 newspaper may advertise multiple
		properties and 1 property may be advertised in multiple

	     viewing: 1 client may view multiple properties and 1 property
		may be viewed by multiple clients

    5. Relationships help determine

	a) integrity constraints

	    e.g., once we know that staff manage properties, we may decide
		to limit this number to 30

	b) possible queries: queries typically follow the "paths" formed
		by relationships (e.g., list the staff in branch x)

	c) foreign keys: a property which is a primary key in another entity
		(serves as a pointer to that entity and represents a

		e.g., a staff member may have a manager property which is
		      the manager's staffno

		      a property may have a realtor property which is the
			realtor's staffno

Independence: Go back to view-conceptual view-internal view diagram

    1) Physical Independence: the structure of the data is not
	hard coded into the software (draw a line between internal view
	and conceptual view)
	a) means we don't have to change the query software if we change physical
		implementation (e.g., change from using a tree to a hash
		table to store some information)

    2) Logical Independence: Views should not change when the
	underlying entities change unless directly affected by the change
	(draw a line between views and conceptual view)

	a) means we don't have to destroy cache information for views when
	   the underlying entity changes

Next time:

    Internal View: The relational model (its theoretical underpinnings)
    Then User View: SQL first, then its theoretical underpinnings
	as relational algebra/calculus
    Then Conceptual View (normalization and entity-relationship diagrams)
    Finally physical implementation