CS465 Midterm -- Spring 2021

  1. This exam is open-note, open-book.
  2. You must answer all of the questions.
  3. The final question requires that you draw a B+-tree. You will take a picture of your drawing and upload it immediately after you exit the exam. You will have until 1:10pm to complete your upload. The link is called "Midterm-Upload". If you decide to draw a picture of your ER diagram then you will also need to take a picture of this drawing and upload it as well.

  1. (Queries) Write the following queries in the requested language

    1. (SQL) Print the patient name, date and time of appointment, groupNo, and clinic address of where the shot will be administered for all patients who have appointments to get a shot between 2021-03-01 and 2021-03-10 inclusive.

      		 SELECT p.name, a.date, a.time p.groupNo, c.address
      		  FROM Appointments a NATURAL JOIN Patient p NATURAL JOIN Clinic c
      		  WHERE a.day BETWEEN '2021-03-01' AND '2021-03-10';
      		

    2. (SQL) Print all patients who have been administered a dose of the "Moderna" vaccine (that is the vaccine's name) between 2021-01-16 and 2021-02-16 inclusive. Print the patient name, date of administration, and dosage.

      SELECT p.name, m.date, v.dosage
         FROM Patient p NATURAL JOIN MedicalRecord m NATURAL JOIN Vaccine v 
         WHERE (v.name = "Moderna") AND (m.date BETWEEN '2021-01-16' AND '2021-02-16);
      

    3. (SQL) Print the names of all staff members who were at the "EZ Clinic" on either 2021-03-10, 2021-04-05, or 2021-05-04 (if the staff member was there on any one of those days, the staff member's name should be included in the result).
      
      SELECT s.Name 
      FROM Staff s NATURAL JOIN Staffing st NATURAL JOIN Clinic c
      WHERE c.Name = 'EZ Clinic' AND st.Date IN ('2021-03-10', '2021=04-05', '2021-05-04');
      		    
    4. (SQL) Print all clinics that have more than 10 staff on 2021-03-05. For each clinic you only need to print the clinicNo and number of staff. Name the number of staff "staffCount".
      
      Select clinicNo, count(*) as staffCount
      FROM Staffing
      WHERE Date = '2021-03-05'
      GROUP BY clinicNo
      HAVING staffCount > 10;
      		    
    5. (SQL) Print all clinics that have a staff count that is greater than 10 on 2021-03-05. For each clinic you only need to print the clinicNo and number of staff. Name number of staff "staffCount".
       SELECT clinicNo, COUNT(*) as staffCount
      		      FROM Staffing
      		      WHERE day = '2021-03-05'
      		      GROUP BY ClinicNo
      		      HAVING staffCount > 10;
      		    

    6. (SQL) Print all groups that have more than 20 patients who were born before 2000-01-01. For each group you only need to print the groupNo and number of patients whose date of birth is before 2000-01-01. Name the number of patients "groupCount".
      
      SELECT groupNo, count(*) as groupCount			  
      FROM Patient
      WHERE DateOfBirth < '2000-01-01'
      GROUP BY groupNo
      HAVING groupCount > 20;		    
      

    7. (Relational Calculus--CS465 students) Print the name, address, groupNo, group description, and the group start date for the patient named 'Brad Vander Zanden'.

       { P.name, P.address, P.groupNo, G.description, G.startDate | Patient(P) ∧ Group(G) ∧ P.groupNo = G.groupNo ∧ P.name = 'Brad Vander Zanden'}

    8. (Relational Calculus--CS465 students) Print the staff members who have administered a vaccine to 'Brad Vander Zanden' and the vaccine that was administered by that staff member. Print only the staffNo and the vaccineNo.
      { M.staffNo, M.vaccineNo | MedicalRecord(M) ∧ (∃P)(Patient(P) ∧ (P.patientNo = M.patientNo) ∧ (P.name = 'Brad Vander Zanden')}
      			
    9. (Relational Calculus--CS465 students only) Print the days on which 'Jerry Mouse' was a staff member at clinicNo 3.
      { ST.date | Staffing(ST) ∧ ST.clinicNo = 3 ∧ (∃S)(Staff(S) ∧ (S.staffNo = ST.staffNo) ∧ S.name = 'Jerry Mouse')}
      			    

    10. (Relational Calculus--CS565 students) Print the patient name, date, and the name of the vaccine of every patient who has been administered a shot by 'Wily Coyote'.
       { P.name, M.date, V.name | Patient(P) ∧ MedicalRecord(M) ∧ Vaccine(V) ∧ P.patientNo = M.PatientNo ∧ M.vaccineNo = V.vaccineNo ∧ ∃(S)(Staff(S) ∧ M.administerer = S.staffNo and S.name = 'Wily Coyote') }
    11. (Relational Calculus--CS565 students) Print all patients who have been administered a dose of the "Moderna" vaccine (that is the vaccine's name) between 2021-01-16 and 2021-02-16 inclusive. Print the patient name, date of administration, and dosage.
      { P.name, M.date, V.dosage | Patient(P) ∧ MedicalRecord(M) ∧ Vaccine(V) ∧ P.patientNo = M.patientNo ∧ M.vaccineNo = V.vaccineNo ∧ V.name = 'Moderna' ∧ M.date ≥ 2021-01-16 ∧ M.date ≤ 2021-02-16 }
      			    
    12. (Relational Calculus--CS565 students only) Print the names of the clinics and the days on which 'Bugs Bunny' was a staff member at that clinic. The result should have one row for each unique clinic name/date.
      
      { C.name, ST.date | Clinic(C) ∧ Staffing(ST) ∧ C.clinicNo = ST.clinicNo ∧ (∃S)(Staff(S) ∧ S.staffNo = ST.staffNo ∧ S.name = 'Bugs Bunny')}
      				

  2. (Physical Design) Which of the following attribute in the Appointment database would be the best candidate for a cluster index?

    1. Appointment(PatientNo)
    2. Appointment(Date)
    3. Group(Description)
    4. Staffing(StaffNo)
    5. Patient(Address)

  3. (Physical Design) Which of the following relations should not have any indices associated with them?

    1. Patient
    2. Clinic
    3. Staff
    4. Appointment
    5. Vaccine
    6. MedicalRecord
    7. Group

  4. (ER Diagrams) A university database has three entities named students, courses, and teachers with the following relationships:
                 take              teach
    Students------------>Courses<-------------Instructors	  
          (a)          (b)      (c)         (d)
    	       
    You may assume that:
    1. Students take a minimum of 0 and a maximum of 7 courses
    2. Courses enroll a minimum of 10 and a maximum of 50 students
    3. An instructur teaches a minimum of 1 and a maximum of 4 courses
    4. A course is taught by exactly 1 instructor

    For each of the above letters a-d, enter the multiplicity for that entity in the relationship (e.g., (a) represents the multiplicity of students that may occur in the Students take Courses relationship).

    1. 10..50
    2. 0..7
    3. 1..4
    4. 1..1

  5. (Normalization) You are constructing a database for an online order company. A customer places an order for one or more products. A service representative fulfills this order by finding the appropriate warehouse for each product, bundling the products if necessary, and sending them to the customer's address. Since different warehouses may be used to fulfill the order, some products may arrive on different dates. A customer may place multiple orders on the same day and the same service representative will handle all orders for that day. Based on the above information, you have derived the following relation and functional dependencies:

    Order(customerNo, customerName, city, street, zipcode, dateOrdered, timeOrdered, [productNo, productName, description, pricePerItem, quantity, dateDelivered, warehouseNo, warehouseName, warehouseAddress, quantityAvailable]*, serviceRepNo, serviceRepName, totalOrderCost)

    Functional dependencies
    	  customerNo --> customerName, city, street, zipcode
    	  productNo --> productName, description, pricePerItem
    	  warehouseNo --> warehouseName, warehouseAddress
    	  warehouseNo, productNo --> quantityAvailable
    	  serviceRepNo --> serviceRepName
    	  customerNo, dateOrdered --> serviceRepNo
    	  customerNo, dateOrdered, timeOrdered --> totalOrderCost, [productNo, quantity, dateDelivered, warehouseNo]*
    	
    The primary key is (customerNo, dateOrdered, timeOrdered)

    Answer the following questions.

    1. This table is susceptible to various types of anomalies that can be eliminated using normalization. For each of the following items, indicate if it is or is not an update anomaly and if it is an anomaly, then indicate whether it is an insert, delete, or modification anomaly. If changing/deleting/inserting something can cause an inconsistency that cannot be eliminated via normalization, then it is not an anomaly.

      1. If you delete the last order associated with a product, you lose all information associated with that product.
      2. You cannot delete a warehouse unless you have deleted all products associated with that warehouse.
      3. If you change a customer's number, then you must change every instance of that customer's number in the database or you will have an inconsistency.
      4. You cannot enter a new warehouse unless it has an order associated with it.
      5. If you change a product's name you must change every instance of that product's name or you will have an inconsistency.
      6. If you change the day or time of an order, you must change every instance of that day or time or you will have an inconsistency.
      7. If you change a service representative's number you must change every instance of that number or you will have an inconsistency.

    2. Next to each of the following functional dependencies, indicate whether they are:
      1. Used to convert the relation from 1st to 2nd normal form
      2. Used to convert the relation from 2nd to 3rd normal form
      3. Not used in the decomposition because the dependency represents a candidate or primary key
      a  b  c	  customerNo --> customerName, city, street, zipcode
      a  b  c	  productNo --> productName, description, pricePerItem
      a  b  c	  warehouseNo --> warehouseName, warehouseAddress
      a  b  c	  warehouseNo, productNo --> quantityAvailable
      a  b  c	  serviceRepNo --> serviceRepName
      a  b  c	  customerNo, dateOrdered --> serviceRepNo
      a  b  c	  customerNo, dateOrdered, timeOrdered --> totalOrderCost, [productNo, quantity, dateDelivered, warehouseNo]*
      	    
    3. List the final 3rd normal form relations that you should derive. Use the notation R(a,b,c) where R is the name of the relation and a, b, and c are the names of the attributes. I do not care to see your 2nd Normal form relations.

      Note that TransactionDetails is a relation created to handle a multi-valued attribute and would be created when you convert the Order relation from unnormalized form to 1st normal form. I added orderNo to the Order relation to simplify the foreign key for the TransactionDetails relation as repeating the primary key for Order in TransactionDetails would be storage consuming.

      Customer(customerNo, customerName, city, street, zipcode)		    
      Product(productNo, productName, description, pricePerItem)
      Warehouse(warehouseNo, warehouseName, warehouseAddress)
      Inventory(warehouseNo, productNo, quantityAvailable)
      ServiceRep(serviceRepNo, serviceRepName)
      Orders(orderNo, customerNo, dateOrdered, timeOrdered, totalOrderCost)
      TransactionDetails(orderNo, productNo, quantity, dateDelivered, warehouseNo)
      OrderHandling(customerNo, dateOrdered, serviceRepNo)		    
      		    

  6. (ER Diagrams)
    1. List the entities in the Appointment database

      Patient, Clinic, Staff, Vaccine--If you listed Group as an entity I am okay with that. I view Group as a composite attribute, much like a date or an address, which you in this case factor out into a separate relation to avoid redundancy.

      Patient(PatientNo, Name, Address, DateOfBirth, GroupNo) Clinic(ClinicNo, Name, Address) Staff(StaffNo, Name) Staffing(ClinicNo, StaffNo, Date) Appointment(PatientNo, ClinicNo, Date, Time) Vaccine(VaccineNo, Name, Dosage, AgeFrom, AgeTo) MedicalRecord(PatientNo, VaccineNo, Date, Administerer (StaffNo)) Group(GroupNo, Description, StartDate)

    2. Draw each of the relationships that exist among entities in the Appointment database.
                  works at	         
      Staff -------------------> Clinic (from Staffing relation)
      
                  contains
      Group -------------------> Patient (only if you have Group as an entity)
      
      	    services
      Clinic ------------------> Patient (from Appointment)
      
      	    Staff
                    |
      Patient---Administers---Vaccine (from Medical Record)
      	    

  7. MongoDB The following Mongo collection represents a catalog of autos that we have for rent.
    1. {	
            "_id" : 100,
            "make" : "Ford",
            "model" : "Fusion",
            "year" : 2014,
            "options" : [ "engine start", "moon roof" ],
            "price" : 135,
            "location" : { "street" : "1414 Cherry Lane",
                           "zipcode" : 37996 }
          }
    2. {
            "_id" : 200,
            "make" : "Honda",
            "model" : "Accord",
            "year" : 2014,
            "options" : [ "spoiler", "alloy wheels", "sunroof" ],
            "price" : 169,
            "location" : { "street" : "518 Magnolia Ave",
                           "zipcode": 37916 }
          }
    3. {
            "_id" : 300,
            "make" : "Honda",
            "model" : "Civic",
            "year" : 2013,
            "options" : [ "spoiler", "engine start", "sunroof" ],
            "price" : 85,
            "location" : { "street" : "518 Magnolia Ave",
                           "zipcode": 37916 }
         }
    4. {
            "_id" : 400,
            "make" : "Ford",
            "model" : "Mustang",
            "year" : 2021,
            "options" : [ "turbocharger" ],
            "price" : 215,
            "location" : { "street" : "1414 Cherry Lane",
                           "zipcode": 37996 }
         }
    5. {
            "_id" : 500,
            "make" : "Ford",
            "model" : "Mustang",
            "year" : 2019,
            "options" : [ "turbocharger", "satellite radio", "moon roof" ],
            "price" : 155,
            "location" : { "street" : "110 Apple Orchard Circle",
                           "zipcode": 37996 }
             }

    For each of the following Mongo queries, circle the tuples that would be returned by the query (these are all or nothing questions so you must get all tuples correct in order to get full credit):

    1   2   3   4   5   : db.catalog.find({ "price" : { $gte : 100, $lte: 200 } })
    1   2   3   4   5   : db.catalog.find({ "location.zipcode" : 37996, options: { $all : ["turbocharger", "moon roof"] } } )
    1   2   3   4   5   : db.catalog.find({ $or: [ { year: 2013 }, { options: { $in: ["turbocharger", "engine start"] } } ] } )
    qty : { $gte : 15 } } } })
    
  8. (Normalization, Functional Dependencies) Here is a possible table for airline reservations and some sample data:

    FlightNoDateDeparture TimeDeparture AirportDeparture GateArrival TimeArrival AirportArrival GateCustomerNoCustomer NameSeatNoTicket PriceAircraftNoAircraft Model

    You should make the following assumptions:

    1. A reservation with the same flightNo always has the same departure and arrival airports, regardless of the day.
    2. Each day a flightNo might have a different departure time/gate and arrival time/gate, and a different plane flying the route
    3. A seat might have a different price on each day on which a flightNo occurs
    4. A flightNo only goes once each day
    5. Seats are never double booked on any flight and customers may not have more than one seat on any given flight. Of course it is okay for a seat to have different customers on different days and for a customer to have reservations on the same flight on two different days.
    6. A customer is allowed to have multiple reservations per day since they might require multiple flights to get to their destination.
    7. There may be many flights on any given day (but the flightNos must be unique for that day).

    List the functional dependencies for this table. These are the ones that I expected you to get:

    customerNo, flightNo, date ⟶ seatNo
    customerNo ⟶ customerName
    flightNo ⟶ departureAirport, arrivalAirport
    flightNo, Date ⟶ departureTime, departureGate, arrivalTime, arrivalGate, aircraftNo
    flightNo, Date, seatNo ⟶ ticketPrice, customerNo
    aircraftNo ⟶ aircraftModel
    		
    These are two other ones that I saw that I was fine with
    flightNo, Date, seatNo ⟶ ticketPrice (omits customerNo but customerNo might be NULL)
    customerNo, flightNo, Date ⟶ seatNo, ticketPrice (but if customerNo is NULL you have a problem)
    		  
    The original scientific literature on functional dependencies did not consider the possibility of NULL values. I am more comfortable with the first dependency because if a seat is unoccupied, then (flightNo, Date, seatNo) will evaluate to a NULL value for customerNo and that is fine--it indicates an empty seat. The problem with the second dependency is that if the seat is unoccupied, then the flightNo does not have a customerNo for that seat and you have (NULL, flightNo, date) and that will not have a unique right hand side if there are multiple unoccupied seats on the plane. It's really the (flightNo, date, seatNo) that determines the ticket price, not the customerNo. On the other hand, the customer books only one seat so the customerNo does determine the seat where the customer sits. That's why I'm ok with the customerNo determining the seat but not so much with the customerNo determining the ticket price of the seat.

    Here are two more that are valid and got you extra credit:

    Date, DepartureTime, DepartureAirport, DepartureGate ⟶ FlightNo, ArrivalTime, ArrivalAirport, ArrivalGate
    Date, ArrivalTime, ArrivalAirport, ArrivalGate ⟶ FlightNo, DepartureTime, DepartureAirport, DepartureGate
    		      

  9. B+ Trees Show what the following 4 degree B+ tree looks like after 150 is added it. For this problem M = 4 and L = 3. You will need to draw your solution on a piece of paper, take a picture of it with your smart phone, and upload it to Canvas after you exit the exam.

    When you add 150 to the leaf (100, 200, 300), the leaf cannot accommodate the extra key and hence must be split into two leavings containing (100, 150) and (200, 300) respectively. 200 gets copied up to the parent to represent the new child, giving the parent the keys (70, 100, 200, 400) but the parent now has 5 children, which is 1 too many, and hence the parent must be split into two nodes. The left node gets the additional child and hence points to the leaves (10, 30), (70, 80), and (100, 150) while the right node points to the leaves (200, 300) and (400, 500). When the parent splits, the keys (70, 100) go to the left node and the key 400 goes to the right node. 200 gets promoted to the parent, thus giving the final result shown in the above diagram.

  10. ( Relational Algebra-3 points) Which of the following queries requires set division in order to answer the query?

    1. Find all staff making more than $100,000
    2. Find all students who have taken CS100
    3. Find all people who have visited all the Baskin Robbins stores in Knoxville
    4. List all the students who have scored more than 90 on every exam in a course