CS460/CS565 Languages Midterm 2

Fall 2010


  1. This exam is a take home exam. It is due Monday, Oct. 25 at 5:00pm.
  2. Submit your answers using the 460_submit or 565_submit scripts. When prompted for a number, enter "midterm2" instead.
  3. In addition to submitting your perl and xml files electronically, please print a hardcopy of your exam and bring it with you to class on Tuesday, Oct 26. You may not modify your answers after the test is due on Monday, Oct. 25. Please staple together the pages of your exam.
  4. Each problem tells you the name of the file in which you should place your answer. Question 6 asks you to put your answer in a file named xml.txt. Please just make this file be an ascii text file. You can create such a file using vi, emacs, notepad, etc.
  5. I will test your answer with sample input so try to make sure your answer works. I will award partial credit if your answer does not work.
  6. You must answer all of the questions.
  7. Unlike the last exam, some of the questions ask you to use specific elements of Perl. If you fail to do so, you may lose most or all of the points for the question.
  8. Good luck!

  1. (15 points) Write a Perl program to read a file of people and their birthdates and then print the oldest num of them, where the filename and num are command line arguments provided by the user. As an example, if your input file is:
    Fred Flintstone 1/25/1975
    James Wilfrid Vander Zanden 11/28/1930
    Ebber B. Vander Zanden 9/8/1964
    Abe Lincoln 2/12/1809
    Jackie O 2/12/1964
    Rhonda Jones 2/3/1964
    Brad Vander Zanden 2/3/1964
    Smiley 5/5/2005
    Claudia 7/6/2006
    
    and your command is:
    bvz> perl birth.pl names.txt 5
    
    then your output should be:
    Abe Lincoln 2/12/1809
    James Wilfrid Vander Zanden 11/28/1930
    Brad Vander Zanden 2/3/1964
    Rhonda Jones 2/3/1964
    Jackie O 2/12/1964
    
    Your program should make the following assumptions:

    For maximum credit your program must have the following features:

  2. (30 points) For each of the following cases, write a short Perl program that achieves the desired goal. The program should use a regular expression to find or type check the requested data.

    1. bullets.pl: Takes a set of files given on the command line and prints all lines in each file that start with a number, followed by a period and one or more blank spaces. For example, "12. brad" is fine, but "12 brad", "12.brad", and "brad 12." are not. I do not need to know the file name that contains the line, just print the lines. A sample invocation might be:
      perl bullets.pl article1.txt article2.txt article3.txt
      

    2. date.pl: Determines if a date given as a command line argument is in the form "Month Day, Year Era" and prints either "passed" or "failed" to indicate whether or not the date matches the pattern. The date should have the following format:

      • Month is any three letter word followed by a period (it does not have to be a valid month). The month should start with a capital letter.
      • Day is any one or two digit number (do not worry about it being between 1 and 31)
      • Year is a four digit number.
      • Era is either "AD" or "BC"

      For example, "Jan. 43, 1964 AD" is a valid date but "Feb. 4, 607" is not (missing an era and year is only 3 digits), nor is "February 2, 1855 BC" (month is too long), nor is "Wednesday Jun. 23, 1563 AD" (extraneous information). An example invocation might be:

      perl date.pl "Jan. 43, 1964 AD"
      
    3. (7 points) substitute.pl: Read the named file from the command line and write a Perl substitution pattern that replaces all substrings of the form hh:mm:ss with substrings of the form hh hours, mm minutes, and ss seconds. For example, the substring "03:53:22" should become the substring "03 hours, 53 minutes, and 22 seconds".

      • Hours, minutes, and seconds should all be two digits.
      • Write the file back out to the same file that you read from.

      A sample invocation might be:

      perl substitute.pl raceresults.txt
      
      You may replace any pattern that matches the above description, even if it is embedded within another pattern, such as "03:18:19AM".

  3. (15 points) Write a Perl script named rename.pl that changes the suffix of all .cc files in a directory to .cpp. For example, foo.cc should become foo.cpp. The Perl script should read the name of the directory from the command line and should exit if the directory is invalid (please print an appropriate error message). You do not need to recursively descend into sub-directories. The suffix must be at least one character long. Hence you will not convert a hidden file of the form .cc to .cpp.
    For the following two questions consider the XML snippet shown in airport.html. Assume the following restrictions on the data:

  4. (10 points) Write a DTD specification for the above XML specification and place it in a file named airport.dtd. Use parsed character data for the elements' content models and just ensure that there is at least one runway. Do not worry about specifying that there can be any number of runways from 1-10 since this is very difficult to do with a DTD.

  5. (15 points) Write an XML schema specification for the above XML specification and place it in a file named airport.xsd.


  6. (15 points) xml.txt: Consider the following recipe for Durango Chili, courtesy of Jim Plank. Design an XML specification for a recipe that contains the same kind of information that you see there. Your solution should show me the hierarchy of element tags you will use using the same basic structure used in question 1 of hw6. Place an asterisk (*), plus (+), or question mark (?) next to an element if it can occur 0 or more times, 1 or more times, or 0/1 times respectively. If you require one of two elements, you can write | and use the following syntax: <element1> | <element2> <element1> ... <element2> ... Put types and/or enumerated types next to each element to describe its content model. As in question 1 of hw6 I want an informal description of the types, not a formal XSD model. Some of the information presented contiguously in the recipe may be too course grained and you may need to make it finer grained.