Scheme


These notes cover the rudimentary aspects of Scheme. Because I wish to emphasize the functional aspects of Scheme, these notes do not cover looping constructs--I want you to write your Scheme programs recursively. I also will not be covering side-effect inducing statements, such as set!, again because I wish to emphasize the functional aspects of Scheme. You can download a version of Scheme from the GNU site, or you can use the Gambit Scheme version on the hydra/tesla machines, which you can invoke from the command line using the command gsi.

  1. Historical Origins: Scheme is a simplified teaching dialect of Lisp.
  2. Basic Syntax: Lists form the basis of everything in Scheme, including both the representation of data and of commands. All Scheme commands start with an operator, followed by a list of operands. This notation is called both prefix notation and Cambridge Polish notation. Lists are enclosed in parentheses, (), so a sample Scheme expression for adding two numbers would be:
    (+ 3 4)
    
    I can use the define command to define a global variable:
    (define a 8)
    
    and I can use the let command to create a block with local variables:
    (let ((a 10)
          (b 20))
         (+ a b))
    
    About the only time I will let you use the define command is to give a function a name. Otherwise you will use let statements, just like in C++ you use { ... } to create blocks and function bodies.

    If I wanted to write an arithmetic expression like (a + 10) * (b - 20), I would write:

    (* (+ a 10) (- b 20))
    
    A function call has a similar syntax. If I want to call the function avg on three variables, I would write:
    (avg x y z)
    
    Although prefix notation takes some getting used to, it does have an advantage of conciseness when writing out an arithmetic expression involving the same operator. For example, the sum (a + b + c + x + y + z) can be concisely specified as:
    (+ a b c x y z)
    
    1. Comments: Comments start with a semi-colon (;). Anything following a semi-colon to the end of the line is a comment. For example:
      ; add two numbers
      (+ 3 7)  ; the result is 10
      
      In imperative languages semi-colons frequently delimit statements, but right parentheses perform this function in Scheme.

    2. Printing Stuff
      1. The display function prints a value, including lists. It can only print a single argument.
      2. The newline function prints a newline.
      3. Example:
        (display "The result is ")
        (display (+ 3 4))
        (newline)
        
    3. Debugging: You may or may not find the following two debugging information helpful:
      1. (trace function-name): Shows the input arguments and output value when the function is called. Redefining the function, such as by loading a file that defines it, undoes the tracing.
      2. Gambit debugging commands

      I typically use a combination of the trace function and display statements to debug my scheme programs.

  3. The Scheme Interpreter: Most Scheme implementations employ an interpreter that runs a "read-eval-print" loop (often abbreviated REPL). The interpreter repeatedly reads an expression from standard input, evaluates that expression, and prints the resulting value. On our departmental machines, you can invoke the gambit scheme interpreter by typing:
    gsi
    
    You can now start typing expressions and it will immediately print the result:
    (+ 3 4)
    7
    (/ (+ 8 10 10) 3)
    28 / 3     ; Lisp supports rational numbers
    
    If you make a mistake, such as typing an undefined variable, you will see something like:
    > a
    *** ERROR IN (console)@1.1 -- Unbound variable: a
    1> 
    
    Type ,t to get back to the main prompt.

    To quit the interpreter you can from the main prompt type any of the following: ,q, (exit), or Ctrl-D.

    You may load programs from a file using the load function:

    (load "qsort.scm")   ; .scm is the common Scheme suffix
    
    Each time you load "qsort.scm", it will evaluate all the expressions in "qsort.scm" and update any previously declared bindings with the new results. In particular, previous function definitions will be overwritten with new function definitions, and hence it is unnecessary to exit the scheme interpreter if you want to fix a program. Instead you edit the file and then reload it.

  4. Basic Lisp Data Types: Lisp supports the usual set of data types--numbers, strings, and boolean values:

    1. numbers: In addition to integers and floating point numbers, Scheme also supports rational numbers, such as 28/3. When you perform division on integer numbers, Scheme returns a rational number if there is a non-zero remainder, rather than a floating point number.
    2. strings: Strings are enclosed in double quotes. For example "Brad" is a string. You may think that 'Brad' is also a string, because it gets accepted by the Lisp interpreter. However, 'Brad' is really the symbol Brad' (see symbols below).
    3. boolean values: #t and #f represent the values true and false respectively.
    4. symbols: Sometimes you will wish to store a symbolic reference to an identifer, without having it be evaluated. You can do so by prefixing the identifier with a quote ('). For example, 'a allows you to store a symbolic reference to the identifier a without evaluating it. This can be helpful when creating functions or expressions dynamically.
    5. characters: Characters are rather clunky in Lisp and we won't deal with them. They start with the notation #\. For example, the character 'a' in C is represented as #\a in Scheme.

  5. Lists: Lists are a built-in data type in Scheme and are the most common way of representing structured data. They are created using the list keyword:
    (list 3 8 10)
    (list "brad" 7 #t (list 5 8) (list 4 3.2 #t))
    (list)   ; the empty list
    
    As you can see, lists can contain heterogenous values and may be nested. The list keyword can be a bit verbose, and so you can also create lists by quoting them:
    '(3 8 10)
    '("brad" 7 #t (5 8) (4 3.2 #t))
    '()  ; the empty list
    
    There is one important distinction between creating a list using a quote versus creating a list using the list command. If you use the list command, then all executable expressions will be evaluated, whereas if you use a quote to create the list, then the executable expressions will not be evaluated and instead the individual components of the expression will appear as symbols in the created list. For example:
    (list (+ 4 5) (+ 3 8))   ; creates the list '(9 11)
    '((+ 4 5) (+ 3 8))       ; creates the list '((+ 4 5) (+ 3 8))
    
    1. cons cells: Lists are created from cons cells, each of which contains two pointers, one to the list element and one to the next cons cell. For example:
      -----   -----   -----
      | | |-->| | |-->| |/|
      -----   -----   -----
       |       |       |
       3       8      10
      
      Primitive elements, such as numbers or strings, are often called atoms. You can also create a cons cell consisting of two atoms using the cons command:
      (cons 6 8)
      
      which will be displayed by the scheme interpreter as (6 . 8). Such a cons cell is often called an improper list. A proper list always ends with a nil (empty) list. In other words, in a proper list, the final cons cell has a null pointer as its second element.

    2. basic list operations: The most basic list operations are the ones that construct them (cons, append) or extract their components (car, cdr, and their variants)
      1. list constructors
        1. (cons x y): returns a cons cell with the elements x and y.
        2. (append list1 list2 ... listn): merges the argument lists into a single list and returns that list, leaving the individual arguments untouched
          (cons '8 '(10))    ==> (8 10)  ; the list consisting of 8 and 10
          (append '(10 15) '(8) '(6 10 14)) ==> '(10 15 8 6 10 14)
          
      2. list extractors
        1. (car list): returns the first element of the list. So named because the assembly language instruction used to implement this operation in the first Lisp implementation was called "contents of address register". Technically car returns the first element of a cons cell. Scheme provides the first, second and third functions as more mnemonic ways to extract the first, second, and third elements of a list. Unfortunately, the gambit interpreter does not support first, second, third, etc., so you must use car and cdr instead.
        2. (cdr list): returns the remaining elements in the list. So named because the assembly language instruction used to implement this operation in the first Lisp implementation was called "contents of decrement register". Technically cdr returns the second element of a cons cell, which is typically a pointer to the rest of the list.
        3. Scheme supports various combinations of these two commands to extract elements near the front of a list. For example cadr is short hand for (car (cdr ...)) and returns the second element of a list. Similarly caddr returns the third element of a list. You can experiment with other forms to see whether or not Scheme supports them.
          (car '(8 10 15))   ==> 8       ; the first element of the list
          (cdr '(8 10 15))   ==> '(10 15) ; the rest of the list  
          (car '())          ==> ERROR  ; the empty list has no elements
          (cdr '())          ==> ERROR  ; the empty list has no elements
          (cdr '(10))        ==> '() ; the empty list '()
          (first '(8 10 15)) ==> 8
          (second '(8 10 15)) ==> 10
          
      3. the nil value (don't use it--it's undefined!): In Scheme, the symbol nil used to represent the empty list, '(). However, it has now been abolished from the standard, so use '() instead if you want to denote an empty list. For example:
        (list? (cons 3 '()))   ; returns #t
        
      4. Finding elements of a list: If you want to determine whether or not an element is in a list, you can use one of memq, memv, or member. For example:
        (memq 3 '(6 3 4))   ; returns #t
        (memq 8 '(6 3 4))   ; returns #f
        
        memq uses the eq? function to compare elements, memv uses the eqv? function to compare elements, and member uses the equal? function to compare elements. Click here for more details about the eq?, eqv?, and equal? functions.
    3. Association Lists: You can treat lists like associative maps if you create a list of pairs. The first element of the pair represents the key and the second element of the pair represents the value. For example:
      (define a '(("brad" 3) ("nels" 6) ("summer" 10)))
      
      You can return the pair associated with a key in an association list by using the assoc function:
      (assoc "summer" a)   ; returns ("summer" 10)
      
      assoc uses the equal? function to check for equality. You can also use assq or assv, which will use the eq? and eqv? functions respectively to check for equality.

  6. Type Predicate Functions: Scheme provides a number of type predicate functions that allow you to determine what type of value you are examining. The most useful ones are:

    1. (string? x): is x a string?
    2. (number? x): is x a number?
    3. (pair? x): is x a cons cell?
    4. (list? x): is x a list?
    5. (null? x): is x the empty list?

  7. Variables
    1. Scheme uses dynamic typing so you do not have to declare the type of a variable--its type is determined by the value currently assigned to it
    2. Scheme uses static scoping
    3. You must declare a variable before you use it. There are two ways to do this:
      1. global variables: global variables are declared using the define function:
        (define a 10)
        (define name "brad")
        
      2. local variables: local variables are introduced using the let function:
        (let ((a 10)
              (b 7))
             (+ a b))   ; result is 17
        
        let can be thought of as introducing a block with the indicated bindings. The bindings are "live" within the let statement and any nested let statements that do not override the binding.
        1. syntax: The initial argument to let is a list of bindings. The remaining arguments are a list of expressions to be evaluated within that let environment. The value of the let expression is the value of the last expression in the let.
        2. order of evaluation in the binding list: The expressions in the binding list are conceptually executed in parallel, and hence bindings later in the list may not depend on bindings earlier in the list. For example, the following is not allowed:
          (let ((a 10)
                (b (* a 5)))   ; illegal: b is not allowed to use the value of a
             ...)
          
          because b depends on a having already been evaluated
        3. order of evaluation in the body: The statements in the body are executed sequentially. The last statement in the body is the value of the let expression.

          1. one reason for having multiple statements in a let body is when you have side-effects, such as when using the display function.
            (let ()
              (display "The result is ")
              (display (+ 3 4))
              (newline)
            )
            
        4. let*: If you want to force the bindings in the binding list to be evaluated sequentially, with earlier bindings being available to later bindings, then use let*:
          (let* ((a 10)
                 (b (* a 5)))   ; b is 50
             (+ a b))          ; result is 60
          

  8. Conditionals, Relational Operators, and Boolean Operators

    1. Relational Operators:
      1. For Numbers: Use =, <, <=, >, >= (but not !=, you must say (not (= op1 op2)) instead
      2. For Strings: Use string=?, string<?, string<=?, etc. but not string!=?
      3. For anything else (although works with numbers and strings as well)
        1. (eq? operand1 operand2): shallow comparison using pointers
        2. (eqv? operand1 operand2): shallow comparison, but it will return true if the elements have the same length and the same value. For example, if both name1 and name2 have the value "brad", then eq? will return false if name1 and name2 point to different copies of "brad", but eqv? is required to return true (our scheme interpreter does not seem to accept this definition of eqv? so I will not use it in this course).
        3. (equal? operand1 operand2): deep recursive comparison of two structures, such as two nested lists

    2. Boolean Operators: and, or, and not are defined. You can use (not (string=? ...)) to test whether or not two strings are equal.

    3. Conditionals
      1. Simple if-else:
        (if (< a 0)
            0
            a)
        
        The first argument is a boolean expression that should evaluate to a boolean value. The second and third arguments are the "then" and "else" expressions

      2. Compound if-else:
        (cond ((and (< grade 100) (>= grade 90)) "A")
              ((> grade 80) "B")
              ((> grade 70) "C")
              (else "B"))
        
        1. The arguments to cond are pairs
        2. The pairs are considered in order from first to last
        3. The value of the cond is the value of the second element of the first pair in which the first element evaluates to #t
        4. If none of the pairs evaluates to #t, then the value of the cond is false
        5. You can use either else or #t as the first element of the final pair to ensure that the final pair provides a default value
        6. You can only provide a single expression for the second argument to the pair--you may have to call a function to get a more complicated computation or use let.

  9. Functions: Functions are constructed using the lambda constructor:
    (lambda (x) (* x x))   ; a function that computes x2
    
    lambda creates an anonymous function object. For it to be useful you usually bind it to a name using either a define or a let statement. For example:
    (define x2 (lambda (x) (* x x)))
    

    1. syntax: The first argument to lambda is a list of parameters. The remaining arguments constitute the body of the function.
    2. return value: The value produced by the last expression executed in a function constitutes the return value of the function. Scheme does not have an explicit return statement so you have to organize your function so that a return statement is not required. That may mean breaking a large function into smaller functions.
    3. parameter passing is by reference
    4. recursive functions: If you define a nested recursive function (i.e., one inside of a let statement), you need to use a letrec (let recursion) statement to define a recursive function, since a simple let statement does not allow a value to depend on a previous binding. Here is a simple factorial function:
      (letrec ((fact (lambda (n)
                       (if (or (= n 0) (= n 1)) 
                           1
                           (* n (fact (- n 1)))))))
         (fact 5)) ; result is 120