Other Functional Language Issues

  1. Evaluation Issues:
    1. Evaluation Order: Because many functional programs do not have side effects, the order in which functional arguments are evaluated should be immaterial. Therefore, many functional languages allow their compilers to evaluate functional arguments in any order they prefer, or even in parallel. Some scheme compilers evaluate arguments right to left (our gambit compiler is an exception-it evaluates arguments left to right). For example:
      (define add (lambda (x) (+ x 20)))
      (define min (lambda (x y) (if (< x y) x y)))
      
      (trace add)
      
      (min (add 5) (add 20))
      
      [Entering #[compound-procedure 4 add]
          Args: 20]
      [40
            <== #[compound-procedure 4 add]
          Args: 20]
      [Entering #[compound-procedure 4 add]
          Args: 5]
      [25
            <== #[compound-procedure 4 add]
          Args: 5]
      ;Value: 25
      
      1. Scheme Evaluation Order
        1. Functions: Any order--determined by the compiler
        2. Special Forms (e.g., cond, if, loops, or/and): Have a pre-specified evaluation order. For example, an if function always evaluates the conditional expression first, and then either the if or else expression

    2. Applicative-Order versus Normal-Order Evaluation
      1. Definitions
        1. Applicative-Order Evaluation: evaluates function arguments before passing them to a function (Scheme functions use applicative-order evaluation)
        2. Normal-Order Evaluation: passes arguments unevaluated to a function (Scheme special forms use normal-order evaluation)

      2. Illustrations of the difference between the two
        1. The obvious difference is short-circuit evaluation in boolean operators. Applicative-order evaluation will evaluate all arguments before evaluating the boolean operator, which could cause unwanted effects if the first condition was guarding the second, as in:
          (and (not (= y 0)) (/ x y))
          
          Normal-order evaluation will allow us to short-circuit the evaluation as soon as one of the conditional expressions causes the outcome to become known

        2. Consider the following function and function call:
          (define double (lambda (x) (+ x x )))
          
          (double (* 3 4))
          
          Under applicative order evaluation we have:
          (double (* 3 4))
          ==> (double 12)
          ==> (+ 12 12)
          ==> 24
          
          Under normal order evaluation we have:
          (double (* 3 4))
          ==> (+ (* 3 4) (* 3 4))
          ==> (+ 12 (* 3 4))
          ==> (+ 12 12)
          ==> 24
          
          and hence we perform twice the work.

    3. Strictness and Lazy Evaluation: As shown above, evaluation order can have an effect on both program correctness and execution speed. In the first example, applicative-order evaluation could cause a division by zero error whereas normal-order evaluation would allow the program to terminate normally. In the second example, normal order evaluation did twice the work of applicative-order evaluation. However, in the first example, normal order evaluation may do less work than applicative order evaluation.

      1. Strictness
        1. A function is strict if it is undefined (fails to terminate or encounters an error) when any of its arguments is undefined. Such a function can safely evaluate all its arguments, so its result will not depend on evaluation order. A strict language may safely use applicative order evaluation.
        2. A function is non-strict if it is sometimes defined, even when one of its arguments is not. A non-strict language must use normal order evaluation.
        3. Scheme is strict for functions, but non-strict for special forms

      2. Lazy Evaluation: Lazy evaluation does not evaluate an expression until its value is actually needed. Normal order evaluation is essentially lazy evaluation
        1. Lazy evaluation in Scheme: The delay and force constructs allow you to use lazy evaluation in Scheme. For example:
          (define expr (delay (+ a 10)))
          (define a 15)
          (force expr) ==> 25
          
          delay binds an expression to a name and force forces the evaluation of that expression. Scheme uses the memoization technique described below to save the value of the evaluated expression, so if force is called again on it, it simply returns the cached value.
        2. Memoization: Memoization is a technique that tags an expression internally when an expression is first evaluated, and saves the expression's computed value. Thereafter, references to the expression use this computed value, rather than re-evaluating the expression. Memoization can considerably improve the performance of normal order evaluation. It in fact brings it within a constant factor of applicative order evaluation. In the double example shown earlier, the first evaluation of (* 3 4) would be saved, so that the second evaluation would simply use the value 12, rather than re-computing the value of the expression.
        3. One issue with memoization is that it may not work properly in the presence of side-effects. For example, suppose we lazily evaluate the expression (* x y) and memoize its result. Before the next reference to this expression is encountered, the value of x is altered. Now the memoize value is incorrect.
        4. Spreadsheets use memoization to prevent a spreadsheet from having to recursively evaluate a tree of expressions each time a cell's value is requested. They also use an out-of-date flag to handle the situation where a memoized value becomes invalid. For example, if the user has typed the formulas:
          a10 = b10 + c10
          b10 = 3 * b9
          c10 = 8 * c9
          b9 = 5
          c9 = 10
          
          then a spreadsheet will cache the results of evaluating the three formulas for a10, b10, and c10. If the user changes the value of a cell, such as b9, then the spreadsheet will use a depth-first traversal to find all formulas that depend directly or indirectly on this changed cell, and mark these formulas and their related cells out-of-date. When a cell's value is requested, the spreadsheet checks the cell's out-of-date flag. If it is set to false, the spreadsheet returns the cached value. If the flag is set to true, the spreadsheet evaluates the cell's formula, caches the result, sets the out-of-date flag to false, and returns the newly computed value. Note that this evaluation could recursively trigger the evaluation of other out-of-date formulas.

  2. Higher-Order Functions: A function is said to be a higher-order function if it takes a function as an argument, or returns a function as a result.

    1. Examples
      1. C obtains polymorphic sort and search functions by taking a comparision function as a parameter.
      2. Map function: The scheme map function takes a function and a sequence of lists, one list for each of the function's arguments. The lists must all have the same length. It then applies the function piece-wise to corresponding sets of elements from the lists and returns a list as the result. For example:
        (map * '(2 4 6) '(3 5 7)) ==> (6 20 42)
        
        map can be applied to any number of lists. Here is an example where it adds 1 to each element in a single list:
        > (map (lambda (x) (+ x 1)) '(3 6 9 10 12))
        (4 7 10 11 13)
        
        If you are struggling to understand how map is implemented, here is pseudo-code showing how the pair-wise list multiplication would be implemented using an imperative procedure and assuming that the lists are implemented as arrays:
        map(list1, list2, function) {
            newlist = []
            for i = 0 to list1.length-1 {
                newlist.append(function(list1[i], list2[i]));
            }
            return newlist
        }
        
      3. Reduce (fold) function: Another common operation is to reduce a list of values to a single value using a binary operator, such as '*'. This operation is commonly called reduce, although it is sometimes also called fold. Its definition is:
        (define reduce (lambda (fct identity-value sequence)
                       (if (null? sequence) 
                           identity-value   ; e.g., 0 for +, 1 for *
                           (fct (car sequence) (reduce fct identity-value (cdr sequence))))))
        
        (reduce * 1 '(2 4 6)) ==> 48
        
        reduce and map are frequently used in tandem. For example, if I am doing a matrix multiply, each element in the newly computed matrix is the dot product of some row and column. The dot product is obtained by doing a pair-wise multiplication of the corresponding elements in the row and column, and then summing the resulting products. You can express this operation elegantly using map and reduce as follows:
        (reduce + 0 (map * row column))
        
        If you are struggling to understand how reduce works, here is the pseudo-code for implementing an imperative version of reduce. The pseudo-code assumes the list is implemented as an array:
        reduce(fct identity-value sequence) {
            result = identity-value
            for i = 0 to sequence.length-1 {
                result = fct(result, sequence[i])
            }
            return result
        }
        
        As a more concrete example, here is a reduce function for adding a list of numbers:
        result = 0;  // 0 is the identity value
        for i = 0 to sequence.length - 1 {
            result = result + sequence[i]
        }
        return result
        
      4. Currying an argument: You can "curry" a function by replacing one of its arguments with a constant value and returning a function that accepts one fewer arguments. For example:
        (define curried-plus (lambda (a) (lambda (b) (+ a b))))
        
        ((curried-plus 3) 4)
        ==> ((lambda (b) (+ 3 b)) 4)  ; curried-plus replaced by its lambda function
        ==> 7
        
        Here's a more useful example of currying a reduce function so that it produces a function that performs the indicated operation on a list without having to always provide the function and identity element. We are actually currying 2 arguments in this example, the function to be used in the "reduce" operation and the identity element used to initialize the result we are computing:
        (define reduce (lambda (fct identity-value sequence)
          (if (null? sequence)
              identity-value   ; e.g., 0 for +, 1 for *
              (fct (car sequence) (reduce fct identity-value (cdr sequence))))))
        
        (define curry_reduce (lambda (fct identity)
                                   (lambda (L) (reduce fct identity L))))
        
        (define total (curry_reduce + 0))
        (define product (curry_reduce * 1))
        
        (total '(1 2 3 4 5))  ;; produces 15
        (product '(2 3 5))    ;; produces 30
        
        Notice how nice it is to just type the function we want, such as total or product, instead of having to write out:
        (reduce + 0 '(1 2 3 4 5))
        (reduce * 1 '(2 3 5))
        
      5. Interpreting a user-entered expression: Although it is beyond the scope of this course, there are ways to allow the user to input a Scheme expression, read this expression as a string, dump the contents of the string into a lambda form, and then return the resulting lambda function. This function can now be evaluated, which achieves the effect of interpreting the user's input. The task of converting a string to a function is what a spreadsheet needs to do when a user types in an equation.

    2. Performance Issues: Because pure functional languages do not have side-effects, naive implementations can suffer from performance issues associated with what has been called the trivial update problem. These are problems in which a relatively small change must be made to state information, but because side-effects are not allowed, an entirely new structure must be created. This forced re-initialization and garbage collection of discarded memory can significantly slow down a program. Three examples of this problem are:

      1. Initialization of complex structures: It can be difficult to incrementally put together complex structures that are not lists, such as multi-dimensional arrays, since each incremental addition requires that the arrays be re-created from scratch.
      2. Summarization: One often wants to summarize information by providing frequency counts of items, such as frequency counts of words in a file. The natural way to accumulate these frequency counts is with a dictionary (i.e., hash table) and to increment the appropriate item each time it is encountered in the file. However, a purely functional language requires the entire dictionary to be re-created from scratch in this situation.
      3. In-place mutation: Many programs do in-place mutations to data elements, such as sorting programs or linear algebra programs. A purely functional language must recreate the data structure from scratch each time a change is required.

      Various compiler techniques and programmer annotations have been developed to handle these situations and make the performance hit to a functional program much less pronounced.

  3. My Perspective: One of my favorite quotes is the following one by an anonymous individual:
    The greatest tragedy of all history is the murder of a beautiful theory by a gang of brutal facts.
    I feel this quotation applies to functional programming. In theory it is elegant, and its lack of side-effects can lead to reduced development times because of fewer errors. In practice though the world is a messy place that constantly demands side-effects (input/output, the update problems mentioned above), and in my experience, functional programming falls apart in the presence of these side-effects. As one example, we have seen in class how elegant looking functional programs get transformed into messy looking, obscure code when we have to convert the original versions to more efficient, tail-recursive forms. In practice, I have seen very few situations that benefit from a purely functional approach (tree and graph traversals come to mind, as do some recurrence relations).

    Another knock against functional programming from my perspective is that we are brought up from birth in an imperative-oriented world of "do this" and "do that". It takes considerable training to re-orient most programmer's thinking to the more functional thinking of mathematicians, and most programmers either will not, or can not, master it.

    Finally, many of the most useful ideas from functional programming, such as garbage collection, anonymously created on the fly functions (i.e., lambda functions), and higher-order functions, such as map and reduce (fold), have been finding their way into imperative languages, especially scripting languages.

    In fact, the one very valuable thing that I think is evolving into imperative languages from functional languages is the concept of higher-order functions that can do a tremendous amount of computation with just one or two lines of code. As a post-doc I spent two years programming in Lisp, and its higher-order functions were the biggest asset to me in programming. In the end I had to use side-effects all the time, because I was programming for graphical user interfaces. It took me about 6 months to become comfortable with this "pidgin" form of functional programming, and I never really became comfortable with Lisp's prefix notation. I am also not comfortable with the more mathematical notation that later functional languages developed to denote functions.

    For those of you with a natural mathematical bent, I encourage you to examine functional languages in more depth. However, I suspect that the world at large will remain a largely imperative shop, with the more desirable characteristics of functional languages mixed in.