Proj-1.html

CS 594 Project 1: Linear Associator - Spring 1998
Due Tuesday April 16

Last updated: Mon Apr 6 13:56:33 EDT 1998

If anything is unclear or you have other questions, please send me mail.

In ~mclennan/pub/594/A-L.dta you will finding a data file containing 12 training pairs:

A binary input vector of pixels (8X14 grid) representing a letter of the alphabet (A-L). (Note: as far as you are concerned, these are just binary vectors 8 X 14 = 112 bits in length.)
An 8-bit binary target output vector representing the ASCII code of the corresponding letter.

This file is also available here as "A-L.dta".

In this project you will explore the linear associator using this training data. Much of this project can be done without programming if you use software packages like matlab to compute matrix products, SVDs etc. I have indicated where you need to do some programming. Even in these cases, you can use off-the-self subroutines for matrix multiplication etc.

First, normalize the input patterns (but not the target patterns).

1) Input Covariance:

a) Compute the covariance matrix C of your (normalized) inputs and attach a printout.

b) Use MatLab [X,S,Y] = svd(C), or another software package, to compute the singular value decomposition of the input covariance matrix, and attach a list of the singular values. [You can use these in (3) to pick a learning rate for the Delta Rule.]

2) Outer Product Rule:

a) Use the Outer Product Rule to compute an associative matrix (weight matrix) for the entire data set.

b) Evaluate the memory's accuracy on each of the training patterns. Report both sum-of-squares error and number of bits in error (using 0.5 as a threshold for distinguishing 0 outputs from 1 outputs).

Actually, you can use any consistent rule (e.g. any fixed threshold) you like. The simplest rule is to interpret any value > 0.5 as 1, since that would be the effect of passing the output through a logistic signoid.

c) Does your memory seem to work better on certain inputs than others? Are there certain pairs of inputs it is more likely to confuse or mix?

d) What does the input covariance matrix [computed in (1a)] tell you about the input patterns?

3) Batch Delta Rule:

a) Implement the Batch Delta Rule and use your program to compute an associative matrix for the data. Attach a graph of sum-of-squares error versus iteration.

b) Repeat parts (b-d) of (1) for the resulting memory, and compare the performance of the Batch Delta Rule with the Outer Product Rule.

4) Online Delta Rule: Repeat (3) using the Online (pattern-by-pattern) version of the Delta Rule (which you will have to program). Try different orders of pattern presentation, and compare the results with each other and the results of (3).

5) Optimal Matrix:

a) Use MatLab [A,S,B]= svd(P) to compute the SVD of your input matrix.

b) Use (a) to compute P+, the pseudo-inverse of the input matrix, and M = QP+, the optimal linear associator matrix.

c) Compare the performance of this matrix with those given by the Delta Rules in (3) and (4). As usual, consider both sum-of-squares error and incorrect bits.

6) In this experiment you will zero out more and more of the values in the diagonal matrix S+ used to compute the pseudo-inverse P+ in M. Start with zeroing the smallest value in S+, and then continue zeroing in order from smallest to largest until they are all zero. You will probably want to write a simple program to do this, but I suppose you can do it by hand if you want to! Use these diagonal matrices to compute cruder and cruder approximations to M, and compare their performance with the optimal M, computed in (5). Show a graph of performance vs. number of zeroed singular values.

7) Extra credit: Repeat (1-6) using a bipolar (-1/+1), instead of a binary (0/1), encoding of the data. (Note that you will have to renormalize the inputs etc.) Compare the results and explain any differences.

CS 594 Project 1: Linear Associator - Spring 1998 Due Tuesday April 16

CS 594 Project 1: Linear Associator - Spring 1998
Due Tuesday April 16