CS140 Lecture notes -- Big O

Jim Plank with modifications by Brad Vander Zanden.

Directory: ~cs140/www-home/notes/BigO

Lecture notes: http://www.cs.utk.edu/~cs140/notes/BigO

Thu Oct 29 07:32:45 EST 1998

Big O

Big O notation is one of the ways in which we talk about how efficient an algorithm or program is. It gives us a nice way of quantifying or classifying how fast or slow a program is as a function of the size of its input.

Examples

Let's look at a program (in linear1.c):

#include < stdio.h >

main(int argc, char **argv)
{
  int n;
  double f;
  double count;
  int i, j;
  long t0;

  if (argc != 2) { 
    fprintf(stderr, "Usage: ex1 n\n");
    exit(1);
  }

  n = atoi(argv[1]);

  t0 = time(0);
  count = 0;
  f = 23.4;
  for (i = 0; i < n; i++) {
    count++;
    f = f * f + 1;
    j = (int) f / 1000;
    f = f - j*1000;
/*    printf("%lf\n", f); */
  }
  printf("N: %d   Count: %.0lf   Time: %d\n", n, count, time(0)-t0);
}

What this does is compute n random-ish numbers between one and 1000, and if we uncommented the printf() statement, then it would print them -- try it out (uncomment the print statement).

Suppose we run this program with varying values of n. What do we expect? Well, as n increases, so will the count, and so will the running time of the program:

(This is on my machine at home -- I don't know how fast the machines here will be.)

UNIX> gcc -o linear1 linear1.c
UNIX> linear1 1
N: 1   Count: 1 Time: 0
UNIX> linear1 10000
N: 10000   Count: 10000 Time: 0
UNIX> linear1 10000000
N: 10000000   Count: 10000000 Time: 8
UNIX> linear1 20000000
N: 20000000   Count: 20000000 Time: 14
UNIX> linear1 100000000
N: 100000000   Count: 100000000 Time: 72
UNIX>

Just what you'd think. In fact, the running time is roughly linear:

running time = 72n/100000000 Obviously, count equals n.

Now, look at four other programs below. I will just show their loops:

linear2.c:

... for (i = 0; i < n; i++) { count++; f = f * f + 1; j = (int) f / 1000; f = f - j*1000; } for (i = 0; i < n; i++) { count++; f = f * f + 1; j = (int) f / 1000; f = f - j*1000; } ...
log.c:

... for (i = 1; i <= n; i *= 2) { count++; f = f * f + 1; j = (int) f / 1000; f = f - j*1000; } ...

nlogn.c:

... for (k = 0; k < n; k++) { for (i = 1; i <= n; i *= 2) { count++; f = f * f + 1; j = (int) f / 1000; f = f - j*1000; } } ...
nsquared.c:

... for (i = 0; i < n; i++) { for (k = 0; k < n; k++) { count++; f = f * f + 1; j = (int) f / 1000; f = f - j*1000; } } ...

In the five programs total, the value of count will be the following (note, I will expect you to be able to do things like tell me what count is as a function of n on tests):

linear1.c: count = n.
linear2.c: count = 2n.
log.c: count = log(n). (where the logarithm is base 2).
nlogn.c: count = n*log(n). (where the logarithm is base 2).
nsquared.c: count = n*n.

In each program, the running time is going to be directly proportional to count. Why? Read chapter two for how to count instructions. So, what do the running times look like if you increase n to large values. I have the output of the various programs in the following table of links:

linear1

linear2

log

nlogn

nsquared

Some things you should notice right off the bat: log(n) is very, very small in comparison to n. This means that log.c is blazingly fast for even huge values of n. On the other end of the spectrum, n*n grows very quickly as n increases. Below, I graph all of the programs' running times as a function of n:

So, this shows what you'd think:

log(n) < n < 2n < n*log(n) < n*n

Perhaps its hard to gauge how much each is less than the other until you see it. Below I plot the same graph, but zoomed up a bit so you can get a better feel for n*log(n) and n*n.

Back to Big O: Function comparison

Big-O notation tries to work on classifying functions. The functions that we care about are the running times of programs. The first concept when we deal with Big O are comparing functions. Basically, we will say that one function f(n)is greater than another g(n) if there is a value x so that for all i >= x:

f(i) >= g(i)

Put graphically, it means that after a certain point on the x axis, as we go right, the curve for f(n) will always be higher than g(n). Thus, given the graphs above, you can see that n*n is greater than n*log(n), which is greater than 2n, which is greater than n, which is greater than log(n).

So, here are some functions:

a(n) = 1
b(n) = 100
c(n) = 6-n
d(n) = n
e(n) = 2n
f(n) = 2n-5
g(n) = n*n - 5000000000
h(n) = log(n)
i(n) = log(n) - 100
j(n) = n*log(n)-100

So, we can ask ourselves questions: Is b(n) > a(n)? Yes. Why? Because for any value of n, b(n) is 100, and a(n) is 1. Therefore for any value of n, b(n) is greater than a(n).

That was easy. How about c(n) and b(n)? b(n) is greater, because for any value of n greater than 6, b(n) is 100 and c(n) is negative.

Here's a total ordering of the above. Make sure you can prove all of these to yourselves:

g(n) > j(n) > e(n) > f(n) > d(n) > h(n) > i(n) > b(n) > a(n) > c(n)

Some rules:

If a function consists only of polynomials, the Big-O value of the function is equal to the polynomial with the largest degree, regardless of how big the coefficients are. For example, f(n) = .00001n² + 1000n is O(n²) despite the fact that n's coefficient is so much greater than n²'s coefficient. Formally this rule can be written as:
If f(n) is a polynomial of degree k and g(n) is a polynomial of degree l < k, and both lead coefficients are positive, then f(n) > g(n).
If f(n) and g(n) are both polynomials of degree k, then the lead coefficient defines which one is greater. For example, if f(n) = 6n and g(n) = 4n then f(n) is greater.
When you write the Big-O value for a function omit any coefficients. For example, write O(n), not O(6n). If a function's Big-O value is constant, write O(1), not O(6).

Big O

Given the above, you should now be able to read the book's definition of big-O, Omega, Theta and little-o:

T(N) = O(f(N)) if there exists a constant c such that c*f(N) >= T(N).

This means given the definitions of a(n) through j(n) above:

a(n) = O(1).
b(n) = O(1).
b(n) = O(n). This is because n > b(n).
b(n) = O(n*n). This is because n*n > b(n).
b(n) = O(a(n)).
a(n) = O(b(n)).
e(n) = O(n). This is because 2n >= e(n). I.e. with c = 2, cn >= e(n).
i(n) = O(log(n)).
j(n) = O(n*log(n))
g(n) = O(n*n)

Note that O(f(N)) is an upper bound on T(N). That means that T(N) is definitely not slower than f(n). Similarly, if g(N) > f(N) and T(N) = O(f(N)), then T(N) = O(g(n)) too.

Read the book for the other definitions.