Thu Oct 29 07:32:45 EST 1998
Big O
Big O notation is one of the ways in which we talk about how efficient
an algorithm or program is. It gives us a nice way of quantifying or
classifying how fast or slow a program is as a function of the size
of its input.
Examples
Let's look at a program (in linear1.c):
#include < stdio.h >
main(int argc, char **argv)
{
int n;
double f;
double count;
int i, j;
long t0;
if (argc != 2) {
fprintf(stderr, "Usage: ex1 n\n");
exit(1);
}
n = atoi(argv[1]);
t0 = time(0);
count = 0;
f = 23.4;
for (i = 0; i < n; i++) {
count++;
f = f * f + 1;
j = (int) f / 1000;
f = f - j*1000;
/* printf("%lf\n", f); */
}
printf("N: %d Count: %.0lf Time: %d\n", n, count, time(0)-t0);
}
What this does is compute n random-ish numbers between one and
1000, and if we uncommented the printf() statement, then it
would print them -- try it out (uncomment the print statement).
Suppose we run this program with varying values of n.
What do we expect? Well, as n increases, so will
the count, and so will the running time of the program:
(This is on my machine at home -- I don't know how fast the machines
here will be.)
UNIX> gcc -o linear1 linear1.c
UNIX> linear1 1
N: 1 Count: 1 Time: 0
UNIX> linear1 10000
N: 10000 Count: 10000 Time: 0
UNIX> linear1 10000000
N: 10000000 Count: 10000000 Time: 8
UNIX> linear1 20000000
N: 20000000 Count: 20000000 Time: 14
UNIX> linear1 100000000
N: 100000000 Count: 100000000 Time: 72
UNIX>
Just what you'd think. In fact, the running time is roughly linear:
running time = 72n/100000000
Obviously, count equals n.
Now, look at four other programs below. I will just show their loops:
linear2.c:
...
for (i = 0; i < n; i++) {
count++;
f = f * f + 1;
j = (int) f / 1000;
f = f - j*1000;
}
for (i = 0; i < n; i++) {
count++;
f = f * f + 1;
j = (int) f / 1000;
f = f - j*1000;
}
...
|
log.c:
...
for (i = 1; i <= n; i *= 2) {
count++;
f = f * f + 1;
j = (int) f / 1000;
f = f - j*1000;
}
...
|
nlogn.c:
...
for (k = 0; k < n; k++) {
for (i = 1; i <= n; i *= 2) {
count++;
f = f * f + 1;
j = (int) f / 1000;
f = f - j*1000;
}
}
...
|
nsquared.c:
...
for (i = 0; i < n; i++) {
for (k = 0; k < n; k++) {
count++;
f = f * f + 1;
j = (int) f / 1000;
f = f - j*1000;
}
}
...
|
In the five programs total, the value of count will be the
following (note, I will expect you to be able to do things like
tell me what count is as a function of n on tests):
- linear1.c: count = n.
- linear2.c: count = 2n.
- log.c: count = log(n). (where the logarithm is base 2).
- nlogn.c: count = n*log(n). (where the logarithm is base 2).
- nsquared.c: count = n*n.
In each program, the running time is going to be directly proportional
to count. Why? Read chapter two for how to count instructions.
So, what do the running times look like if you increase n to large
values. I have the output of the various programs in the following
table of links:
Some things you should notice right off the bat: log(n) is very,
very small in comparison to n. This means that log.c
is blazingly fast for even huge values of n. On the other end
of the spectrum, n*n grows very quickly as n increases.
Below, I graph all of the programs' running times as a function of
n:
So, this shows what you'd think:
log(n) < n < 2n < n*log(n) < n*n
Perhaps its hard to gauge how much each is less than the other until you
see it. Below I plot the same graph, but zoomed up a bit so you can get
a better feel for n*log(n) and n*n.
Back to Big O: Function comparison
Big-O notation tries to work on classifying functions. The functions that
we care about are the running times of programs. The first concept when
we deal with Big O are comparing functions. Basically, we will say that
one function f(n)is greater than another g(n)
if there is a value x so that for all i >= x:
f(i) >= g(i)
Put graphically, it means that after a certain point on the x axis,
as we go right, the curve for f(n) will always be higher than
g(n). Thus, given the graphs above, you can see that
n*n is greater than
n*log(n), which is greater than
2n, which is greater than
n, which is greater than
log(n).
So, here are some functions:
- a(n) = 1
- b(n) = 100
- c(n) = 6-n
- d(n) = n
- e(n) = 2n
- f(n) = 2n-5
- g(n) = n*n - 5000000000
- h(n) = log(n)
- i(n) = log(n) - 100
- j(n) = n*log(n)-100
So, we can ask ourselves questions: Is b(n) > a(n)? Yes. Why?
Because for any value of n, b(n) is 100, and a(n) is
1. Therefore for any value of n, b(n) is greater than
a(n).
That was easy. How about c(n) and b(n)? b(n) is
greater, because for any value of n greater than 6, b(n)
is 100 and c(n) is negative.
Here's a total ordering of the above. Make sure you can prove all
of these to yourselves:
g(n) > j(n) > e(n) > f(n) > d(n) > h(n) > i(n) > b(n) > a(n) > c(n)
Some rules:
- If a function consists only of polynomials, the Big-O value of the
function is equal to the polynomial with the
largest degree, regardless of how big the coefficients are. For example,
f(n) = .00001n2 + 1000n is O(n2) despite the fact that
n's coefficient is so much greater than n2's coefficient.
Formally this rule can be written as:
If f(n) is a polynomial of degree k and g(n) is a
polynomial of degree l < k, and both lead coefficients are
positive, then f(n) > g(n).
- If f(n) and g(n) are both polynomials of
degree k, then the lead coefficient defines which one is greater. For
example, if f(n) = 6n and g(n) = 4n then f(n) is greater.
- When you write the Big-O value for a function omit any coefficients. For
example, write O(n), not O(6n). If a function's Big-O value is constant,
write O(1), not O(6).
Big O
Given the above, you should now be able to read the book's definition
of big-O, Omega, Theta and little-o:
T(N) = O(f(N)) if there exists a constant c such that
c*f(N) >= T(N).
This means given the definitions of a(n) through j(n)
above:
- a(n) = O(1).
- b(n) = O(1).
- b(n) = O(n). This is because n > b(n).
- b(n) = O(n*n). This is because n*n > b(n).
- b(n) = O(a(n)).
- a(n) = O(b(n)).
- e(n) = O(n). This is because 2n >= e(n). I.e.
with c = 2, cn >= e(n).
- i(n) = O(log(n)).
- j(n) = O(n*log(n))
- g(n) = O(n*n)
Note that O(f(N)) is an upper bound on T(N). That means
that T(N) is definitely not slower than f(n). Similarly,
if g(N) > f(N) and T(N) = O(f(N)), then
T(N) = O(g(n)) too.
Read the book for the other definitions.