Graph Definitions

See also Chapter 9 of Weiss.

Glossary


Graph Representation

Data Structures for Vertices

Vertices are typically represented as a set. Four of the possible, commonly used, representations for this set are a hash table, an array, a list, and a binary search tree (such as a red-black tree). A hash table is generally preferred because it has O(1) access and supports dynamic insertions and deletions. The one drawback of a hash table is that it is a little inefficient for finding all the vertices in a graph, since you have to walk through each hash table entry, and for each hash table entry, walk through its list of vertices. Since most hash tables have a few empty entries, the visits to the empty entries are wasted.

An array works well when there are a static number of vertices and this number is known in advance. In this case the array can be pre-allocated. If the vertices are numbered consecutively, then the array can support direct access; otherwise a binary search may be required. An advantage of an array is that all the vertices can be visited by simply walking through the array entries.

A list is typically the least preferable of the data structures. Its advantage is that all the vertices can be visited by simply walking through the list. However, to find a vertex will require O(V) time and to insert a vertex may require O(V) time if the list is sorted.

The red-black tree has the advantage of finding a vertex in O(lg V) time.

Data Structures for Edges

The two most common representations for edges are a matrix and an adjacency list.

Adjacency Matrix

An adjacency matrix is a two dimensional array whose dimensions are equal to the number of vertices.

Values of the entries

Implementation: If the language supports a bit data type, then matrices are typically stored as bit arrays. However, if either the language does not support a bit data type or the edges are represented as instances, then an integer or edge array is typically used.

Adjacency List

An adjacency list is a list of vertices to which a vertex has connections (i.e., it is a list of vertices that are attached to this vertex by edges). In an adjacency list representation, each vertex typically points to its adjacency list of edges. In an undirected graph, if there is an edge from x to y, then the adjacency list for x will have an entry for y and the adjacency list for y will have an entry for x.

Matrices versus Lists

As a rough rule of thumb, matrices are typically used for dense arrays and adjacency lists for sparse arrays. The reason is that matrices consume less space for dense arrays and adjacency lists consume less space for sparse arrays. However, space is only one consideration. Other factors must also be taken into account:

  1. Inserting an edge, determining if an edge exists, and deleting an edge all take O(1) time in a matrix but potentially O(V) time in an adjacency list. If time is of the essence and these operations predominate, then a matrix may be the best choice, period.
  2. Initializing a matrix takes O(V^2) time. Consequently, while subsequent operations may be fast, the initial start-up time for any algorithm involving a matrix requires O(V^2) time. Hence, any algorithm involving a matrix is minimally an O(V^2) algorithm. Important: This is a case where Big-O notation can trip you up. In a long-running system like an airline reservation system, the most important consideration is generally how long the steady-state operations take, not how long the initialization step takes. Consequently, while an algorithm may appear to be O(n^2) because of the initialization time, for all intents and purposes the algorithm may act like an O(n) algorithm or an O(1) algorithm if it runs for a long period of time. Nonetheless, in computing the Big-O running time of an algorithm, the time devoted to the initialization steps must be included.