As each row of the map is traversed, the value of the current pixel is compared to the value of the previous pixel in that row (west neighbor) and the pixel value in the same column of the previous row (north neighbor). Abiding by the NEWS neighborhood rule, if all pixels are in the same cluster, the smallest cluster label is assigned to all three pixels and any changes in the status of the west and north neighbors are recorded in the working arrays. The smallest of the two values is chosen to reduce the memory requirements of the csize array where the maximum size is a function of the density of the map and the pixel clustering patterns. For the best use of memory by the csize array, the best case scenario is a map entirely of 1's or 0's which would contain only 1 cluster of size . The csize array size would be 1. Conversely, the worst case scenario is a checkerboard pattern, alternating 1's and 0's throughout the map, resulting in clusters each containing only 1 pixel. The csize array size would be . When the algorithm completes, each index in the csize array holds one of three values: a zero, the number of pixels in the cluster identified by the csize index value, or a pointer to another csize index. The last two values are differentiated by the arithmetic sign of the value---positive represents the pixel count in the indexed cluster whereas negative represents a pointer to another csize array index. De-referencing the pointer, which could follow a non-circular, multi-step path, is accomplished by indexing into the csize array using the absolute value of the pointer. The HK algorithm does not guarantee a properly labeled map; however, relabeling, or adjusting pixel values to accurately reflect cluster membership can be accomplished in linear time using the working array produced by the HK algorithm. Figure 6 illustrates the results of the working array csize after merging the temporary cluster labeled 2 from the top row into the temporary cluster labeled 1.
Before the merge, temporary cluster 1 contained 9 pixels, and temporary cluster 2 contained 2 pixels. After the addition of the current pixel (1 pixel) and the pixels from temporary cluster 2 (2 pixels), temporary cluster 1 now contains 12 pixels (9 + 2 + 1), and temporary cluster 2 shows a pointer to temporary cluster 1 in the csize working array. It is possible to process extremely large maps in linear time using a divide-and-conquer approach to cluster identification because at a point in time, the algorithm only requires access to the working array csize, the current pixel, the north neighboring pixel, and the pixel value of the west neighbor. In this work, both an improved sequential implementation of the HK algorithm and a parallel implementation on the Thinking Machines CM-5 are presented.