Your second project will be to build a sorting utility called
volsort
in C++ that will mimic the UNIX tool "sort" with a
twist: the underlying container must be a linked list.
Your program must implement the following sorting methods or modes:
qsort: This sorting method simply uses the qsort function.
merge: This sorting method uses a custom implementation of the merge sort algorithm.
quick: This sorting method uses a custom implementation of the quick sort algorithm.
This project is to be done in groups of 1 - 4 students and is due Friday September 18th, 2024 at 5:00pm.
My implementation, which you should mimic, has the following usage message:
$ volsort -h usage: volsort -m MODE Sorting mode (oblivious, stl, qsort, merge, quick) -n Perform numerical ordering
volsort
therefore must take two optional arguments: the -m
flag, which lets the user select which sorting method to use; and
the -n
flag, which informs the program to treat the input values as numeric
integers rather than textual strings.
As with most Unix filters, volsort
reads from [standand input] and then
outputs to standard output:
# Use the volsort solution to order 10 random numbers in the file input $ volsort -m quick < input 12 32 37 37 50 61 69 71 90 97
Your header file should contains the definition of the struct Node
and struct List
such as this:
struct Node { std::string string; int number; Node *next; }; struct List { Node *head; size_t size; List(); ~List(); void push_front(const std::string &s); };
This struct List
is a singly-linked list with a
constructor, destructor, and a single push_front method. This
struct List
consists of struct Node
entries that contain both string
values, the corresponding number value, and a pointer to the next struct
Node
.
It is required that you implement the struct List
constructor,
destructor, and push_front method. The push_front method should
insert a new struct Node
into the front of the list. To get the number
value for the new struct Node
, you should use the std::stoi function (if
this conversion fails, you can default to 0
for the number value).
You should implement the struct
Node
comparison functions in C
++ style as:
// C++ Style comparison function bool node_number_compare(const Node *a, const Node *b); bool node_string_compare(const Node *a, const Node *b);
Thus, there are two comparision functions in total: one for comparing the
number field and the other for comparing the string field of the
struct Node
Additionally, it will help to debug if you also implement the dump_node
utility function
which outputs the contents of each node start from n
until nullptr
is
reached:
// Utility function void dump_node(Node *n);
The first sorting mode is to use the std::sort method. Because our struct List
does not implement random access iterators,
we will need to copy the struct Node
s in the struct List
into another
container such as an array. Once we have this copy, we can then call the
std::sort
function on this copy with the appropriate comparison function.
Note that our Node struct has three values, one of which is a string.
The most efficient way to reconstitute the sorted order is to update
the links between the struct Node
s -- not to simply
replacing the list data with node data. For example, for a list of
n strings you may have to run atoi n times, versus
simply updating n pointers as suggested here.
Hint: You should store struct Node *
in your copy array.
Hint: Make sure you set the head of the struct List
after you have
set the links of all the struct Node
s.
The second sorting mode could be nearly identical to the first function, except you will use qsort instead of std::sort.
Both of the custom merge sort and quick sort implementations must use the following divide-and-conquer strategy. Msort is used here as the initial example:
Node *msort(Node *head, CompareFunction compare) { // Handle base case // Divide into left and right sublists // Conquer left and right sublists // Combine left and right sublists }
Specifically, you could implement the third sorting mode as follows:
// Public functions void merge_sort(List &l, bool numeric); // Internal functions Node *msort(Node *head, CompareFunction compare); void split(Node *head, Node *&left, Node *&right); Node *merge(Node *left, Node *right, CompareFunction compare);
merge_sort
takes a struct List
and whether or not the comparison should
be numeric
and performs the top-down merge sort algorithm. This
function serves as a wrapper or helper function for the recursive msort
function.
msort
is the recursive portion of the algorithm and calls split
to
divide and calls merge
to conquer. It returns the new head
of
the list.
split
is a helper function that splits the singly-linked list in half
by using the slow and faster pointer trick (aka [tortoise and hare]).
merge
is a helper function that combines both the left
and right
lists and returns the new head
of the list.
The fourth sorting mode is a custom implementation of quick sort. You are to implement a simple version of the algorithm where the first element is the pivot.
// Public functions void quick_sort(List &l, bool numeric); // Internal functions Node *qsort(Node *head, CompareFunction compare) void partition(Node *head, Node *pivot, Node *&left, Node *&right, CompareFunction compare); Node *concatenate(Node *left, Node *right);
quick_sort
takes a struct List
and whether or not the comparison should
be numeric
and performs the quick sort algorithm. This function serves
as a wrapper or helper function for the recursive qsort
function.
qsort
is the recursive portion of the algorithm and calls partition
to
divide the list and calls concatenate
to conquer. It returns the
new head
of the list.
partition
is a helper function that splits the singly-linked list into
two left
and right
lists such that all elements less than the pivot
are in the left
side and everything else is on the right
.
concatenate
is a helper function that combines both the left
and
right
lists and returns the new head
of the list.
To test your programs, simply run make test
. Your programs must correctly
process the input
correctly, have no memory errors, and pass the test cases provided.
When you are finished implementing your volsort
sorting utility answer the
following questions in the project02/README.md
:
Benchmark all four different sorting modes on files with the following number of integers:
10, 100, 1000, 10000, 100000, 1000000, 10000000
Record the elapsed time and memory consumption of each mode and file size in a Markdown table:
| Mode | Size | Elapsed Time | |---------|-------|---------------| | STL | 10 | 0 | | STL | 100 | 0 | | ... | ... | ... | | QSORT | 10 | 0 | | ... | ... | ... | | MERGE | 10 | 0 | | ... | ... | ... | | QUICK | 10 | 0 | | ... | ... | ... |
Hint: Use the Unix utility time to get the elapsed time per test.
Hint: Write a simple C/C++ program or create a Python script to generate the input files.
Optional: If you have prior experience, write another simple script (shell script for example) to automate the benchmarking.
You may wish to create your test files in /tmp
as that is likely larger than your EECS local directory.
Likewise, you should redirect the output of volsort
to /dev/null
for
the benchmarking.
After you have performed your benchmark:
Discuss the relative performance of each sorting method and try to explain the differences.
What do these results reveal about the relationship between theoretical complexity discussed in class and actual performance?
In your opinion, which sorting mode is the best? Justify your conclusion by examining the trade-offs for the chosen mode.
In addition to the questions, please provide a brief summary of each individual group members contributions to the project.
To faciliate testing, you were previously asked to clone the course Github repository as follows:
git clone https://github.com/semrich/cs302-fall24.git cs302
For this assignment, update this clone by using the following:
git pull
We'll discuss this in class but note that your program must be compilable using make. To test your solution against ours, type:
make test
As we grade the assignment, "as intended" means you achieve the correct output in a given mode as outlined in this project document above. We may choose to assign up to 50% of the total points for as many alternative implementations, e.g., volsort -m stl copies list into a vector, sorts using std:sort(v.begin(), v.end()), and copies back into the list if:
Full rubric in three total parts follows:
Part 1: Project high-level requirements
Part 2: Make test
Part 3: Report in Git readme (as requested) or separate doc
To submit your solution, you must submit a single TAR archive (.tar) on Canvas prior to the deadline.
Note: Although submission will be faciliated by Canvas, we will compile and test on EECS lab machines!
If you develop your solution elsewhere please make sure it works on the lab computers prior to the deadline