The code is in reverse_1.cpp:
#include <iostream> #include <list> using namespace std; main() { list <string> lines; list <string>::iterator lit; string s; while (getline(cin, s)) lines.push_front(s); for (lit = lines.begin(); lit != lines.end(); lit++) { cout << *lit << endl; } } |
A few things -- you declare an empty list just like you declare an empty vector. In fact, the code to create the list is very much like the code to create a vector, except we are using push_front() to prepend each string to the front of the list.
To traverse the list, we use an iterator, which is a special type defined by the template library. The for loop is typical -- you start with the first element of the list, obtained with the begin() method, and traverse until you are one element beyond the end of the list (signified by the end() method). To go from one element to the next, you increment the iterator. I don't like this usage of overloading, but it wasn't up to me.
Then, to access the element in the list, you use pointer indirection (the asterisk). When you get used to seeing this code, it reads nicely. It does take a little acclimation though. Regardless, it works:
UNIX> cat input.txt Born in the night She would run like a leopard That freaks at the sight Of a mind close beside herself UNIX> reverse_1 < input.txt Of a mind close beside herself That freaks at the sight She would run like a leopard Born in the night UNIX>
#include <iostream> #include <list> using namespace std; main() { list <string> lines; list <string>::reverse_iterator lit; string s; while (getline(cin, s)) lines.push_back(s); for (lit = lines.rbegin(); lit != lines.rend(); lit++) { cout << *lit << endl; } } |
We've created the list with push_back(), and we change lit to be a reverse_iterator. The iteration proceeds from rbegin(), which is the last element of the list, to rend(), which is one element before the first element of the list. Note, we still increment lit -- is that natural? You be the judge.
UNIX> reverse_2 < input.txt Of a mind close beside herself That freaks at the sight She would run like a leopard Born in the night UNIX>
The program reverse_3.cpp implements reversal by inserting each element at the front and traversing the list in the forward direction:
#include <iostream> #include <list> using namespace std; main() { list <string> lines; list <string>::iterator lit; string s; while (getline(cin, s)) lines.insert(lines.begin(), s); for (lit = lines.begin(); lit != lines.end(); lit++) { cout << *lit << endl; } } |
It works like the others:
UNIX> reverse_3 < input.txt Of a mind close beside herself That freaks at the sight She would run like a leopard Born in the night UNIX>
void insertItem(string newItem, string target, list<string> &groceryList) { list <string>::iterator lit; for (lit = groceryList.begin(); lit != groceryList.end(); lit++) { if (*lit == target) { groceryList.insert(lit, newItem); break; } } if (lit == groceryList.end()) groceryList.push_back(newItem); } |
The code iterates through the grocery list until it locates the target item. It then inserts the new item before this target item. If the code fails to locate the target item, then it will fall out of the loop, having reached groceryList.end(). We can test for this condition, and if it is true, then we add the new item to the back of the list.
You can also efficiently delete items from the middle of a list. For example, suppose we have a list of integers and we want to output them in sorted order, from smallest to largest. One way to do this is to iterate through the list and find the smallest integer. We output this integer and delete it from the list. We then repeat this procedure on the list until we have printed all the integers in the list and emptied it. As a concrete example, suppose the list has the elements { 8, 2, 1, 10 }. Our algorithm will then work as follows:
The code for this program can be found in sort.cpp. sort.cpp takes a list of command line arguments, which should be integers, and prints them out in sorted order. We are assuming that we are sorting a list of box sizes, each of which is an integer. The relevant function is findMin, which iterates through the list, finds the minimum element, deletes it, and returns the minimum element:
int findMin(list<int> &boxSizes) { list <int>::iterator lit; list <int>::iterator min; min = boxSizes.begin(); for (lit = boxSizes.begin(); lit != boxSizes.end(); lit++) { if (*lit < *min) { min = lit; } } int returnValue = *min; boxSizes.erase(min); return returnValue; } |
findMin uses two iterators, min and lit. lit is used to iterate through each element of the list and min is used to "remember" the location of the smallest element found so far in the list. When the loop terminates, min points to the smallest element in the list. We assign this value to a temporary variable, and then delete the element from the list using the list's erase method. The order in which we save the value and do the deletion is important. If we first delete the element and then try to return the value, we may fail, because we already deleted the element from the list. Hence we must save the value before deleting it.
#include <iostream> #include <vector> using namespace std; main() { vector <string> lines; vector <string>::iterator lit; string s; while (getline(cin, s)) lines.insert(lines.begin(), s); for (lit = lines.begin(); lit != lines.end(); lit++) { cout << *lit << endl; } } |
I call this ill-judged because when you perform an insertion such as v.insert(v.begin(), x), the STL basically does the following:
v.resize(v.size()+1); for (i = v.size(); i > 0; i--) v[i] = v[i-1]; v[0] = x; |
In other words, it copies all of the elements of the vector to make room for the new element at v[0]. This is expensive, and makes reverse_4.cpp above run in O(n2) time.
To illustrate, input-2.txt is an input file with 10,000 lines, and input-3.txt is one with 40,000 lines. Look at the difference in speed between reverse_3 and reverse_4:
UNIX> wc input-2.txt 10000 10000 80000 input-2.txt UNIX> wc input-3.txt 40000 40000 320000 input-3.txt UNIX> time reverse_3 < input-2.txt > /dev/null 0.012u 0.000s 0:00.01 100.0% 0+0k 0+0io 0pf+0w UNIX> time reverse_3 < input-3.txt > /dev/null 0.024u 0.008s 0:00.03 66.6% 0+0k 0+0io 0pf+0w UNIX> time reverse_4 < input-2.txt > /dev/null 0.452u 0.000s 0:00.45 100.0% 0+0k 0+0io 0pf+0w UNIX> time reverse_4 < input-3.txt > /dev/null 7.008u 0.012s 0:07.04 99.5% 0+0k 0+0io 0pf+0w UNIX>As you can see, reverse_3 is very fast (0.012 and 0.024 seconds on one of our hydra machines in 2014), while reverse_4 is painfully slow (0.45 and 7 seconds). This is is important, and you should take care that it doesn't happen to you.
A good rule of thumb is to use a vector as an array and not a list. Don't use iterators -- use integer indices. Then you're ok.
#include <iostream> #include <deque> using namespace std; main() { deque <string> lines; int i; string s; while (getline(cin, s)) lines.push_front(s); for (i = 0; i < lines.size(); i++) cout << lines[i] << endl; } |
Unlike the vector version, this one runs very fast:
UNIX> reverse_5 < input.txt Of a mind close beside herself That freaks at the sight She would run like a leopard Born in the night UNIX> time reverse_5 < input-2.txt > /dev/null 0.004u 0.004s 0:00.01 0.0% 0+0k 0+0io 0pf+0w UNIX> time reverse_5 < input-3.txt > /dev/null 0.032u 0.000s 0:00.03 100.0% 0+0k 0+0io 0pf+0w UNIX>
#include <iostream> #include <list> using namespace std; main() { list <string> lines; list <string>::iterator lit; string s; while (getline(cin, s)) { lines.push_back(s); if (lines.size() > 10) lines.erase(lines.begin()); } for (lit = lines.begin(); lit != lines.end(); lit++) { cout << *lit << endl; } } |
Works fine:
UNIX> mytail_list < input-2.txt 9991 9992 9993 9994 9995 9996 9997 9998 9999 10000 UNIX> mytail_list < input-3.txt 39991 39992 39993 39994 39995 39996 39997 39998 39999 40000 UNIX>As with the previous example, we can port the code directly to vectors and to deques, since they both implement an erase() method. As with the other example, we see that the vector implemention performs worse, since it copies all of the remaining elements upon deletion (the shell scripts make them do more work so that you can see the difference):
UNIX> time sh big_mytail_list.sh 0.411u 0.013s 0:00.42 100.0% 0+0k 0+0io 0pf+0w UNIX> time sh big_mytail_deque.sh 0.370u 0.012s 0:00.38 100.0% 0+0k 0+0io 0pf+0w UNIX> time sh big_mytail_vector.sh 0.507u 0.012s 0:00.52 98.0% 0+0k 0+1io 0pf+0w UNIX>The difference isn't huge, but it is there. Were we to keep the last 100 lines instead of the last 10, the difference would be much more pronounced (we did this in class).
They give a few examples:
String | Number of diamonds |
"><<><>>><" | 3 |
">>>><<" | 0 |
"<<<<<<<<<>>>>>>>>>" | 9 |
"><<><><<>>>><<>><<><<>><<<>>>>>><<<" | 14 |
This implementation works directly from the problem statement, using the find() method of strings to find a diamond, and then using substr() to remove the diamond:
#include <iostream> #include <cstdio> #include <cstdlib> using namespace std; class DiamondHunt { public: int countDiamonds(string mine); }; int DiamondHunt::countDiamonds(string mine) { int nd, i; nd = 0; while (1) { i = mine.find("<>"); if (i == string::npos) return nd; nd++; mine = mine.substr(0, i) + mine.substr(i+2); } } main() { DiamondHunt d; string s; while (cin >> s) { cout << d.countDiamonds(s) << endl; } exit(0); } |
When we test it out, it works fine:
UNIX> g++ -o DiamondHunt1 DiamondHunt1.cpp UNIX> DiamondHunt1 <> 1 >< 0 ><<><>>>< 3 >>>><< 0 <<<<<<<<<>>>>>>>>> 9 ><<><><<>>>><<>><<><<>><<<>>>>>><<< 14 UNIX>Although this solution works, think about its running time. In particular, think about the "<<<<<<<<<>>>>>>>>>" input. It has to scan nine characters before finding the diamond. Then the next time it has to scan 8, then 7, etc. In other words, if you have a string of n less-than signs followed by n greater-than signs, you will have to perform n2 scans to find the diamonds. When n is small (25 in the topcoder constraints), that doesn't make a difference. However, it can matter. The program make_bad_diamond.cpp is a very simple C++ program that takes n on the command line and produces a string with n less-than signs followed by n greater-than signs. See what happens when we call it with successively larger values and time the output:
UNIX> time sh -c "make_bad_diamond 10 | DiamondHunt1" 10 0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w UNIX> time sh -c "make_bad_diamond 100 | DiamondHunt1" 100 0.010u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w UNIX> time sh -c "make_bad_diamond 1000 | DiamondHunt1" 1000 0.010u 0.000s 0:00.01 100.0% 0+0k 0+0io 0pf+0w UNIX> time sh -c "make_bad_diamond 10000 | DiamondHunt1" 10000 0.800u 0.000s 0:00.79 101.2% 0+0k 0+0io 0pf+0w UNIX> time sh -c "make_bad_diamond 100000 | DiamondHunt1" 100000 79.350u 0.010s 1:19.66 99.6% 0+0k 0+0io 0pf+0w UNIX>When the input size is increased by a factor of 10, the running time is increased by a factor of 100. That's not good.
Instead, DiamondHunt2.cpp uses a list. It copies the elements of mine to a list, and then uses three iterators on the list:
#include <iostream> #include <cstdio> #include <cstdlib> #include <list> using namespace std; class DiamondHunt { public: int countDiamonds(string mine); }; int DiamondHunt::countDiamonds(string mine) { int nd, i; list <char> l; list <char>::iterator left, right, newleft; for (i = 0; i < mine.size(); i++) l.push_back(mine[i]); nd = 0; left = l.begin(); while (left != l.end()) { if (*left == '>') { left++; // If left is not the beginning of a diamond, move on. } else { right = left; right++; if (right == l.end()) return nd; if (*right == '<') { // If right is not the end of a diamond, move on left++; } else { // Otherwise, we've found a diamond. We need to nd++; // increment nd, and set newleft to point to the previous // char, or if left is at the beginning, to the next one. if (left == l.begin()) { newleft = right; newleft++; } else { newleft = left; newleft--; } l.erase(left); // Now erase left and right, and set left to newleft. l.erase(right); left = newleft; } } } return nd; } main() { DiamondHunt d; string s; while (cin >> s) { cout << d.countDiamonds(s) << endl; } exit(0); } |
It works on the examples as before:
UNIX> g++ -o DiamondHunt2 DiamondHunt2.cpp UNIX> DiamondHunt2 <> 1 >< 0 ><<><>>>< 3 >>>><< 0 <<<<<<<<<>>>>>>>>> 9 ><<><><<>>>><<>><<><<>><<<>>>>>><<< 14 UNIX>However, it is much faster than the previous version because we don't traverse the list on each iteration as we did with m.find():
UNIX> time sh -c "make_bad_diamond 1000 | DiamondHunt2" 1000 0.020u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w UNIX> time sh -c "make_bad_diamond 10000 | DiamondHunt2" 10000 0.020u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w UNIX> time sh -c "make_bad_diamond 100000 | DiamondHunt2" 100000 0.040u 0.010s 0:00.04 125.0% 0+0k 0+0io 0pf+0w UNIX> time sh -c "make_bad_diamond 1000000 | DiamondHunt2" 1000000 0.470u 0.000s 0:00.44 106.8% 0+0k 0+0io 0pf+0w UNIX>As we increase the string by a factor of 10, we increase the running time by a factor of ten. That's much better than DiamondHunt1.
It's important for you to understand the code in DiamondHunt2.cpp. To help you, here's an example when we call it on the string: "<<>><<>": I will draw every iteration of the while() loop. Here are the list and the iterators in the first iteration:
I'm drawing the list with two sentinel nodes at each end. Before the first node is a sentinel node for l.rend(), and after the last node is a sentinel node for l.end(). We start with left equaling l.begin(), and since it points to a less-than character, we set right to be the next node. Since right also points to a less-than node, there is no diamond -- we increment left and go to the next iteration of the while() loop:
Now left points to a less-than and right points to a greater-than. So, we increment nd and then set newleft to be the node before left. That is pictured below:
We then erase left and right, and set left to newleft before going back to the top of the while() loop. Here's what happens in the next iteration:
The two erased nodes are gone from the picture, and left and right point to a diamond. Thus, nd is incremented, and since left is equal to l.begin(), we set newleft to be the node after right. That is the state pictured above. We then erase left and right, and set left to newleft before going back to the top of the while() loop. Here's what happens in the fourth iteration:
This is the same case as the first iteration -- no diamond. We increment left and move on:
We have a diamond. We first increment nd. Next, since left is not equal to l.begin(), we set newleft to point to the node before left. That is depicted above. We then erase, set left to newleft and reach the last iteration of the while() loop:
Since right equals l.end(), we return 3, and we're done. It's important that you step through this example until you understand it. You may even want to step through what happens when the string is we call it on the string "<<>>><>". The execution is very similar, except the fourth and sixth iterations look a little different.
After creating the list of lists, we traverse it an print out each bottom level list on one line:
#include <iostream> #include <list> using namespace std; typedef list <int> intlist; main() { list <intlist *> numlists; list <intlist *>::iterator nlit; intlist *il; intlist::iterator ilit; int i, j; for (j = 0; j < 100; j += 10) { il = new intlist; numlists.push_back(il); for (i = 0; i < 10; i++) { il->push_back(i+j); } } for (nlit = numlists.begin(); nlit != numlists.end(); nlit++) { il = *nlit; for (ilit = il->begin(); ilit != il->end(); ilit++) cout << *ilit << " " ; cout << endl; } } |
The typedef makes the code cleaner, so that you don't have nested list declarations.
This code runs nicely, and as you'd expect:
UNIX> list_o_list_1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 UNIX>You have undoubtedly noticed the fact that our bottom level list is a pointer to a list, which we create using new. Why do we do this? The answer is that if we don't use pointers, we expose ourselves to problems with making copies of things. Let's see what happens if we try to avoid pointers.
A first straightforward try is in list_o_list_2.cpp, which just takes out the new and changes pointers to non-pointers:
#include <iostream> #include <list> using namespace std; typedef list <int> intlist; main() { list <intlist> numlists; list <intlist>::iterator nlit; intlist il; intlist::iterator ilit; int i, j; for (j = 0; j < 100; j += 10) { numlists.push_back(il); for (i = 0; i < 10; i++) { il.push_back(i+j); } } for (nlit = numlists.begin(); nlit != numlists.end(); nlit++) { il = *nlit; for (ilit = il.begin(); ilit != il.end(); ilit++) cout << *ilit << " " ; cout << endl; } } |
When we run it, we get some icky results:
UNIX> list_o_list_2 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 UNIX>What's going on? Well, first, you are always inserting il into each top-level list. When you do that, it makes a copy of il, and then does not insert the integers into the copy, but into il. This is why il keeps growing, and why the first line is blank -- we are printing an empty list.
We can fix this by getting rid of il and accessing the list elements directly. A solution is in list_o_list_3.cpp:
#include <iostream> #include <list> using namespace std; typedef list <int> intlist; main() { list <intlist> numlists; list <intlist>::iterator nlit; intlist il; intlist::iterator ilit; int i, j; for (j = 0; j < 100; j += 10) { numlists.resize(numlists.size()+1); for (i = 0; i < 10; i++) { numlists.back().push_back(i+j); /* Yuck */ } } for (nlit = numlists.begin(); nlit != numlists.end(); nlit++) { il = *nlit; for (ilit = il.begin(); ilit != il.end(); ilit++) cout << *ilit << " " ; cout << endl; } } |
That's an awful line of code, isn't it? Spend some time reading it to make sure you understand it. It seems to work fine:
UNIX> list_o_list_3 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 UNIX>However, there is a bug. That is the line
il = *nlit; |
This line makes a copy of *nlit, which makes a copy of the list. As I said in class, it makes one pine for C, which doesn't let you make copies so wantonly. To fix this, remove il completely (list_o_list_4.cpp):
#include <iostream> #include <list> using namespace std; typedef list <int> intlist; main() { list <intlist> numlists; list <intlist>::iterator nlit; intlist il; intlist::iterator ilit; int i, j; for (j = 0; j < 100; j += 10) { numlists.resize(numlists.size()+1); for (i = 0; i < 10; i++) { numlists.back().push_back(i+j); } } for (nlit = numlists.begin(); nlit != numlists.end(); nlit++) { for (ilit = nlit->begin(); ilit != nlit->end(); ilit++) cout << *ilit << " " ; cout << endl; } } |
Again, I find that code unreadable -- in fact, this code is so ugly, you may as well put it all on one line (list_o_list_5.cpp):
#include <iostream> #include <list> using namespace std; typedef list <int> intlist; main() { list <intlist> numlists; list <intlist>::iterator nlit; intlist il; intlist::iterator ilit; int i, j; for (j = 0; j < 100; j += 10) { numlists.resize(numlists.size()+1); for (i = 0; i < 10; i++) { numlists.back().push_back(i+j); } } for (nlit = numlists.begin(); nlit != numlists.end(); nlit++) { for (ilit = nlit->begin(); ilit != nlit->end(); ilit++) cout << *ilit << " " ; cout << endl; } } |
For the record, I don't advocate doing this -- it's just that list_o_list_4.cpp is so unreadable it may as well be on one line.