#include <set> #include <iostream> using namespace std; main() { string s; set <string> names; set <string>::iterator nit; while(!cin.fail()) { getline(cin, s); if (!cin.fail()) names.insert(s); } for (nit = names.begin(); nit != names.end(); nit++) { cout << *nit << endl; } } |
Instead of using push_back(), like you do with lists or vectors, you use insert(), which puts the string into the right place. The traversal is exactly like traversing a list.
UNIX> cat input-1.txt Tim David Adrian Hamza UNIX> simple_set < input-1.txt Adrian David Hamza Tim UNIX>The first question you should have is: "What about duplicate entries?" For example, let's try input-2.txt:
UNIX> cat input-2.txt Tim David Adrian Hamza Tim UNIX> simple_set < input-2.txt Adrian David Hamza Tim UNIX>As you can see, it does not insert duplicates. If you want to allow duplicates, you use a multiset, as in simple_multiset.cpp:
#include <set> #include <iostream> using namespace std; main() { string s; multiset <string> names; multiset <string>::iterator nit; while(!cin.fail()) { getline(cin, s); if (!cin.fail()) names.insert(s); } for (nit = names.begin(); nit != names.end(); nit++) { cout << *nit << endl; } } |
UNIX> simple_multiset < input-2.txt Adrian David Hamza Tim Tim UNIX>
map <string, int> names; map <string, int>iterator nit; |
We'll write a simple example. This example assumes that input is as in Roster.txt: it is composed of first and last names of people. (Our example is all the NFL players in 2009 whose last names begin with "A", in random order). We'll use a map as declared above, and what we are going to do is keep track of the last names, and how many players have each last name. The program for this is in simple_map.cpp
#include <stdio.h> #include <iostream> #include <string> #include <map> using namespace std; main() { map <string, int> names; map <string, int>::iterator nit; string fn, ln; while (!cin.eof()) { cin >> fn >> ln; if (!cin.fail()) { nit = names.find(ln); if (nit == names.end()) { names.insert(make_pair(ln, 1)); } else { nit->second++; } } } for (nit = names.begin(); nit != names.end(); nit++) { cout << "Last name: " << nit->first << ". Number of players: " << nit->second << endl; } } |
When you insert into a map, since you are inserting two things (a key and value), you must combine them into a pair with the make_pair() procedure. The types of the arguments must match the types specified in the declaration -- in this case, they must be a string and an integer.
The iterator for a map is different, too. Instead of simply specifying it with pointer indirection, you can grab they key from an iterator with "->first" and the value with "->second". Yes, I wish they were called key and val, but that is life. When we run it on Roster.txt, we get:
UNIX> simple_map < Roster.txt Last name: Abdallah. Number of players: 1 Last name: Abdullah. Number of players: 2 Last name: Abiamiri. Number of players: 1 Last name: Abraham. Number of players: 1 Last name: Adams. Number of players: 7 .....We can check for correctness with grep:
UNIX> grep Abdallah Roster.txt Nader Abdallah UNIX> grep Adams Roster.txt Gaines Adams Jamar Adams Anthony Adams Michael Adams Titus Adams Flozell Adams Mike Adams UNIX> grep Adams Roster.txt | wc 7 14 90 UNIX>Like sets, you traverse the maps in ascending order, and you can't insert duplicate keys. Since simple_map.cpp performs the find() and only performs the insert() when the key is not found, the limitation on duplicate keys is not a problem. If you need duplicate keys, use a multimap.
#include <stdio.h> #include <iostream> #include <string> #include <set> #include <map> using namespace std; typedef set <string> fnset; main() { map <string, fnset *> lnames; map <string, fnset *>::iterator lnit; fnset *fnames; fnset::iterator fnit; int i; string fn, ln, name; while (!cin.eof()) { cin >> fn; if (!cin.fail()) { cin >> ln; lnit = lnames.find(ln); if (lnit == lnames.end()) { fnames = new fnset; lnames.insert(make_pair(ln, fnames)); } else { fnames = lnit->second; } fnames->insert(fn); } } for (lnit = lnames.begin(); lnit != lnames.end(); lnit++) { fnames = lnit->second; for (fnit = fnames->begin(); fnit != fnames->end(); fnit++) { cout << *fnit << " " << lnit->first << endl; } } } |
The program uses a map to sort the last names. The "second" field of the map is a pointer to a set, which sorts the first names that belong to that last name. When you read in a name, you check the last name to see if it's in the map. If so, then it sets fnames to be the set of first names with that last name. If not, it creates a new fnames set and inserts it and the last name into the map. Last, it inserts the first name into the set.
When it's done reading input, it does a nested traversal to print out all of the names.
Note the typedef statement to make the program read more easily.
This program will not print out duplicate names, because sets don't hold duplicate entries. If you wanted it to print out duplicate names, you would have to use a multiset.
UNIX> sort_names_1 < Roster.txt | head Nader Abdallah Hamza Abdullah Husain Abdullah Victor Abiamiri John Abraham Anthony Adams Flozell Adams Gaines Adams Jamar Adams Michael Adams UNIX>
#include <stdio.h> #include <iostream> #include <string> #include <set> #include <map> using namespace std; typedef set <string> fnset; main() { map <string, fnset> lnames; map <string, fnset>::iterator lnit; fnset fnames; fnset::iterator fnit; int i; string fn, ln, name; while (!cin.eof()) { cin >> fn; if (!cin.fail()) { cin >> ln; lnit = lnames.find(ln); if (lnit == lnames.end()) { lnames.insert(make_pair(ln, fnames)); } else { fnames = lnit->second; } fnames.insert(fn); } } for (lnit = lnames.begin(); lnit != lnames.end(); lnit++) { fnames = lnit->second; for (fnit = fnames.begin(); fnit != fnames.end(); fnit++) { cout << *fnit << " " << lnit->first << endl; } } } |
This program is very buggy. Take a simple example:
UNIX> head -n 2 Roster.txt Adam Anderson Andy Alleman UNIX> head -n 2 Roster.txt | sort_names_1 Andy Alleman Adam Anderson UNIX> head -n 2 Roster.txt | sort_names_bad Adam Alleman UNIX>Yuck. What's going on? Well, two things. Let's concentrate on the most egregious. This is the fact that you reuse fnames to insert a name into the set, and then you use that same fnames when you insert a last name into the map. That's wrong. Let's fix that by having two fnset's: fnames, which we'll use to insert first names, and fnames_empty, which we use to put an empty set into a newly created last name map: sort_names_bad2.cpp
#include <stdio.h> #include <iostream> #include <string> #include <set> #include <map> using namespace std; typedef set <string> fnset; main() { map <string, fnset> lnames; map <string, fnset>::iterator lnit; fnset fnames, fnames_empty; fnset::iterator fnit; int i; string fn, ln, name; while (!cin.eof()) { cin >> fn; if (!cin.fail()) { cin >> ln; lnit = lnames.find(ln); if (lnit == lnames.end()) { lnames.insert(make_pair(ln, fnames_empty)); lnit = lnames.find(ln); } fnames = lnit->second; fnames.insert(fn); } } for (lnit = lnames.begin(); lnit != lnames.end(); lnit++) { fnames = lnit->second; for (fnit = fnames.begin(); fnit != fnames.end(); fnit++) { cout << *fnit << " " << lnit->first << endl; } } } |
This one still doesn't work:
UNIX> head -n 2 Roster.txt | sort_names_bad2 UNIX>Why? The culprit lies in these two lines:
fnames = lnit->second; fnames.insert(fn); |
The first of these lines makes a copy of lnit->second; You insert the first name into the copy, which does not modifiy the fnset that is actually in lnit->second. To fix this, you need to insert directly into lnit->second. I do this in sort_names_bad3.cpp:
#include <stdio.h> #include <iostream> #include <string> #include <set> #include <map> using namespace std; typedef set <string> fnset; main() { map <string, fnset> lnames; map <string, fnset>::iterator lnit; fnset fnames, fnames_empty; fnset::iterator fnit; int i; string fn, ln, name; while (!cin.eof()) { cin >> fn; if (!cin.fail()) { cin >> ln; lnit = lnames.find(ln); if (lnit == lnames.end()) { lnames.insert(make_pair(ln, fnames_empty)); lnit = lnames.find(ln); } lnit->second.insert(fn); } } for (lnit = lnames.begin(); lnit != lnames.end(); lnit++) { fnames = lnit->second; for (fnit = fnames.begin(); fnit != fnames.end(); fnit++) { cout << *fnit << " " << lnit->first << endl; } } } |
This works as it should:
UNIX> sort_names_1 < Roster.txt > out1.txt UNIX> sort_names_bad3 < Roster.txt > out2.txt UNIX> diff out1.txt out2.txt UNIX>So, now you say, "Ok, it works. Why can't I do this?" The answer is twofold. First, the fact that you can't have a variable point to lnit->second is not only inconvenient, it makes your programs very hard to read. Second, you'll find yourself setting variables to lnit->second and making copies when you don't have to. For example, look at the for loop that prints out the names:
for (lnit = lnames.begin(); lnit != lnames.end(); lnit++) { fnames = lnit->second; for (fnit = fnames.begin(); fnit != fnames.end(); fnit++) { cout << *fnit << " " << lnit->first << endl; } } |
It is making copies of lnit->second. Even though it's not a bug, it's extremely inefficient in terms of both time and memory. Get into the habit of using pointers in the second field of your maps.
pair<iterator, bool> set::insert(const TYPE& val); |
The "(const TYPE& val)" simply means that it works with type that you specify when you define the set.
The return value is a pair much like what you pass to the insert() call of a map. Its first field will be an iterator for the set, and the second will be a boolean. If the element is inserted, then the iterator will point to the newly inserted element. Otherwise, you tried to insert a duplicate, and the iterator is to the value already in the set. The second field reports whether the item was inserted or not.
To see usage, take a look at setreturn.cpp:
#include <set> #include <iostream> using namespace std; typedef set <string> string_set; main() { string s; string_set names; string_set::iterator nit; pair <string_set::iterator, bool> retval; while(!cin.fail()) { getline(cin, s); if (!cin.fail()) { retval = names.insert(s); if (retval.second) { cout << s << ": Successfully inserted.\n"; } else { cout << s << ": Duplicate not inserted.\n"; } } } } |
Note how it returns a pair, whose fields you access with dots rather than arrows. Why then do you use arrows in iterators on maps? Because those iterators point to pairs -- they are not pairs themselves.
UNIX> cat input-2.txt Tim David Adrian Hamza Tim UNIX> setreturn < input-2.txt Tim: Successfully inserted. David: Successfully inserted. Adrian: Successfully inserted. Hamza: Successfully inserted. Tim: Duplicate not inserted. UNIX>