Leetcode.com problem 1268: "Search-Suggestions-System"
James S. Plank
Mon Jun 20 01:01:58 PM EDT 2022
Introduction
This is a nice problem in running time analysis. I solved it in two ways:
Way one -- just use a set and lower_bound()
This was my first pass -- quite straightforward:
- Insert all of the words from products into a set.
- Iterate for every prefix of the search word:
- Call lower_bound() on the set, and then check up to three elements from the return value,
to make sure that the prefix is a prefix of that word. You are guaranteed that when
you fail, you're done.
I've put the commented code in
set-solution.cpp. It has a main() so you can test
with example files. Here's the first example from Leetcode:
UNIX> cat example-1.txt
mobile mouse moneypot monitor mousepad
mouse
UNIX> ./a.out < example-1.txt
mobile moneypot monitor
mobile moneypot monitor
mouse mousepad
mouse mousepad
mouse mousepad
UNIX>
When I submitted it, I was surprised that it was faster than anything:
- Runtime: 97 ms, faster than 67.40% of C++ online submissions.
- Memory Usage: 33.3 MB, less than 67.40% of C++ online submissions.
I suspect the reason is that for some reason students hate sets and always want to use
vectors and poor searching. Maybe? No clue.
But let's analyze the running time of this. Let's assign some variables:
- Number of elements in problems: n
- Length of the search word w.
- Average length of words in problems: a.
Obviously, creating the set is O(n log n), but if the words are similar, we should
factor in a, so make it O(an log n).
Now, for each substring, we create the prefix, which will be O(prefix-size).
We then perform lower_bound(), which is O(log n), and then do up to three
comparisons -- those are O(prefix-size) too. So I've got each iteration
to be O(w log n). That makes O(w2 log n).
So that's a total of O(aw2 log n). That feels slow, but of
course, the size of the return value can be O(w2), so maybe
it's not so bad.
Way two -- a list of indices
This one was a little challenging to program, but it did turn out to be faster.
I did the following:
- I sorted problems.
- Next, I put the indices from 0 to problems.size()-1 onto a list.
The idea is that the list gives me potential words from problems to
test. I use a list, because once I determine that a word doesn't have
the prefix, I'll delete it from the list.
- Now, I start with the first character of the search word, and I
look on the list, starting at the front. If a word on the list has
the first character, I put it on the return vector (until the return
vector has three elements. Once it has three elements, I don't put
it on the return vector, but I do keep processing). I then move onto
the next word.
If a word doesn't have the first character, then I delete it.
You'll notice that at the end of the first pass, the list will only
contain indices of words that have the first character of the search
word.
- Then, I can keep processing successive characters of the search word,
again, deleting elements of the list that don't have the proper
character.
I should give an example, but I don't have time. The running time of this
is O(an log n) for sorting the products, and then O(an)
for processing the list. Why O(an)? -- well, potentially each character
of each word of products is visited, with an overhead of O(1)
for each visit (remember, deleting an element from a list is O(1).)
Of course, creating the solution ends up being O(w2), so that's
probably the dominant term. That wasn't the case in the first answer.
My solution was a little more efficient than the description above, but
those are the broad strokes. My timing was much better:
- Runtime: 60 ms, faster than 82.96% of C++ online submissions.
- Memory Usage: 24 MB, less than 92.19% of C++ online submissions.
The code is in
list-solution.cpp.
I said, it deviates from the code above, but I popped in a few comments
to help you understand. Only venture there if you're brave....