Leetcode.com problem 1268: "Search-Suggestions-System"

James S. Plank

Mon Jun 20 01:01:58 PM EDT 2022

Problem Statement.
Success Rate (as of June, 2022): 67%

Introduction

This is a nice problem in running time analysis. I solved it in two ways:

Way one -- just use a set and lower_bound()

This was my first pass -- quite straightforward:

Insert all of the words from products into a set.
Iterate for every prefix of the search word:
Call lower_bound() on the set, and then check up to three elements from the return value, to make sure that the prefix is a prefix of that word. You are guaranteed that when you fail, you're done.

I've put the commented code in set-solution.cpp. It has a main() so you can test with example files. Here's the first example from Leetcode:

UNIX> cat example-1.txt
mobile mouse moneypot monitor mousepad
mouse
UNIX> ./a.out < example-1.txt
 mobile moneypot monitor
 mobile moneypot monitor
 mouse mousepad
 mouse mousepad
 mouse mousepad
UNIX>

When I submitted it, I was surprised that it was faster than anything:

Runtime: 97 ms, faster than 67.40% of C++ online submissions.
Memory Usage: 33.3 MB, less than 67.40% of C++ online submissions.

I suspect the reason is that for some reason students hate sets and always want to use vectors and poor searching. Maybe? No clue.

But let's analyze the running time of this. Let's assign some variables:

Number of elements in problems: n
Length of the search word w.
Average length of words in problems: a.

Obviously, creating the set is O(n log n), but if the words are similar, we should factor in a, so make it O(an log n).

Now, for each substring, we create the prefix, which will be O(prefix-size). We then perform lower_bound(), which is O(log n), and then do up to three comparisons -- those are O(prefix-size) too. So I've got each iteration to be O(w log n). That makes O(w² log n).

So that's a total of O(aw² log n). That feels slow, but of course, the size of the return value can be O(w²), so maybe it's not so bad.

Way two -- a list of indices

This one was a little challenging to program, but it did turn out to be faster. I did the following:

I sorted problems.
Next, I put the indices from 0 to problems.size()-1 onto a list. The idea is that the list gives me potential words from problems to test. I use a list, because once I determine that a word doesn't have the prefix, I'll delete it from the list.
Now, I start with the first character of the search word, and I look on the list, starting at the front. If a word on the list has the first character, I put it on the return vector (until the return vector has three elements. Once it has three elements, I don't put it on the return vector, but I do keep processing). I then move onto the next word. If a word doesn't have the first character, then I delete it. You'll notice that at the end of the first pass, the list will only contain indices of words that have the first character of the search word.
Then, I can keep processing successive characters of the search word, again, deleting elements of the list that don't have the proper character.

I should give an example, but I don't have time. The running time of this is O(an log n) for sorting the products, and then O(an) for processing the list. Why O(an)? -- well, potentially each character of each word of products is visited, with an overhead of O(1) for each visit (remember, deleting an element from a list is O(1).)

Of course, creating the solution ends up being O(w²), so that's probably the dominant term. That wasn't the case in the first answer.

My solution was a little more efficient than the description above, but those are the broad strokes. My timing was much better:

Runtime: 60 ms, faster than 82.96% of C++ online submissions.
Memory Usage: 24 MB, less than 92.19% of C++ online submissions.

The code is in list-solution.cpp. I said, it deviates from the code above, but I popped in a few comments to help you understand. Only venture there if you're brave....