Leetcode.com problem 1239: "Maximum Length of a Concatenated String with Unique Characters"

James S. Plank

Tue Oct 25 00:14:09 EDT 2022


In case Leetcode's servers are down

Here is a summary of the problem:

You are to write the following method of the Solution class:

int maxLength(vector<string>& arr);

Here are details:

Examples:

1: arr = { "un", "iq", "ue" }             Answer: 4 - either "uniq" or "ique"
2: arr = { "cha", "r", "act", "ers" }     Answer: 6 - either "chaers" or "acters"
3: arr = { "abcdefghijklmnopqrstuvwxyz" } Answer: 26 - duh
4: arr = { "aba", "bcc", "e" }            Answer: 1 - "e"


Hints

The first thing that you should do is identify the strings in arr that have no repeated characters. While you can do that with a map or a 26-element vector, let's do the fastest and most compact thing we can do: Bit arithmetic!!

Use an int to represent all of the characters in the string. For each character in a string, check to see if its bit in the integer is set, and if so, the string has duplicates. If not, for every character, then the string is fine.

Example: aba:
index    character         integer(in binary)         bit-set?             new-integer
0           a       00000000000000000000000000000000    no     00000000000000000000000000000001
1           b       00000000000000000000000000000001    no     00000000000000000000000000000011
2           a       00000000000000000000000000000011   yes     string has duplicates

Example: dab:
index    character         integer(in binary)         bit-set?             new-integer
0           d       00000000000000000000000000000000    no     00000000000000000000000000001000
1           a       00000000000000000000000000001000    no     00000000000000000000000000001001
2           b       00000000000000000000000000001001    no     00000000000000000000000000001011
String is ok.

Ok -- now we have a new vector, call it arru, whose strings have no repeated characters. In fact, you shouldn't store strings in arru, but the integer representations that you calculated above.

The maximum size of this vector is 16, so we can easily do a power set enumeration of all of its subsets. For each subset, test for repeated characters in the same way as above, with bit arithmetic. However, now you don't have to go character by character. You can simply have an int that has all of the characters seen so far, and you can:


Here's example 1 to illustrate:

arr = { "un", "iq", "ue" }

                zyxwvutsrqponmlkjihgfedcba
arru[0] = 00000000000100000010000000000000
arru[1] = 00000000000000010000000100000000
arru[2] = 00000000000100000000000000010000

Test subset 000 = {}.       set = 00000000000000000000000000000000   answer = 0
Test subset 001 = { 0 }     set = 00000000000100000010000000000000   answer = 2
Test subset 010 = { 1 }     set = 00000000000000010000000100000000   answer = 2
Test subset 011 = { 0,1 }   set = 00000000000100010010000100000000   answer = 4
Test subset 100 = { 2 }     set = 00000000000100000000000000010000   answer = 2
Test subset 101 = { 0, 2 }  find character u twice                   answer = 0
Test subset 110 = { 1, 2 }  set = 00000000000100010000000100010000   answer = 4
Test subset 111 = { 0,1,2 } find character u twice                   answer = 0