Hash Algorithm to Check A List Containing Almost Same Strings (Differ By One Character)


You are given a list of lowercase alphabet strings words where each string is of the same length. Return whether there’s two strings that differ only in one index.

Constraints
1 ≤ n ≤ 100,000 where n is the total number of characters in words
Example 1
Input
words = [“abcd”, “aaaa”, “abcf”]
Output
True
Explanation
We see that “abcd” and “abcf” only differ in the last index.

Using Set to Find Out If there Are Almost Same Strings

The trick here is to use a hash set to include all the variants of strings that differ by only 1 character. Thus we can go through each string, and character by character replacing each character by a wide-char (or any other non-alpha characters), then insert this into the hash set. If we find a variant is already in the hash set, we know there are at least two strings (original) that differ by only 1 character.

The following is the Python code to check if there are at least two almost-same strings in the list:

1
2
3
4
5
6
7
8
9
10
class Solution:
    def hasAlmostSameStrings(self, words):
        seen = set()
        for i in words:
            for j in range(len(i)):
                s = i[:j] + "*" + i[j + 1:]
                if s in seen:
                    return True
                seen.add(s)
        return False
class Solution:
    def hasAlmostSameStrings(self, words):
        seen = set()
        for i in words:
            for j in range(len(i)):
                s = i[:j] + "*" + i[j + 1:]
                if s in seen:
                    return True
                seen.add(s)
        return False

The time complexity is O(NM) where N is the number of strings and M is the number of characters in each string. The space complexity is O(NM) as we are using hash set.

C++ implementation of the same algorithm:

1
2
3
4
5
6
7
8
9
10
11
bool hasAlmostSameStrings(vector<string>& words) {
    unordered_set<string> data;
    for (auto &n: words) {
        for (int i = 0; i < static_cast<int>(n.size()); ++ i) {
            string v = n.substr(0, i) + "*" + n.substr(i + 1);
            if (data.count(v)) return true;
            data.insert(v);
        }
    }
    return false;
}
bool hasAlmostSameStrings(vector<string>& words) {
    unordered_set<string> data;
    for (auto &n: words) {
        for (int i = 0; i < static_cast<int>(n.size()); ++ i) {
            string v = n.substr(0, i) + "*" + n.substr(i + 1);
            if (data.count(v)) return true;
            data.insert(v);
        }
    }
    return false;
}

See also: Teaching Kids Programming – Using Hash Set to Find Out Almost Same Strings

–EOF (The Ultimate Computing & Technology Blog) —

GD Star Rating
loading...
419 words
Last Post: Teaching Kids Programming - Algorithm to Check If Array is Monotonic
Next Post: Teaching Kids Programming - Adding Two Linked List

The Permanent URL is: Hash Algorithm to Check A List Containing Almost Same Strings (Differ By One Character)

Leave a Reply