How to Filter the Unique Email Addresses?


Same Email addresses can have different forms for example, in the format of “username@domain” – the username part can have the optional dots to separate. The following email addresses are unique:

  • userabcdef@domain
  • user.abcdef@domain
  • userabc.def@domain
  • user.abc.def@domain

We can also set up a label for each email address in the format of “username+label@domain” for example, the following all point to the same email address

  • userabcdef+label1@domain
  • user.abcdef+label2@domain
  • userabc.def+label3@domain
  • user.abc.def+label4@domain

Your task is to count how many unique email addresses given a list of valid email addresses.

C++ using unordered_set

C++ unordered_set data structure can be used to keep a list of current unique email addresses. Next, we iterate the array and for each email address, we split the string into username and domain part. We can then check character by character, concatenate into a unique username part skipping the dot until the end of the string or the plus sign.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Solution {
public:
    int numUniqueEmails(vector<string>& emails) {
        unordered_set<string> data;
        for (const auto &n: emails) {
            int p = n.find('@');
            string x = "";
            for (auto i = 0; i < p; ++ i) {
                if (n[i] == '+') {
                    break;
                }
                if (n[i] != '.') {
                    x.append(std::to_string(n[i]));
                }
            }
            data.insert(x + '@' + n.substr(p));
        }
        return data.size();
    }
};
class Solution {
public:
    int numUniqueEmails(vector<string>& emails) {
        unordered_set<string> data;
        for (const auto &n: emails) {
            int p = n.find('@');
            string x = "";
            for (auto i = 0; i < p; ++ i) {
                if (n[i] == '+') {
                    break;
                }
                if (n[i] != '.') {
                    x.append(std::to_string(n[i]));
                }
            }
            data.insert(x + '@' + n.substr(p));
        }
        return data.size();
    }
};

The C++ STL::erase can be used to remove a character in std::string. The following is more readable via the use of string::erase and std::remove.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Solution {
public:
    int numUniqueEmails(vector<string>& emails) {
        unordered_set<string> data;
        for (const auto &n: emails) {
            int p = n.find('@');
            auto x = n.substr(0, p);
            // remove dots
            x.erase(std::remove(begin(x), end(x), '.'), end(x));
            auto y = x.find('+');
            if (y != string::npos) {
                x.erase(x.begin() + y, end(x));
            }
            data.insert(x + '@' + n.substr(p));
        }
        return data.size();
    }
};
class Solution {
public:
    int numUniqueEmails(vector<string>& emails) {
        unordered_set<string> data;
        for (const auto &n: emails) {
            int p = n.find('@');
            auto x = n.substr(0, p);
            // remove dots
            x.erase(std::remove(begin(x), end(x), '.'), end(x));
            auto y = x.find('+');
            if (y != string::npos) {
                x.erase(x.begin() + y, end(x));
            }
            data.insert(x + '@' + n.substr(p));
        }
        return data.size();
    }
};

Javascript using Regular Expression

In ES6, you can use the Set object in Javascript. Therfore, we can iterate the array using forEach, and use regular expression to remove the dots and the labels after the sign symbol and simply adding the unique email to the Set.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
/**
 * @param {string[]} emails
 * @return {number}
 */
var numUniqueEmails = function(emails) {
    var data = new Set();
    emails.forEach(function(v) {
        var arr = v.split("@");
        arr[0] = arr[0].replace(/\./g, ''); // remove dot
        arr[0] = arr[0].replace(/(\+.*)/g, '');  // remove label
        data.add(arr.join("@"));
    });
    return data.size;
};
/**
 * @param {string[]} emails
 * @return {number}
 */
var numUniqueEmails = function(emails) {
    var data = new Set();
    emails.forEach(function(v) {
        var arr = v.split("@");
        arr[0] = arr[0].replace(/\./g, ''); // remove dot
        arr[0] = arr[0].replace(/(\+.*)/g, '');  // remove label
        data.add(arr.join("@"));
    });
    return data.size;
};

See also: Teaching Kids Programming – Number of Unique Email Addresses

–EOF (The Ultimate Computing & Technology Blog) —

GD Star Rating
loading...
553 words
Last Post: HHKB (Happy Hacking Keyboard) Review and the Key Combinations
Next Post: How to Show/Execute History Command Lines in Windows Command Line Prompt?

The Permanent URL is: How to Filter the Unique Email Addresses?

Leave a Reply