831. Masking Personal Information
Problem Description
This problem involves designing an algorithm to mask personal information. Personal information can be an email address or a phone number, and each has specific rules for masking.
Email Address Masking Rules:
- Transform all uppercase letters in the name and domain parts of the email address to lowercase.
- Mask all but the first and last letter of the name part with five asterisks (
*****
).
Phone Number Masking Rules:
- Remove all separation characters (which may include
+
,-
,(
,)
, and spaces) to only retain digits. - Format the number based on the leftover digit count after the local number (last 10 digits). The formatting is as follows:
- If there is no country code, the format is
***-***-XXXX
, whereXXXX
is the last four digits of the local number. - If there is a country code, it should be represented by the same number of asterisks as digits, followed by the formatted local number
***-***-XXXX
.
- If there is no country code, the format is
The algorithm must identify whether the provided string is an email or a phone number and mask it accordingly.
Intuition
The intuition behind solving this problem starts with identifying if the given s
is an email address or a phone number. This can usually be identified by the presence of alphabetic characters and the special character @
. If s[0]
is an alphabetic character, it is an email, otherwise it's a phone number.
For an email address:
- Convert the entire string to lowercase to satisfy the case-insensitivity requirement.
- Replace the middle part of the name with five asterisks, but maintain the first character, the character before the
@
, and the complete domain after the@
.
For a phone number:
- Remove all separation characters by filtering out non-digit characters, leaving only the digits.
- The number of characters remaining after the last 10 gives us the number of digits in the country code. Depending on this count, mask the country code with asterisks.
- The last 4 digits of the local number remain unmasked, while the others are replaced by asterisks, keeping consistent with the
***-***-
prefix.
The provided solution uses Python comprehensions to filter and the string methods to replace and format according to the rules. The code is compact and only uses conditional logic to distinguish the masking process between an email address and a phone number.
Solution Approach
The implementation of the masking algorithm can be broken down into several steps and patterns used throughout the code:
-
Determining the Type of Personal Information: The algorithm first checks if the given string
s
represents an email address or a phone number. This is done by checking the first character of the string:if s[0].isalpha():
If the first character is an alphabetical letter, it is an email address; otherwise, it is a phone number.
-
Email Masking:
- All characters of the email address are converted to lowercase using
lower()
to ensure case insensitivity. - The first and the last character of the name part, along with the domain, are preserved.
- The middle part of the name is replaced with five asterisks (
*****
).
return s[0] + '*****' + s[s.find('@') - 1 :]
- All characters of the email address are converted to lowercase using
-
Phone Number Masking:
- All non-digit characters are filtered out, leaving only digits in the string:
s = ''.join(c for c in s if c.isdigit())
- The algorithm calculates the length of the country code by subtracting 10 from the total length of the filtered digits. This is because the local number always consists of the last 10 digits.
- A switch-like statement, using if-elif, is avoided. Instead, a more elegant formatting via an f-string and repetition of asterisk characters is used to handle different lengths of country codes:
cnt = len(s) - 10 suf = '***-***-' + s[-4:] return suf if cnt == 0 else f'+{"*" * cnt}-{suf}'
- All non-digit characters are filtered out, leaving only digits in the string:
In summary, the solution makes use of string methods (like lower()
and find()
), list comprehensions (to filter characters), and string concatenation to assemble the masked personal information. It uses the compactness of Python's string formatting to keep the code concise and readable. There is no explicit usage of complex data structures as the operations are basic string manipulations. This approach is efficient because it goes through the string at most two times: once to either lower the case of the characters (for emails) or filter digits (for phone numbers), and once to construct the masked string. This is essentially O(n) complexity, where n is the length of the input string.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example illustrating the solution approach for both an email and a phone number.
Email Example
Given the email [email protected]
, let's apply the solution approach:
- Since the first character 'E' is an alphabetical letter, we identify the string as an email address.
- Convert the email to lowercase:
[email protected]
. - Preserve the first and the last character of the name part and the entire domain: 'e' and 'l' are the two characters we preserve from
exampleemail
. - Replace the middle part of the name with five asterisks:
e*****[email protected]
.
After following these steps, we get the masked email address: e*****[email protected]
.
Phone Number Example
Given the phone number +1 (234) 567-8901
, let's apply the solution approach:
- The first character '+' indicates it's a phone number.
- Remove all non-digit characters, leaving only the digits:
12345678901
. - Identify the country code, which in this case is '1', by counting the digits beyond the last 10, which gives us 1 digit here.
- Replace the country code with an asterisk and apply the standard mask for the remaining number, preserving the last four digits:
* ***-***-8901
.
After following these steps, we get the masked phone number: * ***-***-8901
.
These two examples demonstrate how the algorithm distinguishes between email addresses and phone numbers and how it applies the respective masking rules to conceal personal information while following the specified format.
Solution Implementation
1class Solution:
2 def maskPII(self, string: str) -> str:
3 # Check if the string contains an email address by looking for an alphabetic character at the start
4 if string[0].isalpha():
5 string = string.lower() # Convert the email to lowercase
6 # Build the masked email with the first character, five asterisks, and the domain part
7 masked_email = string[0] + '*****' + string[string.find('@') - 1:]
8 return masked_email
9 # If it's not an email, assume it's a phone number
10 else:
11 # Remove all non-digit characters from the string
12 digits_only = ''.join(c for c in string if c.isdigit())
13 # Count the number of digits beyond the local 10-digit format
14 extra_digits_count = len(digits_only) - 10
15 # Create a suffix for the masked phone number
16 phone_suffix = '***-***-' + digits_only[-4:]
17 # If there are no extra digits, return the standard format
18 if extra_digits_count == 0:
19 return phone_suffix
20 # If there are extra digits (international format), add the country code mask
21 else:
22 return f'+{"*" * extra_digits_count}-{phone_suffix}'
23
24# Example usage:
25# solution = Solution()
26# print(solution.maskPII("[email protected]")) # Output: "j*****[email protected]"
27# print(solution.maskPII("1(234)567-890")) # Output: "***-***-7890"
28# print(solution.maskPII("+1(234)567-890")) # Output: "+*-***-***-7890"
29
1class Solution {
2
3 public String maskPII(String input) {
4 // Check if the first character is a letter to determine if it's an email
5 if (Character.isLetter(input.charAt(0))) {
6 // Convert the entire input string to lower case for uniformity
7 input = input.toLowerCase();
8 // Find the index of the '@' symbol in the email
9 int atIndex = input.indexOf('@');
10 // Create a masked email with the first character, five stars, and the domain part
11 return input.charAt(0) + "*****" + input.substring(atIndex - 1);
12 }
13
14 // StringBuilder to hold only the digits from the input (phone number)
15 StringBuilder digitBuilder = new StringBuilder();
16 // Loop through the characters in the input
17 for (char c : input.toCharArray()) {
18 // Append only if the character is a digit
19 if (Character.isDigit(c)) {
20 digitBuilder.append(c);
21 }
22 }
23
24 // Convert the StringBuilder to a string containing only digits
25 String digits = digitBuilder.toString();
26 // Calculate the number of digits that are in the international code
27 int internationalCodeLength = digits.length() - 10;
28 // Create a masked string for the last 7 digits of the phone number
29 String maskedSuffix = "***-***-" + digits.substring(digits.length() - 4);
30
31 // Depending on whether there is an international code, mask the phone number appropriately
32 if (internationalCodeLength == 0) {
33 // If there's no international code, return just the masked U.S. number
34 return maskedSuffix;
35 } else {
36 // If there's an international code, mask it and append the U.S. number
37 String starsForInternational = "+";
38 for(int i = 0; i < internationalCodeLength; i++) {
39 starsForInternational += "*";
40 }
41 return starsForInternational + "-" + maskedSuffix;
42 }
43 }
44}
45
1#include <cctype> // include ctype for using tolower, isdigit functions
2#include <string> // include string for using string class
3using namespace std;
4
5class Solution {
6public:
7 /**
8 * @param s: The original string, either an email or a phone number.
9 * @return: The masked string with PIIs (Personally Identifiable Information) hidden.
10 */
11 string maskPII(string s) {
12 // Find the position of '@', which indicates an email.
13 size_t atPosition = s.find('@');
14 // If "@" is found, mask it as an email.
15 if (atPosition != string::npos) {
16 string maskedEmail;
17 // Convert first character to lowercase and append.
18 maskedEmail += tolower(s[0]);
19 // Append 5 asterisks to hide part of the local name.
20 maskedEmail += "*****";
21 // Convert from before '@' to lowercase and append the domain.
22 for (size_t j = atPosition - 1; j < s.size(); ++j) {
23 maskedEmail += tolower(s[j]);
24 }
25 return maskedEmail;
26 }
27
28 // If "@" is not found, it's a phone number.
29 // Extract all digits from the string.
30 string digits;
31 for (char c : s) {
32 if (isdigit(c)) {
33 digits += c;
34 }
35 }
36
37 // Determine how many digits are in the country code.
38 int countryCodeLength = digits.size() - 10;
39 // Create the suffix for the standard 10-digit mask.
40 string suffix = "***-***-" + digits.substr(digits.size() - 4);
41
42 // If there's no country code, return just the suffix.
43 if (countryCodeLength == 0) {
44 return suffix;
45 } else {
46 // If there's a country code, prepend it with masks.
47 string countryCodeMask = "+" + string(countryCodeLength, '*') + "-";
48 return countryCodeMask + suffix; // Full masked string with country code.
49 }
50 }
51};
52
1function maskPII(s: string): string {
2 // Find the index of '@' to determine if it's an email.
3 const atIndex: number = s.indexOf('@');
4 if (atIndex !== -1) {
5 // If it's an email, mask the characters between the first character and the '@' symbol.
6 let maskedEmail: string = s[0].toLowerCase() + '*****';
7 for (let j: number = atIndex - 1; j < s.length; j++) {
8 maskedEmail += s.charAt(j).toLowerCase();
9 }
10 return maskedEmail;
11 }
12
13 // If it's not an email, build a string containing only the digits from the input.
14 let numericString: string = '';
15 for (const char of s) {
16 if (/\d/.test(char)) { // Check if the character is a digit.
17 numericString += char;
18 }
19 }
20
21 // Compute the count of digits to be replaced by '*' in the case there are country codes.
22 const countryCodeStarsCount: number = numericString.length - 10;
23
24 // The suffix for the phone number hiding the first 6 digits, revealing only the last 4.
25 const maskedPhoneSuffix: string = `***-***-${numericString.substring(numericString.length - 4)}`;
26
27 // Return the masked phone number with country code as '*' if there is a country code.
28 return countryCodeStarsCount === 0 ? maskedPhoneSuffix : `+${'*'.repeat(countryCodeStarsCount)}-${maskedPhoneSuffix}`;
29}
30
Time and Space Complexity
Time Complexity:
The time complexity of the maskPII
function is determined by several operations that the function performs:
-
Checking whether the first character is alphabetical:
O(1)
, since it's a constant-time check on a single character. -
Converting the string to lower case if it's an email:
O(n)
, wheren
is the length of the strings
. -
Finding the position of '@' in the string:
O(n)
in the worst case, since it requires scanning the string in case of an email address. -
Stripping non-digit characters and joining digits into a new string:
O(n)
, because each character of the original strings
is checked once. -
Constructing the masked string: The actual construction is
O(1)
because we are appending fixed strings and characters to a masked result.
Combining these, the time complexity is dictated by the string operations, which are linear in the length of the input string s
. Therefore, the time complexity is O(n)
.
Space Complexity:
The space complexity is determined by the additional space used by the function:
-
A new lower case email id string if
s[0].isalpha()
:O(n)
space for the new string. -
The list comprehension that filters digits:
O(n)
space for the new list of characters before it is joined into a string. -
The final masked string has a constant size related to the format of the phone number or email (
O(1)
space), not dependent on the input size.
Thus, the dominating factor is the creation of new strings, which takes O(n)
space. Hence, the space complexity is O(n)
.
Learn more about how to find time and space complexity quickly using problem constraints.
What data structure does Breadth-first search typically uses to store intermediate states?
Recommended Readings
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!