Skip to content

Latest commit

 

History

History
128 lines (93 loc) · 3.65 KB

File metadata and controls

128 lines (93 loc) · 3.65 KB
comments difficulty edit_url tags
true
Easy
Database

中文文档

Description

Table: Users

+-----------------+---------+
| Column Name     | Type    |
+-----------------+---------+
| user_id         | int     |
| email           | varchar |
+-----------------+---------+
(user_id) is the unique key for this table.
Each row contains a user's unique ID and email address.

Write a solution to find all the valid email addresses. A valid email address meets the following criteria:

  • It contains exactly one @ symbol.
  • It ends with .com.
  • The part before the @ symbol contains only alphanumeric characters and underscores.
  • The part after the @ symbol and before .com contains a domain name that contains only letters.

Return the result table ordered by user_id in ascending order.

 

Example:

Input:

Users table:

+---------+---------------------+
| user_id | email               |
+---------+---------------------+
| 1       | [email protected]   |
| 2       | bob_at_example.com  |
| 3       | [email protected] |
| 4       | [email protected]    |
| 5       | eve@invalid         |
+---------+---------------------+

Output:

+---------+-------------------+
| user_id | email             |
+---------+-------------------+
| 1       | [email protected] |
| 4       | [email protected]  |
+---------+-------------------+

Explanation:

  • [email protected] is valid because it contains one @, alice is alphanumeric, and example.com starts with a letter and ends with .com.
  • bob_at_example.com is invalid because it contains an underscore instead of an @.
  • [email protected] is invalid because the domain does not end with .com.
  • [email protected] is valid because it meets all criteria.
  • eve@invalid is invalid because the domain does not end with .com.

Result table is ordered by user_id in ascending order.

Solutions

Solution 1: Regular Expression

We can use a regular expression with REGEXP to match valid email addresses.

The time complexity is $O(n)$, and the space complexity is $O(1)$. Here, $n$ is the length of the input string.

MySQL

# Write your MySQL query statement below
SELECT user_id, email
FROM Users
WHERE email REGEXP '^[A-Za-z0-9_]+@[A-Za-z][A-Za-z0-9]*\\.com$'
ORDER BY 1;

Pandas

import pandas as pd


def find_valid_emails(users: pd.DataFrame) -> pd.DataFrame:
    email_pattern = r"^[A-Za-z0-9_]+@[A-Za-z][A-Za-z0-9]*\.com$"
    valid_emails = users[users["email"].str.match(email_pattern)]
    valid_emails = valid_emails.sort_values(by="user_id")
    return valid_emails