Regular Expression
- Pattern that can represent a variety of strings
- Tool for verifying information
Why?
- Validation
- Searching (such as
grep
)
- Identify valid credit card numbers
- Represent complex strings with single simpler* string
Set Correspondence
- The set of all distinct elements of the sequence
- is the set corresponding to the sequence
abababababbababab
Basics
- Given a set
- would be the words that could be created with the characters, including non-valid words
Examples
Phone Number
(123) 456-7890
- Opening paren
- 3 digits
- Closing paren
- Space
- 3 digits
- Hyphen
- 4 digits
\(\d{3}\) \d{3}-\d{4}
Email Address
foo@bar.com
\w+@\w+\.\w+$
Word Ending in βingβ
Password
SuperStr0ngPassword!
- At least one letter β
(?=.*[a-zA-Z])
- At least one digit β
(?=.*\d)
- At least one special character β
(?=.*[!@#$%^&*?])
- At least 8 characters β
[a-zA-Z\d!@#$%^&*?]{8,}
^(?=.*[a-zA-Z])(?=.*\d)(?=.*[!@#$%^&*?])[a-zA-Z\d!@#$%^&*?]{8,}$
Hex Color
#ffffff
or #fff
- Hashtag
- 6 or 3 characters, 0-9 or a-f
^#([a-fA-F\d]{6}|[a-fA-F\d]{3})
Python Code
import re
test_cases = [
"(123) 456-7890",
"foo@bar.com",
"Crying",
"MyPasswordIsVeryStr0ng!",
"#abc"
]
patterns = {
"Phone Number": r"\(\d{3}\) \d{3}-\d{4}",
"Email Address": r"\w+@\w+\.\w+$",
"Word ending in 'ing'": r"\w+ing\b",
"Password": r"^(?=.*[a-zA-Z])(?=.*\d)(?=.*[!@#$%^&*?])[a-zA-Z\d!@#$%^&*?]{8,}$",
"Hex Code": r"^#([a-fA-F\d]{6}|[a-fA-F\d]{3})"
}
for label, patterns in patterns:
print(f"\nTesting {label}:")
for case in test_cases:
match = re.match(pattern, case)
print(f" {case}: {"β
" if match else "β"}")