Regular Expressions
Regular expressions (regex) are powerful patterns for matching and manipulating strings. Python's `re` module provides functions like `search()`, `match()`, `findall()`, `sub()`, and `compile()`.
Basic Patterns
- `.` – any character except newline
- `d` – digit
- `w` – word character (letter, digit, underscore)
- `s` – whitespace
- `^` – start of string
- `$` – end of string
- `*` – zero or more
- `+` – one or more
- `?` – zero or one
- `{n}` – exactly n
- `[abc]` – character set
import re
text = "My email is alice@example.com and bob@test.org"
pattern = r"w+@w+.w+"
emails = re.findall(pattern, text)
print(emails) # ['alice@example.com', 'bob@test.org']
Search and Match
- `re.search()` – find first occurrence anywhere
- `re.match()` – match from start of string
- `re.findall()` – list of all matches
- `re.finditer()` – iterator of match objects
- `re.sub()` – replace matches
match = re.search(r"(d+)", "Order number: 12345")
if match:
print(match.group(0)) # 12345
print(match.group(1)) # 12345
cleaned = re.sub(r"d", "X", "a1b2c3")
print(cleaned) # aXbXcX
Compiling Patterns
For repeated use, compile for speed.
pattern = re.compile(r"d+")
matches = pattern.findall("There are 42 and 7 apples.")
print(matches) # ['42', '7']
Two Minute Drill
- Regex patterns are powerful for text processing.
- Use `re.findall()` for all matches, `re.sub()` for replacement.
- Compile patterns for performance when reused.
- Remember to escape special characters (or use raw strings `r""`).
Need more clarification?
Drop us an email at career@quipoinfotech.com
