
Practical Regex Guide: Essential Patterns for Search, Replace, and Validation
From regex basics to practical patterns for work. Covers email, phone, URL, postal code validation, search/replace techniques, and common metacharacters with examples.
What Is Regex and Why Is It Necessary?
Regular expressions (RegEx) are a language for describing string patterns. They enable "searching, extracting, or replacing strings matching specific patterns" in just a few lines of code.
For example, "determining if a string is a valid email address" requires dozens of if statements in regular programming, but regex completes it in one line.
Where Regex Excels
- Validation: Check correct format in input forms (email, phone, postal code)
- Search/Extract: Extract specific error messages from log files
- Replace: Bulk replacement in many text files (e.g., hide all phone numbers)
- Data Cleansing: Batch delete unnecessary spaces and line breaks
Basic Regex Syntax
Metacharacters (Characters with Special Meaning)
| Char | Meaning | Example |
|---|---|---|
. | Any single character | a.c → "abc", "a9c", "a c" |
^ | Line start | ^Hello → line starts with "Hello" |
$ | Line end | world$ → line ends with "world" |
* | 0+ occurrences of preceding char | ab*c → "ac", "abc", "abbc" |
+ | 1+ occurrences of preceding char | ab+c → "abc", "abbc" (not "ac") |
? | 0 or 1 occurrence of preceding char | ab?c → "ac", "abc" |
| | OR (either) | cat|dog → "cat" or "dog" |
() | Grouping | (ab)+ → "ab", "abab", "ababab" |
[] | Character class (any one char) | [abc] → "a", "b", "c" |
[^] | Negated character class (except these) | [^abc] → anything but "a", "b", "c" |
Special Characters Requiring Escape
Escape these characters with \ to search literally:
\ ^ $ . * + ? ( ) [ ] { } | /
Example: To search $100 → \$100
Common Character Classes & Shorthands
| Pattern | Meaning | Equivalent |
|---|---|---|
\d | Digit | [0-9] |
\D | Non-digit | [^0-9] |
\w | Word character (alphanumeric+_) | [a-zA-Z0-9_] |
\W | Non-word character | [^a-zA-Z0-9_] |
\s | Whitespace (space/tab/newline) | [ \t\n\r\f\v] |
\S | Non-whitespace | [^ \t\n\r\f\v] |
Example: "3-digit number" → \d{3}
Example: "Alphanumeric 4-8 chars" → \w{4,8}
Practical Regex Pattern Collection
Email Address Validation
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Explanation:
^...$: Match entire string[a-zA-Z0-9._%+-]+: Local part (before @)@: At sign[a-zA-Z0-9.-]+: Domain name\.[a-zA-Z]{2,}: Top-level domain (.com, .jp, etc.)
Japanese Phone Number (With/Without Hyphens)
^0\d{1,4}-?\d{1,4}-?\d{4}$
Explanation:
^0: Starts with 0\d{1,4}: Area code (1-4 digits)-?: Hyphen 0 or 1 time\d{1,4}-?\d{4}: Local exchange and subscriber number
Example: Matches 03-1234-5678, 090-1234-5678, 0312345678
URL Extraction
https?:\/\/[\w\/:%#\$&\?\(\)~\.=\+\-]+
Explanation:
https?: http or https:\/\/: :// (escaped slashes)[\w\/:%#\$&\?\(\)~\.=\+\-]+: URL-allowed characters
Japanese Postal Code (〒123-4567)
^\d{3}-\d{4}$
Explanation:
^\d{3}: Starts with 3 digits-: Hyphen\d{4}$: Ends with 4 digits
Date (YYYY-MM-DD Format)
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Explanation:
\d{4}: Year (4 digits)(0[1-9]|1[0-2]): Month (01-12)(0[1-9]|[12]\d|3[01]): Day (01-31)
Practical Search & Replace Techniques
Case 1: Replace All Phone Numbers with "-**-***"
Search: 0\d{1,4}-?\d{1,4}-?\d{4}
Replace: ***-****-****
Case 2: Remove All HTML Tags
Search: <[^>]+>
Replace: (empty string)
Explanation: <[^>]+> means "starts with <, followed by 1+ non-> chars, ends with >".
Case 3: Remove All Leading Spaces
Search: ^\s+
Replace: (empty string)
Explanation: ^ is line start, \s+ is 1+ whitespace.
Case 4: Reorder Using Capture Groups
Original: 田中太郎(Tanaka Taro)
Search: (.+)((.+))
Replace: $2 - $1
Result: Tanaka Taro - 田中太郎
Explanation: Groups captured by () can be referenced as $1, $2.
Common Mistakes and Debugging
Mistake 1: Forgetting to Escape
. or * without escape are interpreted as metacharacters.
Wrong: file*.txt → "file followed by 0+ any char, .txt"
Correct: file\*.txt → "file*.txt" literal string
Mistake 2: Greedy Matching
.* uses longest match, matching beyond intended range.
Example: <div>Hello</div><div>World</div> with <div>.*</div>
→ Matches entire <div>Hello</div><div>World</div> (unintended)
Solution: Use non-greedy .*?
→ <div>.*?</div> → Matches <div>Hello</div> and <div>World</div> separately
Mistake 3: Unable to Match Across Newlines
By default, . doesn't match newlines.
Solution: Enable s flag (dotall) or use [\s\S].
FAQ: Regex Questions
Q1. Which languages/tools support regex?
A. Nearly all programming languages (JavaScript, Python, Java, PHP, Ruby, etc.), text editors (VS Code, Sublime Text, Vim), and command-line tools (grep, sed, awk). However, detailed syntax and features (lookahead, lookbehind, etc.) may vary by language.
Q2. Regex is too complex to read...
A. Regex often becomes "write-once code." Maintain readability by:
- Adding comments (many languages support
(?#comment)) - Splitting into variables and combining
- Testing incrementally with regex tester tools
Example: regex101.com, regexr.com are convenient online tools.
Q3. How to specify "full-width katakana only"?
A. Use ^[ァ-ヶー]+$. In Unicode property-enabled environments, ^\p{Katakana}+$ works (requires JavaScript u flag).
Summary: Practice Makes Perfect with Regex
Regex initially seems cryptic, but memorizing basic patterns and actually using them enables quick mastery. Follow these learning steps:
- Memorize basic metacharacters:
. ^ $ * + ? | ( ) [ ] - Imitate common patterns: email, phone, URL
- Test with regex tester: Verify actual matches
- Use in real work: Practice with search, replace, validation
- Split complex patterns: Maintain readability
Mastering regex dramatically streamlines text processing. Start using it today, little by little.


