Regex Cheat Sheet

regex regular-expressions pattern-matching cheatsheet programming

Regular expressions are a universal pattern-matching syntax built into every major programming language, text editor, and command-line tool. This reference sheet is designed to be scanned, not read — keep it open next to your editor.

Character Classes & Metacharacters

A character class matches exactly one character from a defined set. Metacharacters are symbols with special meaning inside a pattern.

PatternMatchesNotes
abcLiteral “abc” in sequenceCase-sensitive by default
.Any character except newlineMatches newline with re.DOTALL / /s flag
\dAny digit [0-9]Unicode digits included with re.UNICODE
\DAny non-digitInverse of \d
\wWord character [a-zA-Z0-9_]re.UNICODE expands to Unicode word chars
\WNon-word characterInverse of \w
\sWhitespace: space, tab, \n, \r, \f
\SNon-whitespaceInverse of \s
[abc]One of: a, b, or cMost metacharacters lose their meaning inside […]
[^abc]Any character except a, b, or cNegated class
[a-z]Any lowercase letterRange syntax
[A-Z]Any uppercase letter
[0-9]Any digit (same as \d)
[a-zA-Z0-9_]Any word character (same as \w)
\tTab character
\nNewline character
\rCarriage return
\\Literal backslashMust also escape in non-raw strings
import re

re.findall(r'\d+', 'Order 42, item 7')
## => ['42', '7']

re.findall(r'\w+', 'hello_world 123')
## => ['hello_world', '123']

re.findall(r'[aeiou]+', 'beautiful')
## => ['eau', 'i', 'u']

re.findall(r'[^a-zA-Z0-9]+', 'hello, world! 42')
## => [', ', '! ']

Anchors & Boundaries

Anchors assert a position in the string — they consume no characters.

PatternPosition MatchedNotes
^Start of stringStart of each line with re.MULTILINE / m flag
$End of stringEnd of each line with re.MULTILINE / m flag
\bWord boundaryBetween \w and \W, or at string edge
\BNon-word boundaryInside a continuous run of word characters
\AAbsolute start of stringUnaffected by re.MULTILINE
\ZAbsolute end of stringUnaffected by re.MULTILINE
text = "apple\nbanana\napricot"

re.findall(r'^a\w+', text, re.MULTILINE)
## => ['apple', 'apricot']

re.findall(r'\bcat\b', 'the cat in scatter')
## => ['cat']  — 'cat' inside 'scatter' is not matched

re.findall(r'\Aapple', text, re.MULTILINE)
## => ['apple']  — only the absolute start of the string

bool(re.fullmatch(r'\d{5}', '90210'))   ## => True
bool(re.fullmatch(r'\d{5}', 'ABC'))     ## => False

Quantifiers: Greedy and Lazy

Quantifiers control how many times the preceding element repeats. Greedy quantifiers consume as much input as possible; lazy (non-greedy) quantifiers consume as little as possible.

QuantifierMeaningMode
*0 or moreGreedy
+1 or moreGreedy
?0 or 1Greedy
{n}Exactly n timesGreedy
{n,}n or more timesGreedy
{n,m}Between n and m times (inclusive)Greedy
*?0 or moreLazy
+?1 or moreLazy
??0 or 1Lazy
{n,m}?Between n and mLazy
text = '<b>bold</b> and <i>italic</i>'

re.findall(r'<.+>', text)
## => ['<b>bold</b> and <i>italic</i>']   — greedy, one giant match

re.findall(r'<.+?>', text)
## => ['<b>', '</b>', '<i>', '</i>']   — lazy, each tag separately

re.findall(r'\b\d{2,4}\b', 'a 5 b 12 c 1234 d 99999')
## => ['12', '1234']

re.findall(r'https?://\S+', 'Visit http://a.com or https://b.com')
## => ['http://a.com', 'https://b.com']

Groups, Capturing & Backreferences

Groups let you apply quantifiers to multi-character sequences, capture submatches, and refer back to earlier matches in the same pattern or in a replacement string.

PatternDescription
(abc)Capturing group — saves the matched text
(?:abc)Non-capturing group — groups without saving
(?P<name>abc)Named capturing group — Python/PCRE syntax
(?<name>abc)Named capturing group — JavaScript ES2018/PCRE2
a|bAlternation — match “a” or “b”
\1Backreference to group 1 (by number)
\g<1>Backreference to group 1 in re.sub replacement
(?P=name)Named backreference — Python
\k<name>Named backreference — JavaScript/PCRE2
pattern = r'(?P<protocol>https?)://(?P<domain>[^/]+)(?P<path>/[^\s]*)?'
m = re.match(pattern, 'https://devnook.dev/guides/')
m.group('protocol')  ## 'https'
m.group('domain')    ## 'devnook.dev'
m.group('path')      ## '/guides/'

re.findall(r'\b(\w+)\s+\1\b', 'the the quick brown fox fox')
## => ['the', 'fox']   — repeated words

re.findall(r'colo(?:u|)r', 'colour and color')
## => ['colour', 'color']   — alternation in non-capturing group

re.sub(r'(\w+)\s+(\w+)', r'\2 \1', 'John Smith')
## => 'Smith John'   — swap groups in replacement

Lookaheads & Lookbehinds

Lookarounds are zero-width assertions — they check what surrounds the current position without consuming any characters.

PatternTypeWhat it asserts
(?=abc)Positive lookaheadCurrent position is followed by “abc”
(?!abc)Negative lookaheadCurrent position is NOT followed by “abc”
(?<=abc)Positive lookbehindCurrent position is preceded by “abc”
(?<!abc)Negative lookbehindCurrent position is NOT preceded by “abc”
re.findall(r'\d+(?=px)', '12px 5em 100px 3rem')
## => ['12', '100']   — digits followed by 'px', 'px' not captured

re.findall(r'new(?!line)', 'newline and new feature')
## => ['new']   — 'new' not followed by 'line'

re.findall(r'(?<=name=)\w+', 'id=42 name=alice role=admin')
## => ['alice']

re.findall(r'(?<!no )error', 'no error here; another error exists')
## => ['error']   — only the second occurrence

re.findall(r'(?<=\$)\d+(?=\s*USD)', '$100 USD and $50 EUR')
## => ['100']   — combined lookahead + lookbehind

Regex Flags

Flags (also called modifiers) alter how the engine interprets the entire pattern. Combine multiple flags with | in Python, or use inline (?imsx) syntax to embed them inside the pattern itself.

FlagPython constantPython inlineJavaScriptWhat it changes
Case-insensitivere.IGNORECASE / re.I(?i)iLetters match any case
Multilinere.MULTILINE / re.M(?m)m^/$ match line starts and ends
Dot-allre.DOTALL / re.S(?s)s. matches \n too
Verbosere.VERBOSE / re.X(?x)Whitespace and # comments ignored
GlobalgFind all matches, not just the first
Unicodere.UNICODE / re.U(?u)uFull Unicode property matching
ASCIIre.ASCII / re.A(?a)Force \w, \d, \s to ASCII-only
StickyyMatch only at lastIndex position
re.findall(r'(?i)hello', 'Hello HELLO hello')
## => ['Hello', 'HELLO', 'hello']   — inline case-insensitive flag

text = "Error: disk full\nerror: timeout\nWARN: retrying"
re.findall(r'^error:.+$', text, re.IGNORECASE | re.MULTILINE)
## => ['Error: disk full', 'error: timeout']

date_re = re.compile(r"""
    (?P<year>\d{4})            (?# 4-digit year)
    -
    (?P<month>0[1-9]|1[0-2])  (?# month 01-12)
    -
    (?P<day>0[1-9]|[12]\d|3[01])  (?# day 01-31)
""", re.VERBOSE)

date_re.match('2026-06-09').groupdict()
## => {'year': '2026', 'month': '06', 'day': '09'}
'Hello HELLO hello'.match(/hello/gi);
// => ['Hello', 'HELLO', 'hello']

const { groups: { year, month, day } } =
  '2026-06-09'.match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/);
// year='2026', month='06', day='09'

const sticky = /\d+/y;
sticky.lastIndex = 7;
sticky.exec('Order: 42');
// => ['42']  — matched exactly at index 7

Common Regex Patterns

Production-ready patterns for frequent validation and extraction tasks. Paste them into the Java Regex Tester — Free Online Tool to verify matches before shipping.

Use CasePattern
Email address^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
URL (HTTP/HTTPS)https?://[^\s/$.?#].[^\s]*
IPv4 address\b(?:\d{1,3}\.){3}\d{1,3}\b
IPv6 (simplified)(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}
Date YYYY-MM-DD\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])
US phone number\(?\d{3}\)?[-.\s]\d{3}[-.\s]\d{4}
Hex colour#(?:[0-9a-fA-F]{3}){1,2}\b
Slug (URL-safe)^[a-z0-9]+(?:-[a-z0-9]+)*$
Positive integer^[1-9]\d*$
HTML tag (basic)<([a-z][a-z0-9]*)\b[^>]*>.*?</\1>
UUID v4[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}
Semantic version\bv?\d+\.\d+\.\d+\b
JWT token^[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+$
Whitespace-only string^\s*$
C-style block comment/\*[\s\S]*?\*/
import re

email_re = re.compile(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')
email_re.match('user@example.com')  ## => Match object
email_re.match('not-an-email')      ## => None

css = "color: #fff; background: #1a2b3c; border: 1px solid #aabbcc;"
re.findall(r'#(?:[0-9a-fA-F]{3}){1,2}\b', css)
## => ['#fff', '#1a2b3c', '#aabbcc']

changelog = "Released v1.2.3; fixed bug from v1.2.1; target is v2.0.0"
re.findall(r'\bv?\d+\.\d+\.\d+\b', changelog)
## => ['v1.2.3', 'v1.2.1', 'v2.0.0']

Regex in Python: re Module Reference

The Python re module is the standard library’s full regex API. Always use raw strings (r'…') for patterns to avoid double-escaping backslashes.

Function / MethodWhat it returnsNotes
re.search(pat, s)First match anywhereReturns None if no match
re.match(pat, s)Match at start of string onlyDoes not scan the whole string
re.fullmatch(pat, s)Match spanning entire stringStrictest option for validation
re.findall(pat, s)List of all non-overlapping matchesReturns list of strings or tuples
re.finditer(pat, s)Iterator of Match objectsUse when you need .start() / .end()
re.sub(pat, repl, s)String with substitutionsrepl can be a string or callable
re.subn(pat, repl, s)(new_string, count) tupleCount = number of replacements made
re.split(pat, s)List of substringsCapturing groups appear in the result
re.compile(pat, flags)Compiled Pattern objectReuse for performance-critical loops
m.group(n)Captured group n (0 = full match)None if group did not participate
m.groups()All captured groups as tuple
m.groupdict()Named groups as {name: value} dict
m.start() / m.end()Start / end position in stringInteger index
m.span()(start, end) tupleEquivalent to (m.start(), m.end())
import re

text = "2026-06-09: Released version 2.1.4"

for m in re.finditer(r'\d+', text):
    print(f"'{m.group()}' at {m.span()}")
## '2026' at (0, 4)
## '06' at (5, 7)  ... etc.

def bump(m):
    return str(int(m.group()) + 1)

re.sub(r'\b\d+\b', bump, 'a=1 b=2 c=3')
## => 'a=2 b=3 c=4'

re.split(r'[,;\s]+', 'one, two;three  four')
## => ['one', 'two', 'three', 'four']

slug_re = re.compile(r'^[a-z0-9]+(?:-[a-z0-9]+)*$')
[s for s in ['hello-world', 'Bad Slug', 'ok-123'] if slug_re.match(s)]
## => ['hello-world', 'ok-123']

For more Python string operations that complement regex, see Python String Methods Cheat Sheet: split, join, replace & More.

Regex in JavaScript

JavaScript regex uses literal syntax /pattern/flags or new RegExp('pattern', 'flags'). The MDN Regular Expressions guide covers every detail of the spec.

MethodCalled onWhat it does
str.match(re)StringFirst match (or all with /g), returns array or null
str.matchAll(re)StringIterator of all Match objects — requires /g flag
str.search(re)StringIndex of first match, or -1
str.replace(re, sub)StringReplaces first match (or all with /g)
str.replaceAll(re, sub)StringReplaces all matches — requires /g or a string
str.split(re)StringSplits on each match, returns array
re.test(str)RegExptrue if the pattern matches anywhere
re.exec(str)RegExpNext match object (stateful with /g or /y)
// Named groups (ES2018+) with destructuring
const dateRe = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const { groups: { year, month, day } } = '2026-06-09'.match(dateRe);
// year='2026', month='06', day='09'

// replaceAll with a transform function (requires /g)
const result = 'a=1, b=2, c=3'.replace(/(\w+)=(\d+)/g, (_, k, v) => `${k}=${+v * 10}`);
// => 'a=10, b=20, c=30'

// matchAll: iterate all matches and extract groups
const str = 'cat bat hat mat';
const matches = [...str.matchAll(/(?<word>[cbhm]at)/g)];
matches.map(m => m.groups.word);
// => ['cat', 'bat', 'hat', 'mat']

// test() for fast boolean validation
const emailRe = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
emailRe.test('user@example.com');  // true
emailRe.test('not-an-email');      // false

For a deeper look at JavaScript’s native regex capabilities, see What is Regex Pattern Checking in JavaScript?. If you use regex in shell scripts or with CLI tools like grep, sed, and awk, the Linux Commands Cheat Sheet has the flags and syntax for POSIX and extended regex modes.

Escaping Special Characters

These characters carry special meaning in regex syntax and must be escaped with a backslash when you want to match them literally.

CharacterEscaped FormNormal Meaning in Regex
.\.Match any character
*\*0-or-more quantifier
+\+1-or-more quantifier
?\?0-or-1 quantifier / lazy modifier
(\(Open capturing group
)\)Close capturing group
[\[Open character class
]\]Close character class
{\{Open repetition count
}\}Close repetition count
^\^Start anchor / negation inside […]
$\$End anchor
|\\|Alternation operator
\\\\\\Backslash itself
/\/Pattern delimiter in JavaScript literals
import re

re.escape('3.14 * x^2')
## => '3\\.14\\ \\*\\ x\\^2'

pattern = re.compile(re.escape('3.14 * x^2'))
pattern.search("result: 3.14 * x^2 + 1")
## => Match object — safe to use with user-supplied text

def is_valid_regex(s):
    try:
        re.compile(s)
        return True
    except re.error:
        return False

is_valid_regex(r'\d+')       ## => True
is_valid_regex(r'[unclosed') ## => False