Python String Methods: split, join, replace, strip & More

python strings cheatsheet string-manipulation string-methods

Every built-in str method organized by task — syntax, runnable examples, and the pitfalls that trip up even experienced developers. Python strings are immutable: every method returns a new string and leaves the original unchanged. Calling s.strip() without assigning the result is one of the most common beginner mistakes for exactly this reason. For the canonical complete list, see the Python string methods documentation.

Python String Methods Quick Reference

The table below maps every commonly used str method to its task area. Detailed sections with working code examples follow.

CategoryKey Methods
Splitting & joiningsplit(), rsplit(), splitlines(), join(), partition(), rpartition()
Searching & findingfind(), rfind(), index(), rindex(), count(), startswith(), endswith()
Replacing & strippingreplace(), strip(), lstrip(), rstrip(), removeprefix(), removesuffix()
Case conversionlower(), upper(), title(), capitalize(), swapcase(), casefold()
Formatting & alignmentformat(), center(), ljust(), rjust(), zfill()
Type-checkingisalpha(), isdigit(), isdecimal(), isalnum(), isspace(), isidentifier(), isascii()
Encoding & translationencode(), decode(), translate(), maketrans(), expandtabs()

All str methods return a new value — Python strings never mutate in place. The variable must be reassigned to keep the result: s = s.strip().

Splitting and Joining Strings

Splitting turns a string into a list; joining reassembles a list back into a string. Together they handle the vast majority of text-parsing tasks.

TaskMethod & Syntax
Split on whitespacetext.split()
Split on a delimitertext.split(',')
Limit number of splitstext.split(':', maxsplit=2)
Split from the righttext.rsplit('/', 1)
Split on line endingstext.splitlines()
Join list with separatorsep.join(items)
Partition into exactly 3 partstext.partition(':')
Partition from the righttext.rpartition('/')
csv      = "alice,bob,charlie,dana"
parts    = csv.split(',')                # ['alice', 'bob', 'charlie', 'dana']
rejoined = ' | '.join(parts)            # 'alice | bob | charlie | dana'

path                   = "/var/log/app/server.log"
dir_part, _, filename  = path.rpartition('/')     # filename = 'server.log'

log_line              = "2026-06-11T09:00:00Z INFO Server started"
timestamp, _, message = log_line.partition(' ')   # timestamp = '2026-06-11T09:00:00Z'

multiline = "line one\nline two\r\nline three\r"
lines     = multiline.splitlines()    # ['line one', 'line two', 'line three']

header        = "Content-Type: text/html; charset=utf-8"  # partition: maxsplit=1
key, _, value = header.partition(': ')   # key='Content-Type', value='text/html; charset=utf-8'

split() with no argument collapses any run of consecutive whitespace and strips leading/trailing spaces — far more robust than split(' ') for normalizing messy input. splitlines() is always preferred over split('\n') when processing files that may have Windows (\r\n) or old Mac (\r) line endings.

Building a string from many parts: always collect into a list and call sep.join(items) once — far faster than concatenation in a loop. See What is List Comprehension in Python? A Complete Guide with Examples for list-building patterns that pair naturally with join().

Searching and Finding Substrings

Use these methods when you need to locate positions, count occurrences, or confirm that a string starts or ends with a known value.

TaskMethod & Syntax
Check if substring exists'pat' in text
Find first occurrence (index)text.find('pat')
Find last occurrence (index)text.rfind('pat')
Raise ValueError if missingtext.index('pat')
Count non-overlapping matchestext.count('pat')
Check prefixtext.startswith('prefix')
Check suffixtext.endswith('.json')
Check multiple options at oncetext.startswith(('http://', 'https://'))
Restrict search to a slicetext.find('x', start, end)
url      = "https://api.example.com/v2/users?page=1"
has_v2   = '/v2/' in url                              # True
qs_start = url.find('?')                              # 33; returns -1 if absent
is_https = url.startswith(('https://', 'wss://'))     # True

log        = "WARN: retry 1. WARN: retry 2. WARN: retry 3. ERROR: abort."
warn_count = log.count('WARN')     # 3
err_count  = log.count('ERROR')    # 1

filename = "data_export_2026.csv"
is_csv   = filename.endswith('.csv')                          # True
is_data  = filename.endswith(('.csv', '.tsv', '.parquet'))    # True

path    = "/home/user/projects/report.tar.gz"   # rfind scans right-to-left
dot_pos = path.rfind('.')    # index of the LAST dot
ext     = path[dot_pos:]     # '.gz'

find() vs index(): find() returns -1 when the substring is absent; index() raises ValueError. Use find() when a missing value is a normal, expected condition; use index() when absence signals a bug and you want the exception to surface immediately. See How to Handle Errors in Python? A Complete Guide for patterns that pair cleanly with both approaches.

For pattern-based searches that go beyond literal substrings — matching email addresses, IP addresses, or arbitrary formats — the Regex Cheat Sheet covers re.search(), re.findall(), and re.sub() with syntax you can drop straight into a project.

Replacing and Stripping Text

replace() handles literal substitution throughout a string. The strip family removes unwanted characters from string boundaries.

TaskMethod & Syntax
Replace all occurrencestext.replace('old', 'new')
Replace first N occurrencestext.replace('old', 'new', 2)
Strip whitespace from both endstext.strip()
Strip from left onlytext.lstrip()
Strip from right onlytext.rstrip()
Strip a character settext.strip('.,!? ')
Remove exact prefix (Python 3.9+)text.removeprefix('https://')
Remove exact suffix (Python 3.9+)text.removesuffix('.txt')
msg = "ERROR: disk full. ERROR: retry failed. ERROR: gave up."
msg.replace('ERROR', 'WARN')       # replaces all 3 occurrences
msg.replace('ERROR', 'WARN', 1)    # replaces only the first

raw = "   \t hello world \n  "
raw.strip()    # 'hello world'
raw.lstrip()   # 'hello world \n  '
raw.rstrip()   # '   \t hello world'

messy = "...!!!Greetings!!!..."
messy.strip('.!')    # 'Greetings'  (strips each char in '.!' independently)

filename = "report_draft.md"
filename.removesuffix('.md')    # 'report_draft'
filename.removesuffix('.txt')   # 'report_draft.md' — no match, unchanged

url = "https://devnook.dev/blog/"
url.removeprefix('https://')    # 'devnook.dev/blog/'
url.removeprefix('http://')     # 'https://devnook.dev/blog/' — no match

strip() accepts a character set, not a substring. text.strip('abc') removes leading/trailing a, b, or c characters in any order — not the literal string "abc". For exact prefix or suffix removal, removeprefix() and removesuffix() (Python 3.9+) are always preferred: they are unambiguous, never surprise you, and return the original string unchanged when no match is found.

Case Conversion and Normalization

TaskMethod & Syntax
Lowercasetext.lower()
Uppercasetext.upper()
Title casetext.title()
Capitalize only first charactertext.capitalize()
Swap each character’s casetext.swapcase()
Unicode-safe folded lowercasetext.casefold()
name = "john DOE"
name.lower()       # 'john doe'
name.upper()       # 'JOHN DOE'
name.title()       # 'John Doe'
name.capitalize()  # 'John doe'  — only the very first char is uppercased
name.swapcase()    # 'JOHN doe'

german = "Straße"
german.lower()    # 'straße'   — ß unchanged by lower()
german.casefold() # 'strasse'  — ß correctly expands to 'ss' for comparison

tags   = ["Python", "python", "PYTHON", "django", "Django"]
unique = list({t.casefold(): t for t in tags}.values())
len(unique)   # 2 — one Python variant, one Django variant

"it's a dog's life".title()   # "It'S A Dog'S Life" — apostrophe edge case

Always use casefold() — not lower() — when comparing strings for equality, especially with user-supplied input that may include non-ASCII characters. The difference only matters for a small set of Unicode characters (German ß, Turkish dotless ı, etc.), but using lower() produces silent, hard-to-reproduce comparison bugs in multilingual applications.

String Formatting and Alignment

Python offers three formatting styles: f-strings (3.6+) for most work, str.format() for reusable templates, and % formatting in legacy codebases. For the complete format mini-language spec, see Python String Formatting: f-strings, format(), and %.

TaskMethod & Syntax
f-string interpolationf"Hello, {name}!"
Named placeholders"{name} is {age}".format(name=n, age=a)
Positional placeholders"{} {}".format('hello', 'world')
Float precisionf"{3.14159:.2f}"
Thousands separatorf"{1_000_000:,}"
Hex / binary outputf"{255:#010x}" / f"{10:#b}"
Center in fixed widthtext.center(20, '-')
Left-justify (pad right)text.ljust(20, '.')
Right-justify (pad left)text.rjust(20)
Zero-pad a numberstr(42).zfill(5)
name, score, pct = "Alice", 1842, 0.9357
f"{name:<10} {score:>8,} {pct:.1%}"   # 'Alice       1,842 93.6%'
f"{255:#010x}"                          # '0x000000ff'
f"{score:+}"                            # '+1842'

fmt = "{:<12} {:>8} {:^6}"             # fixed-width table formatting
print(fmt.format("Name", "Score", "Grade"))  # 'Name          Score  Grade '
print(fmt.format("Bob",  1750,    "B"))      # 'Bob            1750    B   '

order_id = f"order_{str(42).zfill(6)}"      # 'order_000042'

template = "User {username!r} logged in from {ip}"
template.format(username="alice", ip="192.168.1.1")

Type-Checking and Validation

These predicate methods return True or False and are most useful for validating raw input before conversion or downstream processing.

TaskMethod & Syntax
All alphabetictext.isalpha()
Strict decimal digits onlytext.isdecimal()
Broad digit characterstext.isdigit()
All alphanumerictext.isalnum()
All lowercasetext.islower()
All uppercasetext.isupper()
All whitespacetext.isspace()
Title-casedtext.istitle()
Valid Python identifiertext.isidentifier()
ASCII characters onlytext.isascii()
raw_age = "25"
age = int(raw_age) if raw_age.isdecimal() else None

"alice".isalpha()       # True
"alice123".isalpha()    # False — digit present
"alice123".isalnum()    # True
"alice_123".isalnum()   # False — underscore is not alphanumeric

"123".isdecimal()   # True
"²³".isdecimal()    # False — Unicode superscripts fail isdecimal
"²³".isdigit()      # True  — superscripts pass isdigit; int("²³") still raises ValueError

"my_column".isidentifier()    # True
"2bad_name".isidentifier()    # False — starts with digit
"for".isidentifier()          # True — keyword check is separate (keyword.iskeyword())

"hello".islower()   # True
"Hello".islower()   # False
"  ".isspace()      # True
"".isspace()        # False — empty string returns False for all is* methods

Use isdecimal() — not isdigit() — when validating numeric input before calling int(). Superscript and subscript Unicode characters pass isdigit() but cause int() to raise ValueError. isalnum() excludes underscores and hyphens, so it is not a substitute for identifier validation — use isidentifier() for that.

Encoding and Byte Conversion

TaskMethod & Syntax
Encode string to bytestext.encode('utf-8')
Decode bytes to stringb_obj.decode('utf-8')
Ignore unencodable characterstext.encode('ascii', errors='ignore')
Replace unencodable characterstext.encode('ascii', errors='replace')
XML-escape unencodable charstext.encode('ascii', errors='xmlcharrefreplace')
Map or delete characterstext.translate(table)
Build a translation tablestr.maketrans('abc', 'ABC')
Delete a set of charactersstr.maketrans('', '', chars_to_delete)
Expand tab stops to spacestext.expandtabs(4)
text     = "Python 🐍 rocks"
as_bytes = text.encode('utf-8')          # b'Python \xf0\x9f\x90\x8d rocks'
back_str = as_bytes.decode('utf-8')      # 'Python 🐍 rocks'

text.encode('ascii', errors='ignore')   # b'Python  rocks'  — emoji stripped
text.encode('ascii', errors='replace')  # b'Python ? rocks' — emoji replaced

no_punct = str.maketrans('', '', '.,!?;:')
"Hello, World!".translate(no_punct)    # 'Hello World'

rot13 = str.maketrans(
    'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz',
    'NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm'
)
"Hello".translate(rot13)   # 'Uryyb'

"col1\tcol2\tcol3".expandtabs(12)   # 'col1        col2        col3'

When reading files that may have mixed or unknown encodings, always specify the encoding parameter in open() and choose an errors strategy ('ignore', 'replace', or 'backslashreplace') to avoid hard crashes on unexpected characters. See How to File Handling in Python + Examples for encoding-aware file patterns. The Python string module also provides string.punctuation, string.ascii_letters, and string.Template for cases where the built-in str methods are not quite enough.

Common Pitfalls and Performance Tips

These are the mistakes that appear most often in code reviews — patterns worth encoding in muscle memory.

PitfallProblemFix
Discarding the return valuetext.strip() alone changes nothingAssign back: text = text.strip()
text.split('')Raises ValueErrorUse list(text) to iterate characters
strip() argument confusionStrips individual chars, not a substringUse removeprefix() / removesuffix() for exact removal
lower() for equality checksMisses Unicode edge cases (German ß, etc.)Use casefold() for all equality comparisons
String concatenation in a loopO(n²) memory allocationsCollect into a list and call ''.join(items) once
isdigit() for int validationAccepts Unicode superscripts like ²Use isdecimal() instead
replace() replaces allUnexpected mass substitutionPass count arg: text.replace('x', 'y', 1)
index() on missing textRaises ValueError at runtimeUse find() and check for -1
s = "  padded  "
s.strip()              # result discarded — s is still "  padded  "
s = s.strip()          # must reassign
print(s)               # 'padded'

parts  = [str(i) for i in range(10_000)]
fast   = "".join(parts)      # single allocation — O(n)
slow   = ""
for p in parts:
    slow += p                # new string object each iteration — O(n²), avoid this
tag   = "<h1>Title</h1>"
clean = tag.removeprefix('<h1>').removesuffix('</h1>')  # 'Title'  — exact, safe
wrong = tag.strip('<h1>')    # strips CHARS '<','h','1','>' — not the literal tag string

words  = ["Café", "café", "CAFÉ"]
unique = {w.casefold() for w in words}  # 1 item — all map to 'café'

When you need batch string transformations over many items, How to Do Dictionary Comprehension in Python? shows patterns that compose cleanly with str method chains. The most common single fix that improves both correctness and performance: switch all string equality comparisons from lower() to casefold(), and replace any concatenation loop with a single join() call.