How to Do Dictionary Comprehension in Python: A Complete Guide

Python

When you need to construct or transform a Python dictionary from an iterable, writing a traditional for loop is the instinctive first approach — but dictionary comprehensions express the same intent in a single line that reads closer to mathematical set notation.

The Problem

Building a dictionary from a list of values using a for loop is verbose. Every transformation involves declaring an empty dictionary, iterating over a source, conditionally filtering, and manually assigning each key-value pair.

# Building a product catalog from a list of raw records — the naive approach
raw_products = [
    {"id": 101, "name": "Laptop",   "price": 999.99, "in_stock": True},
    {"id": 102, "name": "Monitor",  "price": 349.00, "in_stock": False},
    {"id": 103, "name": "Keyboard", "price":  79.50, "in_stock": True},
    {"id": 104, "name": "Webcam",   "price":  59.99, "in_stock": False},
]

# Construct a lookup map of id -> name for in-stock items only
in_stock_catalog = {}
for product in raw_products:
    if product["in_stock"]:
        in_stock_catalog[product["id"]] = product["name"]

print(in_stock_catalog)
# {101: 'Laptop', 103: 'Keyboard'}

This approach works but requires four lines and introduces the mutable in_stock_catalog = {} intermediate state into the local scope. The developer’s intent — “map id to name for in-stock products” — is split across multiple statements and is not immediately obvious at a glance. In functions that build several different maps, this pattern clutters the namespace and makes the function longer than it needs to be.

The Python Solution: Dictionary Comprehension

Dictionary comprehension condenses the same logic into a single declarative expression using the syntax {key_expr: value_expr for item in iterable if condition}.

# The identical logic in a single, readable expression
in_stock_catalog = {
    p["id"]: p["name"]
    for p in raw_products
    if p["in_stock"]
}

print(in_stock_catalog)
# {101: 'Laptop', 103: 'Keyboard'}

The three components of a dictionary comprehension map directly to the three components of the for loop above:

p["id"]: p["name"] — the key-value expression (replaces the assignment catalog[key] = value)
for p in raw_products — the loop (replaces for product in raw_products:)
if p["in_stock"] — the filter (replaces the if statement inside the loop)

The result is identical but the intent is captured in one expression that can be read left-to-right as a sentence: “map each product’s ID to its name, for every product in raw_products, if the product is in stock.”

How Dictionary Comprehension Works in Python

Under the hood, the Python compiler translates a dictionary comprehension into a set of specialized bytecode instructions that differ from those used by a manually written for loop.

A traditional for loop that builds a dictionary executes the following bytecode on each iteration:

LOAD_FAST — load the loop variable
LOAD_FAST — load the dictionary being built
STORE_SUBSCR — execute dict[key] = value by calling dict.__setitem__

A dictionary comprehension uses a different instruction:

Creates a fresh, isolated dictionary scope using BUILD_MAP
Uses MAP_ADD for each key-value assignment — a low-level C instruction that bypasses Python’s attribute lookup and directly calls the CPython dictionary’s C-level PyDict_SetItem function

MAP_ADD skips the overhead of resolving the dictionary name in the local namespace on every iteration, which is what STORE_SUBSCR does. For comprehensions that iterate thousands of items, this difference accumulates. Benchmarks with timeit consistently show dictionary comprehensions running 15–30% faster than equivalent for loops for medium-sized dictionaries.

Additionally, because a comprehension runs in its own isolated scope (visible with dis.dis()), the loop variable (p in the example above) does not bleed into the enclosing function’s namespace. A for loop always leaves the loop variable bound to its last value after the loop exits — a minor namespace pollution that comprehensions avoid.

Going Further — Real-World Patterns

Pattern 1: Inverting a dictionary

One of the most common real-world use cases is reversing a dictionary — swapping keys and values to create a reverse lookup table. This arises constantly when you have a mapping in one direction (e.g., username → user ID) and need to query it in the other direction (user ID → username).

# Original mapping: product name -> SKU code
name_to_sku = {
    "Laptop":   "PRD-001",
    "Monitor":  "PRD-002",
    "Keyboard": "PRD-003",
}

# Invert: SKU code -> product name (for scanning barcodes, etc.)
sku_to_name = {sku: name for name, sku in name_to_sku.items()}

print(sku_to_name)
# {'PRD-001': 'Laptop', 'PRD-002': 'Monitor', 'PRD-003': 'Keyboard'}

Important caveat: inversion only works correctly when all values in the original dictionary are unique. If two keys share a value, only one mapping survives in the inverted dictionary — the one processed last. If you suspect duplicates, check with a set: assert len(set(name_to_sku.values())) == len(name_to_sku).

Pattern 2: Transforming keys and values simultaneously

In data pipeline work, incoming data often needs cleaning before it can be used — column names need normalizing, prices need rounding, or string values need type coercion. Dictionary comprehensions apply these transformations in a single pass.

# Raw API response with inconsistent formatting
raw_config = {
    "  MAX_RETRIES  ": "5",
    "timeout_ms":      "3000",
    "DEBUG_MODE":      "true",
    "  base_url    ":  " https://api.devnook.dev / ",
}

# Normalize: strip and lowercase all keys, strip and coerce all values
normalized = {
    k.strip().lower(): v.strip()
    for k, v in raw_config.items()
}

print(normalized)
# {'max_retries': '5', 'timeout_ms': '3000', 'debug_mode': 'true', 'base_url': 'https://api.devnook.dev /'}

Pattern 3: Building from two parallel lists with zip

When a database query returns separate lists for column names and row values, zip pairs them together, and a dictionary comprehension converts the pairs into a structured record.

# Database column names returned separately from row values
columns = ["user_id", "username", "email", "is_admin"]
row     = [8841, "alice_dev", "alice@example.com", False]

# Pair columns with their values and build a named record
user_record = {col: val for col, val in zip(columns, row)}

print(user_record)
# {'user_id': 8841, 'username': 'alice_dev', 'email': 'alice@example.com', 'is_admin': False}

# This scales naturally for multiple rows
rows = [
    [8841, "alice_dev", "alice@example.com", False],
    [8842, "bob_eng",   "bob@example.com",   True],
]

records = [{col: val for col, val in zip(columns, row)} for row in rows]
# records[0] = {'user_id': 8841, ...}
# records[1] = {'user_id': 8842, ...}

Using zip with a dictionary comprehension is cleaner than dict(zip(columns, row)) when you need to transform the keys or values during construction, because the comprehension expression allows arbitrary transformation logic on each col and val.

What to Watch Out For

Readability deteriorates with nesting: Python allows nested loops inside comprehensions — {k: v for outer in data for k, v in outer.items() if condition}. While syntactically valid, comprehensions with more than one for clause or complex conditions are significantly harder to read than the equivalent for loop. The guideline is: if the comprehension requires more than one line to read comfortably, rewrite it as a loop. Comprehensions optimize for the common case of simple one-to-one mappings; complex nested logic belongs in explicit loops with intermediate variables.

Memory usage for large datasets: A dictionary comprehension evaluates the entire source iterable and builds the complete output dictionary before the first key is accessible. If you are building a dictionary from a million-row database result set or a multi-gigabyte file, the comprehension will attempt to build a million-entry dictionary in memory simultaneously. For streaming or large datasets, use a generator expression with dict() to evaluate lazily, or build the dictionary incrementally in chunks.

Duplicate keys silently overwrite: If the key expression produces the same value for multiple items, later iterations silently overwrite earlier ones without raising any error. This is standard Python dictionary behaviour but is easy to miss when the comprehension filters and transforms are complex.

# Duplicate key scenario — only one 'Alice' survives
users = [
    {"name": "Alice", "role": "admin"},
    {"name": "Alice", "role": "user"},   # Same name — overwrites the first entry
    {"name": "Bob",   "role": "user"},
]

name_to_role = {u["name"]: u["role"] for u in users}
print(name_to_role)
# {'Alice': 'user', 'Bob': 'user'} — the 'admin' role was silently lost

If duplicate keys are a concern, validate uniqueness before building the map, or use collections.defaultdict(list) to accumulate multiple values per key.

Under the Hood: Performance & Mechanics

The performance advantage of dictionary comprehensions over equivalent for loops is rooted in the bytecode level. Using Python’s dis module to examine the compiled bytecode reveals the difference clearly.

import dis

# Comprehension version
dis.dis(compile("{k: v for k, v in items}", "<>", "eval"))
# Key instructions: GET_ITER, FOR_ITER, MAP_ADD — optimized inner loop

# For-loop version
code = """
d = {}
for k, v in items:
    d[k] = v
"""
dis.dis(compile(code, "<>", "exec"))
# Key instructions: GET_ITER, FOR_ITER, BINARY_SUBSCR, STORE_SUBSCR — more steps

The comprehension uses MAP_ADD which is a direct CPython C-API call: PyDict_SetItem(dict, key, value). The for-loop uses STORE_SUBSCR, which invokes the dict’s __setitem__ method through Python’s attribute resolution machinery — an additional layer of indirection.

For small dictionaries (fewer than ~50 items), the difference is immeasurable in practice. For larger dictionaries built in tight loops as part of a data pipeline, the cumulative savings are meaningful. On CPython 3.12, a comprehension building 100,000 entries is typically 20–25% faster than the equivalent for loop.

Advanced Edge Cases

Edge Case 1: Conditional value transformation with ternary

The if clause at the end of a comprehension acts as a filter — items that don’t match are excluded entirely. But sometimes you want to include all items and transform the value conditionally rather than exclude the item. The ternary operator in the value expression handles this.

# Test scores — assign grade labels, include all students
scores = {"Alice": 93, "Bob": 51, "Charlie": 77, "Diana": 88}

# The 'if/else' in the value position is a ternary, NOT the comprehension filter
graded = {
    name: ("Pass" if score >= 60 else "Fail")
    for name, score in scores.items()
}

print(graded)
# {'Alice': 'Pass', 'Bob': 'Fail', 'Charlie': 'Pass', 'Diana': 'Pass'}

# Contrast: the 'if' at the END acts as a filter (excludes items entirely)
passing_only = {name: score for name, score in scores.items() if score >= 60}
print(passing_only)
# {'Alice': 93, 'Charlie': 77, 'Diana': 88} — Bob is excluded entirely

The syntax distinction is subtle but important: {k: (a if cond else b) for k in items} transforms values conditionally; {k: v for k in items if cond} filters items conditionally.

Edge Case 2: Using walrus operator for computed conditions

Python 3.8 introduced the walrus operator (:=), which assigns a value inside an expression. In dictionary comprehensions, this allows computing an intermediate value once and using it in both the condition and the expression, avoiding double computation.

import re

log_lines = [
    "2026-04-17 ERROR: Disk full on /dev/sda1",
    "2026-04-17 INFO: Server started",
    "2026-04-17 ERROR: Connection refused to 10.0.0.5",
    "2026-04-17 WARN: Memory usage at 85%",
]

# Extract error messages — compute the regex match once, not twice
error_map = {
    i: m.group(0)
    for i, line in enumerate(log_lines)
    if (m := re.search(r"ERROR: (.+)", line))
}

print(error_map)
# {0: 'ERROR: Disk full on /dev/sda1', 2: 'ERROR: Connection refused to 10.0.0.5'}

Without the walrus operator, you would either call re.search() twice (once in the if condition and once in the value expression) or refactor to a for loop. The walrus operator makes single-pass filtering with computed values natural inside comprehensions.

Testing Dictionary Comprehension in Python

Because a dictionary comprehension is a pure expression with no side effects, testing it reduces to asserting the correctness of the output dictionary given a specific input.

import unittest


def build_product_catalog(products: list[dict]) -> dict[int, str]:
    """Build an id-to-name map for in-stock products only."""
    return {p["id"]: p["name"] for p in products if p["in_stock"]}


class TestBuildProductCatalog(unittest.TestCase):

    def setUp(self):
        self.products = [
            {"id": 101, "name": "Laptop",   "in_stock": True},
            {"id": 102, "name": "Monitor",  "in_stock": False},
            {"id": 103, "name": "Keyboard", "in_stock": True},
        ]

    def test_includes_only_in_stock_items(self):
        result = build_product_catalog(self.products)
        self.assertIn(101, result)
        self.assertIn(103, result)
        self.assertNotIn(102, result)

    def test_maps_id_to_name_correctly(self):
        result = build_product_catalog(self.products)
        self.assertEqual(result[101], "Laptop")
        self.assertEqual(result[103], "Keyboard")

    def test_empty_input_returns_empty_dict(self):
        result = build_product_catalog([])
        self.assertEqual(result, {})

    def test_all_out_of_stock_returns_empty_dict(self):
        out_of_stock = [
            {"id": 201, "name": "Mouse", "in_stock": False},
        ]
        result = build_product_catalog(out_of_stock)
        self.assertEqual(result, {})


if __name__ == "__main__":
    unittest.main()

The key testing strategy for dictionary comprehensions is to wrap them in named functions, which makes them independently callable and testable. Testing the boundary conditions — empty input, all-items-filtered, single-item input — ensures the comprehension handles the full range of production inputs.

Summary

Dictionary comprehensions are a first-class Python feature that replace verbose for-loop dictionary construction with a concise, declarative expression. They execute 15–30% faster than equivalent loops due to the MAP_ADD bytecode instruction, avoid namespace pollution by running in a private scope, and express transformational intent more clearly since the key-value relationship is captured in a single readable expression. Common patterns include dictionary inversion, key-value normalization, zip-based record construction, and conditional value transformation using ternary expressions. Keep comprehensions simple — the moment a comprehension becomes difficult to read in one pass, rewrite it as an explicit loop.

How to Do Dictionary Comprehension in Python: A Complete Guide

The Problem

The Python Solution: Dictionary Comprehension

How Dictionary Comprehension Works in Python

Going Further — Real-World Patterns

What to Watch Out For

Under the Hood: Performance & Mechanics

Advanced Edge Cases

Testing Dictionary Comprehension in Python

Summary

Related Posts