Python

Stop Wasting Memory: Why Python Generators Will Change How You Write Code

Learn how Python generators and lazy evaluation let you process massive datasets with minimal memory, using practical examples like infinite sequences and file processing pipelines.

June 2026 · 8 min read · 1 views · 0 hearts

Try in editor Tutorial catalog

Stop Wasting Memory: Why Python Generators Will Change How You Write Code

You're building a tool to process a 10GB log file, and your laptop just froze. Sound familiar? The problem isn't your machine — it's how you think about data. Python generators and lazy evaluation are the mental shift that lets you handle massive datasets with the memory footprint of a postage stamp.

What's the Big Deal About Lazy Evaluation?

Let's start with a concrete example. Imagine you need to process every line in a 10GB file. The naive approach:

# This will crash or swap like crazy
all_lines = open("huge_log.txt").readlines()
for line in all_lines:
    process(line)

readlines() loads the entire file into memory. On a 10GB file, that's 10GB of RAM gone — plus Python's overhead. Instead, generators give you:

# Memory usage: one line at a time
def read_lines(filename):
    with open(filename) as f:
        for line in f:
            yield line

for line in read_lines("huge_log.txt"):
    process(line)

The magic? yield pauses execution, returns a value, and remembers where it left off. Each iteration pulls only what's needed — the rest of the file never occupies memory.

The Generator Mechanics That Matter

A generator isn't just a function with yield — it's a state machine. Here's what Python does under the hood:

def counter(n):
    i = 0
    while i < n:
        yield i
        i += 1

gen = counter(3)
print(next(gen))  # 0 — function runs until first yield
print(next(gen))  # 1 — resumes after yield, continues loop
print(next(gen))  # 2
print(next(gen))  # StopIteration — function exhausted

Think of each next() call as: "Run the function until you hit yield, hand me that value, then freeze everything — local variables, execution point, everything."

Generator Expressions: The One-Liner Power Move

If you've used list comprehensions, generator expressions will feel familiar but dangerously powerful:

# List comprehension — computes all values upfront
squares = [x**2 for x in range(10_000_000)]  # 80MB+ memory

# Generator expression — computes on demand
squares = (x**2 for x in range(10_000_000))  # Basically zero memory

# Usage is identical for iteration
for sq in squares:
    print(sq)

Memory difference? About 80MB vs 56 bytes. Generator expressions use parentheses () instead of brackets [], and they're memory-efficient by default. The trade-off: you can only iterate once, and you can't index into them.

Real-World Patterns: When Generators Save Your Day

1. Infinite Sequences

Lists can't be infinite. Generators can — they produce values on-the-fly forever:

def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

fib = fibonacci()
first_20 = [next(fib) for _ in range(20)]  # [0, 1, 1, 2, 3, 5...]

No upper bound, no memory growth. You take what you need.

2. Pipelining Data

Build processing chains that never create intermediate lists:

def read_ints(filepath):
    with open(filepath) as f:
        for line in f:
            yield int(line.strip())

def filter_even(stream):
    for x in stream:
        if x % 2 == 0:
            yield x

def multiply_by_10(stream):
    for x in stream:
        yield x * 10

# Pipeline — memory usage stays tiny
pipeline = multiply_by_10(filter_even(read_ints("numbers.txt")))
for result in pipeline:
    print(result)

Each generator in the chain processes one element and passes it forward. No full copies, no buffer bloat.

3. Lazy File Processing

The most common win: handling files too large for memory.

def tail(filename, n=10):
    """Read last n lines of a huge file without loading all of it."""
    with open(filename) as f:
        # Fast-forward to near end
        f.seek(0, 2)  # Seek to end
        buffer_size = 1024
        # Read chunks backwards until we have n lines
        # ... (implementation detail — but it's all generators underneath)
        pass

When Not to Use Generators

Generators aren't free. They have overhead per next() call. For small datasets (under a few thousand items), a list is usually faster and simpler.

Key trade-offs:

Scenario	Choose
Dataset fits in memory, accessed multiple times	List/ tuple
Large dataset, single pass	Generator
Need random access by index	List/ array
Infinite or unknown length	Generator
Debugging: need to inspect values	List (easier)

The `yield from` Shortcut

Python 3.3+ introduced yield from for delegating to sub-generators:

def flatten(nested):
    for item in nested:
        if isinstance(item, list):
            yield from flatten(item)
        else:
            yield item

nested = [1, [2, [3, 4], 5], 6]
print(list(flatten(nested)))  # [1, 2, 3, 4, 5, 6]

Same logic as manual iteration, but cleaner and a bit faster.

Generator Sending: Two-Way Communication

Generators can receive values too, using .send():

def running_average():
    total = 0
    count = 0
    avg = None
    while True:
        value = yield avg  # Receives value via send()
        if value is not None:
            total += value
            count += 1
            avg = total / count

avg_gen = running_average()
next(avg_gen)  # Initialize — run to first yield
print(avg_gen.send(10))  # 10.0
print(avg_gen.send(20))  # 15.0
print(avg_gen.send(30))  # 20.0

This is how coroutines work under the hood — and the foundation of async Python, but that's a rabbit hole for another article.

The Bottom Line

Generators aren't about being clever — they're about working within constraints. Every large-scale data processing tool you've used (pandas, Spark, database cursors) uses lazy evaluation somewhere. Learning to think this way makes you a better programmer, not just a Python developer.

Next time you write [x for x in something], ask yourself: do I really need all of this right now? If not, switch to () and save your RAM for what matters.

Comments

Questions, corrections, and tips stay visible for everyone reading this page.

0 in thread

Join the discussion

No comments yet

Be the first to leave a note — it helps the next reader.

Stop Wasting Memory: Why Python Generators Will Change How You Write Code

What's the Big Deal About Lazy Evaluation?

The Generator Mechanics That Matter

Generator Expressions: The One-Liner Power Move

Real-World Patterns: When Generators Save Your Day

1. Infinite Sequences

2. Pipelining Data

3. Lazy File Processing

When Not to Use Generators

The yield from Shortcut

Generator Sending: Two-Way Communication

The Bottom Line

Comments

Join the discussion

The `yield from` Shortcut