Tech
Why Time Series Databases Exist and When You Actually Need One
Time series databases solve performance and storage issues that general-purpose databases face with timestamped data. This article explains their optimization tricks and provides a decision tree to determine if your workload truly benefits from one.
June 2026 · 5 min read · 1 views · 0 hearts
Advertisement
Why Time Series Databases Exist and When You Actually Need One
Every second, your IoT sensors, server metrics, and financial tickers generate another timestamped data point. Store them in a regular database — and prepare for pain. That's why time series databases exist.
The Problem With General-Purpose Databases
Dump a million temperature readings into PostgreSQL or MongoDB, and you'll find:
- Writes choke — every row insertion is a separate transaction
- Reads crawl — queries across time ranges scan entire tables
- Storage bloats — no built-in compression for repeated timestamps or values
Time series data isn't random; it's sequential, append-heavy, and rarely updated. General databases treat it like any other table, which wastes resources.
What Time Series Databases Optimize For
Writes at massive scale
InfluxDB, TimescaleDB, and Prometheus handle millions of data points per second by batching inserts and using specialized storage backends. No locking overhead per row.
Compression tricks
Timestamps repeat every second. Values often hover around a range. Time series DBs use:
- Delta-of-delta encoding for timestamps (store differences of differences)
- Gorilla compression (Facebook's lossless float compression)
- Chunking — store blocks of data together, not individual rows
Result: 90%+ storage savings compared to a standard relational table.
Time-range queries that don't suck
Need "average CPU load every 5 minutes over the last 30 days"? Time series DBs use:
- Downsampling — pre-computed aggregates at different resolutions
- Retention policies — auto-delete old raw data, keep summaries
- Time-based partitioning — shard by hour/day for instant range scans
When You Actually Need One (vs. Faking It)
✅ Use a time series database when:
- You write data 100x more than you read it — monitoring, logging, sensor streams
- Your queries are "give me everything between two timestamps" — not random record lookups
- You need automatic downsampling — keep 1-second resolution for 7 days, 5-minute for a year
- Your dataset grows forever — no natural deletion, only retention window
❌ Skip it when:
- You have a few thousand records — SQLite with an index works fine
- Your "time series" is actually event logs with varied schemas — use Elasticsearch
- You need complex joins across non-time data — "temperature per sensor" is fine; "temperature per sensor joined with user profiles" is not
- Your updates are frequent — time series DBs optimize for append-only, not mutations
The Real Trade-Offs
You don't get the relational features you're used to:
- No foreign keys — data is denormalized by design
- Limited joins — most queries are against a single metric
- No arbitrary where clauses — queries almost always filter by time first
You also pick up operational baggage. Time series databases often require their own query language (Flux for InfluxDB, PromQL for Prometheus), their own clustering setup, and their own backup strategies.
The Shortcut Decision Tree
- Do you write more than 10,000 data points per second? → Yes
- Do you need to query "last 7 days" faster than 100ms? → Yes
- Is your query pattern "give me the average per minute, not the individual rows"? → Yes
- Are you okay learning a new database and query language? → Yes
Four yeses? Get InfluxDB or TimescaleDB. Otherwise, your existing PostgreSQL with a well-placed timestamp index will do just fine — and save you a whole lot of operational headache.
Advertisement
Comments
Questions, corrections, and tips stay visible for everyone reading this page.
Join the discussion
No comments yet
Be the first to leave a note — it helps the next reader.