InfluxDB vs TimescaleDB vs QuestDB: Which Time-Series Database for High-Volume Telemetry?

Server rack in a data center with green indicator lights, clean lines

The Write Rate Problem

When we were designing the Stima data backend, we needed to answer a concrete question: at what write rate does each database candidate start showing cracks? The number we were targeting at scale was 50,000 writes per second — roughly 500 vehicles each reporting 10 telemetry measurements every second. That's not a theoretical ceiling; it's the load profile we'd hit with a mid-size fleet operator running a combined vehicle count in the thousands across multiple cities.

We tested three candidates: InfluxDB 2.7 (OSS), TimescaleDB 2.11 on PostgreSQL 15, and QuestDB 7.3. All tests ran on identically provisioned AWS c5.4xlarge instances (16 vCPU, 32GB RAM, 1TB gp3 SSD at 16,000 IOPS). We used synthetic vehicle telemetry data: 12 fields per measurement (voltage, current, temperature, SOC, SOH, latitude, longitude, speed, odometer, charge_cycles, pack_id, vehicle_id), timestamped at second-level resolution.

The test harness was a Go program that batched writes into 1,000-row payloads and varied the number of concurrent writer goroutines from 10 to 200. We measured p50, p95, and p99 write latency, sustained throughput at 60 minutes of continuous load, and disk usage after 24 hours of simulated data at 10,000 vehicles.

InfluxDB 2.7: The Expected Choice That Disappointed

InfluxDB is the name most people reach for when they think about IoT time-series storage, and it performed well at moderate load. At 10,000 writes/second, p50 latency was 8ms, p99 was 24ms. Sustained throughput was comfortable and consistent. The line protocol ingestion API is well-documented and easy to implement against.

The problems appeared above 30,000 writes/second. At that point, InfluxDB's WAL (write-ahead log) began to create backpressure. The compaction process — which InfluxDB runs to consolidate TSM files — competed with ingest for I/O bandwidth. p99 latency climbed to 180ms. At 50,000 writes/second, we saw sustained periods where the HTTP ingest API returned 503 errors with "hinted handoff" messages — InfluxDB's mechanism for handling write overflow, which stores rejected writes in a local queue for later retry.

The hinted handoff queue grew faster than it drained during our 60-minute stress test. After 60 minutes at 50,000 writes/second, the queue contained approximately 3.2 million unprocessed data points. InfluxDB's OSS tier doesn't include the clustering features needed to scale write throughput horizontally — that capability lives in InfluxDB Cloud or the commercial InfluxDB Enterprise product.

For a startup running on a seed budget, paying for InfluxDB Enterprise at the write rates we needed didn't pencil out. InfluxDB OSS is an excellent database for smaller deployments — up to roughly 20,000 writes/second on our test hardware — but it's not the right foundation if you're planning to scale aggressively on a single node.

TimescaleDB 2.11: SQL Familiarity Costs You Throughput

TimescaleDB runs as a PostgreSQL extension, which means your telemetry data lives in a PostgreSQL table and you query it with standard SQL. For engineers who already know Postgres, the operational familiarity is real — you can use pg_dump, existing ORM libraries, and standard monitoring tools without modification. That matters for a small team.

TimescaleDB's hypertable abstraction automatically partitions data into time-based chunks, which helps query performance on time-range lookups. The continuous aggregates feature — which pre-computes rollups at configurable intervals — is particularly useful for our degradation analysis queries, which frequently aggregate per-vehicle metrics over 7-day or 30-day windows.

The write throughput ceiling was lower than InfluxDB. TimescaleDB hit noticeable latency degradation at around 15,000 writes/second, largely because PostgreSQL's MVCC architecture and WAL logging overhead are not optimized for high-frequency append-only workloads. At 30,000 writes/second, we saw lock contention on the chunk insertion paths and p99 latencies above 400ms.

TimescaleDB with batch inserts (we tested INSERT with 5,000-row batches instead of 1,000) was meaningfully better — p99 at 120ms at 30,000 writes/second. But the Postgres connection pool itself became a bottleneck above 40,000 writes/second regardless of batch size. If our write volume stayed under 15,000 writes/second permanently, TimescaleDB would be our choice for the SQL ecosystem benefits. Above that, the throughput limits are a real constraint.

QuestDB 7.3: The Unexpected Winner

QuestDB was the least familiar option on our shortlist, which made it easy to overlook. It turned out to be the database the benchmark was built for. QuestDB is written in Java with a low-level columnar storage engine that uses memory-mapped I/O to bypass the JVM heap for most data operations. It supports both InfluxDB line protocol ingestion (same client libraries, no code changes) and a SQL query interface.

At 50,000 writes/second, QuestDB's p50 latency was 6ms, p95 was 18ms, and p99 was 41ms. No dropped writes during the 60-minute sustained test. Disk usage after 24 hours of 10,000-vehicle data was 34% smaller than InfluxDB for equivalent data, because QuestDB stores columns independently and applies dictionary encoding to high-cardinality string fields like vehicle_id and pack_id automatically.

We pushed the test to 100,000 writes/second. QuestDB sustained it for the full 60-minute test at p99 under 80ms. The single-node throughput was surprising enough that we ran the test three times to verify it wasn't an artifact of our test setup. It wasn't.

QuestDB's tradeoffs are real. The SQL dialect is close to standard but not identical — some Postgres functions are absent, and the join behavior differs for time-series joins in ways that require learning. The community is smaller than InfluxDB's, which means fewer Stack Overflow answers when you're debugging something unusual. For our use case — high-frequency writes, time-range queries, aggregate computations — the performance characteristics outweighed the ecosystem size.

Query Performance: The Other Half of the Problem

Write throughput is only one dimension. The degradation analysis queries that power Stima's prediction engine are computationally expensive — they compute rolling standard deviations of internal resistance estimates over 50-point windows, join that output against a vehicle metadata table, and filter results by fleet ID. On a dataset of 10,000 vehicles with 90 days of history, that's a large scan.

InfluxDB's Flux query language can express the computation, but the execution time was 12–18 seconds for the full-fleet degradation scan. TimescaleDB with continuous aggregates pre-computed the rolling statistics and returned results in 2–3 seconds, but setting up the continuous aggregate configuration required non-trivial Postgres expertise. QuestDB's SAMPLE BY clause and built-in window functions ran the equivalent query in 1.1–2.4 seconds without pre-computation.

The query performance difference shapes the product experience. Our degradation scoring pipeline runs every 15 minutes for each fleet. At 10,000 vehicles, an 18-second scan per fleet means the pipeline can only complete a small number of fleets per cycle — fine for early-stage, problematic at scale. QuestDB's sub-3-second scan meant the pipeline could comfortably handle 200+ fleet scans per 15-minute cycle on a single node.

What We Chose and Why

We run QuestDB as our primary telemetry store, with TimescaleDB as a secondary store for structured vehicle and customer metadata that benefits from full relational semantics. The split architecture gives us QuestDB's write throughput and query speed for time-series data, and TimescaleDB's rich SQL tooling for the relational data that doesn't need microsecond write latency.

If you're building IoT telemetry infrastructure and write rate is your primary concern, benchmark QuestDB before committing to InfluxDB on name recognition alone. The gap in sustained throughput on commodity hardware is large enough to materially affect your infrastructure cost at scale. For a team our size, the difference between a database that needs a second node at 30,000 writes/second and one that runs comfortably to 100,000+ on a single node translates directly to operational simplicity and cost.

The benchmark methodology and raw results are available on request. Email [email protected] if you want the full data file.

Filed under: Engineering, Data Infrastructure · Back to Blog