JSONL Compression: gzip vs zstd vs Brotli
A practical guide to compressing JSONL files. Compare compression ratios, speed benchmarks, and learn when to use gzip, zstd, or Brotli for your data pipelines, cloud storage, and web delivery.
Last updated: February 2026
Why Compress JSONL Files?
JSONL files grow fast. A single day of application logs can produce gigabytes of line-delimited JSON, and machine learning datasets routinely reach tens of gigabytes. Without compression, you pay more for storage, transfers take longer, and I/O becomes the bottleneck in your data pipeline. Compression is not optional at scale β it is a fundamental part of working with JSONL data efficiently.
The good news is that JSONL compresses exceptionally well. Because JSON is repetitive text with recurring keys, delimiters, and structural patterns, compression algorithms can exploit this redundancy to achieve 5x to 15x size reduction. The challenge is choosing the right algorithm for your use case: gzip offers universal compatibility, zstd delivers the best speed-to-ratio tradeoff, and Brotli achieves the highest compression for static assets. This guide compares all three with real benchmarks, working code examples, and clear recommendations.
Compression Algorithms Overview
Three algorithms dominate the JSONL compression landscape. Each uses different strategies and is optimized for different scenarios. Understanding their tradeoffs helps you make the right choice for your specific workload.
gzip (DEFLATE)
UniversalThe universal standard. gzip has been around since 1992 and is supported everywhere β every programming language, every operating system, every cloud provider, and every web browser. It uses the DEFLATE algorithm combining LZ77 and Huffman coding. While not the fastest or most efficient, its ubiquity makes it the safe default choice when compatibility matters most.
Zstandard (zstd)
RecommendedDeveloped by Facebook in 2016, zstd is the modern workhorse of data compression. It compresses and decompresses significantly faster than gzip while achieving similar or better ratios. Zstd also supports dictionary compression, which is especially powerful for JSONL files where every line shares the same key structure. It is the best choice for data pipelines and real-time processing.
Brotli
Best RatioCreated by Google, Brotli achieves the highest compression ratios among the three, especially at maximum compression levels. It uses a combination of LZ77, Huffman coding, and a built-in static dictionary of common web content. Brotli excels at compressing JSONL for HTTP delivery and static storage, but its compression speed at high levels is notably slower than gzip or zstd.
Head-to-Head Comparison
The following table summarizes the key differences between gzip, zstd, and Brotli across the metrics that matter most when compressing JSONL files. These are general characteristics at default settings; actual performance varies with data and compression level.
| Metric | gzip | zstd | Brotli |
|---|---|---|---|
| Compression Ratio | Good (5-8x) | Very Good (6-10x) | Excellent (7-12x) |
| Compression Speed | Moderate | Fast | Slow to Moderate |
| Decompression Speed | Moderate | Very Fast | Fast |
| CPU Usage | Moderate | Low to Moderate | High (at max level) |
| Browser Support | All browsers | Chrome 123+, Firefox 126+ | All modern browsers |
| Streaming Support | Yes (native) | Yes (native) | Limited |
Benchmark Results: 100 MB JSONL File
To give concrete numbers, here are benchmark results from compressing a 100 MB JSONL file containing application log records. Each record has 12 fields including timestamps, log levels, message strings, and nested metadata objects. Tests were run on an AMD Ryzen 7 with 32 GB RAM and NVMe storage.
| Algorithm & Level | Compressed Size | Ratio | Compress Time | Decompress Time |
|---|---|---|---|---|
| gzip (level 6) | 14.2 MB | 7.0x | 2.8s | 0.9s |
| gzip (level 9) | 13.1 MB | 7.6x | 8.4s | 0.9s |
| zstd (level 3) | 12.8 MB | 7.8x | 0.6s | 0.3s |
| zstd (level 1) | 15.1 MB | 6.6x | 0.3s | 0.3s |
| Brotli (level 6) | 11.5 MB | 8.7x | 3.2s | 0.5s |
| Brotli (level 11) | 9.8 MB | 10.2x | 42.1s | 0.4s |
Benchmarks are representative of typical JSONL log data. Results vary depending on field cardinality, value entropy, and record structure. Files with highly repetitive keys and low-entropy values (such as log levels or status codes) compress better than those with unique high-entropy strings.
Compression Code Examples
Here are practical examples for compressing and decompressing JSONL files in Python, Node.js, and from the command line. Each example shows how to work with all three algorithms.
Python has built-in gzip support. For zstd and Brotli, install the pyzstd and brotli packages. All three follow the same pattern: open a compressed file handle, then read or write JSONL lines through it.
import gzipimport json# === gzip (built-in) ===# Write compressed JSONLwith gzip.open('data.jsonl.gz', 'wt', encoding='utf-8') as f:for record in records:f.write(json.dumps(record, ensure_ascii=False) + '\n')# Read compressed JSONLwith gzip.open('data.jsonl.gz', 'rt', encoding='utf-8') as f:for line in f:record = json.loads(line)# === zstd (pip install pyzstd) ===import pyzstd# Write compressed JSONLwith pyzstd.open('data.jsonl.zst', 'wt', encoding='utf-8') as f:for record in records:f.write(json.dumps(record, ensure_ascii=False) + '\n')# Read compressed JSONLwith pyzstd.open('data.jsonl.zst', 'rt', encoding='utf-8') as f:for line in f:record = json.loads(line)# === Brotli (pip install brotli) ===import brotli# Compress an entire JSONL filewith open('data.jsonl', 'rb') as f:raw = f.read()compressed = brotli.compress(raw, quality=6)with open('data.jsonl.br', 'wb') as f:f.write(compressed)# Decompresswith open('data.jsonl.br', 'rb') as f:raw = brotli.decompress(f.read())for line in raw.decode('utf-8').splitlines():record = json.loads(line)
Node.js includes built-in support for both gzip and Brotli through the zlib module. For zstd, use the @aspect-build/zstd or fzstd npm package. The stream-based API is ideal for processing large JSONL files without loading them entirely into memory.
import { createReadStream, createWriteStream } from 'fs';import { createGzip, createGunzip, createBrotliCompress,createBrotliDecompress } from 'zlib';import { createInterface } from 'readline';import { pipeline } from 'stream/promises';// === gzip compress ===await pipeline(createReadStream('data.jsonl'),createGzip({ level: 6 }),createWriteStream('data.jsonl.gz'));// === gzip decompress & parse ===const gunzip = createGunzip();const rl = createInterface({input: createReadStream('data.jsonl.gz').pipe(gunzip),});for await (const line of rl) {if (line.trim()) {const record = JSON.parse(line);// process record}}// === Brotli compress ===await pipeline(createReadStream('data.jsonl'),createBrotliCompress(),createWriteStream('data.jsonl.br'));// === Brotli decompress & parse ===const br = createBrotliDecompress();const rl2 = createInterface({input: createReadStream('data.jsonl.br').pipe(br),});for await (const line of rl2) {if (line.trim()) {const record = JSON.parse(line);}}
Command-line tools are the fastest way to compress JSONL files. gzip is pre-installed on all Unix systems. Install zstd and brotli via your package manager for the other two algorithms.
# === gzip ===# Compress (keeps original by default with -k)gzip -k data.jsonl # -> data.jsonl.gzgzip -9 -k data.jsonl # max compression# Decompressgzip -d data.jsonl.gz# or: gunzip data.jsonl.gz# === zstd ===# Install: brew install zstd / apt install zstdzstd data.jsonl # -> data.jsonl.zstzstd -3 data.jsonl # level 3 (default)zstd --fast data.jsonl # fastest compression# Decompresszstd -d data.jsonl.zst# or: unzstd data.jsonl.zst# === Brotli ===# Install: brew install brotli / apt install brotlibrotli data.jsonl # -> data.jsonl.brbrotli -q 6 data.jsonl # quality 6brotli -q 11 data.jsonl # max compression# Decompressbrotli -d data.jsonl.br# === Piping with jq ===# Compress filtered JSONLcat data.jsonl | jq -c 'select(.level == "error")' | gzip > errors.jsonl.gz# Decompress and count lineszstd -dc data.jsonl.zst | wc -l
Cloud Storage Compression Strategies
When storing JSONL files in cloud object storage, compression reduces both storage costs and transfer time. Most cloud providers support transparent decompression for gzip and Brotli through their CDN layers, but the upload and storage strategies differ.
Upload compressed JSONL to S3 with the correct Content-Encoding header. S3 stores the compressed bytes, and CloudFront can serve them with automatic decompression. For data lake workloads, tools like AWS Athena and Spark natively read gzip and zstd compressed JSONL.
guide-jsonl-compression.jsonlCompression.cloudStorage.s3.code
Google Cloud Storage supports gzip transcoding. When you upload a gzip-compressed object with the Content-Encoding: gzip header, GCS can serve the decompressed version automatically when clients send Accept-Encoding: gzip. For BigQuery imports, use gzip-compressed JSONL directly.
from google.cloud import storageimport gzipimport jsonclient = storage.Client()bucket = client.bucket('my-data-bucket')# Upload gzip-compressed JSONLdef upload_compressed(records, blob_name):blob = bucket.blob(f{blob_name}.jsonl.gz)blob.content_encoding = 'gzip'blob.content_type = 'application/x-ndjson'data = '\n'.join(json.dumps(r, ensure_ascii=False) for r in records).encode('utf-8')blob.upload_from_string(gzip.compress(data),content_type='application/x-ndjson',)# BigQuery: load compressed JSONL directly# bq load --source_format=NEWLINE_DELIMITED_JSON \# my_dataset.my_table gs://bucket/data.jsonl.gz schema.json
Best Practices: When to Use Which Algorithm
There is no single best compression algorithm. The right choice depends on whether you prioritize storage size, processing speed, compatibility, or a balance of all three. Here are clear recommendations for common JSONL use cases.
Archival & Cold Storage
Use Brotli (quality 9-11) or zstd (level 19+) for maximum compression.
Compression time matters less for archival. You compress once and decompress rarely. Brotli at quality 11 can achieve 10x+ compression on JSONL data, significantly reducing long-term storage costs.
Real-time Data Pipelines
Use zstd (level 1-3) for the best speed-to-ratio tradeoff.
In streaming pipelines (Kafka, Kinesis, Flink), compression and decompression speed directly affect throughput and latency. Zstd at level 1 compresses faster than gzip while achieving better ratios. Its dictionary mode is ideal for JSONL with fixed schemas.
Web Delivery & APIs
Use Brotli for static files, gzip as fallback for maximum compatibility.
All modern browsers support Brotli via Accept-Encoding: br. CDNs like Cloudflare and CloudFront can automatically compress with Brotli. Use gzip as fallback for older clients. Zstd browser support is growing but not yet universal.
ETL & Batch Processing
Use gzip for maximum compatibility, or zstd for better performance.
Most data tools (Spark, Athena, BigQuery, pandas) support gzip natively. Zstd support is improving rapidly. If your toolchain supports zstd, prefer it for 3-5x faster compression with comparable ratios.
Try Our Free JSONL Tools
Compress your JSONL files before uploading, or validate and convert them using our free online tools. No installation required.