DeltaGlider reduces storage costs by storing only binary deltas between similar files. Achieves 99.9% compression for versioned artifacts. Key features: - Intelligent file type detection (delta for archives, direct for others) - Drop-in S3 replacement with automatic compression - SHA256 integrity verification on every operation - Clean hexagonal architecture - Full test coverage - Production tested with 200K+ files Case study: ReadOnlyREST reduced 4TB to 5GB (99.9% compression)
DeltaGlider 🛸
Store 4TB of similar files in 5GB. No, that's not a typo.
DeltaGlider is a drop-in S3 replacement that achieves 99.9% compression for versioned artifacts, backups, and release archives through intelligent binary delta compression.
The Problem We Solved
You're storing hundreds of versions of your releases. Each 100MB build differs by <1% from the previous version. You're paying to store 100GB of what's essentially 100MB of unique data.
Sound familiar?
Real-World Impact
From our ReadOnlyREST case study:
- Before: 201,840 files, 3.96TB storage, $1,120/year
- After: Same files, 4.9GB storage, $1.32/year
- Compression: 99.9% (not a typo)
- Integration time: 5 minutes
How It Works
Traditional S3:
v1.0.0.zip (100MB) → S3: 100MB
v1.0.1.zip (100MB) → S3: 100MB (200MB total)
v1.0.2.zip (100MB) → S3: 100MB (300MB total)
With DeltaGlider:
v1.0.0.zip (100MB) → S3: 100MB reference + 0KB delta
v1.0.1.zip (100MB) → S3: 98KB delta (100.1MB total)
v1.0.2.zip (100MB) → S3: 97KB delta (100.3MB total)
Quick Start
Installation
# Via pip (Python 3.11+)
pip install deltaglider
# Via uv (faster)
uv pip install deltaglider
# Via Docker
docker run -v ~/.aws:/root/.aws deltaglider/deltaglider --help
Your First Upload
# Upload a file - DeltaGlider automatically handles compression
deltaglider put my-app-v1.0.0.zip s3://releases/
# Upload v1.0.1 - automatically creates a 99% smaller delta
deltaglider put my-app-v1.0.1.zip s3://releases/
# ↑ This 100MB file takes only ~100KB in S3
# Download - automatically reconstructs from delta
deltaglider get s3://releases/my-app-v1.0.1.zip
# ↑ Seamless reconstruction, SHA256 verified
Intelligent File Type Detection
DeltaGlider automatically detects file types and applies the optimal strategy:
| File Type | Strategy | Typical Compression |
|---|---|---|
.zip, .tar, .gz |
Binary delta | 99%+ for similar versions |
.dmg, .deb, .rpm |
Binary delta | 95%+ for similar versions |
.jar, .war, .ear |
Binary delta | 90%+ for similar builds |
.exe, .dll, .so |
Direct upload | 0% (no delta benefit) |
.txt, .json, .xml |
Direct upload | 0% (use gzip instead) |
.sha1, .sha512, .md5 |
Direct upload | 0% (already minimal) |
Performance Benchmarks
Testing with real software releases:
# 513 Elasticsearch plugin releases (82.5MB each)
Original size: 42.3 GB
DeltaGlider size: 115 MB
Compression: 99.7%
Upload speed: 3-4 files/second
Download speed: <100ms reconstruction
Integration Examples
CI/CD Pipeline (GitHub Actions)
- name: Upload Release with 99% compression
run: |
pip install deltaglider
deltaglider put dist/*.zip s3://releases/${{ github.ref_name }}/
Backup Script
#!/bin/bash
# Daily backup with automatic deduplication
tar -czf backup-$(date +%Y%m%d).tar.gz /data
deltaglider put backup-*.tar.gz s3://backups/
# Only changes are stored, not full backup
Python SDK
from deltaglider import DeltaService
service = DeltaService(
bucket="releases",
storage_backend="s3", # or "minio", "r2", etc
)
# Upload with automatic compression
summary = service.put("my-app-v2.0.0.zip", "v2.0.0/")
print(f"Stored {summary.original_size} as {summary.stored_size}")
# Output: Stored 104857600 as 98304 (99.9% reduction)
# Download with automatic reconstruction
service.get("v2.0.0/my-app-v2.0.0.zip", "local-copy.zip")
Architecture
DeltaGlider uses a clean hexagonal architecture:
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ Your App │────▶│ DeltaGlider │────▶│ S3/MinIO │
│ (CLI/SDK) │ │ Core │ │ Storage │
└─────────────┘ └──────────────┘ └─────────────┘
│
┌──────▼───────┐
│ Local Cache │
│ (References) │
└──────────────┘
Key Components:
- Binary diff engine: xdelta3 for optimal compression
- Intelligent routing: Automatic file type detection
- Integrity verification: SHA256 on every operation
- Local caching: Fast repeated operations
- Zero dependencies: No database, no manifest files
When to Use DeltaGlider
✅ Perfect for:
- Software releases and versioned artifacts
- Container images and layers
- Database backups and snapshots
- Machine learning model checkpoints
- Game assets and updates
- Any versioned binary data
❌ Not ideal for:
- Already compressed unique files
- Streaming media files
- Frequently changing unstructured data
- Files smaller than 1MB
Comparison
| Solution | Compression | Speed | Integration | Cost |
|---|---|---|---|---|
| DeltaGlider | 99%+ | Fast | Drop-in | Open source |
| S3 Versioning | 0% | Native | Built-in | $$ per version |
| Deduplication | 30-50% | Slow | Complex | Enterprise $$$ |
| Git LFS | Good | Slow | Git-only | $ per GB |
| Restic/Borg | 80-90% | Medium | Backup-only | Open source |
Production Ready
- ✅ Battle tested: 200K+ files in production
- ✅ Data integrity: SHA256 verification on every operation
- ✅ S3 compatible: Works with AWS, MinIO, Cloudflare R2, etc.
- ✅ Atomic operations: No partial states
- ✅ Concurrent safe: Multiple clients supported
- ✅ Well tested: 95%+ code coverage
Development
# Clone the repo
git clone https://github.com/your-org/deltaglider
cd deltaglider
# Install with dev dependencies
uv pip install -e ".[dev]"
# Run tests
uv run pytest
# Run with local MinIO
docker-compose up -d
export AWS_ENDPOINT_URL=http://localhost:9000
deltaglider put test.zip s3://test/
FAQ
Q: What if my reference file gets corrupted? A: Every operation includes SHA256 verification. Corruption is detected immediately.
Q: How fast is reconstruction? A: Sub-100ms for typical files. The delta is applied in-memory using xdelta3.
Q: Can I use this with existing S3 data? A: Yes! DeltaGlider can start optimizing new uploads immediately. Old data remains accessible.
Q: What's the overhead for unique files? A: Zero. Files without similarity are uploaded directly.
Q: Is this compatible with S3 encryption? A: Yes, DeltaGlider respects all S3 settings including SSE, KMS, and bucket policies.
The Math
For N versions of a S MB file with D% difference between versions:
Traditional S3: N × S MB
DeltaGlider: S + (N-1) × S × D% MB
Example: 100 versions of 100MB files with 1% difference:
- Traditional: 10,000 MB
- DeltaGlider: 199 MB
- Savings: 98%
Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
Key areas we're exploring:
- Cloud-native reference management
- Rust implementation for 10x speed
- Automatic similarity detection
- Multi-threaded delta generation
- WASM support for browser usage
License
MIT - Use it freely in your projects.
Success Stories
"We reduced our artifact storage from 4TB to 5GB. This isn't hyperbole—it's math." — ReadOnlyREST Case Study
"Our CI/CD pipeline now uploads 100x faster. Deploys that took minutes now take seconds." — Platform Engineer at [redacted]
"We were about to buy expensive deduplication storage. DeltaGlider saved us $50K/year." — CTO at [stealth startup]
Try it now: Got versioned files in S3? See your potential savings:
# Analyze your S3 bucket
deltaglider analyze s3://your-bucket/
# Output: "Potential savings: 95.2% (4.8TB → 237GB)"
Built with ❤️ by engineers who were tired of paying to store the same bytes over and over.