The sha tag template `prefix={{branch}}-` produces `:-hash` on tag
pushes because {{branch}} is empty, resulting in an invalid Docker
tag like `beshultd/deltaglider:-482f45f`. Only emit sha tags on
branch pushes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add boto3-compatible bucket ACL operations as pure S3 passthroughs,
following the existing create_bucket/delete_bucket pattern. Includes
CLI commands (put-bucket-acl, get-bucket-acl), 7 integration tests,
and documentation updates (method count 21→23).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace dict[str,Any] returns in delete/delete_recursive with DeleteResult
and RecursiveDeleteResult dataclasses for type safety
- Extract _delete_reference/_delete_delta/_classify_objects_for_deletion
helper methods from oversized delete methods in service.py
- Centralize metadata key aliases in METADATA_KEY_ALIASES dict with
resolve_metadata() replacing duplicated _meta_value() lookups
- Add DeltaGliderConfig dataclass with from_env() for centralized config
- Add ObjectKey.full_key property, remove dead _multipart_uploads dict
- Update all consumers (client, CLI, tests) for dataclass access patterns
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Added `rehydrate_for_download` method to download and decompress deltaglider-compressed files, re-uploading them with expiration metadata.
- Introduced `generate_presigned_url_with_rehydration` method to generate presigned URLs that automatically handle rehydration for both regular and deltaglider files.
- Implemented `purge_temp_files` command in CLI to delete expired temporary files from the .deltaglider/tmp/ directory, with options for dry run and JSON output.
- Enhanced service methods to support the new rehydration and purging features, including detailed logging and metrics tracking.
- Added object_limit_reached attribute to BucketStats for tracking limits.
- Introduced QUICK_LIST_LIMIT and SAMPLED_LIST_LIMIT constants to manage listing limits.
- Implemented _first_metadata_value helper function for improved metadata retrieval.
- Updated get_bucket_stats to log when listing is capped due to limits.
- Refactored DeltaMeta to streamline metadata extraction with error handling.
- Enhanced object listing to support max_objects parameter and limit tracking.
This document describes the old metadata format without dg- prefix.
Since v6.0.0 uses the new dg- prefixed format and requires all files
to be re-uploaded (greenfield approach), this diagnosis doc is no longer
relevant.
This is a major release with breaking changes to metadata format.
BREAKING CHANGES:
- All metadata keys now use 'dg-' namespace prefix (becomes 'x-amz-meta-dg-*' in S3)
- Old metadata format is not supported - all files must be re-uploaded
- Stats behavior changed: quick mode no longer shows misleading warnings
Features:
- Metadata now uses real package version (dg-tool: deltaglider/VERSION)
- All metadata keys properly namespaced with 'dg-' prefix
- Clean stats output in quick mode (no per-file warning spam)
- Fixed nonsensical negative compression ratios in quick mode
Fixes:
- Stats now correctly handles delta files without metadata
- Space saved shows 0 instead of negative numbers when metadata unavailable
- Removed misleading warnings in quick mode (metadata not fetched is expected)
- Fixed metadata keys to use hyphens instead of underscores
Documentation:
- Added comprehensive metadata documentation
- Added stats calculation behavior guide
- Added real version tracking documentation
Tests:
- Updated all tests to use new dg- prefixed metadata keys
- All 73 unit tests passing
- All quality checks passing (ruff, mypy)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Fix pagination bug using continuation_token instead of start_after
- Add stats caching to prevent blocking web apps
- Improve code formatting and type checking
- Add comprehensive unit tests for new features
- Fix test mock usage in object_listing tests
**Performance Issues Fixed:**
1. aws_compat.py: Changed to use cached stats only (no bucket scans after uploads)
2. stats.py: Added safety mechanisms to prevent infinite hangs
- Max 10k iterations (10M object limit)
- 10 min timeout on metadata fetching
- Missing pagination token detection
- Graceful error recovery with partial stats
**Refactoring:**
- Reduced nesting in get_bucket_stats from 5 levels to 2 levels
- Extracted 5 helper functions for better maintainability
- Main function reduced from 300+ lines to 33 lines
- 100% backward compatible - no API changes
**Benefits:**
- Web apps no longer hang on upload/delete operations
- Explicit get_bucket_stats() calls complete within bounded time
- Better error handling and logging
- Easier to test and maintain
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Remove unnecessary concurrent.futures patches in tests
- Update test_detailed_stats_flag to match current implementation behavior
- Tests now properly handle parallel metadata fetching without mocking
Created extensive documentation for the DG_MAX_RATIO parameter, which
controls delta compression efficiency thresholds.
New Documentation:
- docs/DG_MAX_RATIO.md (526 lines)
* Complete explanation of how DG_MAX_RATIO works
* Real-world scenarios and use cases
* Decision trees for choosing optimal values
* Industry-specific recommendations
* Monitoring and tuning strategies
* Advanced usage patterns
* Comprehensive FAQ
Updates to Existing Documentation:
- README.md: Added link to DG_MAX_RATIO guide with tip callout
- CLAUDE.md: Added detailed DG_MAX_RATIO explanation and guide link
- Dockerfile: Added inline comments explaining DG_MAX_RATIO tuning
- docs/sdk/getting-started.md: Added DG_MAX_RATIO guide reference
Key Topics Covered:
- What DG_MAX_RATIO does and why it exists
- How to choose the right value (0.2-0.7 range)
- Real-world scenarios (nightly builds, major versions, etc.)
- Industry-specific use cases (SaaS, mobile apps, backups, etc.)
- Configuration examples (Docker, SDK, CLI)
- Monitoring and optimization strategies
- Advanced usage patterns (dynamic ratios, A/B testing)
- FAQ addressing common questions
Examples Included:
- Conservative (0.2-0.3): For dissimilar files or expensive storage
- Default (0.5): Balanced approach for most use cases
- Permissive (0.6-0.7): For very similar files or cheap storage
Value Proposition:
- Helps users optimize compression for their specific use case
- Prevents inefficient delta compression
- Provides data-driven tuning methodology
- Reduces support questions about compression behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
BREAKING CHANGES:
- Encryption is now ALWAYS enabled (cannot be disabled)
- Removed DG_CACHE_ENCRYPTION environment variable
Security Enhancements:
- Encryption is mandatory for all cache operations
- Ephemeral encryption keys per process (forward secrecy)
- Automatic deletion of corrupted cache files on decryption failures
- Auto-cleanup on both decryption failures and SHA mismatches
Changes:
- Removed DG_CACHE_ENCRYPTION toggle from CLI and SDK
- Updated EncryptedCache to auto-delete corrupted files
- Simplified cache initialization (always wrapped with encryption)
- DG_CACHE_ENCRYPTION_KEY remains optional for persistent keys
Documentation:
- Updated CLAUDE.md with encryption always-on behavior
- Updated CHANGELOG.md with breaking changes
- Clarified security model and auto-cleanup behavior
Testing:
- All 119 tests passing with encryption always-on
- Type checking: 0 errors (mypy)
- Linting: All checks passed (ruff)
Rationale:
- Zero-trust cache architecture requires encryption
- Corrupted cache is security risk - auto-deletion prevents exploitation
- Ephemeral keys provide maximum security by default
- Users who need cross-process sharing can opt-in with persistent keys
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>