mirror of
https://github.com/beshu-tech/deltaglider.git
synced 2026-05-24 07:47:05 +02:00
2d345bc663
The Rust `deltaglider_proxy` ships proxy + CLI + UI in one binary with a byte-identical wire format. Maintaining both has been a duplication tax (metadata-namespace fix v6.1.2 had to land twice). This release is the final feature release; security/bug fixes stop here. What this commit does: - CLI: every invocation prints a deprecation notice to stderr pointing at github.com/beshu-tech/deltaglider_proxy with a one-line migration alias (`alias dg='deltaglider_proxy s3'`). Banner prints once per process; suppress via DG_SUPPRESS_DEPRECATION=1 for CI that hasn't migrated yet. - README: prominent deprecation banner at the top with the migration command and the archive-timing notice (~1 week after v6.2.0 ships). - pyproject.toml: description prefixed with "DEPRECATED" so PyPI search results show the warning. Classifier moved Beta -> Inactive. - CHANGELOG: v6.2.0 entry under "Deprecated" documenting the migration path + archive plan, preserving the carried-forward Fixed/Changed/ Added items from Unreleased. Repo archive timing: Maintainer will archive ~1 week after v6.2.0 hits PyPI to give users a window to see the stderr notice on their next update. PyPI installs continue to work indefinitely. No behaviour changes to the wire format, the CLI surface, or the metadata schema. Existing buckets remain readable forever. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
348 lines
19 KiB
Markdown
348 lines
19 KiB
Markdown
# Changelog
|
|
|
|
All notable changes to this project will be documented in this file.
|
|
|
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
|
|
## [6.2.0] - 2026-05-22 — Final release; project deprecated
|
|
|
|
### Deprecated
|
|
- **The `deltaglider` Python package is deprecated as of this release.**
|
|
The canonical implementation is now
|
|
[`deltaglider_proxy`](https://github.com/beshu-tech/deltaglider_proxy), a
|
|
single Rust binary that ships the S3-compatible proxy, the `s3` CLI
|
|
(every Python subcommand has a 1:1 Rust equivalent), and the web UI.
|
|
Wire format is byte-identical: data written by this tool is readable
|
|
by `deltaglider_proxy` and vice versa.
|
|
- **Every CLI invocation now prints a deprecation banner to stderr.**
|
|
Set `DG_SUPPRESS_DEPRECATION=1` to silence it for CI/automation that
|
|
hasn't migrated yet.
|
|
- **PyPI classifier bumped to `Development Status :: 7 - Inactive`.**
|
|
- **Repo will be archived approximately one week after this release.**
|
|
PyPI installs continue to work indefinitely (PyPI never deletes
|
|
published versions), but no further updates or security fixes will
|
|
land. File new issues against
|
|
[`deltaglider_proxy`](https://github.com/beshu-tech/deltaglider_proxy/issues).
|
|
|
|
Migration:
|
|
```bash
|
|
brew install beshu-tech/tap/deltaglider_proxy
|
|
# or grab a binary from
|
|
# https://github.com/beshu-tech/deltaglider_proxy/releases
|
|
|
|
alias dg='deltaglider_proxy s3'
|
|
dg cp foo s3://bucket/foo
|
|
dg ls s3://bucket
|
|
dg migrate s3://src s3://dest
|
|
```
|
|
|
|
### Fixed (carried from Unreleased)
|
|
- **Direct-upload metadata now uses the canonical `dg-*` dashed namespace.** Pre-fix, files routed through `_upload_direct` (non-delta-eligible extensions: `.sha1`, `.sha512`, etc.) wrote metadata with bare underscored keys (`original_name`, `file_sha256`, `compression`) while delta and reference uploads correctly used the namespaced form (`dg-original-name`, `dg-file-sha256`, `dg-compression`). Downstream consumers — most visibly the [DeltaGlider Proxy](https://github.com/beshu-tech/deltaglider_proxy) — only recognised the dashed form, so every `.sha1`/`.sha512` listing triggered a `PATHOLOGICAL | Missing/corrupt DG metadata` warning. Aligned the writer to the canonical scheme so new uploads stop producing log spam.
|
|
|
|
### Changed
|
|
- **Read path now resolves both schemes uniformly.** The historical bare keys (`original_name`, `compression`, etc.) stay in `METADATA_KEY_ALIASES` so already-stored objects keep being recognised on read — no migration required. Replaced ad-hoc `metadata.get("compression")` / `metadata.get("original_name")` / `metadata.get("file_size")` / `metadata.get("ref_key")` lookups in `DeltaService.get`, `DeltaService.delete`, `_delete_delta`, the recursive-delete listing path, `client.list_objects_v2`, and `client_operations.stats.get_object_info` with `resolve_metadata(meta, field)` calls so both schemes work transparently for the lifetime of the bucket. New `compression` and `source_name` entries added to the alias table.
|
|
- **`DeltaService.get` "regular S3 vs DeltaGlider-managed" dispatch** now uses `resolve_metadata` for the `file_sha256` presence check. Pre-fix, this check looked for the literal string `"dg-file-sha256"` in `obj_head.metadata`, which silently misclassified legacy bare-keyed direct uploads (`file_sha256` without the `dg-` prefix) as "regular S3 objects" — they still served correctly because both branches call `_get_direct`, but the wrong log line fired and the wrong `file_size` value was recorded for telemetry. Caught during adversarial PR review.
|
|
|
|
### Added
|
|
- **Regression tests for the dual-scheme contract** (`tests/unit/test_metadata_aliases.py`, 11 tests): every alias resolves, new dashed keys win when both are present, empty strings count as missing, the alias-table shape is pinned (first alias dashed, bare underscored alias always present, `compression` + `source_name` present).
|
|
- **`test_direct_upload_emits_dashed_namespace`** in `test_core_service.py` pins the writer to emit `dg-*`-only metadata so the original underscored regression cannot return.
|
|
- **`test_get_legacy_direct_upload_not_misclassified_as_regular_s3`** in `test_core_service.py` pins the `get()` dispatch to route bare-keyed legacy direct uploads through the DeltaGlider-managed branch (not the "regular S3 object" passthrough). Demonstrated to fail without the corresponding `resolve_metadata` swap, pass with it.
|
|
|
|
## [6.1.1] - 2026-03-23
|
|
|
|
### Fixed
|
|
- **S3-Compatible Endpoint Support**: Disabled boto3 automatic request checksums (CRC32/CRC64) that were added in boto3 1.36+. S3-compatible stores like Hetzner Object Storage reject these headers with `BadRequest`, breaking direct (non-delta) file uploads. Sets `request_checksum_calculation="when_required"` to restore compatibility while still working with AWS S3.
|
|
- **CI: LocalStack pinned to 4.4** — `localstack/localstack:latest` now requires a paid license; pinned to last free version across all workflows and docker-compose files.
|
|
|
|
### Changed
|
|
- **Dependency Pinning**: All runtime dependencies now use major-version upper bounds (`boto3>=1.35.0,<2.0.0`, etc.) to prevent surprise breaking changes in Docker builds.
|
|
|
|
### Added
|
|
- **S3 Compatibility Tests**: New `test_s3_compat.py` unit tests verifying the boto3 client disables automatic checksums and `put_object` doesn't pass checksum kwargs — regression protection for non-AWS S3 endpoints.
|
|
- **Dependency Management Guide**: Added quarterly dependency refresh checklist and known compatibility constraints to CLAUDE.md.
|
|
|
|
## [6.1.0] - 2025-02-07
|
|
|
|
### Added
|
|
- **Bucket ACL Management**: New `put_bucket_acl()` and `get_bucket_acl()` methods
|
|
- boto3-compatible passthrough to native S3 ACL operations
|
|
- Supports canned ACLs (`private`, `public-read`, `public-read-write`, `authenticated-read`)
|
|
- Supports grant-based ACLs (`GrantRead`, `GrantWrite`, `GrantFullControl`, etc.)
|
|
- Supports full `AccessControlPolicy` dict for fine-grained control
|
|
- SDK method count increased from 21 to 23
|
|
- **New CLI Commands**: `deltaglider put-bucket-acl` and `deltaglider get-bucket-acl`
|
|
- Mirrors `aws s3api put-bucket-acl` / `get-bucket-acl` syntax
|
|
- Accepts bucket name or `s3://bucket` URL format
|
|
- JSON output for `get-bucket-acl` (compatible with AWS CLI)
|
|
- Supports `--endpoint-url`, `--region`, `--profile` flags
|
|
- **Docker Publishing**: Added GitHub Actions workflow for multi-arch Docker image builds (amd64/arm64)
|
|
|
|
### Changed
|
|
- **Refactor**: Extracted `DeltaGliderConfig` dataclass for centralized configuration management
|
|
- **Refactor**: Introduced typed `DeleteResult` and `RecursiveDeleteResult` dataclasses replacing raw dicts
|
|
- **Refactor**: Centralized S3 metadata key aliases into `core/models.py` constants
|
|
- **Refactor**: Extracted helper methods in `DeltaService` for improved readability
|
|
|
|
### Fixed
|
|
- Removed unused imports flagged by ruff in test files
|
|
|
|
### Documentation
|
|
- Updated BOTO3_COMPATIBILITY.md (coverage 20% → 23%)
|
|
- Updated AWS S3 CLI compatibility docs with ACL command examples
|
|
- Refreshed README with dark mode logo and streamlined content
|
|
- Cleaned up SDK documentation and examples
|
|
|
|
## [6.0.0] - 2025-10-17
|
|
|
|
### Added
|
|
- **EC2 Region Detection & Cost Optimization**
|
|
- Automatic detection of EC2 instance region using IMDSv2
|
|
- Warns when EC2 region ≠ S3 client region (potential cross-region charges)
|
|
- Different warnings for auto-detected vs. explicit `--region` flag mismatches
|
|
- Green checkmark when regions are aligned (optimal configuration)
|
|
- Can be disabled with `DG_DISABLE_EC2_DETECTION=true` environment variable
|
|
- Helps users optimize for cost and performance before migration starts
|
|
- **New CLI Command**: `deltaglider migrate` for S3-to-S3 bucket migration with compression
|
|
- Supports resume capability (skips already migrated files)
|
|
- Real-time progress tracking with file count and statistics
|
|
- Interactive confirmation prompt (use `--yes` to skip)
|
|
- Prefix preservation by default (use `--no-preserve-prefix` to disable)
|
|
- Dry run mode with `--dry-run` flag
|
|
- Include/exclude pattern filtering
|
|
- Shows compression statistics after migration
|
|
- **EC2-aware region logging**: Detects EC2 instance and warns about cross-region charges
|
|
- **FIXED**: Now correctly preserves original filenames during migration
|
|
- **S3-to-S3 Recursive Copy**: `deltaglider cp -r s3://source/ s3://dest/` now supported
|
|
- Automatically uses migration functionality with prefix preservation
|
|
- Applies delta compression during transfer
|
|
- Preserves original filenames correctly
|
|
- **Version Command**: Added `--version` flag to show deltaglider version
|
|
- Usage: `deltaglider --version`
|
|
- **DeltaService API Enhancement**: Added `override_name` parameter to `put()` method
|
|
- Allows specifying destination filename independently of source filesystem path
|
|
- Enables proper S3-to-S3 transfers without filesystem renaming tricks
|
|
- **Rehydration & Purge**: Automatic rehydration of delta-compressed files for presigned URL access
|
|
- New `deltaglider purge` CLI command to clean expired temporary files
|
|
- **Metadata Namespace**: Centralized `dg-` prefixed metadata keys for all DeltaGlider metadata
|
|
- **S3-Based Stats Caching**: Bucket statistics cached in S3 with automatic invalidation
|
|
|
|
### Fixed
|
|
- **Critical**: S3-to-S3 migration now preserves original filenames
|
|
- Previously created files with temp names like `tmp1b9cpdsn.zip`
|
|
- Now correctly uses original filenames from source S3 keys
|
|
- Fixed by adding `override_name` parameter to `DeltaService.put()`
|
|
- **CLI Region Support**: `--region` flag now properly passes region to boto3 client
|
|
- Previously only set environment variable, relied on boto3 auto-detection
|
|
- Now explicitly passes `region_name` to `boto3.client()` via `boto3_kwargs`
|
|
- Ensures consistent behavior with `DeltaGliderClient` SDK
|
|
|
|
### Changed
|
|
- Recursive S3-to-S3 copy operations now preserve source prefix structure by default
|
|
- Migration operations show formatted output with source and destination paths
|
|
|
|
### Documentation
|
|
- Added comprehensive migration guide in README.md
|
|
- Updated CLI reference with migrate command examples
|
|
- Added prefix preservation behavior documentation
|
|
|
|
## [5.1.1] - 2025-01-10
|
|
|
|
### Fixed
|
|
- **Stats Command**: Fixed incorrect compression ratio calculations
|
|
- Now correctly counts ALL files including reference.bin in compressed size
|
|
- Fixed handling of orphaned reference.bin files (reference files with no delta files)
|
|
- Added prominent warnings for orphaned reference files with cleanup commands
|
|
- Fixed stats for buckets with no compression (now shows 0% instead of negative)
|
|
- SHA1 checksum files are now properly included in calculations
|
|
|
|
### Improved
|
|
- **Stats Performance**: Optimized metadata fetching with parallel requests
|
|
- 5-10x faster for buckets with many delta files
|
|
- Uses ThreadPoolExecutor for concurrent HEAD requests
|
|
- Single-pass calculation algorithm for better efficiency
|
|
|
|
## [5.1.0] - 2025-10-10
|
|
|
|
### Added
|
|
- **New CLI Command**: `deltaglider stats <bucket>` for bucket statistics and compression metrics
|
|
- Supports `--detailed` flag for comprehensive analysis
|
|
- Supports `--json` flag for machine-readable output
|
|
- Accepts multiple formats: `s3://bucket/`, `s3://bucket`, `bucket`
|
|
- **Session-Level Statistics Caching**: Bucket stats now cached per client instance
|
|
- Automatic cache invalidation on mutations (put, delete, bucket operations)
|
|
- Intelligent cache reuse (detailed stats serve quick stat requests)
|
|
- Enhanced `list_buckets()` includes cached stats when available
|
|
- **Programmatic Cache Management**: Added cache management APIs for long-running applications
|
|
- `clear_cache()`: Clear all cached references
|
|
- `evict_cache()`: Remove specific cached reference
|
|
- Session-scoped cache lifecycle management
|
|
|
|
### Changed
|
|
- Bucket statistics are now cached within client session for performance
|
|
- `list_buckets()` response includes `DeltaGliderStats` metadata when cached
|
|
|
|
### Documentation
|
|
- Added comprehensive DG_MAX_RATIO tuning guide in docs/
|
|
- Updated CLI command reference in CLAUDE.md and README.md
|
|
- Added detailed cache management documentation
|
|
|
|
## [5.0.3] - 2025-10-10
|
|
|
|
### Security
|
|
- **BREAKING**: Removed all legacy shared cache code for security
|
|
- **BREAKING**: Encryption is now ALWAYS ON (cannot be disabled)
|
|
- Ephemeral process-isolated cache is now the ONLY mode (no opt-out)
|
|
- **Content-Addressed Storage (CAS)**: Implemented SHA256-based cache storage
|
|
- Zero collision risk (SHA256 namespace guarantees uniqueness)
|
|
- Automatic deduplication (same content = same filename)
|
|
- Tampering protection (changing content changes SHA, breaks lookup)
|
|
- Two-level directory structure for filesystem optimization
|
|
- **Encrypted Cache**: All cache data encrypted at rest using Fernet (AES-128-CBC + HMAC)
|
|
- Ephemeral encryption keys per process (forward secrecy)
|
|
- Optional persistent keys via `DG_CACHE_ENCRYPTION_KEY` for shared filesystems
|
|
- Automatic cleanup of corrupted cache files on decryption failures
|
|
- Fixed TOCTOU vulnerabilities with atomic SHA validation at use-time
|
|
- Added `get_validated_ref()` method to prevent cache poisoning
|
|
- Eliminated multi-user data exposure through mandatory cache isolation
|
|
|
|
### Removed
|
|
- **BREAKING**: Removed `DG_UNSAFE_SHARED_CACHE` environment variable
|
|
- **BREAKING**: Removed `DG_CACHE_DIR` environment variable
|
|
- **BREAKING**: Removed `DG_CACHE_ENCRYPTION` environment variable (encryption always on)
|
|
- **BREAKING**: Removed `cache_dir` parameter from `create_client()`
|
|
|
|
### Changed
|
|
- Cache is now auto-created in `/tmp/deltaglider-*` and cleaned on exit
|
|
- All cache operations use file locking (Unix) and SHA validation
|
|
- Added `CacheMissError` and `CacheCorruptionError` exceptions
|
|
|
|
### Added
|
|
- New `ContentAddressedCache` adapter in `adapters/cache_cas.py`
|
|
- New `EncryptedCache` wrapper in `adapters/cache_encrypted.py`
|
|
- New `MemoryCache` adapter in `adapters/cache_memory.py` with LRU eviction
|
|
- Self-describing cache structure with SHA256-based filenames
|
|
- Configurable cache backends via `DG_CACHE_BACKEND` (filesystem or memory)
|
|
- Memory cache size limit via `DG_CACHE_MEMORY_SIZE_MB` (default: 100MB)
|
|
|
|
### Internal
|
|
- Updated all tests to use Content-Addressed Storage and encryption
|
|
- All 119 tests passing with zero errors (99 original + 20 new cache tests)
|
|
- Type checking: 0 errors (mypy)
|
|
- Linting: All checks passed (ruff)
|
|
- Completed Phase 1, 2, and 7 of SECURITY_FIX_ROADMAP.md
|
|
- Added comprehensive test suites for encryption (13 tests) and memory cache (10 tests)
|
|
|
|
## [5.0.1] - 2025-01-10
|
|
|
|
### Changed
|
|
- **Code Organization**: Refactored client.py from 1560 to 1154 lines (26% reduction)
|
|
- Extracted client operations into modular `client_operations/` package:
|
|
- `bucket.py` - S3 bucket management operations
|
|
- `presigned.py` - Presigned URL generation
|
|
- `batch.py` - Batch upload/download operations
|
|
- `stats.py` - Analytics and statistics operations
|
|
- Improved code maintainability with logical separation of concerns
|
|
- Better developer experience with cleaner module structure
|
|
|
|
### Internal
|
|
- Full type safety maintained with mypy (0 errors)
|
|
- All 99 tests passing
|
|
- Code quality checks passing (ruff)
|
|
- No breaking changes - all public APIs remain unchanged
|
|
|
|
## [5.0.0] - 2025-01-10
|
|
|
|
### Added
|
|
- boto3-compatible TypedDict types for S3 responses (no boto3 import needed)
|
|
- Complete boto3 compatibility vision document
|
|
- Type-safe response builders using TypedDict patterns
|
|
|
|
### Changed
|
|
- **BREAKING**: `list_objects()` now returns boto3-compatible dict instead of custom dataclass
|
|
- Use `response['Contents']` instead of `response.contents`
|
|
- Use `response.get('IsTruncated')` instead of `response.is_truncated`
|
|
- Use `response.get('NextContinuationToken')` instead of `response.next_continuation_token`
|
|
- DeltaGlider metadata now in `Metadata` field of each object
|
|
- Internal response building now uses TypedDict for compile-time type safety
|
|
- All S3 responses are dicts at runtime (TypedDict is a dict!)
|
|
|
|
### Fixed
|
|
- Updated all documentation examples to use dict-based responses
|
|
- Fixed pagination examples in README and API docs
|
|
- Corrected SDK documentation with accurate method signatures
|
|
|
|
## [4.2.4] - 2025-01-10
|
|
|
|
### Fixed
|
|
- Show only filename in `ls` output instead of full path for cleaner display
|
|
- Correct `ls` command path handling and prefix display logic
|
|
|
|
## [4.2.3] - 2025-01-07
|
|
|
|
### Added
|
|
- Comprehensive test coverage for `delete_objects_recursive()` method with 19 thorough tests
|
|
- Tests cover delta suffix handling, error/warning aggregation, statistics tracking, and edge cases
|
|
- Better code organization with separate `client_models.py` and `client_delete_helpers.py` modules
|
|
|
|
### Fixed
|
|
- Fixed all mypy type errors using proper `cast()` for type safety
|
|
- Improved type hints for dictionary operations in client code
|
|
|
|
### Changed
|
|
- Refactored client code into logical modules for better maintainability
|
|
- Enhanced code quality with comprehensive linting and type checking
|
|
- All 99 integration/unit tests passing with zero type errors
|
|
|
|
### Internal
|
|
- Better separation of concerns in client module
|
|
- Improved developer experience with clearer code structure
|
|
|
|
## [4.2.2] - 2024-10-06
|
|
|
|
### Fixed
|
|
- Add .delta suffix fallback for `delete_object()` method
|
|
- Handle regular S3 objects without DeltaGlider metadata
|
|
- Update mypy type ignore comment for compatibility
|
|
|
|
## [4.2.1] - 2024-10-06
|
|
|
|
### Fixed
|
|
- Make GitHub release creation non-blocking in workflows
|
|
|
|
## [4.2.0] - 2024-10-03
|
|
|
|
### Added
|
|
- AWS credential parameters to `create_client()` function
|
|
- Support for custom endpoint URLs
|
|
- Enhanced boto3 compatibility
|
|
|
|
## [4.1.0] - 2024-09-29
|
|
|
|
### Added
|
|
- boto3-compatible client API
|
|
- Bucket management methods
|
|
- Comprehensive SDK documentation
|
|
|
|
## [4.0.0] - 2024-09-21
|
|
|
|
### Added
|
|
- Initial public release
|
|
- CLI with AWS S3 compatibility
|
|
- Delta compression for versioned artifacts
|
|
- 99%+ compression for similar files
|
|
|
|
[6.1.0]: https://github.com/beshu-tech/deltaglider/compare/v6.0.2...v6.1.0
|
|
[6.0.0]: https://github.com/beshu-tech/deltaglider/compare/v5.1.1...v6.0.0
|
|
[5.1.0]: https://github.com/beshu-tech/deltaglider/compare/v5.0.3...v5.1.0
|
|
[5.0.3]: https://github.com/beshu-tech/deltaglider/compare/v5.0.1...v5.0.3
|
|
[5.0.1]: https://github.com/beshu-tech/deltaglider/compare/v5.0.0...v5.0.1
|
|
[5.0.0]: https://github.com/beshu-tech/deltaglider/compare/v4.2.4...v5.0.0
|
|
[4.2.4]: https://github.com/beshu-tech/deltaglider/compare/v4.2.3...v4.2.4
|
|
[4.2.3]: https://github.com/beshu-tech/deltaglider/compare/v4.2.2...v4.2.3
|
|
[4.2.2]: https://github.com/beshu-tech/deltaglider/compare/v4.2.1...v4.2.2
|
|
[4.2.1]: https://github.com/beshu-tech/deltaglider/compare/v4.2.0...v4.2.1
|
|
[4.2.0]: https://github.com/beshu-tech/deltaglider/compare/v4.1.0...v4.2.0
|
|
[4.1.0]: https://github.com/beshu-tech/deltaglider/compare/v4.0.0...v4.1.0
|
|
[4.0.0]: https://github.com/beshu-tech/deltaglider/releases/tag/v4.0.0
|