mirror of
https://github.com/beshu-tech/deltaglider.git
synced 2026-04-10 02:53:44 +02:00
Implemented SHA256-based Content-Addressed Storage to eliminate cache collisions and enable automatic deduplication. Key Features: - Zero collision risk: SHA256 namespace guarantees uniqueness - Automatic deduplication: same content = same filename - Tampering protection: changing content changes SHA, breaks lookup - Two-level directory structure (ab/cd/abcdef...) for filesystem optimization Changes: - Added ContentAddressedCache adapter in adapters/cache_cas.py - Updated CLI and SDK to use CAS instead of FsCacheAdapter - Updated all tests to use ContentAddressedCache - Documented CAS architecture in CLAUDE.md and SECURITY_FIX_ROADMAP.md Security Benefits: - Eliminates cross-endpoint collision vulnerabilities - Self-describing cache (filename IS the checksum) - Natural cache validation without external metadata All quality checks passing: - 99 tests passing (0 failures) - Type checking: 0 errors (mypy) - Linting: All checks passed (ruff) Completed Phase 2 of SECURITY_FIX_ROADMAP.md
5.7 KiB
5.7 KiB
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased]
[5.0.3] - 2025-10-10
Security
- BREAKING: Removed all legacy shared cache code for security
- Ephemeral process-isolated cache is now the ONLY mode (no opt-out)
- Content-Addressed Storage (CAS): Implemented SHA256-based cache storage
- Zero collision risk (SHA256 namespace guarantees uniqueness)
- Automatic deduplication (same content = same filename)
- Tampering protection (changing content changes SHA, breaks lookup)
- Two-level directory structure for filesystem optimization
- Fixed TOCTOU vulnerabilities with atomic SHA validation at use-time
- Added
get_validated_ref()method to prevent cache poisoning - Eliminated multi-user data exposure through mandatory cache isolation
Removed
- BREAKING: Removed
DG_UNSAFE_SHARED_CACHEenvironment variable - BREAKING: Removed
DG_CACHE_DIRenvironment variable - BREAKING: Removed
cache_dirparameter fromcreate_client()
Changed
- Cache is now auto-created in
/tmp/deltaglider-*and cleaned on exit - All cache operations use file locking (Unix) and SHA validation
- Added
CacheMissErrorandCacheCorruptionErrorexceptions
Added
- New
ContentAddressedCacheadapter inadapters/cache_cas.py - Self-describing cache structure with SHA256-based filenames
Internal
- Updated all tests to use Content-Addressed Storage
- All 99 tests passing with zero errors
- Type checking: 0 errors (mypy)
- Linting: All checks passed (ruff)
- Completed Phase 1 & Phase 2 of SECURITY_FIX_ROADMAP.md
5.0.1 - 2025-01-10
Changed
- Code Organization: Refactored client.py from 1560 to 1154 lines (26% reduction)
- Extracted client operations into modular
client_operations/package:bucket.py- S3 bucket management operationspresigned.py- Presigned URL generationbatch.py- Batch upload/download operationsstats.py- Analytics and statistics operations
- Improved code maintainability with logical separation of concerns
- Better developer experience with cleaner module structure
Internal
- Full type safety maintained with mypy (0 errors)
- All 99 tests passing
- Code quality checks passing (ruff)
- No breaking changes - all public APIs remain unchanged
5.0.0 - 2025-01-10
Added
- boto3-compatible TypedDict types for S3 responses (no boto3 import needed)
- Complete boto3 compatibility vision document
- Type-safe response builders using TypedDict patterns
Changed
- BREAKING:
list_objects()now returns boto3-compatible dict instead of custom dataclass- Use
response['Contents']instead ofresponse.contents - Use
response.get('IsTruncated')instead ofresponse.is_truncated - Use
response.get('NextContinuationToken')instead ofresponse.next_continuation_token - DeltaGlider metadata now in
Metadatafield of each object
- Use
- Internal response building now uses TypedDict for compile-time type safety
- All S3 responses are dicts at runtime (TypedDict is a dict!)
Fixed
- Updated all documentation examples to use dict-based responses
- Fixed pagination examples in README and API docs
- Corrected SDK documentation with accurate method signatures
4.2.4 - 2025-01-10
Fixed
- Show only filename in
lsoutput instead of full path for cleaner display - Correct
lscommand path handling and prefix display logic
4.2.3 - 2025-01-07
Added
- Comprehensive test coverage for
delete_objects_recursive()method with 19 thorough tests - Tests cover delta suffix handling, error/warning aggregation, statistics tracking, and edge cases
- Better code organization with separate
client_models.pyandclient_delete_helpers.pymodules
Fixed
- Fixed all mypy type errors using proper
cast()for type safety - Improved type hints for dictionary operations in client code
Changed
- Refactored client code into logical modules for better maintainability
- Enhanced code quality with comprehensive linting and type checking
- All 99 integration/unit tests passing with zero type errors
Internal
- Better separation of concerns in client module
- Improved developer experience with clearer code structure
4.2.2 - 2024-10-06
Fixed
- Add .delta suffix fallback for
delete_object()method - Handle regular S3 objects without DeltaGlider metadata
- Update mypy type ignore comment for compatibility
4.2.1 - 2024-10-06
Fixed
- Make GitHub release creation non-blocking in workflows
4.2.0 - 2024-10-03
Added
- AWS credential parameters to
create_client()function - Support for custom endpoint URLs
- Enhanced boto3 compatibility
4.1.0 - 2024-09-29
Added
- boto3-compatible client API
- Bucket management methods
- Comprehensive SDK documentation
4.0.0 - 2024-09-21
Added
- Initial public release
- CLI with AWS S3 compatibility
- Delta compression for versioned artifacts
- 99%+ compression for similar files