Commit Graph

138 Commits

Author SHA1 Message Date
Simone Scarduzio
a98fc7c178 style: format storage_s3.py for ruff format compliance
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
v6.1.1
2026-03-23 19:15:09 +01:00
Simone Scarduzio
82e00623de fix: verbose diagnostic logging on put_object retries
On retry: logs bucket, key, body size, content type, metadata keys,
endpoint URL, HTTP status, error code/message, request ID, and full
HTTP response headers. Enables botocore DEBUG logging for wire-level
HTTP traces on subsequent retry attempts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 19:08:35 +01:00
Simone Scarduzio
e8c76f1dc7 fix: add retry with backoff for put_object on transient S3 failures
S3-compatible endpoints (Hetzner) occasionally return transient
BadRequest errors. Retries up to 3 times with exponential backoff
(1s, 2s) before giving up.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 19:02:51 +01:00
Simone Scarduzio
c492a5087b feat: log deltaglider version on every CLI invocation
Helps verify which version is running in Docker containers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 13:58:15 +01:00
Simone Scarduzio
85af5a95c8 docs: update CHANGELOG for v6.1.1 release
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 11:52:49 +01:00
Simone Scarduzio
60b70309fa fix: pin LocalStack to 4.4 (latest now requires paid license)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 11:50:05 +01:00
Simone Scarduzio
b0699f952a fix: disable boto3 auto-checksums for S3-compatible endpoint support
boto3 1.36+ sends CRC32/CRC64 checksums by default on PUT requests.
S3-compatible stores like Hetzner Object Storage reject these with
BadRequest, breaking direct (non-delta) file uploads. This sets
request_checksum_calculation="when_required" to restore compatibility
while still working with AWS S3.

Also pins runtime deps to major version ranges and adds S3 compat tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 11:45:05 +01:00
Simone Scarduzio
9bfe121f44 style: format files for ruff format --check compliance
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
v6.1.0
2026-02-07 16:02:40 +01:00
Simone Scarduzio
6cab3de9a0 fix: disable sha tag on tag pushes to avoid invalid Docker tag
The sha tag template `prefix={{branch}}-` produces `:-hash` on tag
pushes because {{branch}} is empty, resulting in an invalid Docker
tag like `beshultd/deltaglider:-482f45f`. Only emit sha tags on
branch pushes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 15:57:37 +01:00
Simone Scarduzio
482f45fc02 docs: update CHANGELOG for v6.1.0 release
Add v6.1.0 section with bucket ACL support, Docker publishing,
config/model refactoring. Backfill v6.0.0 section from previously
unreleased entries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 15:55:50 +01:00
Simone Scarduzio
6b3245266e feat: add put_bucket_acl and get_bucket_acl support
Add boto3-compatible bucket ACL operations as pure S3 passthroughs,
following the existing create_bucket/delete_bucket pattern. Includes
CLI commands (put-bucket-acl, get-bucket-acl), 7 integration tests,
and documentation updates (method count 21→23).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 15:53:33 +01:00
Simone Scarduzio
20053acb5f fix: remove unused imports flagged by ruff
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 08:48:22 +01:00
Simone Scarduzio
87f425734f refactor: typed result dataclasses, centralized metadata aliases, config extraction
- Replace dict[str,Any] returns in delete/delete_recursive with DeleteResult
  and RecursiveDeleteResult dataclasses for type safety
- Extract _delete_reference/_delete_delta/_classify_objects_for_deletion
  helper methods from oversized delete methods in service.py
- Centralize metadata key aliases in METADATA_KEY_ALIASES dict with
  resolve_metadata() replacing duplicated _meta_value() lookups
- Add DeltaGliderConfig dataclass with from_env() for centralized config
- Add ObjectKey.full_key property, remove dead _multipart_uploads dict
- Update all consumers (client, CLI, tests) for dataclass access patterns

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 23:16:57 +01:00
Simone Scarduzio
012662c377 updates 2025-11-11 17:20:43 +01:00
Simone Scarduzio
284f030fae updates to docs 2025-11-11 17:05:50 +01:00
Simone Scarduzio
7a4d30a007 freshen up 2025-11-11 11:18:06 +01:00
Simone Scarduzio
0d46283ff0 width 2025-11-11 09:55:52 +01:00
Simone Scarduzio
805e2967bc dark mode 2025-11-11 09:53:54 +01:00
Simone Scarduzio
2ef1741d51 freshen up readme 2025-11-11 09:48:34 +01:00
Simone Scarduzio
2c1d756e7b tweak readme 2025-11-06 16:14:29 +01:00
Simone Scarduzio
c6cee7ae26 docker 2025-11-06 15:56:15 +01:00
Simone Scarduzio
cee9a9fd2d higher limits why not v6.0.2 2025-10-17 18:43:46 +02:00
Simone Scarduzio
0507e6ebcd format 2025-10-16 17:14:37 +02:00
Simone Scarduzio
fa9c4fa42d feat: Implement rehydration and purge functionality for deltaglider files
- Added `rehydrate_for_download` method to download and decompress deltaglider-compressed files, re-uploading them with expiration metadata.
- Introduced `generate_presigned_url_with_rehydration` method to generate presigned URLs that automatically handle rehydration for both regular and deltaglider files.
- Implemented `purge_temp_files` command in CLI to delete expired temporary files from the .deltaglider/tmp/ directory, with options for dry run and JSON output.
- Enhanced service methods to support the new rehydration and purging features, including detailed logging and metrics tracking.
2025-10-16 17:02:00 +02:00
Simone Scarduzio
934d83975c fix: format models.py v6.0.1 2025-10-16 11:21:33 +02:00
Simone Scarduzio
c32d5265d9 feat: Enhance metadata handling and bucket statistics
- Added object_limit_reached attribute to BucketStats for tracking limits.
- Introduced QUICK_LIST_LIMIT and SAMPLED_LIST_LIMIT constants to manage listing limits.
- Implemented _first_metadata_value helper function for improved metadata retrieval.
- Updated get_bucket_stats to log when listing is capped due to limits.
- Refactored DeltaMeta to streamline metadata extraction with error handling.
- Enhanced object listing to support max_objects parameter and limit tracking.
2025-10-16 11:17:13 +02:00
Simone Scarduzio
1cf7e3ad21 import 2025-10-15 18:52:56 +02:00
Simone Scarduzio
9b36087438 not mandatory to have the command metadata field set 2025-10-15 18:16:43 +02:00
Simone Scarduzio
60877966f2 docs: Remove outdated METADATA_ISSUE_DIAGNOSIS.md
This document describes the old metadata format without dg- prefix.
Since v6.0.0 uses the new dg- prefixed format and requires all files
to be re-uploaded (greenfield approach), this diagnosis doc is no longer
relevant.
2025-10-15 11:45:52 +02:00
Simone Scarduzio
fbd44ea3c3 style: Format integration test files with ruff v6.0.0 2025-10-15 11:38:17 +02:00
Simone Scarduzio
3f689fc601 fix: Update integration tests for new metadata format and caching behavior
- Fix sync tests: Add list_objects.side_effect = NotImplementedError() to mock
- Fix sync tests: Add side_effect for put() to avoid hanging
- Fix MockStorage: Add continuation_token parameter to list_objects()
- Fix stats tests: Update assertions to include use_cache and refresh_cache params
- Fix bucket management test: Update caching expectations for S3-based cache

All 97 integration tests now pass.
2025-10-15 11:34:43 +02:00
Simone Scarduzio
3753212f96 style: Format test file with ruff
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-15 11:22:00 +02:00
Simone Scarduzio
db7d14f8a8 feat: Add metadata namespace and fix stats calculation
This is a major release with breaking changes to metadata format.

BREAKING CHANGES:
- All metadata keys now use 'dg-' namespace prefix (becomes 'x-amz-meta-dg-*' in S3)
- Old metadata format is not supported - all files must be re-uploaded
- Stats behavior changed: quick mode no longer shows misleading warnings

Features:
- Metadata now uses real package version (dg-tool: deltaglider/VERSION)
- All metadata keys properly namespaced with 'dg-' prefix
- Clean stats output in quick mode (no per-file warning spam)
- Fixed nonsensical negative compression ratios in quick mode

Fixes:
- Stats now correctly handles delta files without metadata
- Space saved shows 0 instead of negative numbers when metadata unavailable
- Removed misleading warnings in quick mode (metadata not fetched is expected)
- Fixed metadata keys to use hyphens instead of underscores

Documentation:
- Added comprehensive metadata documentation
- Added stats calculation behavior guide
- Added real version tracking documentation

Tests:
- Updated all tests to use new dg- prefixed metadata keys
- All 73 unit tests passing
- All quality checks passing (ruff, mypy)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-15 11:19:10 +02:00
Simone Scarduzio
e1259b7ea8 fix: Code quality improvements for v5.2.2 release
- Fix pagination bug using continuation_token instead of start_after
- Add stats caching to prevent blocking web apps
- Improve code formatting and type checking
- Add comprehensive unit tests for new features
- Fix test mock usage in object_listing tests
v5.2.2
2025-10-14 23:54:49 +02:00
Simone Scarduzio
ff05e77c24 fix: Prevent get_bucket_stats from blocking web apps indefinitely
**Performance Issues Fixed:**
1. aws_compat.py: Changed to use cached stats only (no bucket scans after uploads)
2. stats.py: Added safety mechanisms to prevent infinite hangs
   - Max 10k iterations (10M object limit)
   - 10 min timeout on metadata fetching
   - Missing pagination token detection
   - Graceful error recovery with partial stats

**Refactoring:**
- Reduced nesting in get_bucket_stats from 5 levels to 2 levels
- Extracted 5 helper functions for better maintainability
- Main function reduced from 300+ lines to 33 lines
- 100% backward compatible - no API changes

**Benefits:**
- Web apps no longer hang on upload/delete operations
- Explicit get_bucket_stats() calls complete within bounded time
- Better error handling and logging
- Easier to test and maintain

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
v5.2.1
2025-10-14 14:47:39 +02:00
Simone Scarduzio
c3d385bf18 fix tests v5.2.0 2025-10-13 17:26:35 +02:00
Simone Scarduzio
aea5cb5d9a feat: Enhance S3 migration CLI with new commands and EC2 detection option 2025-10-12 23:12:32 +02:00
Simone Scarduzio
b2ca59490b feat: Add EC2 region detection and cost optimization features 2025-10-12 22:41:48 +02:00
Simone Scarduzio
4f56c4b600 fix: Preserve original filenames during S3-to-S3 migration 2025-10-12 18:10:04 +02:00
Simone Scarduzio
14c6af0f35 handle version in cli 2025-10-12 17:47:05 +02:00
Simone Scarduzio
67792b2031 migrate CLI support 2025-10-12 17:37:44 +02:00
Simone Scarduzio
a9a1396e6e style: Format test_stats_algorithm.py with ruff v5.1.1 2025-10-11 14:17:49 +02:00
Simone Scarduzio
52eb5bba21 fix: Fix unit test import issues for concurrent.futures
- Remove unnecessary concurrent.futures patches in tests
- Update test_detailed_stats_flag to match current implementation behavior
- Tests now properly handle parallel metadata fetching without mocking
2025-10-11 14:13:40 +02:00
Simone Scarduzio
f75db142e8 fix: Correct logging message formatting in get_bucket_stats and update test assertionsalls for clarity. 2025-10-11 14:05:54 +02:00
Simone Scarduzio
35d34d4862 chore: Update CHANGELOG for v5.1.1 release
- Document stats command fixes
- Document performance improvements
2025-10-10 19:57:11 +02:00
Simone Scarduzio
9230cbd762 test 2025-10-10 19:52:15 +02:00
Simone Scarduzio
2eba6e8d38 optimisation 2025-10-10 19:50:33 +02:00
Simone Scarduzio
656726b57b algorithm correctness 2025-10-10 19:46:39 +02:00
Simone Scarduzio
85dd315424 ruff v5.1.0 v5.0.4 2025-10-10 18:44:46 +02:00
Simone Scarduzio
dbd2632cae docs: Update SDK documentation for v5.1.0 features
- Add session-level caching documentation to API reference
- Document clear_cache() and evict_cache() methods
- Add comprehensive bucket statistics examples
- Update list_buckets() with DeltaGliderStats metadata
- Add cache management patterns and best practices
- Update CHANGELOG comparison links

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-10 18:34:44 +02:00