docs: Update SDK documentation for v5.1.0 features

- Add session-level caching documentation to API reference - Document clear_cache() and evict_cache() methods - Add comprehensive bucket statistics examples - Update list_buckets() with DeltaGliderStats metadata - Add cache management patterns and best practices - Update CHANGELOG comparison links 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2026-04-27 02:39:27 +02:00 · 2025-10-10 18:34:44 +02:00
parent 3d04a407c0
commit dbd2632cae
3 changed files with 498 additions and 14 deletions
--- a/docs/sdk/api.md
+++ b/docs/sdk/api.md
@@ -156,7 +156,7 @@ for obj in response['Contents']:

 #### `get_bucket_stats`

-Get statistics for a bucket with optional detailed compression metrics.
+Get statistics for a bucket with optional detailed compression metrics. Results are cached per client session for performance.

 ```python
 def get_bucket_stats(
@@ -173,16 +173,46 @@ def get_bucket_stats(
  - With `detailed_stats=False`: ~50ms for any bucket size (LIST calls only)
  - With `detailed_stats=True`: ~2-3s per 1000 objects (adds HEAD calls for delta files)

+##### Caching Behavior
+
+- **Session-scoped cache**: Results cached within client instance lifetime
+- **Automatic invalidation**: Cache cleared on bucket mutations (put, delete, bucket operations)
+- **Intelligent reuse**: Detailed stats can serve quick stat requests
+- **Manual cache control**: Use `clear_cache()` to invalidate all cached stats
+
+##### Returns
+
+`BucketStats`: Dataclass containing:
+- **bucket** (`str`): Bucket name
+- **object_count** (`int`): Total number of objects
+- **total_size** (`int`): Original size in bytes (before compression)
+- **compressed_size** (`int`): Actual stored size in bytes
+- **space_saved** (`int`): Bytes saved through compression
+- **average_compression_ratio** (`float`): Average compression ratio (0.0-1.0)
+- **delta_objects** (`int`): Number of delta-compressed objects
+- **direct_objects** (`int`): Number of directly stored objects
+
 ##### Examples

 ```python
-# Quick stats for dashboard display
+# Quick stats for dashboard display (cached after first call)
 stats = client.get_bucket_stats('releases')
 print(f"Objects: {stats.object_count}, Size: {stats.total_size}")

-# Detailed stats for analytics (slower but accurate)
+# Second call hits cache (instant response)
+stats = client.get_bucket_stats('releases')
+print(f"Space saved: {stats.space_saved} bytes")
+
+# Detailed stats for analytics (slower but accurate, also cached)
 stats = client.get_bucket_stats('releases', detailed_stats=True)
 print(f"Compression ratio: {stats.average_compression_ratio:.1%}")
+
+# Quick call after detailed call reuses detailed cache (more accurate)
+quick_stats = client.get_bucket_stats('releases')  # Uses detailed cache
+
+# Clear cache to force refresh
+client.clear_cache()
+stats = client.get_bucket_stats('releases')  # Fresh computation
 ```

 #### `put_object`
@@ -304,7 +334,7 @@ client.delete_bucket(Bucket='old-releases')

 #### `list_buckets`

-List all S3 buckets (boto3-compatible).
+List all S3 buckets (boto3-compatible). Includes cached statistics when available.

 ```python
 def list_buckets(
@@ -315,7 +345,32 @@ def list_buckets(

 ##### Returns

-Dict with list of buckets and owner information (identical to boto3).
+Dict with list of buckets and owner information (identical to boto3). Each bucket may include optional `DeltaGliderStats` metadata if statistics have been previously cached.
+
+##### Response Structure
+
+```python
+{
+    'Buckets': [
+        {
+            'Name': 'bucket-name',
+            'CreationDate': datetime(2025, 1, 1),
+            'DeltaGliderStats': {  # Optional, only if cached
+                'Cached': True,
+                'Detailed': bool,  # Whether detailed stats were fetched
+                'ObjectCount': int,
+                'TotalSize': int,
+                'CompressedSize': int,
+                'SpaceSaved': int,
+                'AverageCompressionRatio': float,
+                'DeltaObjects': int,
+                'DirectObjects': int
+            }
+        }
+    ],
+    'Owner': {...}
+}
+```

 ##### Examples

@@ -324,6 +379,17 @@ Dict with list of buckets and owner information (identical to boto3).
 response = client.list_buckets()
 for bucket in response['Buckets']:
    print(f"{bucket['Name']} - Created: {bucket['CreationDate']}")
+
+    # Check if stats are cached
+    if 'DeltaGliderStats' in bucket:
+        stats = bucket['DeltaGliderStats']
+        print(f"  Cached stats: {stats['ObjectCount']} objects, "
+              f"{stats['AverageCompressionRatio']:.1%} compression")
+
+# Fetch stats first, then list buckets to see cached data
+client.get_bucket_stats('my-bucket', detailed_stats=True)
+response = client.list_buckets()
+# Now 'my-bucket' will include DeltaGliderStats in response
 ```

 ### Simple API Methods
@@ -460,6 +526,104 @@ else:
    # Re-upload or investigate
 ```

+### Cache Management Methods
+
+DeltaGlider maintains two types of caches for performance optimization:
+1. **Reference cache**: Binary reference files used for delta reconstruction
+2. **Statistics cache**: Bucket statistics (session-scoped)
+
+#### `clear_cache`
+
+Clear all cached data including reference files and bucket statistics.
+
+```python
+def clear_cache(self) -> None
+```
+
+##### Description
+
+Removes all cached reference files from the local filesystem and invalidates all bucket statistics. Useful for:
+- Forcing fresh statistics computation
+- Freeing disk space in long-running applications
+- Ensuring latest data after external bucket modifications
+- Testing and development workflows
+
+##### Cache Types Cleared
+
+1. **Reference Cache**: Binary reference files stored in `/tmp/deltaglider-*/`
+   - Encrypted at rest with ephemeral keys
+   - Content-addressed storage (SHA256-based filenames)
+   - Automatically cleaned up on process exit
+
+2. **Statistics Cache**: Bucket statistics cached per client session
+   - Metadata about compression ratios and object counts
+   - Session-scoped (not persisted to disk)
+   - Automatically invalidated on bucket mutations
+
+##### Examples
+
+```python
+# Long-running application
+client = create_client()
+
+# Work with files
+for i in range(1000):
+    client.upload(f"file_{i}.zip", "s3://bucket/")
+
+    # Periodic cache cleanup to prevent disk buildup
+    if i % 100 == 0:
+        client.clear_cache()
+
+# Force fresh statistics after external changes
+stats_before = client.get_bucket_stats('releases')  # Cached
+# ... external tool modifies bucket ...
+client.clear_cache()
+stats_after = client.get_bucket_stats('releases')  # Fresh data
+
+# Development workflow
+client.clear_cache()  # Start with clean state
+```
+
+#### `evict_cache`
+
+Remove a specific cached reference file from the local cache.
+
+```python
+def evict_cache(self, s3_url: str) -> None
+```
+
+##### Parameters
+
+- **s3_url** (`str`): S3 URL of the reference file to evict (e.g., `s3://bucket/prefix/reference.bin`)
+
+##### Description
+
+Removes a specific reference file from the cache without affecting other cached files or statistics. Useful for:
+- Selective cache invalidation when specific references are updated
+- Memory management in applications with many delta spaces
+- Testing specific delta compression scenarios
+
+##### Examples
+
+```python
+# Evict specific reference after update
+client.upload("new-reference.zip", "s3://releases/v2.0.0/")
+client.evict_cache("s3://releases/v2.0.0/reference.bin")
+
+# Next upload will fetch fresh reference
+client.upload("similar-file.zip", "s3://releases/v2.0.0/")
+
+# Selective eviction for specific delta spaces
+delta_spaces = ["v1.0.0", "v1.1.0", "v1.2.0"]
+for space in delta_spaces:
+    client.evict_cache(f"s3://releases/{space}/reference.bin")
+```
+
+##### See Also
+
+- [docs/CACHE_MANAGEMENT.md](../../CACHE_MANAGEMENT.md): Complete cache management guide
+- `clear_cache()`: Clear all caches
+
 #### `lifecycle_policy`

 Set lifecycle policy for S3 prefix (placeholder for future implementation).