mirror of
https://github.com/beshu-tech/deltaglider.git
synced 2026-04-19 23:21:16 +02:00
docs: Update SDK documentation for v5.1.0 features
- Add session-level caching documentation to API reference - Document clear_cache() and evict_cache() methods - Add comprehensive bucket statistics examples - Update list_buckets() with DeltaGliderStats metadata - Add cache management patterns and best practices - Update CHANGELOG comparison links 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
174
docs/sdk/api.md
174
docs/sdk/api.md
@@ -156,7 +156,7 @@ for obj in response['Contents']:
|
||||
|
||||
#### `get_bucket_stats`
|
||||
|
||||
Get statistics for a bucket with optional detailed compression metrics.
|
||||
Get statistics for a bucket with optional detailed compression metrics. Results are cached per client session for performance.
|
||||
|
||||
```python
|
||||
def get_bucket_stats(
|
||||
@@ -173,16 +173,46 @@ def get_bucket_stats(
|
||||
- With `detailed_stats=False`: ~50ms for any bucket size (LIST calls only)
|
||||
- With `detailed_stats=True`: ~2-3s per 1000 objects (adds HEAD calls for delta files)
|
||||
|
||||
##### Caching Behavior
|
||||
|
||||
- **Session-scoped cache**: Results cached within client instance lifetime
|
||||
- **Automatic invalidation**: Cache cleared on bucket mutations (put, delete, bucket operations)
|
||||
- **Intelligent reuse**: Detailed stats can serve quick stat requests
|
||||
- **Manual cache control**: Use `clear_cache()` to invalidate all cached stats
|
||||
|
||||
##### Returns
|
||||
|
||||
`BucketStats`: Dataclass containing:
|
||||
- **bucket** (`str`): Bucket name
|
||||
- **object_count** (`int`): Total number of objects
|
||||
- **total_size** (`int`): Original size in bytes (before compression)
|
||||
- **compressed_size** (`int`): Actual stored size in bytes
|
||||
- **space_saved** (`int`): Bytes saved through compression
|
||||
- **average_compression_ratio** (`float`): Average compression ratio (0.0-1.0)
|
||||
- **delta_objects** (`int`): Number of delta-compressed objects
|
||||
- **direct_objects** (`int`): Number of directly stored objects
|
||||
|
||||
##### Examples
|
||||
|
||||
```python
|
||||
# Quick stats for dashboard display
|
||||
# Quick stats for dashboard display (cached after first call)
|
||||
stats = client.get_bucket_stats('releases')
|
||||
print(f"Objects: {stats.object_count}, Size: {stats.total_size}")
|
||||
|
||||
# Detailed stats for analytics (slower but accurate)
|
||||
# Second call hits cache (instant response)
|
||||
stats = client.get_bucket_stats('releases')
|
||||
print(f"Space saved: {stats.space_saved} bytes")
|
||||
|
||||
# Detailed stats for analytics (slower but accurate, also cached)
|
||||
stats = client.get_bucket_stats('releases', detailed_stats=True)
|
||||
print(f"Compression ratio: {stats.average_compression_ratio:.1%}")
|
||||
|
||||
# Quick call after detailed call reuses detailed cache (more accurate)
|
||||
quick_stats = client.get_bucket_stats('releases') # Uses detailed cache
|
||||
|
||||
# Clear cache to force refresh
|
||||
client.clear_cache()
|
||||
stats = client.get_bucket_stats('releases') # Fresh computation
|
||||
```
|
||||
|
||||
#### `put_object`
|
||||
@@ -304,7 +334,7 @@ client.delete_bucket(Bucket='old-releases')
|
||||
|
||||
#### `list_buckets`
|
||||
|
||||
List all S3 buckets (boto3-compatible).
|
||||
List all S3 buckets (boto3-compatible). Includes cached statistics when available.
|
||||
|
||||
```python
|
||||
def list_buckets(
|
||||
@@ -315,7 +345,32 @@ def list_buckets(
|
||||
|
||||
##### Returns
|
||||
|
||||
Dict with list of buckets and owner information (identical to boto3).
|
||||
Dict with list of buckets and owner information (identical to boto3). Each bucket may include optional `DeltaGliderStats` metadata if statistics have been previously cached.
|
||||
|
||||
##### Response Structure
|
||||
|
||||
```python
|
||||
{
|
||||
'Buckets': [
|
||||
{
|
||||
'Name': 'bucket-name',
|
||||
'CreationDate': datetime(2025, 1, 1),
|
||||
'DeltaGliderStats': { # Optional, only if cached
|
||||
'Cached': True,
|
||||
'Detailed': bool, # Whether detailed stats were fetched
|
||||
'ObjectCount': int,
|
||||
'TotalSize': int,
|
||||
'CompressedSize': int,
|
||||
'SpaceSaved': int,
|
||||
'AverageCompressionRatio': float,
|
||||
'DeltaObjects': int,
|
||||
'DirectObjects': int
|
||||
}
|
||||
}
|
||||
],
|
||||
'Owner': {...}
|
||||
}
|
||||
```
|
||||
|
||||
##### Examples
|
||||
|
||||
@@ -324,6 +379,17 @@ Dict with list of buckets and owner information (identical to boto3).
|
||||
response = client.list_buckets()
|
||||
for bucket in response['Buckets']:
|
||||
print(f"{bucket['Name']} - Created: {bucket['CreationDate']}")
|
||||
|
||||
# Check if stats are cached
|
||||
if 'DeltaGliderStats' in bucket:
|
||||
stats = bucket['DeltaGliderStats']
|
||||
print(f" Cached stats: {stats['ObjectCount']} objects, "
|
||||
f"{stats['AverageCompressionRatio']:.1%} compression")
|
||||
|
||||
# Fetch stats first, then list buckets to see cached data
|
||||
client.get_bucket_stats('my-bucket', detailed_stats=True)
|
||||
response = client.list_buckets()
|
||||
# Now 'my-bucket' will include DeltaGliderStats in response
|
||||
```
|
||||
|
||||
### Simple API Methods
|
||||
@@ -460,6 +526,104 @@ else:
|
||||
# Re-upload or investigate
|
||||
```
|
||||
|
||||
### Cache Management Methods
|
||||
|
||||
DeltaGlider maintains two types of caches for performance optimization:
|
||||
1. **Reference cache**: Binary reference files used for delta reconstruction
|
||||
2. **Statistics cache**: Bucket statistics (session-scoped)
|
||||
|
||||
#### `clear_cache`
|
||||
|
||||
Clear all cached data including reference files and bucket statistics.
|
||||
|
||||
```python
|
||||
def clear_cache(self) -> None
|
||||
```
|
||||
|
||||
##### Description
|
||||
|
||||
Removes all cached reference files from the local filesystem and invalidates all bucket statistics. Useful for:
|
||||
- Forcing fresh statistics computation
|
||||
- Freeing disk space in long-running applications
|
||||
- Ensuring latest data after external bucket modifications
|
||||
- Testing and development workflows
|
||||
|
||||
##### Cache Types Cleared
|
||||
|
||||
1. **Reference Cache**: Binary reference files stored in `/tmp/deltaglider-*/`
|
||||
- Encrypted at rest with ephemeral keys
|
||||
- Content-addressed storage (SHA256-based filenames)
|
||||
- Automatically cleaned up on process exit
|
||||
|
||||
2. **Statistics Cache**: Bucket statistics cached per client session
|
||||
- Metadata about compression ratios and object counts
|
||||
- Session-scoped (not persisted to disk)
|
||||
- Automatically invalidated on bucket mutations
|
||||
|
||||
##### Examples
|
||||
|
||||
```python
|
||||
# Long-running application
|
||||
client = create_client()
|
||||
|
||||
# Work with files
|
||||
for i in range(1000):
|
||||
client.upload(f"file_{i}.zip", "s3://bucket/")
|
||||
|
||||
# Periodic cache cleanup to prevent disk buildup
|
||||
if i % 100 == 0:
|
||||
client.clear_cache()
|
||||
|
||||
# Force fresh statistics after external changes
|
||||
stats_before = client.get_bucket_stats('releases') # Cached
|
||||
# ... external tool modifies bucket ...
|
||||
client.clear_cache()
|
||||
stats_after = client.get_bucket_stats('releases') # Fresh data
|
||||
|
||||
# Development workflow
|
||||
client.clear_cache() # Start with clean state
|
||||
```
|
||||
|
||||
#### `evict_cache`
|
||||
|
||||
Remove a specific cached reference file from the local cache.
|
||||
|
||||
```python
|
||||
def evict_cache(self, s3_url: str) -> None
|
||||
```
|
||||
|
||||
##### Parameters
|
||||
|
||||
- **s3_url** (`str`): S3 URL of the reference file to evict (e.g., `s3://bucket/prefix/reference.bin`)
|
||||
|
||||
##### Description
|
||||
|
||||
Removes a specific reference file from the cache without affecting other cached files or statistics. Useful for:
|
||||
- Selective cache invalidation when specific references are updated
|
||||
- Memory management in applications with many delta spaces
|
||||
- Testing specific delta compression scenarios
|
||||
|
||||
##### Examples
|
||||
|
||||
```python
|
||||
# Evict specific reference after update
|
||||
client.upload("new-reference.zip", "s3://releases/v2.0.0/")
|
||||
client.evict_cache("s3://releases/v2.0.0/reference.bin")
|
||||
|
||||
# Next upload will fetch fresh reference
|
||||
client.upload("similar-file.zip", "s3://releases/v2.0.0/")
|
||||
|
||||
# Selective eviction for specific delta spaces
|
||||
delta_spaces = ["v1.0.0", "v1.1.0", "v1.2.0"]
|
||||
for space in delta_spaces:
|
||||
client.evict_cache(f"s3://releases/{space}/reference.bin")
|
||||
```
|
||||
|
||||
##### See Also
|
||||
|
||||
- [docs/CACHE_MANAGEMENT.md](../../CACHE_MANAGEMENT.md): Complete cache management guide
|
||||
- `clear_cache()`: Clear all caches
|
||||
|
||||
#### `lifecycle_policy`
|
||||
|
||||
Set lifecycle policy for S3 prefix (placeholder for future implementation).
|
||||
|
||||
Reference in New Issue
Block a user