mirror of
https://github.com/beshu-tech/deltaglider.git
synced 2026-01-11 14:20:33 +01:00
updates to docs
This commit is contained in:
12
README.md
12
README.md
@@ -6,13 +6,13 @@
|
||||
[](https://www.python.org/downloads/)
|
||||
[](https://github.com/jmacd/xdelta)
|
||||
|
||||
> 🌟 Star if you like this! 🙏
|
||||
> Leave a message in [Issues](https://github.com/beshu-tech/deltaglider/issues) - we are listening!
|
||||
|
||||
**Store 4TB of similar files in 5GB. No, that's not a typo.**
|
||||
|
||||
DeltaGlider is a drop-in S3 replacement that may achieve 99.9% size reduction for versioned compressed artifacts, backups, and release archives through intelligent binary delta compression (via xdelta3).
|
||||
|
||||
> 🌟 Star if you like this! Or Leave a message in [Issues](https://github.com/beshu-tech/deltaglider/issues) - we are listening!
|
||||
|
||||
## The Problem We Solved
|
||||
|
||||
You're storing hundreds of versions of your software releases. Each 100MB build differs by <1% from the previous version. You're paying to store 100GB of what's essentially 100MB of unique data.
|
||||
@@ -33,7 +33,7 @@ We don't expect significant benefit for multimedia content like videos, but we n
|
||||
|
||||
## Quick Start
|
||||
|
||||
The quickest way to start is using the GUI
|
||||
Deltaglider comes as SDK, CLI, but we also have a GUI:
|
||||
* https://github.com/beshu-tech/deltaglider_commander/
|
||||
|
||||
<div align="center">
|
||||
@@ -210,6 +210,12 @@ deltaglider stats my-bucket --refresh # Force cache refresh
|
||||
deltaglider stats my-bucket --no-cache # Skip caching entirely
|
||||
deltaglider stats my-bucket --json # JSON output for automation
|
||||
|
||||
# Integrity verification & maintenance
|
||||
deltaglider verify s3://releases/file.zip # Validate stored SHA256
|
||||
deltaglider purge my-bucket # Clean expired .deltaglider/tmp files
|
||||
deltaglider purge my-bucket --dry-run # Preview purge results
|
||||
deltaglider purge my-bucket --json # Machine-readable purge stats
|
||||
|
||||
# Migrate existing S3 buckets to DeltaGlider compression
|
||||
deltaglider migrate s3://old-bucket/ s3://new-bucket/ # Interactive migration
|
||||
deltaglider migrate s3://old-bucket/ s3://new-bucket/ --yes # Skip confirmation
|
||||
|
||||
@@ -206,10 +206,17 @@ from deltaglider import create_client
|
||||
client = create_client(
|
||||
endpoint_url="http://minio.internal:9000", # Custom S3 endpoint
|
||||
log_level="DEBUG", # Detailed logging
|
||||
cache_dir="/var/cache/deltaglider", # Custom cache location
|
||||
aws_access_key_id="minio",
|
||||
aws_secret_access_key="minio",
|
||||
region_name="eu-west-1",
|
||||
max_ratio=0.3, # Stricter delta acceptance
|
||||
)
|
||||
```
|
||||
|
||||
> ℹ️ The SDK now manages an encrypted, process-isolated cache automatically in `/tmp/deltaglider-*`.
|
||||
> Tune cache behavior via environment variables such as `DG_CACHE_BACKEND`,
|
||||
> `DG_CACHE_MEMORY_SIZE_MB`, and `DG_CACHE_ENCRYPTION_KEY` instead of passing a `cache_dir` argument.
|
||||
|
||||
## Real-World Example
|
||||
|
||||
```python
|
||||
@@ -299,4 +306,4 @@ url = client.generate_presigned_url(
|
||||
|
||||
## License
|
||||
|
||||
MIT License - See [LICENSE](https://github.com/beshu-tech/deltaglider/blob/main/LICENSE) for details.
|
||||
MIT License - See [LICENSE](https://github.com/beshu-tech/deltaglider/blob/main/LICENSE) for details.
|
||||
|
||||
182
docs/sdk/api.md
182
docs/sdk/api.md
@@ -156,29 +156,34 @@ for obj in response['Contents']:
|
||||
|
||||
#### `get_bucket_stats`
|
||||
|
||||
Get statistics for a bucket with optional detailed compression metrics. Results are cached per client session for performance.
|
||||
Get statistics for a bucket with optional detailed compression metrics. Results are cached inside the bucket for performance.
|
||||
|
||||
```python
|
||||
def get_bucket_stats(
|
||||
self,
|
||||
bucket: str,
|
||||
detailed_stats: bool = False
|
||||
mode: Literal["quick", "sampled", "detailed"] = "quick",
|
||||
use_cache: bool = True,
|
||||
refresh_cache: bool = False,
|
||||
) -> BucketStats
|
||||
```
|
||||
|
||||
##### Parameters
|
||||
|
||||
- **bucket** (`str`): S3 bucket name.
|
||||
- **detailed_stats** (`bool`): If True, fetch accurate compression ratios for delta files. Default: False.
|
||||
- With `detailed_stats=False`: ~50ms for any bucket size (LIST calls only)
|
||||
- With `detailed_stats=True`: ~2-3s per 1000 objects (adds HEAD calls for delta files)
|
||||
- **mode** (`Literal[...]`): Accuracy/cost trade-off:
|
||||
- `"quick"` (default): LIST-only scan; compression ratios for deltas are estimated.
|
||||
- `"sampled"`: HEAD one delta per deltaspace and reuse the ratio.
|
||||
- `"detailed"`: HEAD every delta object; slowest but exact.
|
||||
- **use_cache** (`bool`): If True, read/write `.deltaglider/stats_{mode}.json` in the bucket for reuse.
|
||||
- **refresh_cache** (`bool`): Force recomputation even if a cache file is valid.
|
||||
|
||||
##### Caching Behavior
|
||||
|
||||
- **Session-scoped cache**: Results cached within client instance lifetime
|
||||
- **Automatic invalidation**: Cache cleared on bucket mutations (put, delete, bucket operations)
|
||||
- **Intelligent reuse**: Detailed stats can serve quick stat requests
|
||||
- **Manual cache control**: Use `clear_cache()` to invalidate all cached stats
|
||||
- Stats are cached per mode directly inside the bucket at `.deltaglider/stats_{mode}.json`.
|
||||
- Every call validates cache freshness via a quick LIST (object count + compressed size).
|
||||
- `refresh_cache=True` skips cache validation and recomputes immediately.
|
||||
- `use_cache=False` bypasses both reading and writing cache artifacts.
|
||||
|
||||
##### Returns
|
||||
|
||||
@@ -195,24 +200,20 @@ def get_bucket_stats(
|
||||
##### Examples
|
||||
|
||||
```python
|
||||
# Quick stats for dashboard display (cached after first call)
|
||||
# Quick stats (fast LIST-only)
|
||||
stats = client.get_bucket_stats('releases')
|
||||
print(f"Objects: {stats.object_count}, Size: {stats.total_size}")
|
||||
|
||||
# Second call hits cache (instant response)
|
||||
stats = client.get_bucket_stats('releases')
|
||||
print(f"Space saved: {stats.space_saved} bytes")
|
||||
# Sampled/detailed modes for analytics
|
||||
sampled = client.get_bucket_stats('releases', mode='sampled')
|
||||
detailed = client.get_bucket_stats('releases', mode='detailed')
|
||||
print(f"Compression ratio: {detailed.average_compression_ratio:.1%}")
|
||||
|
||||
# Detailed stats for analytics (slower but accurate, also cached)
|
||||
stats = client.get_bucket_stats('releases', detailed_stats=True)
|
||||
print(f"Compression ratio: {stats.average_compression_ratio:.1%}")
|
||||
# Force refresh if an external tool modified the bucket
|
||||
fresh = client.get_bucket_stats('releases', mode='quick', refresh_cache=True)
|
||||
|
||||
# Quick call after detailed call reuses detailed cache (more accurate)
|
||||
quick_stats = client.get_bucket_stats('releases') # Uses detailed cache
|
||||
|
||||
# Clear cache to force refresh
|
||||
client.clear_cache()
|
||||
stats = client.get_bucket_stats('releases') # Fresh computation
|
||||
# Skip cache entirely when running ad-hoc diagnostics
|
||||
uncached = client.get_bucket_stats('releases', use_cache=False)
|
||||
```
|
||||
|
||||
#### `put_object`
|
||||
@@ -334,7 +335,7 @@ client.delete_bucket(Bucket='old-releases')
|
||||
|
||||
#### `list_buckets`
|
||||
|
||||
List all S3 buckets (boto3-compatible). Includes cached statistics when available.
|
||||
List all S3 buckets (boto3-compatible).
|
||||
|
||||
```python
|
||||
def list_buckets(
|
||||
@@ -345,51 +346,18 @@ def list_buckets(
|
||||
|
||||
##### Returns
|
||||
|
||||
Dict with list of buckets and owner information (identical to boto3). Each bucket may include optional `DeltaGliderStats` metadata if statistics have been previously cached.
|
||||
|
||||
##### Response Structure
|
||||
|
||||
```python
|
||||
{
|
||||
'Buckets': [
|
||||
{
|
||||
'Name': 'bucket-name',
|
||||
'CreationDate': datetime(2025, 1, 1),
|
||||
'DeltaGliderStats': { # Optional, only if cached
|
||||
'Cached': True,
|
||||
'Detailed': bool, # Whether detailed stats were fetched
|
||||
'ObjectCount': int,
|
||||
'TotalSize': int,
|
||||
'CompressedSize': int,
|
||||
'SpaceSaved': int,
|
||||
'AverageCompressionRatio': float,
|
||||
'DeltaObjects': int,
|
||||
'DirectObjects': int
|
||||
}
|
||||
}
|
||||
],
|
||||
'Owner': {...}
|
||||
}
|
||||
```
|
||||
Dict with the same structure boto3 returns (`Buckets`, `Owner`, `ResponseMetadata`). DeltaGlider does not inject additional metadata; use `get_bucket_stats()` for compression data.
|
||||
|
||||
##### Examples
|
||||
|
||||
```python
|
||||
# List all buckets
|
||||
response = client.list_buckets()
|
||||
for bucket in response['Buckets']:
|
||||
print(f"{bucket['Name']} - Created: {bucket['CreationDate']}")
|
||||
|
||||
# Check if stats are cached
|
||||
if 'DeltaGliderStats' in bucket:
|
||||
stats = bucket['DeltaGliderStats']
|
||||
print(f" Cached stats: {stats['ObjectCount']} objects, "
|
||||
f"{stats['AverageCompressionRatio']:.1%} compression")
|
||||
|
||||
# Fetch stats first, then list buckets to see cached data
|
||||
client.get_bucket_stats('my-bucket', detailed_stats=True)
|
||||
response = client.list_buckets()
|
||||
# Now 'my-bucket' will include DeltaGliderStats in response
|
||||
# Combine with get_bucket_stats for deeper insights
|
||||
stats = client.get_bucket_stats('releases', mode='detailed')
|
||||
print(f"releases -> {stats.object_count} objects, {stats.space_saved/(1024**3):.2f} GB saved")
|
||||
```
|
||||
|
||||
### Simple API Methods
|
||||
@@ -528,13 +496,9 @@ else:
|
||||
|
||||
### Cache Management Methods
|
||||
|
||||
DeltaGlider maintains two types of caches for performance optimization:
|
||||
1. **Reference cache**: Binary reference files used for delta reconstruction
|
||||
2. **Statistics cache**: Bucket statistics (session-scoped)
|
||||
|
||||
#### `clear_cache`
|
||||
|
||||
Clear all cached data including reference files and bucket statistics.
|
||||
Clear all locally cached reference files.
|
||||
|
||||
```python
|
||||
def clear_cache(self) -> None
|
||||
@@ -542,23 +506,20 @@ def clear_cache(self) -> None
|
||||
|
||||
##### Description
|
||||
|
||||
Removes all cached reference files from the local filesystem and invalidates all bucket statistics. Useful for:
|
||||
- Forcing fresh statistics computation
|
||||
Removes all cached reference files from the local filesystem. Useful for:
|
||||
- Freeing disk space in long-running applications
|
||||
- Ensuring latest data after external bucket modifications
|
||||
- Ensuring the next upload/download fetches fresh references from S3
|
||||
- Resetting cache after configuration or credential changes
|
||||
- Testing and development workflows
|
||||
|
||||
##### Cache Types Cleared
|
||||
##### Cache Scope
|
||||
|
||||
1. **Reference Cache**: Binary reference files stored in `/tmp/deltaglider-*/`
|
||||
- Encrypted at rest with ephemeral keys
|
||||
- Content-addressed storage (SHA256-based filenames)
|
||||
- Automatically cleaned up on process exit
|
||||
|
||||
2. **Statistics Cache**: Bucket statistics cached per client session
|
||||
- Metadata about compression ratios and object counts
|
||||
- Session-scoped (not persisted to disk)
|
||||
- Automatically invalidated on bucket mutations
|
||||
- **Reference Cache**: Binary reference files stored in `/tmp/deltaglider-*/`
|
||||
- Encrypted at rest with ephemeral keys
|
||||
- Content-addressed storage (SHA256-based filenames)
|
||||
- Automatically cleaned up on process exit
|
||||
- **Statistics Cache**: Stored inside the bucket as `.deltaglider/stats_{mode}.json`.
|
||||
- `clear_cache()` does *not* remove these S3 objects; use `refresh_cache=True` or delete the objects manually if needed.
|
||||
|
||||
##### Examples
|
||||
|
||||
@@ -574,71 +535,14 @@ for i in range(1000):
|
||||
if i % 100 == 0:
|
||||
client.clear_cache()
|
||||
|
||||
# Force fresh statistics after external changes
|
||||
stats_before = client.get_bucket_stats('releases') # Cached
|
||||
# ... external tool modifies bucket ...
|
||||
client.clear_cache()
|
||||
stats_after = client.get_bucket_stats('releases') # Fresh data
|
||||
# Force fresh statistics after external changes (skip cache instead of clearing)
|
||||
stats_before = client.get_bucket_stats('releases')
|
||||
stats_after = client.get_bucket_stats('releases', refresh_cache=True)
|
||||
|
||||
# Development workflow
|
||||
client.clear_cache() # Start with clean state
|
||||
```
|
||||
|
||||
#### `evict_cache`
|
||||
|
||||
Remove a specific cached reference file from the local cache.
|
||||
|
||||
```python
|
||||
def evict_cache(self, s3_url: str) -> None
|
||||
```
|
||||
|
||||
##### Parameters
|
||||
|
||||
- **s3_url** (`str`): S3 URL of the reference file to evict (e.g., `s3://bucket/prefix/reference.bin`)
|
||||
|
||||
##### Description
|
||||
|
||||
Removes a specific reference file from the cache without affecting other cached files or statistics. Useful for:
|
||||
- Selective cache invalidation when specific references are updated
|
||||
- Memory management in applications with many delta spaces
|
||||
- Testing specific delta compression scenarios
|
||||
|
||||
##### Examples
|
||||
|
||||
```python
|
||||
# Evict specific reference after update
|
||||
client.upload("new-reference.zip", "s3://releases/v2.0.0/")
|
||||
client.evict_cache("s3://releases/v2.0.0/reference.bin")
|
||||
|
||||
# Next upload will fetch fresh reference
|
||||
client.upload("similar-file.zip", "s3://releases/v2.0.0/")
|
||||
|
||||
# Selective eviction for specific delta spaces
|
||||
delta_spaces = ["v1.0.0", "v1.1.0", "v1.2.0"]
|
||||
for space in delta_spaces:
|
||||
client.evict_cache(f"s3://releases/{space}/reference.bin")
|
||||
```
|
||||
|
||||
##### See Also
|
||||
|
||||
- [docs/CACHE_MANAGEMENT.md](../../CACHE_MANAGEMENT.md): Complete cache management guide
|
||||
- `clear_cache()`: Clear all caches
|
||||
|
||||
#### `lifecycle_policy`
|
||||
|
||||
Set lifecycle policy for S3 prefix (placeholder for future implementation).
|
||||
|
||||
```python
|
||||
def lifecycle_policy(
|
||||
self,
|
||||
s3_prefix: str,
|
||||
days_before_archive: int = 30,
|
||||
days_before_delete: int = 90
|
||||
) -> None
|
||||
```
|
||||
|
||||
**Note**: This method is a placeholder for future S3 lifecycle policy management.
|
||||
|
||||
## UploadSummary
|
||||
|
||||
Data class containing upload operation results.
|
||||
@@ -995,4 +899,4 @@ client = create_client(log_level="DEBUG")
|
||||
|
||||
- **GitHub Issues**: [github.com/beshu-tech/deltaglider/issues](https://github.com/beshu-tech/deltaglider/issues)
|
||||
- **Documentation**: [github.com/beshu-tech/deltaglider](https://github.com/beshu-tech/deltaglider)
|
||||
- **PyPI Package**: [pypi.org/project/deltaglider](https://pypi.org/project/deltaglider)
|
||||
- **PyPI Package**: [pypi.org/project/deltaglider](https://pypi.org/project/deltaglider)
|
||||
|
||||
@@ -41,19 +41,19 @@ def fast_bucket_listing(bucket: str):
|
||||
|
||||
# Process objects for display
|
||||
items = []
|
||||
for obj in response.contents:
|
||||
for obj in response['Contents']:
|
||||
metadata = obj.get("Metadata", {})
|
||||
items.append({
|
||||
"key": obj.key,
|
||||
"size": obj.size,
|
||||
"last_modified": obj.last_modified,
|
||||
"is_delta": obj.is_delta, # Determined from filename
|
||||
# No compression_ratio - would require HEAD request
|
||||
"key": obj["Key"],
|
||||
"size": obj["Size"],
|
||||
"last_modified": obj["LastModified"],
|
||||
"is_delta": metadata.get("deltaglider-is-delta") == "true",
|
||||
})
|
||||
|
||||
elapsed = time.time() - start
|
||||
print(f"Listed {len(items)} objects in {elapsed*1000:.0f}ms")
|
||||
|
||||
return items, response.next_continuation_token
|
||||
return items, response.get("NextContinuationToken")
|
||||
|
||||
# Example: List first page
|
||||
items, next_token = fast_bucket_listing('releases')
|
||||
@@ -75,12 +75,12 @@ def paginated_listing(bucket: str, page_size: int = 50):
|
||||
FetchMetadata=False # Keep it fast
|
||||
)
|
||||
|
||||
all_objects.extend(response.contents)
|
||||
all_objects.extend(response["Contents"])
|
||||
|
||||
if not response.is_truncated:
|
||||
if not response.get("IsTruncated"):
|
||||
break
|
||||
|
||||
continuation_token = response.next_continuation_token
|
||||
continuation_token = response.get("NextContinuationToken")
|
||||
print(f"Fetched {len(all_objects)} objects so far...")
|
||||
|
||||
return all_objects
|
||||
@@ -96,8 +96,8 @@ print(f"Total objects: {len(all_objects)}")
|
||||
def dashboard_with_stats(bucket: str):
|
||||
"""Dashboard view with optional detailed stats."""
|
||||
|
||||
# Quick overview (fast - no metadata)
|
||||
stats = client.get_bucket_stats(bucket, detailed_stats=False)
|
||||
# Quick overview (fast LIST-only)
|
||||
stats = client.get_bucket_stats(bucket)
|
||||
|
||||
print(f"Quick Stats for {bucket}:")
|
||||
print(f" Total Objects: {stats.object_count}")
|
||||
@@ -108,7 +108,7 @@ def dashboard_with_stats(bucket: str):
|
||||
|
||||
# Detailed compression analysis (slower - fetches metadata for deltas only)
|
||||
if stats.delta_objects > 0:
|
||||
detailed_stats = client.get_bucket_stats(bucket, detailed_stats=True)
|
||||
detailed_stats = client.get_bucket_stats(bucket, mode='detailed')
|
||||
print(f"\nDetailed Compression Stats:")
|
||||
print(f" Average Compression: {detailed_stats.average_compression_ratio:.1%}")
|
||||
print(f" Space Saved: {detailed_stats.space_saved / (1024**3):.2f} GB")
|
||||
@@ -131,11 +131,25 @@ def compression_analysis(bucket: str, prefix: str = ""):
|
||||
)
|
||||
|
||||
# Analyze compression effectiveness
|
||||
delta_files = [obj for obj in response.contents if obj.is_delta]
|
||||
delta_files: list[dict[str, float | int | str]] = []
|
||||
for obj in response["Contents"]:
|
||||
metadata = obj.get("Metadata", {})
|
||||
if metadata.get("deltaglider-is-delta") != "true":
|
||||
continue
|
||||
original_size = int(metadata.get("deltaglider-original-size", obj["Size"]))
|
||||
compression_ratio = float(metadata.get("deltaglider-compression-ratio", 0.0))
|
||||
delta_files.append(
|
||||
{
|
||||
"key": obj["Key"],
|
||||
"original": original_size,
|
||||
"compressed": obj["Size"],
|
||||
"ratio": compression_ratio,
|
||||
}
|
||||
)
|
||||
|
||||
if delta_files:
|
||||
total_original = sum(obj.original_size for obj in delta_files)
|
||||
total_compressed = sum(obj.compressed_size for obj in delta_files)
|
||||
total_original = sum(obj["original"] for obj in delta_files)
|
||||
total_compressed = sum(obj["compressed"] for obj in delta_files)
|
||||
avg_ratio = (total_original - total_compressed) / total_original
|
||||
|
||||
print(f"Compression Analysis for {prefix or 'all files'}:")
|
||||
@@ -145,11 +159,11 @@ def compression_analysis(bucket: str, prefix: str = ""):
|
||||
print(f" Average Compression: {avg_ratio:.1%}")
|
||||
|
||||
# Find best and worst compression
|
||||
best = max(delta_files, key=lambda x: x.compression_ratio or 0)
|
||||
worst = min(delta_files, key=lambda x: x.compression_ratio or 1)
|
||||
best = max(delta_files, key=lambda x: x["ratio"])
|
||||
worst = min(delta_files, key=lambda x: x["ratio"])
|
||||
|
||||
print(f" Best Compression: {best.key} ({best.compression_ratio:.1%})")
|
||||
print(f" Worst Compression: {worst.key} ({worst.compression_ratio:.1%})")
|
||||
print(f" Best Compression: {best['key']} ({best['ratio']:.1%})")
|
||||
print(f" Worst Compression: {worst['key']} ({worst['ratio']:.1%})")
|
||||
|
||||
# Example: Analyze v2.0 releases
|
||||
compression_analysis('releases', 'v2.0/')
|
||||
@@ -180,7 +194,11 @@ def performance_comparison(bucket: str):
|
||||
)
|
||||
time_detailed = (time.time() - start) * 1000
|
||||
|
||||
delta_count = sum(1 for obj in response_fast.contents if obj.is_delta)
|
||||
delta_count = sum(
|
||||
1
|
||||
for obj in response_fast["Contents"]
|
||||
if obj.get("Metadata", {}).get("deltaglider-is-delta") == "true"
|
||||
)
|
||||
|
||||
print(f"Performance Comparison for {bucket}:")
|
||||
print(f" Fast Listing: {time_fast:.0f}ms (1 API call)")
|
||||
@@ -203,7 +221,7 @@ performance_comparison('releases')
|
||||
|
||||
## Bucket Statistics and Monitoring
|
||||
|
||||
DeltaGlider provides powerful bucket statistics with session-level caching for performance.
|
||||
DeltaGlider provides powerful bucket statistics with S3-backed caching for performance.
|
||||
|
||||
### Quick Dashboard Stats (Cached)
|
||||
|
||||
@@ -244,7 +262,7 @@ def detailed_compression_report(bucket: str):
|
||||
"""Generate detailed compression report with accurate ratios."""
|
||||
|
||||
# Detailed stats fetch metadata for delta files (slower, accurate)
|
||||
stats = client.get_bucket_stats(bucket, detailed_stats=True)
|
||||
stats = client.get_bucket_stats(bucket, mode='detailed')
|
||||
|
||||
efficiency = (stats.space_saved / stats.total_size * 100) if stats.total_size > 0 else 0
|
||||
|
||||
@@ -286,7 +304,7 @@ def list_buckets_with_stats():
|
||||
# Pre-fetch stats for important buckets
|
||||
important_buckets = ['releases', 'backups']
|
||||
for bucket_name in important_buckets:
|
||||
client.get_bucket_stats(bucket_name, detailed_stats=True)
|
||||
client.get_bucket_stats(bucket_name, mode='detailed')
|
||||
|
||||
# List all buckets (includes cached stats automatically)
|
||||
response = client.list_buckets()
|
||||
@@ -357,7 +375,7 @@ except KeyboardInterrupt:
|
||||
|
||||
## Session-Level Cache Management
|
||||
|
||||
DeltaGlider maintains session-level caches for optimal performance in long-running applications.
|
||||
DeltaGlider maintains an encrypted reference cache for optimal performance in long-running applications.
|
||||
|
||||
### Long-Running Application Pattern
|
||||
|
||||
@@ -410,11 +428,8 @@ def handle_external_bucket_changes(bucket: str):
|
||||
print("External backup tool running...")
|
||||
run_external_backup_tool(bucket) # Your external tool
|
||||
|
||||
# Clear cache to get fresh data
|
||||
client.clear_cache()
|
||||
|
||||
# Get updated stats
|
||||
stats_after = client.get_bucket_stats(bucket)
|
||||
# Force a recompute of the cached stats
|
||||
stats_after = client.get_bucket_stats(bucket, refresh_cache=True)
|
||||
print(f"After: {stats_after.object_count} objects")
|
||||
print(f"Added: {stats_after.object_count - stats_before.object_count} objects")
|
||||
|
||||
@@ -422,35 +437,6 @@ def handle_external_bucket_changes(bucket: str):
|
||||
handle_external_bucket_changes('backups')
|
||||
```
|
||||
|
||||
### Selective Cache Eviction
|
||||
|
||||
```python
|
||||
def selective_cache_management():
|
||||
"""Manage cache for specific delta spaces."""
|
||||
|
||||
client = create_client()
|
||||
|
||||
# Upload to multiple delta spaces
|
||||
versions = ['v1.0.0', 'v1.1.0', 'v1.2.0']
|
||||
|
||||
for version in versions:
|
||||
client.upload(f"app-{version}.zip", f"s3://releases/{version}/")
|
||||
|
||||
# Update reference for specific version
|
||||
print("Updating v1.1.0 reference...")
|
||||
client.upload("new-reference.zip", "s3://releases/v1.1.0/")
|
||||
|
||||
# Evict only v1.1.0 cache (others remain cached)
|
||||
client.evict_cache("s3://releases/v1.1.0/reference.bin")
|
||||
|
||||
# Next upload to v1.1.0 fetches fresh reference
|
||||
# v1.0.0 and v1.2.0 still use cached references
|
||||
client.upload("similar-file.zip", "s3://releases/v1.1.0/")
|
||||
|
||||
# Example: Selective eviction
|
||||
selective_cache_management()
|
||||
```
|
||||
|
||||
### Testing with Clean Cache
|
||||
|
||||
```python
|
||||
@@ -491,19 +477,18 @@ def measure_cache_performance(bucket: str):
|
||||
client = create_client()
|
||||
|
||||
# Test 1: Cold cache
|
||||
client.clear_cache()
|
||||
start = time.time()
|
||||
stats1 = client.get_bucket_stats(bucket, detailed_stats=True)
|
||||
stats1 = client.get_bucket_stats(bucket, mode='detailed', refresh_cache=True)
|
||||
cold_time = (time.time() - start) * 1000
|
||||
|
||||
# Test 2: Warm cache
|
||||
start = time.time()
|
||||
stats2 = client.get_bucket_stats(bucket, detailed_stats=True)
|
||||
stats2 = client.get_bucket_stats(bucket, mode='detailed')
|
||||
warm_time = (time.time() - start) * 1000
|
||||
|
||||
# Test 3: Quick stats from detailed cache
|
||||
start = time.time()
|
||||
stats3 = client.get_bucket_stats(bucket, detailed_stats=False)
|
||||
stats3 = client.get_bucket_stats(bucket, mode='quick')
|
||||
reuse_time = (time.time() - start) * 1000
|
||||
|
||||
print(f"Cache Performance for {bucket}:")
|
||||
@@ -1707,4 +1692,4 @@ files_to_upload = [
|
||||
results = uploader.upload_batch(files_to_upload)
|
||||
```
|
||||
|
||||
These examples demonstrate real-world usage patterns for DeltaGlider across various domains. Each example includes error handling, monitoring, and best practices for production deployments.
|
||||
These examples demonstrate real-world usage patterns for DeltaGlider across various domains. Each example includes error handling, monitoring, and best practices for production deployments.
|
||||
|
||||
Reference in New Issue
Block a user