- Create DeltaGliderClient with user-friendly interface - Add create_client() factory function with sensible defaults - Implement UploadSummary dataclass with helpful properties - Expose simplified API through main package - Add comprehensive SDK documentation under docs/sdk/: - Getting started guide with installation and examples - Complete API reference documentation - Real-world usage examples for 8 common scenarios - Architecture deep dive explaining how DeltaGlider works - Automatic documentation generation scripts - Update CONTRIBUTING.md with SDK documentation guidelines - All tests pass and code quality checks succeed 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
14 KiB
DeltaGlider API Reference
Complete API documentation for the DeltaGlider Python SDK.
Table of Contents
Client Creation
create_client
Factory function to create a configured DeltaGlider client with sensible defaults.
def create_client(
endpoint_url: Optional[str] = None,
log_level: str = "INFO",
cache_dir: str = "/tmp/.deltaglider/cache",
**kwargs
) -> DeltaGliderClient
Parameters
- endpoint_url (
Optional[str]): S3 endpoint URL for MinIO, R2, or other S3-compatible storage. If None, uses AWS S3. - log_level (
str): Logging verbosity level. Options: "DEBUG", "INFO", "WARNING", "ERROR". Default: "INFO". - cache_dir (
str): Directory for local reference cache. Default: "/tmp/.deltaglider/cache". - kwargs: Additional arguments passed to
DeltaService:- tool_version (
str): Version string for metadata. Default: "deltaglider/0.1.0" - max_ratio (
float): Maximum acceptable delta/file ratio. Default: 0.5
- tool_version (
Returns
DeltaGliderClient: Configured client instance ready for use.
Examples
# Default AWS S3 configuration
client = create_client()
# Custom endpoint for MinIO
client = create_client(endpoint_url="http://localhost:9000")
# Debug mode with custom cache
client = create_client(
log_level="DEBUG",
cache_dir="/var/cache/deltaglider"
)
# Custom delta ratio threshold
client = create_client(max_ratio=0.3) # Only use delta if <30% of original
DeltaGliderClient
Main client class for interacting with DeltaGlider.
Constructor
class DeltaGliderClient:
def __init__(
self,
service: DeltaService,
endpoint_url: Optional[str] = None
)
Note: Use create_client() instead of instantiating directly.
Methods
upload
Upload a file to S3 with automatic delta compression.
def upload(
self,
file_path: str | Path,
s3_url: str,
tags: Optional[Dict[str, str]] = None,
max_ratio: float = 0.5
) -> UploadSummary
Parameters
- file_path (
str | Path): Local file path to upload. - s3_url (
str): S3 destination URL in formats3://bucket/prefix/. - tags (
Optional[Dict[str, str]]): S3 object tags to attach. (Future feature) - max_ratio (
float): Maximum acceptable delta/file size ratio. Default: 0.5.
Returns
UploadSummary: Object containing upload statistics and compression details.
Raises
FileNotFoundError: If local file doesn't exist.ValueError: If S3 URL is invalid.PermissionError: If S3 access is denied.
Examples
# Simple upload
summary = client.upload("app.zip", "s3://releases/v1.0.0/")
# With custom compression threshold
summary = client.upload(
"large-file.tar.gz",
"s3://backups/",
max_ratio=0.3 # Only use delta if compression > 70%
)
# Check results
if summary.is_delta:
print(f"Stored as delta: {summary.stored_size_mb:.1f} MB")
else:
print(f"Stored as full file: {summary.original_size_mb:.1f} MB")
download
Download and reconstruct a file from S3.
def download(
self,
s3_url: str,
output_path: str | Path
) -> None
Parameters
- s3_url (
str): S3 source URL in formats3://bucket/key. - output_path (
str | Path): Local destination path.
Returns
None. File is written to output_path.
Raises
ValueError: If S3 URL is invalid or missing key.FileNotFoundError: If S3 object doesn't exist.PermissionError: If local path is not writable or S3 access denied.
Examples
# Download a file
client.download("s3://releases/v1.0.0/app.zip", "downloaded.zip")
# Auto-detects .delta suffix if needed
client.download("s3://releases/v1.0.0/app.zip", "app.zip")
# Will try app.zip first, then app.zip.delta if not found
# Download to specific directory
from pathlib import Path
output = Path("/tmp/downloads/app.zip")
output.parent.mkdir(parents=True, exist_ok=True)
client.download("s3://releases/v1.0.0/app.zip", output)
verify
Verify the integrity of a stored file using SHA256 checksums.
def verify(
self,
s3_url: str
) -> bool
Parameters
- s3_url (
str): S3 URL of the file to verify.
Returns
bool: True if verification passed, False if corrupted.
Raises
ValueError: If S3 URL is invalid.FileNotFoundError: If S3 object doesn't exist.
Examples
# Verify file integrity
is_valid = client.verify("s3://releases/v1.0.0/app.zip")
if is_valid:
print("✓ File integrity verified")
else:
print("✗ File is corrupted!")
# Re-upload or investigate
lifecycle_policy
Set lifecycle policy for S3 prefix (placeholder for future implementation).
def lifecycle_policy(
self,
s3_prefix: str,
days_before_archive: int = 30,
days_before_delete: int = 90
) -> None
Note: This method is a placeholder for future S3 lifecycle policy management.
UploadSummary
Data class containing upload operation results.
@dataclass
class UploadSummary:
operation: str # Operation type: "PUT" or "PUT_DELTA"
bucket: str # S3 bucket name
key: str # S3 object key
original_size: int # Original file size in bytes
stored_size: int # Actual stored size in bytes
is_delta: bool # Whether delta compression was used
delta_ratio: float = 0.0 # Ratio of delta size to original
Properties
original_size_mb
Original file size in megabytes.
@property
def original_size_mb(self) -> float
stored_size_mb
Stored size in megabytes (after compression if applicable).
@property
def stored_size_mb(self) -> float
savings_percent
Percentage saved through compression.
@property
def savings_percent(self) -> float
Example Usage
summary = client.upload("app.zip", "s3://releases/")
print(f"Operation: {summary.operation}")
print(f"Location: s3://{summary.bucket}/{summary.key}")
print(f"Original: {summary.original_size_mb:.1f} MB")
print(f"Stored: {summary.stored_size_mb:.1f} MB")
print(f"Saved: {summary.savings_percent:.0f}%")
print(f"Delta used: {summary.is_delta}")
if summary.is_delta:
print(f"Delta ratio: {summary.delta_ratio:.2%}")
DeltaService
Core service class handling delta compression logic.
class DeltaService:
def __init__(
self,
storage: StoragePort,
diff: DiffPort,
hasher: HashPort,
cache: CachePort,
clock: ClockPort,
logger: LoggerPort,
metrics: MetricsPort,
tool_version: str = "deltaglider/0.1.0",
max_ratio: float = 0.5
)
Methods
put
Upload a file with automatic delta compression.
def put(
self,
file: Path,
delta_space: DeltaSpace,
max_ratio: Optional[float] = None
) -> PutSummary
get
Download and reconstruct a file.
def get(
self,
object_key: ObjectKey,
output_path: Path
) -> GetSummary
verify
Verify file integrity.
def verify(
self,
object_key: ObjectKey
) -> VerifyResult
Models
DeltaSpace
Represents a compression space in S3.
@dataclass(frozen=True)
class DeltaSpace:
bucket: str # S3 bucket name
prefix: str # S3 prefix for related files
ObjectKey
Represents an S3 object location.
@dataclass(frozen=True)
class ObjectKey:
bucket: str # S3 bucket name
key: str # S3 object key
PutSummary
Detailed upload operation results.
@dataclass
class PutSummary:
operation: str # "PUT" or "PUT_DELTA"
bucket: str # S3 bucket
key: str # S3 key
file_size: int # Original file size
file_hash: str # SHA256 of original file
delta_size: Optional[int] # Size of delta (if used)
delta_hash: Optional[str] # SHA256 of delta
delta_ratio: Optional[float] # Delta/original ratio
reference_hash: Optional[str] # Reference file hash
GetSummary
Download operation results.
@dataclass
class GetSummary:
operation: str # "GET" or "GET_DELTA"
bucket: str # S3 bucket
key: str # S3 key
size: int # Downloaded size
hash: str # SHA256 hash
reconstructed: bool # Whether reconstruction was needed
VerifyResult
Verification operation results.
@dataclass
class VerifyResult:
valid: bool # Verification result
operation: str # "VERIFY" or "VERIFY_DELTA"
expected_hash: str # Expected SHA256
actual_hash: Optional[str] # Actual SHA256 (if computed)
details: Optional[str] # Error details if invalid
Exceptions
DeltaGlider uses standard Python exceptions with descriptive messages:
Common Exceptions
- FileNotFoundError: Local file or S3 object not found
- PermissionError: Access denied (S3 or local filesystem)
- ValueError: Invalid parameters (malformed URLs, invalid ratios)
- IOError: I/O operations failed
- RuntimeError: xdelta3 binary not found or failed
Exception Handling Example
from deltaglider import create_client
client = create_client()
try:
summary = client.upload("app.zip", "s3://bucket/path/")
except FileNotFoundError as e:
print(f"File not found: {e}")
except PermissionError as e:
print(f"Permission denied: {e}")
print("Check AWS credentials and S3 bucket permissions")
except ValueError as e:
print(f"Invalid parameters: {e}")
except RuntimeError as e:
print(f"System error: {e}")
print("Ensure xdelta3 is installed: apt-get install xdelta3")
except Exception as e:
print(f"Unexpected error: {e}")
# Log for investigation
import traceback
traceback.print_exc()
Environment Variables
DeltaGlider respects these environment variables:
AWS Configuration
- AWS_ACCESS_KEY_ID: AWS access key
- AWS_SECRET_ACCESS_KEY: AWS secret key
- AWS_DEFAULT_REGION: AWS region (default: us-east-1)
- AWS_ENDPOINT_URL: Custom S3 endpoint (for MinIO/R2)
- AWS_PROFILE: AWS profile to use
DeltaGlider Configuration
- DG_LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR)
- DG_CACHE_DIR: Local cache directory
- DG_MAX_RATIO: Default maximum delta ratio
Example
# Configure for MinIO
export AWS_ENDPOINT_URL=http://localhost:9000
export AWS_ACCESS_KEY_ID=minioadmin
export AWS_SECRET_ACCESS_KEY=minioadmin
# Configure DeltaGlider
export DG_LOG_LEVEL=DEBUG
export DG_CACHE_DIR=/var/cache/deltaglider
export DG_MAX_RATIO=0.3
# Now use normally
python my_script.py
Thread Safety
DeltaGlider clients are thread-safe for read operations but should not be shared across threads for write operations. For multi-threaded applications:
import threading
from deltaglider import create_client
# Create separate client per thread
def worker(file_path, s3_url):
client = create_client() # Each thread gets its own client
summary = client.upload(file_path, s3_url)
print(f"Thread {threading.current_thread().name}: {summary.savings_percent:.0f}%")
# Create threads
threads = []
for i, (file, url) in enumerate(files_to_upload):
t = threading.Thread(target=worker, args=(file, url), name=f"Worker-{i}")
threads.append(t)
t.start()
# Wait for completion
for t in threads:
t.join()
Performance Considerations
Upload Performance
- First file: No compression overhead (becomes reference)
- Similar files: 3-4 files/second with compression
- Network bound: Limited by S3 upload speed
- CPU bound: xdelta3 compression for large files
Download Performance
- Direct files: Limited by S3 download speed
- Delta files: <100ms reconstruction overhead
- Cache hits: Near-instant for cached references
Optimization Tips
- Group related files: Upload similar files to same prefix
- Batch operations: Use concurrent uploads for independent files
- Cache management: Don't clear cache during operations
- Compression threshold: Tune
max_ratiofor your use case - Network optimization: Use S3 Transfer Acceleration if available
Logging
DeltaGlider uses Python's standard logging framework:
import logging
# Configure logging before creating client
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler('deltaglider.log'),
logging.StreamHandler()
]
)
# Create client (will use configured logging)
client = create_client(log_level="DEBUG")
Log Levels
- DEBUG: Detailed operations, xdelta3 commands
- INFO: Normal operations, compression statistics
- WARNING: Non-critical issues, fallbacks
- ERROR: Operation failures, exceptions
Version Compatibility
- Python: 3.11 or higher required
- boto3: 1.35.0 or higher
- xdelta3: System binary required
- S3 API: Compatible with S3 API v4
Support
- GitHub Issues: github.com/beshu-tech/deltaglider/issues
- Documentation: github.com/beshu-tech/deltaglider
- PyPI Package: pypi.org/project/deltaglider