diff --git a/agent/cmd/README.md b/agent/cmd/README.md new file mode 100644 index 00000000..de459b38 --- /dev/null +++ b/agent/cmd/README.md @@ -0,0 +1,52 @@ +# agent/cmd + +The main entry point for the GoDoxy Agent, a secure monitoring and proxy agent that runs alongside Docker containers. + +## Overview + +This package contains the `main.go` entry point for the GoDoxy Agent. The agent is a TLS-enabled server that provides: + +- Secure Docker socket proxying with client certificate authentication +- HTTP proxy capabilities for container traffic +- System metrics collection and monitoring +- Health check endpoints + +## Architecture + +```mermaid +graph TD + A[main] --> B[Logger Init] + A --> C[Load CA Certificate] + A --> D[Load Server Certificate] + A --> E[Log Version Info] + A --> F[Start Agent Server] + A --> G[Start Socket Proxy] + A --> H[Start System Info Poller] + A --> I[Wait Exit] + + F --> F1[TLS with mTLS] + F --> F2[Agent Handler] + G --> G1[Docker Socket Proxy] +``` + +## Main Function Flow + +1. **Logger Setup**: Configures zerolog with console output +1. **Certificate Loading**: Loads CA and server certificates for TLS/mTLS +1. **Version Logging**: Logs agent version and configuration +1. **Agent Server**: Starts the main HTTPS server with agent handlers +1. **Socket Proxy**: Starts Docker socket proxy if configured +1. **System Monitoring**: Starts system info polling +1. **Graceful Shutdown**: Waits for exit signal (3 second timeout) + +## Configuration + +See `agent/pkg/env/README.md` for configuration options. + +## Dependencies + +- `agent/pkg/agent` - Core agent types and constants +- `agent/pkg/env` - Environment configuration +- `agent/pkg/server` - Server implementation +- `socketproxy/pkg` - Docker socket proxy +- `internal/metrics/systeminfo` - System metrics diff --git a/agent/pkg/agent/README.md b/agent/pkg/agent/README.md new file mode 100644 index 00000000..8d9800e4 --- /dev/null +++ b/agent/pkg/agent/README.md @@ -0,0 +1,108 @@ +# Agent Package + +The `agent` package provides the client-side implementation for interacting with GoDoxy agents. It handles agent configuration, secure communication via TLS, and provides utilities for agent deployment and management. + +## Architecture Overview + +```mermaid +graph TD + subgraph GoDoxy Server + AP[Agent Pool] --> AC[AgentConfig] + end + + subgraph Agent Communication + AC -->|HTTPS| AI[Agent Info API] + AC -->|TLS| ST[Stream Tunneling] + end + + subgraph Deployment + G[Generator] --> DC[Docker Compose] + G --> IS[Install Script] + end + + subgraph Security + NA[NewAgent] --> Certs[Certificates] + end +``` + +## File Structure + +| File | Purpose | +| -------------------------------------------------------- | --------------------------------------------------------- | +| [`config.go`](agent/pkg/agent/config.go) | Core configuration, initialization, and API client logic. | +| [`new_agent.go`](agent/pkg/agent/new_agent.go) | Agent creation and certificate generation logic. | +| [`docker_compose.go`](agent/pkg/agent/docker_compose.go) | Generator for agent Docker Compose configurations. | +| [`bare_metal.go`](agent/pkg/agent/bare_metal.go) | Generator for bare metal installation scripts. | +| [`env.go`](agent/pkg/agent/env.go) | Environment configuration types and constants. | +| [`common/`](agent/pkg/agent/common) | Shared constants and utilities for agents. | + +## Core Types + +### [`AgentConfig`](agent/pkg/agent/config.go:29) + +The primary struct used by the GoDoxy server to manage a connection to an agent. It stores the agent's address, metadata, and TLS configuration. + +### [`AgentInfo`](agent/pkg/agent/config.go:45) + +Contains basic metadata about the agent, including its version, name, and container runtime (Docker or Podman). + +### [`PEMPair`](agent/pkg/agent/new_agent.go:53) + +A utility struct for handling PEM-encoded certificate and key pairs, supporting encryption, decryption, and conversion to `tls.Certificate`. + +## Agent Creation and Certificate Management + +### Certificate Generation + +The [`NewAgent`](agent/pkg/agent/new_agent.go:147) function creates a complete certificate infrastructure for an agent: + +- **CA Certificate**: Self-signed root certificate with 1000-year validity. +- **Server Certificate**: For the agent's HTTPS server, signed by the CA. +- **Client Certificate**: For the GoDoxy server to authenticate with the agent. + +All certificates use ECDSA with P-256 curve and SHA-256 signatures. + +### Certificate Security + +- Certificates are encrypted using AES-GCM with a provided encryption key. +- The [`PEMPair`](agent/pkg/agent/new_agent.go:53) struct provides methods for encryption, decryption, and conversion to `tls.Certificate`. +- Base64 encoding is used for certificate storage and transmission. + +## Key Features + +### 1. Secure Communication + +All communication between the GoDoxy server and agents is secured using mutual TLS (mTLS). The [`AgentConfig`](agent/pkg/agent/config.go:29) handles the loading of CA and client certificates to establish secure connections. + +### 2. Agent Discovery and Initialization + +The [`Init`](agent/pkg/agent/config.go:231) and [`InitWithCerts`](agent/pkg/agent/config.go:110) methods allow the server to: + +- Fetch agent metadata (version, name, runtime). +- Verify compatibility between server and agent versions. +- Test support for TCP and UDP stream tunneling. + +### 3. Deployment Generators + +The package provides interfaces and implementations for generating deployment artifacts: + +- **Docker Compose**: Generates a `docker-compose.yml` for running the agent as a container via [`AgentComposeConfig.Generate()`](agent/pkg/agent/docker_compose.go:21). +- **Bare Metal**: Generates a shell script to install and run the agent as a systemd service via [`AgentEnvConfig.Generate()`](agent/pkg/agent/bare_metal.go:27). + +### 4. Fake Docker Host + +The package supports a "fake" Docker host scheme (`agent://`) to identify containers managed by an agent, allowing the GoDoxy server to route requests appropriately. See [`IsDockerHostAgent`](agent/pkg/agent/config.go:90) and [`GetAgentAddrFromDockerHost`](agent/pkg/agent/config.go:94). + +## Usage Example + +```go +cfg := &agent.AgentConfig{} +cfg.Parse("192.168.1.100:8081") + +ctx := context.Background() +if err := cfg.Init(ctx); err != nil { + log.Fatal(err) +} + +fmt.Printf("Connected to agent: %s (Version: %s)\n", cfg.Name, cfg.Version) +``` diff --git a/agent/pkg/agentproxy/README.md b/agent/pkg/agentproxy/README.md new file mode 100644 index 00000000..b95e54e8 --- /dev/null +++ b/agent/pkg/agentproxy/README.md @@ -0,0 +1,122 @@ +# agent/pkg/agentproxy + +Package for configuring HTTP proxy connections through the GoDoxy Agent using HTTP headers. + +## Overview + +This package provides types and functions for parsing and setting agent proxy configuration via HTTP headers. It supports both a modern base64-encoded JSON format and a legacy header-based format for backward compatibility. + +## Architecture + +```mermaid +graph LR + A[HTTP Request] --> B[ConfigFromHeaders] + B --> C{Modern Format?} + C -->|Yes| D[Parse X-Proxy-Config Base64 JSON] + C -->|No| E[Parse Legacy Headers] + D --> F[Config] + E --> F + + F --> G[SetAgentProxyConfigHeaders] + G --> H[Modern Headers] + G --> I[Legacy Headers] +``` + +## Public Types + +### Config + +```go +type Config struct { + Scheme string // Proxy scheme (http or https) + Host string // Proxy host (hostname or hostname:port) + HTTPConfig // Extended HTTP configuration +} +``` + +The `HTTPConfig` embedded type (from `internal/route/types`) includes: + +- `NoTLSVerify` - Skip TLS certificate verification +- `ResponseHeaderTimeout` - Timeout for response headers +- `DisableCompression` - Disable gzip compression + +## Public Functions + +### ConfigFromHeaders + +```go +func ConfigFromHeaders(h http.Header) (Config, error) +``` + +Parses proxy configuration from HTTP request headers. Tries modern format first, falls back to legacy format if not present. + +### proxyConfigFromHeaders + +```go +func proxyConfigFromHeaders(h http.Header) (Config, error) +``` + +Parses the modern base64-encoded JSON format from `X-Proxy-Config` header. + +### proxyConfigFromHeadersLegacy + +```go +func proxyConfigFromHeadersLegacy(h http.Header) Config +``` + +Parses the legacy header format: + +- `X-Proxy-Host` - Proxy host +- `X-Proxy-Https` - Whether to use HTTPS +- `X-Proxy-Skip-Tls-Verify` - Skip TLS verification +- `X-Proxy-Response-Header-Timeout` - Response timeout in seconds + +### SetAgentProxyConfigHeaders + +```go +func (cfg *Config) SetAgentProxyConfigHeaders(h http.Header) +``` + +Sets headers for modern format with base64-encoded JSON config. + +### SetAgentProxyConfigHeadersLegacy + +```go +func (cfg *Config) SetAgentProxyConfigHeadersLegacy(h http.Header) +``` + +Sets headers for legacy format with individual header fields. + +## Header Constants + +Modern headers: + +- `HeaderXProxyScheme` - Proxy scheme +- `HeaderXProxyHost` - Proxy host +- `HeaderXProxyConfig` - Base64-encoded JSON config + +Legacy headers (deprecated): + +- `HeaderXProxyHTTPS` +- `HeaderXProxySkipTLSVerify` +- `HeaderXProxyResponseHeaderTimeout` + +## Usage Example + +```go +// Reading configuration from incoming request headers +func handleRequest(w http.ResponseWriter, r *http.Request) { + cfg, err := agentproxy.ConfigFromHeaders(r.Header) + if err != nil { + http.Error(w, "Invalid proxy config", http.StatusBadRequest) + return + } + + // Use cfg.Scheme and cfg.Host to proxy the request + // ... +} +``` + +## Integration + +This package is used by `agent/pkg/handler/proxy_http.go` to configure reverse proxy connections based on request headers. diff --git a/agent/pkg/certs/README.md b/agent/pkg/certs/README.md new file mode 100644 index 00000000..058cd62c --- /dev/null +++ b/agent/pkg/certs/README.md @@ -0,0 +1,131 @@ +# agent/pkg/certs + +Certificate management package for creating and extracting certificate archives. + +## Overview + +This package provides utilities for packaging SSL certificates into ZIP archives and extracting them. It is used by the GoDoxy Agent to distribute certificates to clients in a convenient format. + +## Architecture + +```mermaid +graph LR + A[Raw Certs] --> B[ZipCert] + B --> C[ZIP Archive] + C --> D[ca.pem] + C --> E[cert.pem] + C --> F[key.pem] + + G[ZIP Archive] --> H[ExtractCert] + H --> I[ca, crt, key] +``` + +## Public Functions + +### ZipCert + +```go +func ZipCert(ca, crt, key []byte) ([]byte, error) +``` + +Creates a ZIP archive containing three PEM files: + +- `ca.pem` - CA certificate +- `cert.pem` - Server/client certificate +- `key.pem` - Private key + +**Parameters:** + +- `ca` - CA certificate in PEM format +- `crt` - Certificate in PEM format +- `key` - Private key in PEM format + +**Returns:** + +- ZIP archive bytes +- Error if packing fails + +### ExtractCert + +```go +func ExtractCert(data []byte) (ca, crt, key []byte, err error) +``` + +Extracts certificates from a ZIP archive created by `ZipCert`. + +**Parameters:** + +- `data` - ZIP archive bytes + +**Returns:** + +- `ca` - CA certificate bytes +- `crt` - Certificate bytes +- `key` - Private key bytes +- Error if extraction fails + +### AgentCertsFilepath + +```go +func AgentCertsFilepath(host string) (filepathOut string, ok bool) +``` + +Generates the file path for storing agent certificates. + +**Parameters:** + +- `host` - Agent hostname + +**Returns:** + +- Full file path within `certs/` directory +- `false` if host is invalid (contains path separators or special characters) + +### isValidAgentHost + +```go +func isValidAgentHost(host string) bool +``` + +Validates that a host string is safe for use in file paths. + +## Constants + +```go +const AgentCertsBasePath = "certs" +``` + +Base directory for storing certificate archives. + +```go +package main + +import ( + "os" + "github.com/yusing/godoxy/agent/pkg/certs" +) + +func main() { + // Read certificate files + caData, _ := os.ReadFile("ca.pem") + certData, _ := os.ReadFile("cert.pem") + keyData, _ := os.ReadFile("key.pem") + + // Create ZIP archive + zipData, err := certs.ZipCert(caData, certData, keyData) + if err != nil { + panic(err) + } + + // Save to file + os.WriteFile("agent-certs.zip", zipData, 0644) + + // Extract from archive + ca, crt, key, err := certs.ExtractCert(zipData) + // ... +} +``` + +## File Format + +The ZIP archive uses `zip.Store` compression (no compression) for fast creation and extraction. Each file is stored with its standard name (`ca.pem`, `cert.pem`, `key.pem`). diff --git a/agent/pkg/env/README.md b/agent/pkg/env/README.md new file mode 100644 index 00000000..be610be6 --- /dev/null +++ b/agent/pkg/env/README.md @@ -0,0 +1,52 @@ +# agent/pkg/env + +Environment configuration package for the GoDoxy Agent. + +## Overview + +This package manages environment variable parsing and provides a centralized location for all agent configuration options. It is automatically initialized on import. + +## Variables + +| Variable | Type | Default | Description | +| -------------------------- | ---------------- | ---------------------- | --------------------------------------- | +| `DockerSocket` | string | `/var/run/docker.sock` | Path to Docker socket | +| `AgentName` | string | System hostname | Agent identifier | +| `AgentPort` | int | `8890` | Agent server port | +| `AgentSkipClientCertCheck` | bool | `false` | Skip mTLS certificate verification | +| `AgentCACert` | string | (empty) | Base64 Encoded CA certificate + key | +| `AgentSSLCert` | string | (empty) | Base64 Encoded server certificate + key | +| `Runtime` | ContainerRuntime | `docker` | Container runtime (docker or podman) | + +## ContainerRuntime Type + +```go +type ContainerRuntime string + +const ( + ContainerRuntimeDocker ContainerRuntime = "docker" + ContainerRuntimePodman ContainerRuntime = "podman" +) +``` + +## Public Functions + +### DefaultAgentName + +```go +func DefaultAgentName() string +``` + +Returns the system hostname as the default agent name. Falls back to `"agent"` if hostname cannot be determined. + +### Load + +```go +func Load() +``` + +Reloads all environment variables from the environment. Called automatically on package init, but can be called again to refresh configuration. + +## Validation + +The `Load()` function validates that `Runtime` is either `docker` or `podman`. An invalid runtime causes a fatal error. diff --git a/agent/pkg/handler/README.md b/agent/pkg/handler/README.md new file mode 100644 index 00000000..1b887f4d --- /dev/null +++ b/agent/pkg/handler/README.md @@ -0,0 +1,127 @@ +# agent/pkg/handler + +HTTP request handler package for the GoDoxy Agent. + +## Overview + +This package provides the HTTP handler for the GoDoxy Agent server, including endpoints for: + +- Version information +- Agent name and runtime +- Health checks +- System metrics (via SSE) +- HTTP proxy routing +- Docker socket proxying + +## Architecture + +```mermaid +graph TD + A[HTTP Request] --> B[NewAgentHandler] + B --> C{ServeMux Router} + + C --> D[GET /version] + C --> E[GET /name] + C --> F[GET /runtime] + C --> G[GET /health] + C --> H[GET /system-info] + C --> I[GET /proxy/http/#123;path...#125;] + C --> J[ /#42; Docker Socket] + + H --> K[Gin Router] + K --> L[WebSocket Upgrade] + L --> M[SystemInfo Poller] +``` + +## Public Types + +### ServeMux + +```go +type ServeMux struct{ *http.ServeMux } +``` + +Wrapper around `http.ServeMux` with agent-specific endpoint helpers. + +**Methods:** + +- `HandleEndpoint(method, endpoint string, handler http.HandlerFunc)` - Registers handler with API base path +- `HandleFunc(endpoint string, handler http.HandlerFunc)` - Registers GET handler with API base path + +## Public Functions + +### NewAgentHandler + +```go +func NewAgentHandler() http.Handler +``` + +Creates and configures the HTTP handler for the agent server. Sets up: + +- Gin-based metrics handler with WebSocket support for SSE +- All standard agent endpoints +- HTTP proxy endpoint +- Docker socket proxy fallback + +## Endpoints + +| Endpoint | Method | Description | +| ----------------------- | -------- | ------------------------------------ | +| `/version` | GET | Returns agent version | +| `/name` | GET | Returns agent name | +| `/runtime` | GET | Returns container runtime | +| `/health` | GET | Health check with scheme query param | +| `/system-info` | GET | System metrics via SSE or WebSocket | +| `/proxy/http/{path...}` | GET/POST | HTTP proxy with config from headers | +| `/*` | \* | Docker socket proxy | + +## Sub-packages + +### proxy_http.go + +Handles HTTP proxy requests by reading configuration from request headers and proxying to the configured upstream. + +**Key Function:** + +- `ProxyHTTP(w, r)` - Proxies HTTP requests based on `X-Proxy-*` headers + +### check_health.go + +Handles health check requests for various schemes. + +**Key Function:** + +- `CheckHealth(w, r)` - Performs health checks with configurable scheme + +**Supported Schemes:** + +- `http`, `https` - HTTP health check +- `h2c` - HTTP/2 cleartext health check +- `tcp`, `udp`, `tcp4`, `udp4`, `tcp6`, `udp6` - TCP/UDP health check +- `fileserver` - File existence check + +## Usage Example + +```go +package main + +import ( + "net/http" + "github.com/yusing/godoxy/agent/pkg/handler" +) + +func main() { + mux := http.NewServeMux() + mux.Handle("/", handler.NewAgentHandler()) + + http.ListenAndServe(":8890", mux) +} +``` + +## WebSocket Support + +The handler includes a permissive WebSocket upgrader for internal use (no origin check). This enables real-time system metrics streaming via Server-Sent Events (SSE). + +## Docker Socket Integration + +All unmatched requests fall through to the Docker socket handler, allowing the agent to proxy Docker API calls when configured. diff --git a/cmd/README.md b/cmd/README.md new file mode 100644 index 00000000..60606cc0 --- /dev/null +++ b/cmd/README.md @@ -0,0 +1,73 @@ +# cmd + +Main entry point package for GoDoxy, a lightweight reverse proxy with WebUI for Docker containers. + +## Overview + +This package contains the `main.go` entry point that initializes and starts the GoDoxy server. It coordinates the initialization of all core components including configuration loading, API server, authentication, and monitoring services. + +## Architecture + +```mermaid +graph TD + A[main] --> B[Init Profiling] + A --> C[Init Logger] + A --> D[Parallel Init] + D --> D1[DNS Providers] + D --> D2[Icon Cache] + D --> D3[System Info Poller] + D --> D4[Middleware Compose Files] + A --> E[JWT Secret Setup] + A --> F[Create Directories] + A --> G[Load Config] + A --> H[Start Proxy Servers] + A --> I[Init Auth] + A --> J[Start API Server] + A --> K[Debug Server] + A --> L[Uptime Poller] + A --> M[Watch Changes] + A --> N[Wait Exit] +``` + +## Main Function Flow + +The `main()` function performs the following initialization steps: + +1. **Profiling Setup**: Initializes pprof endpoints for performance monitoring +1. **Logger Initialization**: Configures zerolog with memory logging +1. **Parallel Initialization**: Starts DNS providers, icon cache, system info poller, and middleware +1. **JWT Secret**: Ensures API JWT secret is set (generates random if not provided) +1. **Directory Preparation**: Creates required directories for logs, certificates, etc. +1. **Configuration Loading**: Loads YAML configuration and reports any errors +1. **Proxy Servers**: Starts HTTP/HTTPS proxy servers based on configuration +1. **Authentication**: Initializes authentication system with access control +1. **API Server**: Starts the REST API server with all configured routes +1. **Debug Server**: Starts the debug page server (development mode) +1. **Monitoring**: Starts uptime and system info polling +1. **Change Watcher**: Starts watching for Docker container and configuration changes +1. **Graceful Shutdown**: Waits for exit signal with configured timeout + +## Configuration + +The main configuration is loaded from `config/config.yml`. Required directories include: + +- `logs/` - Log files +- `config/` - Configuration directory +- `certs/` - SSL certificates +- `proxy/` - Proxy-related files + +## Environment Variables + +- `API_JWT_SECRET` - Secret key for JWT authentication (optional, auto-generated if not set) + +## Dependencies + +- `internal/api` - REST API handlers +- `internal/auth` - Authentication and ACL +- `internal/config` - Configuration management +- `internal/dnsproviders` - DNS provider integration +- `internal/homepage` - WebUI dashboard +- `internal/logging` - Logging infrastructure +- `internal/metrics` - System metrics collection +- `internal/route` - HTTP routing and middleware +- `github.com/yusing/goutils/task` - Task lifecycle management diff --git a/internal/acl/README.md b/internal/acl/README.md new file mode 100644 index 00000000..91426367 --- /dev/null +++ b/internal/acl/README.md @@ -0,0 +1,282 @@ +# ACL (Access Control List) + +Access control at the TCP connection level with IP/CIDR, timezone, and country-based filtering. + +## Overview + +The ACL package provides network-level access control by wrapping TCP listeners and validating incoming connections against configurable allow/deny rules. It integrates with MaxMind GeoIP for geographic-based filtering and supports access logging with notification batching. + +### Primary consumers + +- `internal/entrypoint` - Wraps the main TCP listener for connection filtering +- Operators - Configure rules via YAML configuration + +### Non-goals + +- HTTP request-level filtering (handled by middleware) +- Authentication or authorization (see `internal/auth`) +- VPN or tunnel integration + +### Stability + +Stable internal package. The public API is the `Config` struct and its methods. + +## Public API + +### Exported types + +```go +type Config struct { + Default string // "allow" or "deny" (default: "allow") + AllowLocal *bool // Allow private/loopback IPs (default: true) + Allow Matchers // Allow rules + Deny Matchers // Deny rules + Log *accesslog.ACLLoggerConfig // Access logging configuration + + Notify struct { + To []string // Notification providers + Interval time.Duration // Notification frequency (default: 1m) + IncludeAllowed *bool // Include allowed in notifications (default: false) + } +} +``` + +```go +type Matcher struct { + match MatcherFunc +} +``` + +```go +type Matchers []Matcher +``` + +### Exported functions and methods + +```go +func (c *Config) Validate() gperr.Error +``` + +Validates configuration and sets defaults. Must be called before `Start`. + +```go +func (c *Config) Start(parent task.Parent) gperr.Error +``` + +Initializes the ACL, starts the logger and notification goroutines. + +```go +func (c *Config) IPAllowed(ip net.IP) bool +``` + +Returns true if the IP is allowed based on configured rules. Performs caching and GeoIP lookup if needed. + +```go +func (c *Config) WrapTCP(lis net.Listener) net.Listener +``` + +Wraps a `net.Listener` to filter connections by IP. + +```go +func (matcher *Matcher) Parse(s string) error +``` + +Parses a matcher string in the format `{type}:{value}`. Supported types: `ip`, `cidr`, `tz`, `country`. + +## Architecture + +### Core components + +```mermaid +graph TD + A[TCP Listener] --> B[TCPListener Wrapper] + B --> C{IP Allowed?} + C -->|Yes| D[Accept Connection] + C -->|No| E[Close Connection] + + F[Config] --> G[Validate] + G --> H[Start] + H --> I[Matcher Evaluation] + I --> C + + J[MaxMind] -.-> K[IP Lookup] + K -.-> I + + L[Access Logger] -.-> M[Log & Notify] + M -.-> B +``` + +### Connection filtering flow + +```mermaid +sequenceDiagram + participant Client + participant TCPListener + participant Config + participant MaxMind + participant Logger + + Client->>TCPListener: Connection Request + TCPListener->>Config: IPAllowed(clientIP) + + alt Loopback IP + Config-->>TCPListener: true + else Private IP (allow_local) + Config-->>TCPListener: true + else Cached Result + Config-->>TCPListener: Cached Result + else Evaluate Allow Rules + Config->>Config: Check Allow list + alt Matches + Config->>Config: Cache true + Config-->>TCPListener: Allowed + else Evaluate Deny Rules + Config->>Config: Check Deny list + alt Matches + Config->>Config: Cache false + Config-->>TCPListener: Denied + else Default Action + Config->>MaxMind: Lookup GeoIP + MaxMind-->>Config: IPInfo + Config->>Config: Apply default rule + Config->>Config: Cache result + Config-->>TCPListener: Result + end + end + end + + alt Logging enabled + Config->>Logger: Log access attempt + end +``` + +### Matcher types + +| Type | Format | Example | +| -------- | ----------------- | --------------------- | +| IP | `ip:address` | `ip:192.168.1.1` | +| CIDR | `cidr:network` | `cidr:192.168.0.0/16` | +| TimeZone | `tz:timezone` | `tz:Asia/Shanghai` | +| Country | `country:ISOCode` | `country:GB` | + +## Configuration Surface + +### Config sources + +Configuration is loaded from `config/config.yml` under the `acl` key. + +### Schema + +```yaml +acl: + default: "allow" # "allow" or "deny" + allow_local: true # Allow private/loopback IPs + log: + log_allowed: false # Log allowed connections + notify: + to: ["gotify"] # Notification providers + interval: "1m" # Notification interval + include_allowed: false # Include allowed in notifications +``` + +### Hot-reloading + +Configuration requires restart. The ACL does not support dynamic rule updates. + +## Dependency and Integration Map + +### Internal dependencies + +- `internal/maxmind` - IP geolocation lookup +- `internal/logging/accesslog` - Access logging +- `internal/notif` - Notifications +- `internal/task/task.go` - Lifetime management + +### Integration points + +```go +// Entrypoint uses ACL to wrap the TCP listener +aclListener := config.ACL.WrapTCP(listener) +http.Server.Serve(aclListener, entrypoint) +``` + +## Observability + +### Logs + +- `ACL started` - Configuration summary on start +- `log_notify_loop` - Access attempts (allowed/denied) + +Log levels: `Info` for startup, `Debug` for client closure. + +### Metrics + +No metrics are currently exposed. + +## Security Considerations + +- Loopback and private IPs are always allowed unless explicitly denied +- Cache TTL is 1 minute to limit memory usage +- Notification channel has a buffer of 100 to prevent blocking +- Failed connections are immediately closed without response + +## Failure Modes and Recovery + +| Failure | Behavior | Recovery | +| --------------------------------- | ------------------------------------- | --------------------------------------------- | +| Invalid matcher syntax | Validation fails on startup | Fix configuration syntax | +| MaxMind database unavailable | GeoIP lookups return unknown location | Default action applies; cache hit still works | +| Notification provider unavailable | Notification dropped | Error logged, continues operation | +| Cache full | No eviction, uses Go map | No action needed | + +## Usage Examples + +### Basic configuration + +```go +aclConfig := &acl.Config{ + Default: "allow", + AllowLocal: ptr(true), + Allow: acl.Matchers{ + {match: matchIP(net.ParseIP("192.168.1.0/24"))}, + }, + Deny: acl.Matchers{ + {match: matchISOCode("CN")}, + }, +} +if err := aclConfig.Validate(); err != nil { + log.Fatal(err) +} +if err := aclConfig.Start(parent); err != nil { + log.Fatal(err) +} +``` + +### Wrapping a TCP listener + +```go +listener, err := net.Listen("tcp", ":443") +if err != nil { + log.Fatal(err) +} + +// Wrap with ACL +aclListener := aclConfig.WrapTCP(listener) + +// Use with HTTP server +server := &http.Server{} +server.Serve(aclListener) +``` + +### Creating custom matchers + +```go +matcher := &acl.Matcher{} +err := matcher.Parse("country:US") +if err != nil { + log.Fatal(err) +} + +// Use the matcher +allowed := matcher.match(ipInfo) +``` diff --git a/internal/agentpool/README.md b/internal/agentpool/README.md new file mode 100644 index 00000000..08b260a8 --- /dev/null +++ b/internal/agentpool/README.md @@ -0,0 +1,281 @@ +# Agent Pool + +Thread-safe pool for managing remote Docker agent connections. + +## Overview + +The agentpool package provides a centralized pool for storing and retrieving remote agent configurations. It enables GoDoxy to connect to Docker hosts via agent connections instead of direct socket access, enabling secure remote container management. + +### Primary consumers + +- `internal/route/provider` - Creates agent-based route providers +- `internal/docker` - Manages agent-based Docker client connections +- Configuration loading during startup + +### Non-goals + +- Agent lifecycle management (handled by `agent/pkg/agent`) +- Agent health monitoring +- Agent authentication/authorization + +### Stability + +Stable internal package. The pool uses `xsync.Map` for lock-free concurrent access. + +## Public API + +### Exported types + +```go +type Agent struct { + *agent.AgentConfig + httpClient *http.Client + fasthttpHcClient *fasthttp.Client +} +``` + +### Exported functions + +```go +func Add(cfg *agent.AgentConfig) (added bool) +``` + +Adds an agent to the pool. Returns `true` if added, `false` if already exists. Uses `LoadOrCompute` to prevent duplicates. + +```go +func Has(cfg *agent.AgentConfig) bool +``` + +Checks if an agent exists in the pool. + +```go +func Remove(cfg *agent.AgentConfig) +``` + +Removes an agent from the pool. + +```go +func RemoveAll() +``` + +Removes all agents from the pool. Called during configuration reload. + +```go +func Get(agentAddrOrDockerHost string) (*Agent, bool) +``` + +Retrieves an agent by address or Docker host URL. Automatically detects if the input is an agent address or Docker host URL and resolves accordingly. + +```go +func GetAgent(name string) (*Agent, bool) +``` + +Retrieves an agent by name. O(n) iteration over pool contents. + +```go +func List() []*Agent +``` + +Returns all agents as a slice. Creates a new copy for thread safety. + +```go +func Iter() iter.Seq2[string, *Agent] +``` + +Returns an iterator over all agents. Uses `xsync.Map.Range`. + +```go +func Num() int +``` + +Returns the number of agents in the pool. + +```go +func (agent *Agent) HTTPClient() *http.Client +``` + +Returns an HTTP client configured for the agent. + +## Architecture + +### Core components + +```mermaid +graph TD + A[Agent Config] --> B[Add to Pool] + B --> C[xsync.Map Storage] + C --> D{Get Request} + D -->|By Address| E[Load from map] + D -->|By Docker Host| F[Resolve agent addr] + D -->|By Name| G[Iterate & match] + + H[Docker Client] --> I[Get Agent] + I --> C + I --> J[HTTP Client] + J --> K[Agent Connection] + + L[Route Provider] --> M[List Agents] + M --> C +``` + +### Thread safety model + +The pool uses `xsync.Map[string, *Agent]` for concurrent-safe operations: + +- `Add`: `LoadOrCompute` prevents race conditions and duplicates +- `Get`: Lock-free read operations +- `Iter`: Consistent snapshot iteration via `Range` +- `Remove`: Thread-safe deletion + +### Test mode + +When running tests (binary ends with `.test`), a test agent is automatically added: + +```go +func init() { + if strings.HasSuffix(os.Args[0], ".test") { + agentPool.Store("test-agent", &Agent{ + AgentConfig: &agent.AgentConfig{ + Addr: "test-agent", + }, + }) + } +} +``` + +## Configuration Surface + +No direct configuration. Agents are added via configuration loading from `config/config.yml`: + +```yaml +providers: + agents: + - addr: agent.example.com:443 + name: remote-agent + tls: + ca_file: /path/to/ca.pem + cert_file: /path/to/cert.pem + key_file: /path/to/key.pem +``` + +## Dependency and Integration Map + +### Internal dependencies + +- `agent/pkg/agent` - Agent configuration and connection settings +- `xsync/v4` - Concurrent map implementation + +### External dependencies + +- `valyala/fasthttp` - Fast HTTP client for agent communication + +### Integration points + +```go +// Docker package uses agent pool for remote connections +if agent.IsDockerHostAgent(host) { + a, ok := agentpool.Get(host) + if !ok { + panic(fmt.Errorf("agent %q not found", host)) + } + opt := []client.Opt{ + client.WithHost(agent.DockerHost), + client.WithHTTPClient(a.HTTPClient()), + } +} +``` + +## Observability + +### Logs + +No specific logging in the agentpool package. Client creation/destruction is logged in the docker package. + +### Metrics + +No metrics are currently exposed. + +## Security Considerations + +- TLS configuration is loaded from agent configuration +- Connection credentials are not stored in the pool after agent creation +- HTTP clients are created per-request to ensure credential freshness + +## Failure Modes and Recovery + +| Failure | Behavior | Recovery | +| -------------------- | -------------------- | ---------------------------- | +| Agent not found | Returns `nil, false` | Add agent to pool before use | +| Duplicate add | Returns `false` | Existing agent is preserved | +| Test mode activation | Test agent added | Only during test binaries | + +## Performance Characteristics + +- O(1) lookup by address +- O(n) iteration for name-based lookup +- Pre-sized to 10 entries via `xsync.WithPresize(10)` +- No locks required for read operations +- HTTP clients are created per-call to ensure fresh connections + +## Usage Examples + +### Adding an agent + +```go +agentConfig := &agent.AgentConfig{ + Addr: "agent.example.com:443", + Name: "my-agent", +} + +added := agentpool.Add(agentConfig) +if !added { + log.Println("Agent already exists") +} +``` + +### Retrieving an agent + +```go +// By address +agent, ok := agentpool.Get("agent.example.com:443") +if !ok { + log.Fatal("Agent not found") +} + +// By Docker host URL +agent, ok := agentpool.Get("http://docker-host:2375") +if !ok { + log.Fatal("Agent not found") +} + +// By name +agent, ok := agentpool.GetAgent("my-agent") +if !ok { + log.Fatal("Agent not found") +} +``` + +### Iterating over all agents + +```go +for addr, agent := range agentpool.Iter() { + log.Printf("Agent: %s at %s", agent.Name, addr) +} +``` + +### Using with Docker client + +```go +// When creating a Docker client with an agent host +if agent.IsDockerHostAgent(host) { + a, ok := agentpool.Get(host) + if !ok { + panic(fmt.Errorf("agent %q not found", host)) + } + opt := []client.Opt{ + client.WithHost(agent.DockerHost), + client.WithHTTPClient(a.HTTPClient()), + } + dockerClient, err := client.New(opt...) +} +``` diff --git a/internal/api/v1/README.md b/internal/api/v1/README.md new file mode 100644 index 00000000..658e3d4e --- /dev/null +++ b/internal/api/v1/README.md @@ -0,0 +1,197 @@ +# API v1 Package + +Implements the v1 REST API handlers for GoDoxy, exposing endpoints for managing routes, Docker containers, certificates, metrics, and system configuration. + +## Overview + +The `internal/api/v1` package implements the HTTP handlers that power GoDoxy's REST API. It uses the Gin web framework and provides endpoints for route management, container operations, certificate handling, system metrics, and configuration. + +### Primary Consumers + +- **WebUI**: The homepage dashboard and admin interface consume these endpoints + +### Non-goals + +- Authentication and authorization logic (delegated to `internal/auth`) +- Route proxying and request handling (handled by `internal/route`) +- Docker container lifecycle management (delegated to `internal/docker`) +- Certificate issuance and storage (handled by `internal/autocert`) + +### Stability + +This package is stable. Public API endpoints follow semantic versioning for request/response contracts. Internal implementation may change between minor versions. + +## Public API + +### Exported Types + +Types are defined in `goutils/apitypes`: + +| Type | Purpose | +| -------------------------- | -------------------------------- | +| `apitypes.ErrorResponse` | Standard error response format | +| `apitypes.SuccessResponse` | Standard success response format | + +### Handler Subpackages + +| Package | Purpose | +| ---------- | ---------------------------------------------- | +| `route` | Route listing, details, and playground testing | +| `docker` | Docker container management and monitoring | +| `cert` | Certificate information and renewal | +| `metrics` | System metrics and uptime information | +| `homepage` | Homepage items and category management | +| `file` | Configuration file read/write operations | +| `auth` | Authentication and session management | +| `agent` | Remote agent creation and management | + +## Architecture + +### Handler Organization + +Package structure mirrors the API endpoint paths (e.g., `auth/login.go` handles `/auth/login`). + +### Request Flow + +```mermaid +sequenceDiagram + participant Client + participant GinRouter + participant Handler + participant Service + participant Response + + Client->>GinRouter: HTTP Request + GinRouter->>Handler: Route to handler + Handler->>Service: Call service layer + Service-->>Handler: Data or error + Handler->>Response: Format JSON response + Response-->>Client: JSON or redirect +``` + +## Configuration Surface + +API listening address is configured with `GODOXY_API_ADDR` environment variable. + +## Dependency and Integration Map + +### Internal Dependencies + +| Package | Purpose | +| ----------------------- | --------------------------- | +| `internal/route/routes` | Route storage and iteration | +| `internal/docker` | Docker client management | +| `internal/config` | Configuration access | +| `internal/metrics` | System metrics collection | +| `internal/homepage` | Homepage item generation | +| `internal/agentpool` | Remote agent management | +| `internal/auth` | Authentication services | + +### External Dependencies + +| Package | Purpose | +| ------------------------------ | --------------------------- | +| `github.com/gin-gonic/gin` | HTTP routing and middleware | +| `github.com/gorilla/websocket` | WebSocket support | +| `github.com/moby/moby/client` | Docker API client | + +## Observability + +### Logs + +Handlers log at `INFO` level for requests and `ERROR` level for failures. Logs include: + +- Request path and method +- Response status code +- Error details (when applicable) + +### Metrics + +No dedicated metrics exposed by handlers. Request metrics collected by middleware. + +## Security Considerations + +- All endpoints (except `/api/v1/version`) require authentication +- Input validation using Gin binding tags +- Path traversal prevention in file operations +- WebSocket connections use same auth middleware as HTTP + +## Failure Modes and Recovery + +| Failure | Behavior | +| ----------------------------------- | ------------------------------------------ | +| Docker host unreachable | Returns partial results with errors logged | +| Certificate provider not configured | Returns 404 | +| Invalid request body | Returns 400 with error details | +| Authentication failure | Returns 302 redirect to login | +| Agent not found | Returns 404 | + +## Usage Examples + +### Listing All Routes via WebSocket + +```go +import ( + "github.com/gorilla/websocket" +) + +func watchRoutes(provider string) error { + url := "ws://localhost:8888/api/v1/route/list" + if provider != "" { + url += "?provider=" + provider + } + + conn, _, err := websocket.DefaultDialer.Dial(url, nil) + if err != nil { + return err + } + defer conn.Close() + + for { + _, message, err := conn.ReadMessage() + if err != nil { + return err + } + // message contains JSON array of routes + processRoutes(message) + } +} +``` + +### Getting Container Status + +```go +import ( + "encoding/json" + "net/http" +) + +type Container struct { + Server string `json:"server"` + Name string `json:"name"` + ID string `json:"id"` + Image string `json:"image"` +} + +func listContainers() ([]Container, error) { + resp, err := http.Get("http://localhost:8888/api/v1/docker/containers") + if err != nil { + return nil, err + } + defer resp.Body.Close() + + var containers []Container + if err := json.NewDecoder(resp.Body).Decode(&containers); err != nil { + return nil, err + } + return containers, nil +} +``` + +### Health Check + +```bash +curl http://localhost:8888/health +``` + +) diff --git a/internal/api/v1/metrics/upime.go b/internal/api/v1/metrics/uptime.go similarity index 100% rename from internal/api/v1/metrics/upime.go rename to internal/api/v1/metrics/uptime.go diff --git a/internal/auth/README.md b/internal/auth/README.md new file mode 100644 index 00000000..aa340207 --- /dev/null +++ b/internal/auth/README.md @@ -0,0 +1,349 @@ +# Authentication + +Authentication providers supporting OIDC and username/password authentication with JWT-based sessions. + +## Overview + +The auth package implements authentication middleware and login handlers that integrate with GoDoxy's HTTP routing system. It provides flexible authentication that can be enabled/disabled based on configuration and supports multiple authentication providers. + +### Primary consumers + +- `internal/route/rules` - Authentication middleware for routes +- `internal/api/v1/auth` - Login and session management endpoints +- `internal/homepage` - WebUI login page + +### Non-goals + +- ACL or authorization (see `internal/acl`) +- User management database +- Multi-factor authentication +- Rate limiting (basic OIDC rate limiting only) + +### Stability + +Stable internal package. Public API consists of the `Provider` interface and initialization functions. + +## Public API + +### Exported types + +```go +type Provider interface { + CheckToken(r *http.Request) error + LoginHandler(w http.ResponseWriter, r *http.Request) + PostAuthCallbackHandler(w http.ResponseWriter, r *http.Request) + LogoutHandler(w http.ResponseWriter, r *http.Request) +} +``` + +### OIDC Provider + +```go +type OIDCProvider struct { + oauthConfig *oauth2.Config + oidcProvider *oidc.Provider + oidcVerifier *oidc.IDTokenVerifier + endSessionURL *url.URL + allowedUsers []string + allowedGroups []string + rateLimit *rate.Limiter +} +``` + +### Username/Password Provider + +```go +type UserPassAuth struct { + username string + pwdHash []byte + secret []byte + tokenTTL time.Duration +} +``` + +### Exported functions + +```go +func Initialize() error +``` + +Sets up authentication providers based on environment configuration. Returns error if OIDC issuer is configured but cannot be reached. + +```go +func IsEnabled() bool +``` + +Returns whether authentication is enabled. Checks `DEBUG_DISABLE_AUTH`, `API_JWT_SECRET`, and `OIDC_ISSUER_URL`. + +```go +func IsOIDCEnabled() bool +``` + +Returns whether OIDC authentication is configured. + +```go +func GetDefaultAuth() Provider +``` + +Returns the configured authentication provider. + +```go +func AuthCheckHandler(w http.ResponseWriter, r *http.Request) +``` + +HTTP handler that checks if the request has a valid token. Returns 200 if valid, invokes login handler otherwise. + +```go +func AuthOrProceed(w http.ResponseWriter, r *http.Request) bool +``` + +Authenticates request or proceeds if valid. Returns `false` if login handler was invoked, `true` if authenticated. + +```go +func ProceedNext(w http.ResponseWriter, r *http.Request) +``` + +Continues to the next handler after successful authentication. + +```go +func NewUserPassAuth(username, password string, secret []byte, tokenTTL time.Duration) (*UserPassAuth, error) +``` + +Creates a new username/password auth provider with bcrypt password hashing. + +```go +func NewUserPassAuthFromEnv() (*UserPassAuth, error) +``` + +Creates username/password auth from environment variables `API_USER`, `API_PASSWORD`, `API_JWT_SECRET`. + +```go +func NewOIDCProvider(issuerURL, clientID, clientSecret string, allowedUsers, allowedGroups []string) (*OIDCProvider, error) +``` + +Creates a new OIDC provider. Returns error if issuer cannot be reached or no allowed users/groups are configured. + +```go +func NewOIDCProviderFromEnv() (*OIDCProvider, error) +``` + +Creates OIDC provider from environment variables `OIDC_ISSUER_URL`, `OIDC_CLIENT_ID`, `OIDC_CLIENT_SECRET`, etc. + +## Architecture + +### Core components + +```mermaid +graph TD + A[HTTP Request] --> B{Auth Enabled?} + B -->|No| C[Proceed Direct] + B -->|Yes| D[Check Token] + D -->|Valid| E[Proceed] + D -->|Invalid| F[Login Handler] + + G[OIDC Provider] --> H[Token Validation] + I[UserPass Provider] --> J[Credential Check] + + F --> K{OIDC Configured?} + K -->|Yes| G + K -->|No| I + + subgraph Cookie Management + L[Token Cookie] + M[State Cookie] + N[Session Cookie] + end +``` + +### OIDC authentication flow + +```mermaid +sequenceDiagram + participant User + participant App + participant IdP + + User->>App: Access Protected Resource + App->>App: Check Token + alt No valid token + App-->>User: Redirect to /auth/ + User->>IdP: Login & Authorize + IdP-->>User: Redirect with Code + User->>App: /auth/callback?code=... + App->>IdP: Exchange Code for Token + IdP-->>App: Access Token + ID Token + App->>App: Validate Token + App->>App: Check allowed users/groups + App-->>User: Protected Resource + else Valid token exists + App-->>User: Protected Resource + end +``` + +### Username/password flow + +```mermaid +sequenceDiagram + participant User + participant App + + User->>App: POST /auth/callback + App->>App: Validate credentials + alt Valid + App->>App: Generate JWT + App-->>User: Set token cookie, redirect to / + else Invalid + App-->>User: 401 Unauthorized + end +``` + +## Configuration Surface + +### Environment variables + +| Variable | Description | +| ------------------------ | ----------------------------------------------------------- | +| `DEBUG_DISABLE_AUTH` | Set to "true" to disable auth for debugging | +| `API_JWT_SECRET` | Secret key for JWT token validation (enables userpass auth) | +| `API_USER` | Username for userpass authentication | +| `API_PASSWORD` | Password for userpass authentication | +| `API_JWT_TOKEN_TTL` | Token TTL duration (default: 24h) | +| `OIDC_ISSUER_URL` | OIDC provider URL (enables OIDC) | +| `OIDC_CLIENT_ID` | OIDC client ID | +| `OIDC_CLIENT_SECRET` | OIDC client secret | +| `OIDC_REDIRECT_URL` | OIDC redirect URL | +| `OIDC_ALLOWED_USERS` | Comma-separated list of allowed users | +| `OIDC_ALLOWED_GROUPS` | Comma-separated list of allowed groups | +| `OIDC_SCOPES` | Comma-separated OIDC scopes (default: openid,profile,email) | +| `OIDC_RATE_LIMIT` | Rate limit requests (default: 10) | +| `OIDC_RATE_LIMIT_PERIOD` | Rate limit period (default: 1m) | + +### Hot-reloading + +Authentication configuration requires restart. No dynamic reconfiguration is supported. + +## Dependency and Integration Map + +### Internal dependencies + +- `internal/common` - Environment variable access + +### External dependencies + +- `golang.org/x/crypto/bcrypt` - Password hashing +- `github.com/coreos/go-oidc/v3/oidc` - OIDC protocol +- `golang.org/x/oauth2` - OAuth2/OIDC implementation +- `github.com/golang-jwt/jwt/v5` - JWT token handling +- `golang.org/x/time/rate` - OIDC rate limiting + +### Integration points + +```go +// Route middleware uses AuthOrProceed +routeHandler := func(w http.ResponseWriter, r *http.Request) { + if !auth.AuthOrProceed(w, r) { + return // Auth failed, login handler was invoked + } + // Continue with authenticated request +} +``` + +## Observability + +### Logs + +- OIDC provider initialization errors +- Token validation failures +- Rate limit exceeded events + +### Metrics + +No metrics are currently exposed. + +## Security Considerations + +- JWT tokens use HS512 signing for userpass auth +- OIDC tokens are validated against the issuer +- Session tokens are scoped by client ID to prevent conflicts +- Passwords are hashed with bcrypt (cost 10) +- OIDC rate limiting prevents brute-force attacks +- State parameter prevents CSRF attacks +- Refresh tokens are stored and invalidated on logout + +## Failure Modes and Recovery + +| Failure | Behavior | Recovery | +| ------------------------ | ------------------------------ | ----------------------------- | +| OIDC issuer unreachable | Initialize returns error | Fix network/URL configuration | +| Invalid JWT secret | Initialize uses API_JWT_SECRET | Provide correct secret | +| Token expired | CheckToken returns error | User must re-authenticate | +| User not in allowed list | Returns ErrUserNotAllowed | Add user to allowed list | +| Rate limit exceeded | Returns 429 Too Many Requests | Wait for rate limit reset | + +## Usage Examples + +### Basic setup + +```go +// Initialize authentication during startup +err := auth.Initialize() +if err != nil { + log.Fatal(err) +} + +// Check if auth is enabled +if auth.IsEnabled() { + log.Println("Authentication is enabled") +} + +// Check OIDC status +if auth.IsOIDCEnabled() { + log.Println("OIDC authentication configured") +} +``` + +### Using AuthOrProceed middleware + +```go +func protectedHandler(w http.ResponseWriter, r *http.Request) { + if !auth.AuthOrProceed(w, r) { + return // Auth failed, login handler was invoked + } + // Continue with authenticated request +} +``` + +### Using AuthCheckHandler + +```go +http.HandleFunc("/api/", auth.AuthCheckHandler(apiHandler)) +``` + +### Custom OIDC provider + +```go +provider, err := auth.NewOIDCProvider( + "https://your-idp.com", + "your-client-id", + "your-client-secret", + []string{"user1", "user2"}, + []string{"group1"}, +) +if err != nil { + log.Fatal(err) +} +``` + +### Custom userpass provider + +```go +provider, err := auth.NewUserPassAuth( + "admin", + "password123", + []byte("jwt-secret-key"), + 24*time.Hour, +) +if err != nil { + log.Fatal(err) +} +``` diff --git a/internal/autocert/README.md b/internal/autocert/README.md index bc19feb3..c69d3fde 100644 --- a/internal/autocert/README.md +++ b/internal/autocert/README.md @@ -2,42 +2,116 @@ Automated SSL certificate management using the ACME protocol (Let's Encrypt and compatible CAs). -## Architecture Overview +## Overview -``` -┌────────────────────────────────────────────────────────────────────────────┐ -│ GoDoxy Proxy │ -├────────────────────────────────────────────────────────────────────────────┤ -│ ┌──────────────────────┐ ┌─────────────────────────────────────────┐ │ -│ │ Config.State │────▶│ autocert.Provider │ │ -│ │ (config loading) │ │ ┌───────────────────────────────────┐ │ │ -│ └──────────────────────┘ │ │ main Provider │ │ │ -│ │ │ - Primary certificate │ │ │ -│ │ │ - SNI matcher │ │ │ -│ │ │ - Renewal scheduler │ │ │ -│ │ └───────────────────────────────────┘ │ │ -│ │ ┌───────────────────────────────────┐ │ │ -│ │ │ extraProviders[] │ │ │ -│ │ │ - Additional certifictes │ │ │ -│ │ │ - Different domains/A │ │ │ -│ │ └───────────────────────────────────┘ │ │ -│ └─────────────────────────────────────────┘ │ -│ │ │ -│ ▼ │ -│ ┌────────────────────────────────┐ │ -│ │ TLS Handshake │ │ -│ │ GetCert(ClientHelloInf) │ │ -│ └────────────────────────────────┘ │ -└────────────────────────────────────────────────────────────────────────────┘ +### Purpose + +This package provides complete SSL certificate lifecycle management: + +- ACME account registration and management +- Certificate issuance via DNS-01 challenge +- Automatic renewal scheduling (30 days before expiry) +- SNI-based certificate selection for multi-domain setups + +### Primary Consumers + +- `internal/net/gphttp/` - TLS handshake certificate provider +- `internal/api/v1/cert/` - REST API for certificate management +- Configuration loading via `internal/config/` + +### Non-goals + +- HTTP-01 challenge support +- Certificate transparency log monitoring +- OCSP stapling +- Private CA support (except via custom CADirURL) + +### Stability + +Internal package with stable public APIs. ACME protocol compliance depends on lego library. + +## Public API + +### Config (`config.go`) + +```go +type Config struct { + Email string // ACME account email + Domains []string // Domains to certify + CertPath string // Output cert path + KeyPath string // Output key path + Extra []ConfigExtra // Additional cert configs + ACMEKeyPath string // ACME account private key + Provider string // DNS provider name + Options map[string]strutils.Redacted // Provider options + Resolvers []string // DNS resolvers + CADirURL string // Custom ACME CA directory + CACerts []string // Custom CA certificates + EABKid string // External Account Binding Key ID + EABHmac string // External Account Binding HMAC +} + +// Merge extra config with main provider +func MergeExtraConfig(mainCfg *Config, extraCfg *ConfigExtra) ConfigExtra ``` -## Certificate Lifecycle +### Provider (`provider.go`) + +```go +type Provider struct { + logger zerolog.Logger + cfg *Config + user *User + legoCfg *lego.Config + client *lego.Client + lastFailure time.Time + legoCert *certificate.Resource + tlsCert *tls.Certificate + certExpiries CertExpiries + extraProviders []*Provider + sniMatcher sniMatcher +} + +// Create new provider (initializes extras atomically) +func NewProvider(cfg *Config, user *User, legoCfg *lego.Config) (*Provider, error) + +// TLS certificate getter for SNI +func (p *Provider) GetCert(hello *tls.ClientHelloInfo) (*tls.Certificate, error) + +// Certificate info for API +func (p *Provider) GetCertInfos() ([]CertInfo, error) + +// Provider name ("main" or "extra[N]") +func (p *Provider) GetName() string + +// Obtain certificate if not exists +func (p *Provider) ObtainCertIfNotExistsAll() error + +// Force immediate renewal +func (p *Provider) ForceExpiryAll() bool + +// Schedule automatic renewal +func (p *Provider) ScheduleRenewalAll(parent task.Parent) + +// Print expiry dates +func (p *Provider) PrintCertExpiriesAll() +``` + +### User (`user.go`) + +```go +type User struct { + Email string // Account email + Registration *registration.Resource // ACME registration + Key crypto.PrivateKey // Account key +} +``` + +## Architecture + +### Certificate Lifecycle ```mermaid ---- -config: - theme: redux-dark-color ---- flowchart TD A[Start] --> B[Load Existing Cert] B --> C{Cert Exists?} @@ -70,16 +144,9 @@ flowchart TD T --> V[Update SNI Matcher] V --> G - - style E fill:#22553F,color:#fff - style I fill:#8B8000,color:#fff - style N fill:#22553F,color:#fff - style U fill:#84261A,color:#fff ``` -## SNI Matching Flow - -When a TLS client connects with Server Name Indication (SNI), the proxy needs to select the correct certificate. +### SNI Matching Flow ```mermaid flowchart LR @@ -96,183 +163,48 @@ flowchart LR F -->|Yes| D F -->|No| G[Return default cert] end - - style C fill:#27632A,color:#fff - style E fill:#18597A,color:#fff - style F fill:#836C03,color:#fff ``` ### Suffix Tree Structure -The `sniMatcher` uses an optimized suffix tree for efficient wildcard matching: - ``` Certificate: *.example.com, example.com, *.api.example.com exact: - "example.com" → Provider_A + "example.com" -> Provider_A root: └── "com" └── "example" - ├── "*" → Provider_A [wildcard at *.example.com] + ├── "*" -> Provider_A [wildcard at *.example.com] └── "api" - └── "*" → Provider_B [wildcard at *.api.example.com] + └── "*" -> Provider_B [wildcard at *.api.example.com] ``` -## Key Components +## Configuration Surface -### Config +### Provider Types -Configuration for certificate management, loaded from `config/autocert.yml`. +| Type | Description | Use Case | +| -------------- | ---------------------------- | ------------------------- | +| `local` | No ACME, use existing cert | Pre-existing certificates | +| `pseudo` | Mock provider for testing | Development | +| ACME providers | Let's Encrypt, ZeroSSL, etc. | Production | -```go -type Config struct { - Email string // ACME account email - Domains []string // Domains to certifiy - CertPath string // Output cert path - KeyPath string // Output key path - Extra []ConfigExtra // Additional cert configs - ACMEKeyPath string // ACME account private key (shared by all extras) - Provider string // DNS provider name - Options map[string]strutils.Redacted // Provider-specific options - Resolvers []string // DNS resolvers for DNS-01 - CADirURL string // Custom ACME CA directory - CACerts []string // Custom CA certificates - EABKid string // External Account Binding Key ID - EABHmac string // External Account Binding HMAC +### Supported DNS Providers - idx int // 0: main, 1+: extra[i] -} +| Provider | Name | Required Options | +| ------------ | -------------- | ----------------------------------- | +| Cloudflare | `cloudflare` | `CF_API_TOKEN` | +| Route 53 | `route53` | AWS credentials | +| DigitalOcean | `digitalocean` | `DO_API_TOKEN` | +| GoDaddy | `godaddy` | `GD_API_KEY`, `GD_API_SECRET` | +| OVH | `ovh` | `OVH_ENDPOINT`, `OVH_APP_KEY`, etc. | +| CloudDNS | `clouddns` | GCP credentials | +| AzureDNS | `azuredns` | Azure credentials | +| DuckDNS | `duckdns` | `DUCKDNS_TOKEN` | -type ConfigExtra Config -``` - -**Extra Provider Merging:** Extra configurations are merged with the main config using `MergeExtraConfig()`, inheriting most settings from the main provider while allowing per-certificate overrides for `Provider`, `Email`, `Domains`, `Options`, `Resolvers`, `CADirURL`, `CACerts`, `EABKid`, `EABHmac`, and `HTTPClient`. The `ACMEKeyPath` is shared across all providers. - -**Validation:** - -- Extra configs must have unique `cert_path` and `key_path` values (no duplicates across main or any extra provider) - -### ConfigExtra - -Extra certificate configuration type. Uses `MergeExtraConfig()` to inherit settings from the main provider: - -```go -func MergeExtraConfig(mainCfg *Config, extraCfg *ConfigExtra) ConfigExtra -``` - -Fields that can be overridden per extra provider: - -- `Provider` - DNS provider name -- `Email` - ACME account email -- `Domains` - Certificate domains -- `Options` - Provider-specific options -- `Resolvers` - DNS resolvers -- `CADirURL` - Custom ACME CA directory -- `CACerts` - Custom CA certificates -- `EABKid` / `EABHmac` - External Account Binding credentials -- `HTTPClient` - Custom HTTP client - -Fields inherited from main config (shared): - -- `ACMEKeyPath` - ACME account private key (same for all) - -**Provider Types:** - -- `local` - No ACME, use existing certificate (default) -- `pseudo` - Mock provider for testing -- `custom` - Custom ACME CA with `CADirURL` - -### Provider - -Main certificate management struct that handles: - -- Certificate issuance and renewal -- Loading certificates from disk -- SNI-based certificate selection -- Renewal scheduling - -```go -type Provider struct { - logger zerolog.Logger // Provider-scoped logger - - cfg *Config // Configuration - user *User // ACME account - legoCfg *lego.Config // LEGO client config - client *lego.Client // ACME client - lastFailure time.Time // Last renewal failure - legoCert *certificate.Resource // Cached cert resource - tlsCert *tls.Certificate // Parsed TLS certificate - certExpiries CertExpiries // Domain → expiry map - extraProviders []*Provider // Additional certificates - sniMatcher sniMatcher // SNI → Provider mapping - forceRenewalCh chan struct{} // Force renewal trigger channel - scheduleRenewalOnce sync.Once // Prevents duplicate renewal scheduling -} -``` - -**Logging:** Each provider has a scoped logger with provider name ("main" or "extra[N]") for consistent log context. - -**Key Methods:** - -- `NewProvider(cfg *Config, user *User, legoCfg *lego.Config) (*Provider, error)` - Creates provider and initializes extra providers atomically -- `GetCert(hello *tls.ClientHelloInfo)` - Returns certificate for TLS handshake -- `GetName()` - Returns provider name ("main" or "extra[N]") -- `ObtainCert()` - Obtains new certificate via ACME -- `ObtainCertAll()` - Renews/obtains certificates for main and all extra providers -- `ObtainCertIfNotExistsAll()` - Obtains certificates only if they don't exist on disk -- `ForceExpiryAll()` - Triggers forced certificate renewal for main and all extra providers -- `ScheduleRenewalAll(parent task.Parent)` - Schedules automatic renewal for all providers -- `PrintCertExpiriesAll()` - Logs certificate expiry dates for all providers - -### User - -ACME account representation implementing lego's `acme.User` interface. - -```go -type User struct { - Email string // Account email - Registration *registration.Resource // ACME registration - Key crypto.PrivateKey // Account key -} -``` - -### sniMatcher - -Efficient SNI-to-Provider lookup with exact and wildcard matching. - -```go -type sniMatcher struct { - exact map[string]*Provider // Exact domain matches - root sniTreeNode // Wildcard suffix tree -} - -type sniTreeNode struct { - children map[string]*sniTreeNode // DNS label → child node - wildcard *Provider // Wildcard match at this level -} -``` - -## DNS Providers - -Supported DNS providers for DNS-01 challenge validation: - -| Provider | Name | Description | -| ------------ | -------------- | ---------------------------------------- | -| Cloudflare | `cloudflare` | Cloudflare DNS | -| Route 53 | `route53` | AWS Route 53 | -| DigitalOcean | `digitalocean` | DigitalOcean DNS | -| GoDaddy | `godaddy` | GoDaddy DNS | -| OVH | `ovh` | OVHcloud DNS | -| CloudDNS | `clouddns` | Google Cloud DNS | -| AzureDNS | `azuredns` | Azure DNS | -| DuckDNS | `duckdns` | DuckDNS | -| and more... | | See `internal/dnsproviders/providers.go` | - -### Provider Configuration - -Each provider accepts configuration via the `options` map: +### Example Configuration ```yaml autocert: @@ -280,53 +212,14 @@ autocert: email: admin@example.com domains: - example.com - - '*.example.com' + - "*.example.com" options: - CF_API_TOKEN: your-api-token - CF_ZONE_API_TOKEN: your-zone-token + auth_token: ${CF_API_TOKEN} resolvers: - 1.1.1.1:53 ``` -## ACME Integration - -### Account Registration - -```mermaid -flowchart TD - A[Load or Generate ACME Key] --> B[Init LEGO Client] - B --> C[Resolve Account by Key] - C --> D{Account Exists?} - D -->|Yes| E[Continue with existing] - D -->|No| F{Has EAB?} - F -->|Yes| G[Register with EAB] - F -->|No| H[Register with TOS Agreement] - G --> I[Save Registration] - H --> I -``` - -### DNS-01 Challenge - -```mermaid -sequenceDiagram - participant C as ACME CA - participant P as GoDoxy - participant D as DNS Provider - - P->>C: Request certificate for domain - C->>P: Present DNS-01 challenge - P->>D: Create TXT record _acme-challenge.domain - D-->>P: Record created - P->>C: Challenge ready - C->>D: Verify DNS TXT record - D-->>C: Verification success - C->>P: Issue certificate - P->>D: Clean up TXT record -``` - -## Multi-Certificate Support - -The package supports multiple certificates through the `extra` configuration: +### Extra Providers ```yaml autocert: @@ -334,212 +227,81 @@ autocert: email: admin@example.com domains: - example.com - - '*.example.com' + - "*.example.com" cert_path: certs/example.com.crt key_path: certs/example.com.key + options: + auth_token: ${CF_API_TOKEN} extra: - domains: - api.example.com - - '*.api.example.com' + - "*.api.example.com" cert_path: certs/api.example.com.crt key_path: certs/api.example.com.key - provider: cloudflare - email: admin@api.example.com ``` -### Extra Provider Setup +## Dependency and Integration Map -Extra providers are initialized atomically within `NewProvider()`: +### External Dependencies -```mermaid -flowchart TD - A[NewProvider] --> B{Merge Config with Extra} - B --> C[Create Provider per Extra] - C --> D[Build SNI Matcher] - D --> E[Register in SNI Tree] +- `github.com/go-acme/lego/v4` - ACME protocol implementation +- `github.com/rs/zerolog` - Structured logging - style B fill:#1a2639,color:#fff - style C fill:#423300,color:#fff -``` +### Internal Dependencies -## Renewal Scheduling +- `internal/task/task.go` - Lifetime management +- `internal/notif/` - Renewal notifications +- `internal/config/` - Configuration loading +- `internal/dnsproviders/` - DNS provider implementations -### Renewal Timing +## Observability -- **Initial Check**: Certificate expiry is checked at startup -- **Renewal Window**: Renewal scheduled for 1 month before expiry -- **Cooldown on Failure**: 1-hour cooldown after failed renewal -- **Request Cooldown**: 15-second cooldown after startup (prevents rate limiting) -- **Force Renewal**: `forceRenewalCh` channel allows triggering immediate renewal +### Logs -### Force Renewal +| Level | When | +| ------- | ----------------------------- | +| `Info` | Certificate obtained/renewed | +| `Info` | Registration reused | +| `Warn` | Renewal failure | +| `Error` | Certificate retrieval failure | -The `forceRenewalCh` channel (buffered size 1) enables immediate certificate renewal on demand: +### Notifications -```go -// Trigger forced renewal for main and all extra providers -provider.ForceExpiryAll() -``` +- Certificate renewal success/failure +- Service startup with expiry dates -```mermaid -flowchart TD - A[Start] --> B[Calculate Renewal Time] - B --> C[expiry - 30 days] - C --> D[Start Timer] +## Security Considerations - D --> E{Event?} - E -->|forceRenewalCh| F[Force Renewal] - E -->|Timer| G[Check Failure Cooldown] - E -->|Context Done| H[Exit] +- Account private key stored at `certs/acme.key` (mode 0600) +- Certificate private keys stored at configured paths (mode 0600) +- Certificate files world-readable (mode 0644) +- ACME account email used for Let's Encrypt ToS +- EAB credentials for zero-touch enrollment - G --> H1{Recently Failed?} - H1 -->|Yes| I[Skip, Wait Next Event] - H1 -->|No| J[Attempt Renewal] +## Failure Modes and Recovery - J --> K{Renewal Success?} - K -->|Yes| L[Reset Failure, Notify Success] - K -->|No| M[Update Failure Time, Notify Failure] - - L --> N[Reset Timer] - I --> N - M --> D - - N --> D - - style F fill:#423300,color:#fff - style J fill:#423300,color:#fff - style K fill:#174014,color:#fff - style M fill:#432829,color:#fff -``` - -**Notifications:** Renewal success/failure triggers system notifications with provider name. - -### CertState - -Certificate state tracking: - -```go -const ( - CertStateValid // Certificate is valid and up-to-date - CertStateExpired // Certificate has expired or needs renewal - CertStateMismatch // Certificate domains don't match config -) -``` - -### RenewMode - -Controls renewal behavior: - -```go -const ( - renewModeForce // Force renewal, bypass cooldown and state check - renewModeIfNeeded // Renew only if expired or domain mismatch -) -``` - -## File Structure - -``` -internal/autocert/ -├── README.md # This file -├── config.go # Config struct and validation -├── provider.go # Provider implementation -├── setup.go # Extra provider setup -├── sni_matcher.go # SNI matching logic -├── providers.go # DNS provider registration -├── state.go # Certificate state enum -├── user.go # ACME user/account -├── paths.go # Default paths -└── types/ - └── provider.go # Provider interface -``` - -## Default Paths - -| Constant | Default Value | Description | -| -------------------- | ---------------- | ------------------------ | -| `CertFileDefault` | `certs/cert.crt` | Default certificate path | -| `KeyFileDefault` | `certs/priv.key` | Default private key path | -| `ACMEKeyFileDefault` | `certs/acme.key` | Default ACME account key | - -Failure tracking file is generated per-certificate: `/.last_failure-` - -## Error Handling - -The package uses structured error handling with `gperr`: - -- **ErrMissingField** - Required configuration field missing -- **ErrDuplicatedPath** - Duplicate certificate/key paths in extras -- **ErrInvalidDomain** - Invalid domain format -- **ErrUnknownProvider** - Unknown DNS provider -- **ErrGetCertFailure** - Certificate retrieval failed - -**Error Context:** All errors are prefixed with provider name ("main" or "extra[N]") via `fmtError()` for clear attribution. +| Failure Mode | Impact | Recovery | +| ------------------------------ | -------------------------- | ----------------------------- | +| DNS-01 challenge timeout | Certificate issuance fails | Check DNS provider API | +| Rate limiting (too many certs) | 1-hour cooldown | Wait or use different account | +| DNS provider API error | Renewal fails | 1-hour cooldown, retry | +| Certificate domains mismatch | Must re-obtain | Force renewal via API | +| Account key corrupted | Must register new account | New key, may lose certs | ### Failure Tracking -Last failure is persisted per-certificate to prevent rate limiting: +Last failure persisted per-certificate to prevent rate limiting: + +``` +File: /.last_failure- +Where hash = SHA256(certPath|keyPath)[:6] +``` + +## Usage Examples + +### Initial Setup ```go -// File: /.last_failure- where hash is SHA256(certPath|keyPath)[:6] -``` - -**Cooldown Checks:** Last failure is checked in `obtainCertIfNotExists()` (15-second startup cooldown) and `renew()` (1-hour failure cooldown). The `renewModeForce` bypasses cooldown checks entirely. - -## Integration with GoDoxy - -The autocert package integrates with GoDoxy's configuration system: - -```mermaid -flowchart LR - subgraph Config - direction TB - A[config.yml] --> B[Parse Config] - B --> C[AutoCert Config] - end - - subgraph State - C --> D[NewProvider] - D --> E[Schedule Renewal] - E --> F[Set Active Provider] - end - - subgraph Server - F --> G[TLS Handshake] - G --> H[GetCert via SNI] - H --> I[Return Certificate] - end -``` - -### REST API - -Force certificate renewal via WebSocket endpoint: - -| Endpoint | Method | Description | -| -------------------- | ------ | ----------------------------------------- | -| `/api/v1/cert/renew` | GET | Triggers `ForceExpiryAll()` via WebSocket | - -The endpoint streams live logs during the renewal process. - -## Usage Example - -```yaml -# config/config.yml -autocert: - provider: cloudflare - email: admin@example.com - domains: - - example.com - - '*.example.com' - options: - CF_API_TOKEN: ${CF_API_TOKEN} - resolvers: - - 1.1.1.1:53 - - 8.8.8.8:53 -``` - -```go -// In config initialization autocertCfg := state.AutoCert user, legoCfg, err := autocertCfg.GetLegoConfig() if err != nil { @@ -558,3 +320,21 @@ if err := provider.ObtainCertIfNotExistsAll(); err != nil { provider.ScheduleRenewalAll(state.Task()) provider.PrintCertExpiriesAll() ``` + +### Force Renewal via API + +```go +// WebSocket endpoint: GET /api/v1/cert/renew +if provider.ForceExpiryAll() { + // Wait for renewal to complete + provider.WaitRenewalDone(ctx) +} +``` + +## Testing Notes + +- `config_test.go` - Configuration validation +- `provider_test/` - Provider functionality tests +- `sni_test.go` - SNI matching tests +- `multi_cert_test.go` - Extra provider tests +- Integration tests require mock DNS provider diff --git a/internal/config/README.md b/internal/config/README.md new file mode 100644 index 00000000..5ad73378 --- /dev/null +++ b/internal/config/README.md @@ -0,0 +1,316 @@ +# Configuration Management + +Centralized YAML configuration management with thread-safe state access and provider initialization. + +## Overview + +The config package implements the core configuration management system for GoDoxy, handling YAML configuration loading, provider initialization, route loading, and state transitions. It uses atomic pointers for thread-safe state access and integrates all configuration components. + +### Primary consumers + +- `cmd/main.go` - Initializes configuration state on startup +- `internal/route/provider` - Accesses configuration for route creation +- `internal/api/v1` - Exposes configuration via REST API +- All packages that need to access active configuration + +### Non-goals + +- Dynamic provider registration after initialization (require config reload) + +### Stability + +Stable internal package. Public API consists of `State` interface and state management functions. + +## Public API + +### Exported types + +```go +type Config struct { + ACL *acl.Config + AutoCert *autocert.Config + Entrypoint entrypoint.Config + Providers Providers + MatchDomains []string + Homepage homepage.Config + Defaults Defaults + TimeoutShutdown int +} + +type Providers struct { + Files []string + Docker map[string]types.DockerProviderConfig + Agents []*agent.AgentConfig + Notification []*notif.NotificationConfig + Proxmox []proxmox.Config + MaxMind *maxmind.Config +} +``` + +### State interface + +```go +type State interface { + Task() *task.Task + Context() context.Context + Value() *Config + EntrypointHandler() http.Handler + ShortLinkMatcher() config.ShortLinkMatcher + AutoCertProvider() server.CertProvider + LoadOrStoreProvider(key string, value types.RouteProvider) (actual types.RouteProvider, loaded bool) + DeleteProvider(key string) + IterProviders() iter.Seq2[string, types.RouteProvider] + StartProviders() error + NumProviders() int +} +``` + +### Exported functions + +```go +func NewState() config.State +``` + +Creates a new configuration state with empty providers map. + +```go +func GetState() config.State +``` + +Returns the active configuration state. Thread-safe via atomic load. + +```go +func SetState(state config.State) +``` + +Sets the active configuration state. Also updates active configs for ACL, entrypoint, homepage, and autocert. + +```go +func HasState() bool +``` + +Returns true if a state is currently active. + +```go +func Value() *config.Config +``` + +Returns the current configuration values. + +```go +func (state *state) InitFromFile(filename string) error +``` + +Initializes state from a YAML file. Uses default config if file doesn't exist. + +```go +func (state *state) Init(data []byte) error +``` + +Initializes state from raw YAML data. Validates, then initializes MaxMind, Proxmox, providers, AutoCert, notifications, access logger, and entrypoint. + +```go +func (state *state) StartProviders() error +``` + +Starts all route providers concurrently. + +```go +func (state *state) IterProviders() iter.Seq2[string, types.RouteProvider] +``` + +Returns an iterator over all providers. + +## Architecture + +### Core components + +```mermaid +graph TD + A[config.yml] --> B[State] + B --> C{Initialize} + C --> D[Validate YAML] + C --> E[Init MaxMind] + C --> F[Init Proxmox] + C --> G[Load Route Providers] + C --> H[Init AutoCert] + C --> I[Init Notifications] + C --> J[Init Entrypoint] + + K[ActiveConfig] -.-> B + + subgraph Providers + G --> L[Docker Provider] + G --> M[File Provider] + G --> N[Agent Provider] + end + + subgraph State Management + B --> O[xsync.Map Providers] + B --> P[Entrypoint] + B --> Q[AutoCert Provider] + B --> R[task.Task] + end +``` + +### Initialization pipeline + +```mermaid +sequenceDiagram + participant YAML + participant State + participant MaxMind + participant Proxmox + participant Providers + participant AutoCert + participant Notif + participant Entrypoint + + YAML->>State: Parse & Validate + par Initialize in parallel + State->>MaxMind: Initialize + State->>Proxmox: Initialize + and + State->>Providers: Load Route Providers + Providers->>State: Store Providers + end + State->>AutoCert: Initialize + State->>Notif: Initialize + State->>Entrypoint: Configure + State->>State: Start Providers +``` + +### Thread safety model + +```go +var stateMu sync.RWMutex + +func GetState() config.State { + return config.ActiveState.Load() +} + +func SetState(state config.State) { + stateMu.Lock() + defer stateMu.Unlock() + config.ActiveState.Store(state) +} +``` + +Uses `sync.RWMutex` for write synchronization and `sync/atomic` for read operations. + +## Configuration Surface + +### Config sources + +Configuration is loaded from `config/config.yml`. + +### Hot-reloading + +Configuration supports hot-reloading via editing `config/config.yml`. + +## Dependency and Integration Map + +### Internal dependencies + +- `internal/acl` - Access control configuration +- `internal/autocert` - SSL certificate management +- `internal/entrypoint` - HTTP entrypoint setup +- `internal/route/provider` - Route providers (Docker, file, agent) +- `internal/maxmind` - GeoIP configuration +- `internal/notif` - Notification providers +- `internal/proxmox` - LXC container management +- `internal/homepage/types` - Dashboard configuration +- `github.com/yusing/goutils/task` - Object lifecycle management + +### External dependencies + +- `github.com/goccy/go-yaml` - YAML parsing +- `github.com/puzpuzpuz/xsync/v4` - Concurrent map + +### Integration points + +```go +// API uses config/query to access state +providers := statequery.RouteProviderList() + +// Route providers access config state +for _, p := range config.GetState().IterProviders() { + // Process provider +} +``` + +## Observability + +### Logs + +- Configuration parsing and validation errors +- Provider initialization results +- Route loading summary +- Full configuration dump (at debug level) + +### Metrics + +No metrics are currently exposed. + +## Security Considerations + +- Configuration file permissions should be restricted (contains secrets) +- TLS certificates are loaded from files specified in config +- Agent credentials are passed via configuration +- No secrets are logged (except in debug mode with full config dump) + +## Failure Modes and Recovery + +| Failure | Behavior | Recovery | +| ----------------------------- | ----------------------------------- | -------------------------- | +| Invalid YAML | Init returns error | Fix YAML syntax | +| Missing required fields | Validation fails | Add required fields | +| Provider initialization fails | Error aggregated and returned | Fix provider configuration | +| Duplicate provider key | Error logged, first provider kept | Rename provider | +| Route loading fails | Error aggregated, other routes load | Fix route configuration | + +## Performance Characteristics + +- Providers are loaded concurrently +- Routes are loaded concurrently per provider +- State access is lock-free for reads +- Atomic pointer for state swap + +## Usage Examples + +### Loading configuration + +```go +state := config.NewState() +err := state.InitFromFile("config.yml") +if err != nil { + log.Fatal(err) +} + +config.SetState(state) +``` + +### Accessing configuration + +```go +if config.HasState() { + cfg := config.Value() + log.Printf("Entrypoint middleware count: %d", len(cfg.Entrypoint.Middlewares)) + log.Printf("Docker providers: %d", len(cfg.Providers.Docker)) +} +``` + +### Iterating providers + +```go +for name, provider := range config.GetState().IterProviders() { + log.Printf("Provider: %s, Routes: %d", name, provider.NumRoutes()) +} +``` + +### Accessing entrypoint handler + +```go +state := config.GetState() +http.Handle("/", state.EntrypointHandler()) +``` diff --git a/internal/config/query/README.md b/internal/config/query/README.md new file mode 100644 index 00000000..52427323 --- /dev/null +++ b/internal/config/query/README.md @@ -0,0 +1,226 @@ +# Configuration Query + +Read-only access to the active configuration state, including route providers and system statistics. + +## Overview + +The `internal/config/query` package offers read-only access to the active configuration state. It provides functions to dump route providers, list providers, search for routes, and retrieve system statistics. This package is primarily used by the API layer to expose configuration information. + +### Primary consumers + +- `internal/api/v1` - REST API endpoints for configuration queries +- `internal/homepage` - Dashboard statistics display +- Operators - CLI tools and debugging interfaces + +### Non-goals + +- Configuration modification (see `internal/config`) +- Provider lifecycle management +- Dynamic state updates + +### Stability + +Stable internal package. Functions are simple read-only accessors. + +## Public API + +### Exported types + +```go +type RouteProviderListResponse struct { + ShortName string `json:"short_name"` + FullName string `json:"full_name"` +} +``` + +```go +type Statistics struct { + Total uint16 `json:"total"` + ReverseProxies types.RouteStats `json:"reverse_proxies"` + Streams types.RouteStats `json:"streams"` + Providers map[string]types.ProviderStats `json:"providers"` +} +``` + +### Exported functions + +```go +func DumpRouteProviders() map[string]types.RouteProvider +``` + +Returns all route providers as a map keyed by their short name. Thread-safe access via `config.ActiveState.Load()`. + +```go +func RouteProviderList() []RouteProviderListResponse +``` + +Returns a list of route providers with their short and full names. Useful for API responses. + +```go +func SearchRoute(alias string) types.Route +``` + +Searches for a route by alias across all providers. Returns `nil` if not found. + +```go +func GetStatistics() Statistics +``` + +Aggregates statistics from all route providers, including total routes, reverse proxies, streams, and per-provider stats. + +## Architecture + +### Core components + +``` +config/query/ +├── query.go # Provider and route queries +└── stats.go # Statistics aggregation +``` + +### Data flow + +```mermaid +graph TD + A[API Request] --> B[config/query Functions] + B --> C{Query Type} + C -->|Provider List| D[ActiveState.Load] + C -->|Route Search| E[Iterate Providers] + C -->|Statistics| F[Aggregate from All Providers] + D --> G[Return Provider Data] + E --> H[Return Found Route or nil] + F --> I[Return Statistics] +``` + +### Thread safety model + +All functions use `config.ActiveState.Load()` for thread-safe read access: + +```go +func DumpRouteProviders() map[string]types.RouteProvider { + state := config.ActiveState.Load() + entries := make(map[string]types.RouteProvider, state.NumProviders()) + for _, p := range state.IterProviders() { + entries[p.ShortName()] = p + } + return entries +} +``` + +## Configuration Surface + +No configuration. This package only reads from the active state. + +## Dependency and Integration Map + +### Internal dependencies + +- `internal/config/types` - `ActiveState` atomic pointer and `State` interface +- `internal/types` - Route provider and route types + +### Integration points + +```go +// API endpoint uses query functions +func ListProviders(w http.ResponseWriter, r *http.Request) { + providers := statequery.RouteProviderList() + json.NewEncoder(w).Encode(providers) +} +``` + +## Observability + +### Logs + +No logging in the query package itself. + +### Metrics + +No metrics are currently exposed. + +## Security Considerations + +- Read-only access prevents state corruption +- No sensitive data is exposed beyond what the configuration already contains +- Caller should handle nil state gracefully + +## Failure Modes and Recovery + +| Failure | Behavior | Recovery | +| -------------------- | -------------------------- | ------------------------------ | +| No active state | Functions return empty/nil | Initialize config first | +| Provider returns nil | Skipped in iteration | Provider should not return nil | +| Route not found | Returns nil | Expected behavior | + +## Performance Characteristics + +- O(n) where n is number of providers for provider queries +- O(n * m) where m is routes per provider for route search +- O(n) for statistics aggregation +- No locking required (uses atomic load) + +## Usage Examples + +### Listing all providers + +```go +providers := statequery.RouteProviderList() +for _, p := range providers { + fmt.Printf("Short: %s, Full: %s\n", p.ShortName, p.FullName) +} +``` + +### Getting all providers as a map + +```go +providers := statequery.DumpRouteProviders() +for shortName, provider := range providers { + fmt.Printf("%s: %s\n", shortName, provider.String()) +} +``` + +### Searching for a route + +```go +route := statequery.SearchRoute("my-service") +if route != nil { + fmt.Printf("Found route: %s\n", route.Alias()) +} +``` + +### Getting system statistics + +```go +stats := statequery.GetStatistics() +fmt.Printf("Total routes: %d\n", stats.Total) +fmt.Printf("Reverse proxies: %d\n", stats.ReverseProxies.Total) +for name, providerStats := range stats.Providers { + fmt.Printf("Provider %s: %d routes\n", name, providerStats.RPs.Total) +} +``` + +### Integration with API + +```go +func handleGetProviders(w http.ResponseWriter, r *http.Request) { + providers := statequery.RouteProviderList() + w.Header().Set("Content-Type", "application/json") + json.NewEncoder(w).Encode(providers) +} + +func handleGetStats(w http.ResponseWriter, r *http.Request) { + stats := statequery.GetStatistics() + w.Header().Set("Content-Type", "application/json") + json.NewEncoder(w).Encode(stats) +} + +func handleFindRoute(w http.ResponseWriter, r *http.Request) { + alias := r.URL.Query().Get("alias") + route := statequery.SearchRoute(alias) + if route == nil { + http.NotFound(w, r) + return + } + json.NewEncoder(w).Encode(route) +} +``` diff --git a/internal/dnsproviders/README.md b/internal/dnsproviders/README.md new file mode 100644 index 00000000..0d79a718 --- /dev/null +++ b/internal/dnsproviders/README.md @@ -0,0 +1,257 @@ +# DNS Providers + +DNS provider integrations for Let's Encrypt certificate management via the lego library. + +## Overview + +The dnsproviders package registers and initializes DNS providers supported by the ACME protocol implementation (lego). It provides a unified interface for configuring DNS-01 challenge providers for SSL certificate issuance. + +### Primary consumers + +- `internal/autocert` - Uses registered providers for certificate issuance +- Operators - Configure DNS providers via YAML + +### Non-goals + +- DNS zone management +- Record creation/deletion outside ACME challenges +- Provider-specific features beyond DNS-01 + +### Stability + +Stable internal package. Provider registry is extensible. + +## Public API + +### Exported constants + +```go +const ( + Local = "local" // Dummy local provider for static certificates + Pseudo = "pseudo" // Pseudo provider for testing +) +``` + +### Exported functions + +```go +func InitProviders() +``` + +Registers all available DNS providers with the autocert package. Called during initialization. + +```go +func NewDummyDefaultConfig() *Config +``` + +Creates a dummy default config for testing providers. + +```go +func NewDummyDNSProviderConfig() map[string]any +``` + +Creates a dummy provider configuration for testing. + +## Architecture + +### Core components + +```mermaid +graph TD + A[AutoCert] --> B[DNS Provider Registry] + B --> C[Provider Factory] + C --> D[Lego DNS Provider] + + subgraph Supported Providers + E[Cloudflare] + F[AWS Route53] + G[DigitalOcean] + H[Google Cloud DNS] + I[And 20+ more...] + end + + B --> E + B --> F + B --> G + B --> H + B --> I +``` + +### Supported providers + +| Provider | Key | Description | +| -------------- | --------------- | --------------------- | +| ACME DNS | `acmedns` | ACME DNS server | +| Azure DNS | `azuredns` | Microsoft Azure DNS | +| Cloudflare | `cloudflare` | Cloudflare DNS | +| CloudNS | `cloudns` | ClouDNS | +| CloudDNS | `clouddns` | Google Cloud DNS | +| DigitalOcean | `digitalocean` | DigitalOcean DNS | +| DuckDNS | `duckdns` | DuckDNS | +| EdgeDNS | `edgedns` | Akamai EdgeDNS | +| GoDaddy | `godaddy` | GoDaddy DNS | +| Google Domains | `googledomains` | Google Domains DNS | +| Hetzner | `hetzner` | Hetzner DNS | +| Hostinger | `hostinger` | Hostinger DNS | +| HTTP Request | `httpreq` | Generic HTTP provider | +| INWX | `inwx` | INWX DNS | +| IONOS | `ionos` | IONOS DNS | +| Linode | `linode` | Linode DNS | +| Namecheap | `namecheap` | Namecheap DNS | +| Netcup | `netcup` | netcup DNS | +| Netlify | `netlify` | Netlify DNS | +| OVH | `ovh` | OVHcloud DNS | +| Oracle Cloud | `oraclecloud` | Oracle Cloud DNS | +| Porkbun | `porkbun` | Porkbun DNS | +| RFC 2136 | `rfc2136` | BIND/named (RFC 2136) | +| Scaleway | `scaleway` | Scaleway DNS | +| SpaceShip | `spaceship` | SpaceShip DNS | +| Timeweb Cloud | `timewebcloud` | Timeweb Cloud DNS | +| Vercel | `vercel` | Vercel DNS | +| Vultr | `vultr` | Vultr DNS | +| Google Cloud | `gcloud` | Google Cloud DNS | + +## Configuration Surface + +### Config sources + +Configuration is loaded from `config/config.yml` under the `autocert` key. + +### Schema + +```yaml +autocert: + provider: cloudflare + email: admin@example.com + domains: + - example.com + - "*.example.com" + options: # provider-specific options + auth_token: your-api-token +``` + +### Hot-reloading + +Supports hot-reloading via editing `config/config.yml`. + +## Dependency and Integration Map + +### Internal dependencies + +- `internal/autocert` - Provider registry and certificate issuance + +### External dependencies + +- `github.com/go-acme/lego/v4/providers/dns/*` - All lego DNS providers + +### Integration points + +```go +// In autocert package +var Providers = map[string]DNSProvider{ + "local": dnsproviders.NewDummyDefaultConfig, + "pseudo": dnsproviders.NewDummyDefaultConfig, + // ... registered providers +} + +type DNSProvider func(*any, ...any) (provider.Config, error) +``` + +## Observability + +### Logs + +- Provider initialization messages from lego +- DNS challenge validation attempts +- Certificate issuance progress + +### Metrics + +No metrics are currently exposed. + +## Security Considerations + +- API credentials are passed to provider configuration +- Credentials are stored in configuration files (should be protected) +- DNS-01 challenge requires TXT record creation capability +- Provider should have minimal DNS permissions (only TXT records) + +## Failure Modes and Recovery + +| Failure | Behavior | Recovery | +| --------------------- | --------------------------- | -------------------------------------- | +| Invalid credentials | Provider returns error | Verify credentials | +| DNS propagation delay | Challenge fails temporarily | Retry with longer propagation time | +| Provider unavailable | Certificate issuance fails | Use alternative provider | +| Unsupported provider | Key not found in registry | Register provider or use supported one | + +## Performance Characteristics + +- Provider initialization is O(1) per provider +- DNS-01 challenge depends on DNS propagation time +- Certificate issuance may take several seconds + +## Usage Examples + +### Initialization + +```go +import "github.com/yusing/godoxy/internal/dnsproviders" + +func init() { + dnsproviders.InitProviders() +} +``` + +### Using with AutoCert + +```go +import "github.com/yusing/godoxy/internal/autocert" + +// Providers are automatically registered +providers := autocert.Providers + +provider, ok := providers["cloudflare"] +if !ok { + log.Fatal("Cloudflare provider not available") +} + +config := provider.DefaultConfig() +``` + +### Getting provider configuration + +```go +// Access registered providers +for name, factory := range autocert.Providers { + cfg := factory.DefaultConfig() + log.Printf("Provider %s: %+v", name, cfg) +} +``` + +### Certificate issuance flow + +```mermaid +sequenceDiagram + participant User + participant AutoCert + participant DNSProvider + participant DNS + participant LetsEncrypt + + User->>AutoCert: Request Certificate + AutoCert->>DNSProvider: Get DNS Config + DNSProvider-->>AutoCert: Provider Config + + AutoCert->>LetsEncrypt: DNS-01 Challenge + LetsEncrypt->>DNS: Verify TXT Record + DNS-->>LetsEncrypt: Verification Result + + alt Verification Successful + LetsEncrypt-->>AutoCert: Certificate + AutoCert-->>User: TLS Certificate + else Verification Failed + LetsEncrypt-->>AutoCert: Error + AutoCert-->>User: Error + end +``` diff --git a/internal/docker/README.md b/internal/docker/README.md new file mode 100644 index 00000000..3cb2e89f --- /dev/null +++ b/internal/docker/README.md @@ -0,0 +1,433 @@ +# Docker Integration + +Docker container discovery, connection management, and label-based route configuration. + +## Overview + +The docker package implements Docker container integration, providing shared client connections, container parsing from Docker API responses, label processing for route configuration, and container filtering capabilities. + +### Primary consumers + +- `internal/route/provider` - Creates Docker-based route providers +- `internal/idlewatcher` - Container idle detection +- Operators - Configure routes via Docker labels + +### Non-goals + +- Docker image building or management +- Container lifecycle operations (start/stop) +- Volume management +- Docker Swarm orchestration + +### Stability + +Stable internal package. Public API consists of client management and container parsing functions. + +## Public API + +### Exported types + +```go +type SharedClient struct { + *client.Client + cfg types.DockerProviderConfig + refCount atomic.Int32 + closedOn atomic.Int64 + key string + addr string + dial func(ctx context.Context) (net.Conn, error) + unique bool +} +``` + +```go +type Container struct { + DockerCfg types.DockerProviderConfig + Image Image + ContainerName string + ContainerID string + Labels map[string]string + ActualLabels map[string]string + Mounts []Mount + Network string + PublicPortMapping map[int]PortSummary + PrivatePortMapping map[int]PortSummary + Aliases []string + IsExcluded bool + IsExplicit bool + IsHostNetworkMode bool + Running bool + State string + PublicHostname string + PrivateHostname string + Agent *agentpool.Agent + IdlewatcherConfig *IdlewatcherConfig +} +``` + +### Exported functions + +```go +func NewClient(cfg types.DockerProviderConfig, unique ...bool) (*SharedClient, error) +``` + +Creates or returns a Docker client. Reuses existing clients for the same URL. Thread-safe. + +```go +func Clients() map[string]*SharedClient +``` + +Returns all currently connected clients. Callers must close returned clients. + +```go +func FromDocker(c *container.Summary, dockerCfg types.DockerProviderConfig) *types.Container +``` + +Converts Docker API container summary to internal container type. Parses labels for route configuration. + +```go +func UpdatePorts(ctx context.Context, c *Container) error +``` + +Refreshes port mappings from container inspect. + +```go +func DockerComposeProject(c *Container) string +``` + +Returns the Docker Compose project name. + +```go +func DockerComposeService(c *Container) string +``` + +Returns the Docker Compose service name. + +```go +func Dependencies(c *Container) []string +``` + +Returns container dependencies from labels. + +```go +func IsBlacklisted(c *Container) bool +``` + +Checks if container should be excluded from routing. + +## Architecture + +### Core components + +```mermaid +graph TD + A[Docker API] --> B[SharedClient Pool] + B --> C{Client Request} + C -->|New Client| D[Create Connection] + C -->|Existing| E[Increment RefCount] + + F[Container List] --> G[FromDocker Parser] + G --> H[Container Struct] + H --> I[Route Builder] + + J[Container Labels] --> K[Label Parser] + K --> L[Route Config] + + subgraph Client Pool + B --> M[clientMap] + N[Cleaner Goroutine] + end +``` + +### Client lifecycle + +```mermaid +stateDiagram-v2 + [*] --> New: NewClient() called + New --> Shared: Refcount = 1, stored in pool + Shared --> Shared: Same URL, increment refcount + Shared --> Idle: Close() called, refcount = 0 + Idle --> Closed: 10s timeout elapsed + Idle --> Shared: NewClient() for same URL + Closed --> [*]: Client closed + Unique --> [*]: Close() immediately +``` + +### Container parsing flow + +```mermaid +sequenceDiagram + participant Provider + participant SharedClient + participant DockerAPI + participant ContainerParser + participant RouteBuilder + + Provider->>SharedClient: NewClient(cfg) + SharedClient->>SharedClient: Check Pool + alt Existing Client + SharedClient->>SharedClient: Increment RefCount + else New Client + SharedClient->>DockerAPI: Connect + DockerAPI-->>SharedClient: Client + end + + Provider->>SharedClient: ListContainers() + SharedClient->>DockerAPI: GET /containers/json + DockerAPI-->>SharedClient: Container List + SharedClient-->>Provider: Container List + + loop For Each Container + Provider->>ContainerParser: FromDocker() + ContainerParser->>ContainerParser: Parse Labels + ContainerParser->>ContainerParser: Resolve Hostnames + ContainerParser-->>Provider: *Container + end + + Provider->>RouteBuilder: Create Routes + RouteBuilder-->>Provider: Routes +``` + +### Client pool management + +The docker package maintains a pool of shared clients: + +```go +var ( + clientMap = make(map[string]*SharedClient, 10) + clientMapMu sync.RWMutex +) + +func initClientCleaner() { + cleaner := task.RootTask("docker_clients_cleaner", true) + go func() { + ticker := time.NewTicker(cleanInterval) + for { + select { + case <-ticker.C: + closeTimedOutClients() + case <-cleaner.Context().Done(): + // Cleanup all clients + } + } + }() +} +``` + +## Configuration Surface + +### Docker provider configuration + +```yaml +providers: + docker: + local: ${DOCKER_HOST} + remote1: + scheme: tcp + host: docker1.local + port: 2375 + remote2: + scheme: tls + host: docker2.local + port: 2375 + tls: + ca_file: /path/to/ca.pem + cert_file: /path/to/cert.pem + key_file: /path/to/key.pem +``` + +### Route configuration labels + +Route labels use the format `proxy..` where `` is the route alias (or `*` for wildcard). The base labels apply to all routes. + +| Label | Description | Example | +| ---------------------- | ------------------------------- | ------------------------------- | +| `proxy.aliases` | Route aliases (comma-separated) | `proxy.aliases: www,app` | +| `proxy.exclude` | Exclude from routing | `proxy.exclude: true` | +| `proxy.network` | Docker network | `proxy.network: frontend` | +| `proxy..host` | Override hostname | `proxy.app.host: 192.168.1.100` | +| `proxy..port` | Target port | `proxy.app.port: 8080` | +| `proxy..scheme` | HTTP scheme | `proxy.app.scheme: https` | +| `proxy..*` | Any route-specific setting | `proxy.app.no_tls_verify: true` | + +#### Wildcard alias + +Use `proxy.*.` to apply settings to all routes: + +```yaml +labels: + proxy.aliases: app1,app2 + proxy.*.scheme: https + proxy.app1.port: 3000 # overrides wildcard +``` + +### Idle watcher labels + +| Label | Description | Example | +| ----------------------- | ------------------------------- | ---------------------------------- | +| `proxy.idle_timeout` | Idle timeout duration | `proxy.idle_timeout: 30m` | +| `proxy.wake_timeout` | Max time to wait for wake | `proxy.wake_timeout: 10s` | +| `proxy.stop_method` | Stop method (pause, stop, kill) | `proxy.stop_method: stop` | +| `proxy.stop_signal` | Signal to send (e.g., SIGTERM) | `proxy.stop_signal: SIGTERM` | +| `proxy.stop_timeout` | Stop timeout in seconds | `proxy.stop_timeout: 30` | +| `proxy.depends_on` | Container dependencies | `proxy.depends_on: database` | +| `proxy.start_endpoint` | Optional path restriction | `proxy.start_endpoint: /api/ready` | +| `proxy.no_loading_page` | Skip loading page | `proxy.no_loading_page: true` | + +### Docker Compose labels + +Those are created by Docker Compose. + +| Label | Description | +| ------------------------------- | -------------------- | +| `com.docker.compose.project` | Compose project name | +| `com.docker.compose.service` | Service name | +| `com.docker.compose.depends_on` | Dependencies | + +## Dependency and Integration Map + +### Internal dependencies + +- `internal/agentpool` - Agent-based Docker host connections +- `internal/maxmind` - Container geolocation +- `internal/types` - Container and provider types +- `internal/task/task.go` - Lifetime management + +### External dependencies + +- `github.com/docker/cli/cli/connhelper` - Connection helpers +- `github.com/moby/moby/client` - Docker API client +- `github.com/docker/go-connections/nat` - Port parsing + +### Integration points + +```go +// Route provider uses docker for container discovery +client, err := docker.NewClient(cfg) +containers, err := client.ContainerList(ctx, container.ListOptions{}) + +for _, c := range containers { + container := docker.FromDocker(c, cfg) + // Create routes from container +} +``` + +## Observability + +### Logs + +- Client initialization and cleanup +- Connection errors +- Container parsing errors + +### Metrics + +No metrics are currently exposed. + +## Security Considerations + +- Docker socket access requires proper permissions +- TLS certificates for remote connections +- Agent-based connections are authenticated via TLS +- Database containers are automatically blacklisted + +### Blacklist detection + +Containers are automatically blacklisted if they: + +- Mount database directories: + - `/var/lib/postgresql/data` + - `/var/lib/mysql` + - `/var/lib/mongodb` + - `/var/lib/mariadb` + - `/var/lib/memcached` + - `/var/lib/rabbitmq` +- Expose database ports: + - 5432 (PostgreSQL) + - 3306 (MySQL/MariaDB) + - 6379 (Redis) + - 11211 (Memcached) + - 27017 (MongoDB) + +## Failure Modes and Recovery + +| Failure | Behavior | Recovery | +| -------------------------- | ---------------------------- | ------------------------ | +| Docker socket inaccessible | NewClient returns error | Fix socket permissions | +| Remote connection failed | NewClient returns error | Check network/tls config | +| Container inspect failed | UpdatePorts returns error | Container may be stopped | +| Invalid labels | Container created with error | Fix label syntax | +| Agent not found | Panic during client creation | Add agent to pool | + +## Performance Characteristics + +- Client pooling reduces connection overhead +- Reference counting prevents premature cleanup +- Background cleaner removes idle clients after 10s +- O(n) container parsing where n is container count + +## Usage Examples + +### Creating a Docker client + +```go +dockerCfg := types.DockerProviderConfig{ + URL: "unix:///var/run/docker.sock", +} + +client, err := docker.NewClient(dockerCfg) +if err != nil { + log.Fatal(err) +} +defer client.Close() +``` + +### Using unique client + +```go +// Create a unique client that won't be shared +client, err := docker.NewClient(cfg, true) +if err != nil { + log.Fatal(err) +} +// Remember to close when done +client.Close() +``` + +### Getting all clients + +```go +clients := docker.Clients() +for host, client := range clients { + log.Printf("Connected to: %s", host) +} +// Use clients... +// Close all clients when done +for _, client := range clients { + client.Close() +} +``` + +### Parsing containers + +```go +containers, err := dockerClient.ContainerList(ctx, container.ListOptions{}) +for _, c := range containers { + container := docker.FromDocker(c, dockerCfg) + if container.Errors != nil { + log.Printf("Container %s has errors: %v", container.ContainerName, container.Errors) + continue + } + log.Printf("Container: %s, Aliases: %v", container.ContainerName, container.Aliases) +} +``` + +### Checking if container is blacklisted + +```go +container := docker.FromDocker(c, dockerCfg) +if docker.IsBlacklisted(container) { + log.Printf("Container %s is blacklisted, skipping", container.ContainerName) + continue +} +``` diff --git a/internal/entrypoint/README.md b/internal/entrypoint/README.md new file mode 100644 index 00000000..bef905a1 --- /dev/null +++ b/internal/entrypoint/README.md @@ -0,0 +1,308 @@ +# Entrypoint + +The entrypoint package provides the main HTTP entry point for GoDoxy, handling domain-based routing, middleware application, short link matching, and access logging. + +## Overview + +The entrypoint package implements the primary HTTP handler that receives all incoming requests, determines the target route based on hostname, applies middleware, and forwards requests to the appropriate route handler. + +### Key Features + +- Domain-based route lookup with subdomain support +- Short link (`go/` domain) handling +- Middleware chain application +- Access logging for all requests +- Configurable not-found handling +- Per-domain route resolution + +## Architecture + +```mermaid +graph TD + A[HTTP Request] --> B[Entrypoint Handler] + B --> C{Access Logger?} + C -->|Yes| D[Wrap Response Recorder] + C -->|No| E[Skip Logging] + + D --> F[Find Route by Host] + E --> F + + F --> G{Route Found?} + G -->|Yes| H{Middleware?} + G -->|No| I{Short Link?} + I -->|Yes| J[Short Link Handler] + I -->|No| K{Not Found Handler?} + K -->|Yes| L[Not Found Handler] + K -->|No| M[Serve 404] + + H -->|Yes| N[Apply Middleware] + H -->|No| O[Direct Route] + N --> O + + O --> P[Route ServeHTTP] + P --> Q[Response] + + L --> R[404 Response] + J --> Q + M --> R +``` + +## Core Components + +### Entrypoint Structure + +```go +type Entrypoint struct { + middleware *middleware.Middleware + notFoundHandler http.Handler + accessLogger accesslog.AccessLogger + findRouteFunc func(host string) types.HTTPRoute + shortLinkTree *ShortLinkMatcher +} +``` + +### Active Config + +```go +var ActiveConfig atomic.Pointer[entrypoint.Config] +``` + +## Public API + +### Creation + +```go +// NewEntrypoint creates a new entrypoint instance. +func NewEntrypoint() Entrypoint +``` + +### Configuration + +```go +// SetFindRouteDomains configures domain-based route lookup. +func (ep *Entrypoint) SetFindRouteDomains(domains []string) + +// SetMiddlewares loads and configures middleware chain. +func (ep *Entrypoint) SetMiddlewares(mws []map[string]any) error + +// SetNotFoundRules configures the not-found handler. +func (ep *Entrypoint) SetNotFoundRules(rules rules.Rules) + +// SetAccessLogger initializes access logging. +func (ep *Entrypoint) SetAccessLogger(parent task.Parent, cfg *accesslog.RequestLoggerConfig) error + +// ShortLinkMatcher returns the short link matcher. +func (ep *Entrypoint) ShortLinkMatcher() *ShortLinkMatcher +``` + +### Request Handling + +```go +// ServeHTTP is the main HTTP handler. +func (ep *Entrypoint) ServeHTTP(w http.ResponseWriter, r *http.Request) + +// FindRoute looks up a route by hostname. +func (ep *Entrypoint) FindRoute(s string) types.HTTPRoute +``` + +## Usage + +### Basic Setup + +```go +ep := entrypoint.NewEntrypoint() + +// Configure domain matching +ep.SetFindRouteDomains([]string{".example.com", "example.com"}) + +// Configure middleware +err := ep.SetMiddlewares([]map[string]any{ + {"rate_limit": map[string]any{"requests_per_second": 100}}, +}) +if err != nil { + log.Fatal(err) +} + +// Configure access logging +err = ep.SetAccessLogger(parent, &accesslog.RequestLoggerConfig{ + Path: "/var/log/godoxy/access.log", +}) +if err != nil { + log.Fatal(err) +} + +// Start server +http.ListenAndServe(":80", &ep) +``` + +### Route Lookup Logic + +The entrypoint uses multiple strategies to find routes: + +1. **Subdomain Matching**: For `sub.domain.com`, looks for `sub` +1. **Exact Match**: Looks for the full hostname +1. **Port Stripping**: Strips port from host if present + +```go +func findRouteAnyDomain(host string) types.HTTPRoute { + // Try subdomain (everything before first dot) + idx := strings.IndexByte(host, '.') + if idx != -1 { + target := host[:idx] + if r, ok := routes.HTTP.Get(target); ok { + return r + } + } + + // Try exact match + if r, ok := routes.HTTP.Get(host); ok { + return r + } + + // Try stripping port + if before, _, ok := strings.Cut(host, ":"); ok { + if r, ok := routes.HTTP.Get(before); ok { + return r + } + } + + return nil +} +``` + +### Short Links + +Short links use a special `.short` domain: + +```go +// Request to: https://abc.short.example.com +// Looks for route with alias "abc" +if strings.EqualFold(host, common.ShortLinkPrefix) { + // Handle short link + ep.shortLinkTree.ServeHTTP(w, r) +} +``` + +## Data Flow + +```mermaid +sequenceDiagram + participant Client + participant Entrypoint + participant Middleware + participant Route + participant Logger + + Client->>Entrypoint: GET /path + Entrypoint->>Entrypoint: FindRoute(host) + alt Route Found + Entrypoint->>Logger: Get ResponseRecorder + Logger-->>Entrypoint: Recorder + Entrypoint->>Middleware: ServeHTTP(routeHandler) + alt Has Middleware + Middleware->>Middleware: Process Chain + end + Middleware->>Route: Forward Request + Route-->>Middleware: Response + Middleware-->>Entrypoint: Response + else Short Link + Entrypoint->>ShortLinkTree: Match short code + ShortLinkTree-->>Entrypoint: Redirect + else Not Found + Entrypoint->>NotFoundHandler: Serve 404 + NotFoundHandler-->>Entrypoint: 404 Page + end + + Entrypoint->>Logger: Log Request + Logger-->>Entrypoint: Complete + Entrypoint-->>Client: Response +``` + +## Not-Found Handling + +When no route is found, the entrypoint: + +1. Attempts to serve a static error page file +1. Logs the 404 request +1. Falls back to the configured error page +1. Returns 404 status code + +```go +func (ep *Entrypoint) serveNotFound(w http.ResponseWriter, r *http.Request) { + if served := middleware.ServeStaticErrorPageFile(w, r); !served { + log.Error(). + Str("method", r.Method). + Str("url", r.URL.String()). + Str("remote", r.RemoteAddr). + Msgf("not found: %s", r.Host) + + errorPage, ok := errorpage.GetErrorPageByStatus(http.StatusNotFound) + if ok { + w.WriteHeader(http.StatusNotFound) + w.Header().Set("Content-Type", "text/html; charset=utf-8") + w.Write(errorPage) + } else { + http.NotFound(w, r) + } + } +} +``` + +## Configuration Structure + +```go +type Config struct { + Middlewares []map[string]any `json:"middlewares"` + Rules rules.Rules `json:"rules"` + AccessLog *accesslog.RequestLoggerConfig `json:"access_log"` +} +``` + +## Middleware Integration + +The entrypoint supports middleware chains configured via YAML: + +```yaml +entrypoint: + middlewares: + - use: rate_limit + average: 100 + burst: 200 + bypass: + - remote 192.168.1.0/24 + - use: redirect_http +``` + +## Access Logging + +Access logging wraps the response recorder to capture: + +- Request method and URL +- Response status code +- Response size +- Request duration +- Client IP address + +```go +func (ep *Entrypoint) ServeHTTP(w http.ResponseWriter, r *http.Request) { + if ep.accessLogger != nil { + rec := accesslog.GetResponseRecorder(w) + w = rec + defer func() { + ep.accessLogger.Log(r, rec.Response()) + accesslog.PutResponseRecorder(rec) + }() + } + // ... handle request +} +``` + +## Integration Points + +The entrypoint integrates with: + +- **Route Registry**: HTTP route lookup +- **Middleware**: Request processing chain +- **AccessLog**: Request logging +- **ErrorPage**: 404 error pages +- **ShortLink**: Short link handling diff --git a/internal/health/check/README.md b/internal/health/check/README.md index 21a7d778..bd27f498 100644 --- a/internal/health/check/README.md +++ b/internal/health/check/README.md @@ -1,14 +1,128 @@ -# Health Check +# Health Check Package -This package provides low-level health check implementations for different protocols and services in GoDoxy. +Low-level health check implementations for different protocols and services in GoDoxy. -## Health Check Types +## Overview -### Docker Health Check +### Purpose -Checks the health status of Docker containers using the Docker API. +This package provides health check implementations for various protocols: -**Flow:** +- **HTTP/HTTPS** - Standard HTTP health checks with fasthttp +- **H2C** - HTTP/2 cleartext health checks +- **Docker** - Container health status via Docker API +- **FileServer** - Directory accessibility checks +- **Stream** - Generic network connection checks + +### Primary Consumers + +- `internal/health/monitor/` - Route health monitoring +- `internal/metrics/uptime/` - Uptime poller integration + +### Non-goals + +- Complex health check logic (response body validation, etc.) +- Authentication/authorization in health checks +- Multi-step health checks (login then check) + +### Stability + +Internal package. Public functions are stable but may be extended with new parameters. + +## Public API + +### HTTP Health Check (`http.go`) + +```go +func HTTP( + url *url.URL, + method string, + path string, + timeout time.Duration, +) (types.HealthCheckResult, error) +``` + +### H2C Health Check (`http.go`) + +```go +func H2C( + ctx context.Context, + url *url.URL, + method string, + path string, + timeout time.Duration, +) (types.HealthCheckResult, error) +``` + +### Docker Health Check (`docker.go`) + +```go +func Docker( + ctx context.Context, + containerID string, +) (types.HealthCheckResult, error) +``` + +### FileServer Health Check (`fileserver.go`) + +```go +func FileServer( + url *url.URL, +) (types.HealthCheckResult, error) +``` + +### Stream Health Check (`stream.go`) + +```go +func Stream( + url *url.URL, +) (types.HealthCheckResult, error) +``` + +### Common Types (`internal/types/`) + +```go +type HealthCheckResult struct { + Healthy bool + Latency time.Duration + Detail string +} + +type HealthStatus int + +const ( + StatusHealthy HealthStatus = 0 + StatusUnhealthy HealthStatus = 1 + StatusError HealthStatus = 2 +) +``` + +## Architecture + +### HTTP Health Check Flow + +```mermaid +flowchart TD + A[HTTP Health Check] --> B[Create FastHTTP Request] + B --> C[Set Headers and Method] + C --> D[Execute Request with Timeout] + D --> E{Request Successful?} + + E -->|no| F{Error Type} + F -->|TLS Error| G[Healthy: TLS Error Ignored] + F -->|Other Error| H[Unhealthy: Error Details] + + E -->|yes| I{Status Code} + I -->|5xx| J[Unhealthy: Server Error] + I -->|Other| K[Healthy] + + G --> L[Return Result with Latency] + H --> L + J --> L + K --> L +``` + +### Docker Health Check Flow ```mermaid flowchart TD @@ -36,53 +150,7 @@ flowchart TD P --> Q ``` -**Key Features:** - -- Intercepts Docker API responses to extract container state -- Tracks failure count with configurable threshold (3 failures) -- Supports containers with and without health check configurations -- Returns detailed error information from Docker health check logs - -### HTTP Health Check - -Performs HTTP/HTTPS health checks using fasthttp for optimal performance. - -**Flow:** - -```mermaid -flowchart TD - A[HTTP Health Check] --> B[Create FastHTTP Request] - B --> C[Set Headers and Method] - C --> D[Execute Request with Timeout] - D --> E{Request Successful?} - - E -->|no| F{Error Type} - F -->|TLS Error| G[Healthy: TLS Error Ignored] - F -->|Other Error| H[Unhealthy: Error Details] - - E -->|yes| I{Status Code} - I -->|5xx| J[Unhealthy: Server Error] - I -->|Other| K[Healthy] - - G --> L[Return Result with Latency] - H --> L - J --> L - K --> L -``` - -**Key Features:** - -- Uses fasthttp for high-performance HTTP requests -- Supports both GET and HEAD methods -- Configurable timeout and path -- Handles TLS certificate verification errors gracefully -- Returns latency measurements - -### H2C Health Check - -Performs HTTP/2 cleartext (h2c) health checks for services that support HTTP/2 without TLS. - -**Flow:** +### H2C Health Check Flow ```mermaid flowchart TD @@ -104,18 +172,7 @@ flowchart TD L --> M ``` -**Key Features:** - -- Uses HTTP/2 transport with cleartext support -- Supports both GET and HEAD methods -- Configurable timeout and path -- Returns latency measurements - -### FileServer Health Check - -Checks if a file server root directory exists and is accessible. - -**Flow:** +### FileServer Health Check Flow ```mermaid flowchart TD @@ -132,18 +189,7 @@ flowchart TD G --> I[Return Error] ``` -**Key Features:** - -- Simple directory existence check -- Measures latency of filesystem operation -- Distinguishes between "not found" and other errors -- Returns detailed error information - -### Stream Health Check - -Checks stream endpoint connectivity by attempting to establish a network connection. - -**Flow:** +### Stream Health Check Flow ```mermaid flowchart TD @@ -164,35 +210,144 @@ flowchart TD K --> L ``` -**Key Features:** +## Configuration Surface -- Generic network connection check -- Supports any stream protocol (TCP, UDP, etc.) -- Handles common connection errors gracefully -- Measures connection establishment latency -- Automatically closes connections +No explicit configuration per health check. Parameters are passed directly: -## Common Features +| Check Type | Parameters | +| ---------- | ----------------------------------- | +| HTTP | URL, Method, Path, Timeout | +| H2C | Context, URL, Method, Path, Timeout | +| Docker | Context, ContainerID | +| FileServer | URL (path component used) | +| Stream | URL (scheme, host, port used) | -### Error Handling +### HTTP Headers -All health checks implement consistent error handling: +All HTTP/H2C checks set: -- **Temporary Errors**: Network timeouts, connection failures -- **Permanent Errors**: Invalid configurations, missing resources -- **Graceful Degradation**: Returns health status even when errors occur +- `User-Agent: GoDoxy/` +- `Accept: text/plain,text/html,*/*;q=0.8` +- `Accept-Encoding: identity` +- `Cache-Control: no-cache` +- `Pragma: no-cache` -### Performance Monitoring +## Dependency and Integration Map -- **Latency Measurement**: All checks measure execution time -- **Timeout Support**: Configurable timeouts prevent hanging -- **Resource Cleanup**: Proper cleanup of connections and resources +### External Dependencies -### Integration +- `github.com/valyala/fasthttp` - High-performance HTTP client +- `golang.org/x/net/http2` - HTTP/2 transport +- Docker socket (for Docker health check) -These health checks are used by the monitor package to implement route-specific health monitoring: +### Internal Dependencies -- HTTP/HTTPS routes use HTTP health checks -- File server routes use FileServer health checks -- Stream routes use Stream health checks -- Docker containers use Docker health checks with fallbacks +- `internal/types/` - Health check result types +- `goutils/version/` - User-Agent version + +## Observability + +### Logs + +No direct logging in health check implementations. Errors are returned as part of `HealthCheckResult.Detail`. + +### Metrics + +- Check latency (returned in result) +- Success/failure rates (tracked by caller) + +## Security Considerations + +- TLS certificate verification skipped (`InsecureSkipVerify: true`) +- Docker socket access required for Docker health check +- No authentication in health check requests +- User-Agent identifies GoDoxy for server-side filtering + +## Failure Modes and Recovery + +### HTTP/H2C + +| Failure Mode | Result | Notes | +| --------------------- | --------- | ------------------------------- | +| Connection timeout | Unhealthy | Detail: timeout message | +| TLS certificate error | Healthy | Handled gracefully | +| 5xx response | Unhealthy | Detail: status text | +| 4xx response | Healthy | Client error considered healthy | + +### Docker + +| Failure Mode | Result | Notes | +| -------------------------- | --------- | ------------------------------ | +| API call failure | Error | Throws error to caller | +| Container not running | Unhealthy | State: "Not Started" | +| Container dead/exited | Unhealthy | State logged | +| No health check configured | Error | Requires health check in image | + +### FileServer + +| Failure Mode | Result | Notes | +| ----------------- | --------- | ------------------------ | +| Path not found | Unhealthy | Detail: "path not found" | +| Permission denied | Error | Returned to caller | +| Other OS error | Error | Returned to caller | + +### Stream + +| Failure Mode | Result | Notes | +| ---------------------- | --------- | --------------------- | +| Connection refused | Unhealthy | Detail: error message | +| Network unreachable | Unhealthy | Detail: error message | +| DNS resolution failure | Unhealthy | Detail: error message | +| Context deadline | Unhealthy | Detail: timeout | + +## Usage Examples + +### HTTP Health Check + +```go +url, _ := url.Parse("http://localhost:8080/health") +result, err := healthcheck.HTTP(url, "GET", "/health", 10*time.Second) +if err != nil { + fmt.Printf("Error: %v\n", err) +} +fmt.Printf("Healthy: %v, Latency: %v, Detail: %s\n", + result.Healthy, result.Latency, result.Detail) +``` + +### H2C Health Check + +```go +ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) +defer cancel() + +url, _ := url.Parse("h2c://localhost:8080") +result, err := healthcheck.H2C(ctx, url, "GET", "/health", 10*time.Second) +``` + +### Docker Health Check + +```go +ctx := context.Background() +result, err := healthcheck.Docker(ctx, "abc123def456") +``` + +### FileServer Health Check + +```go +url, _ := url.Parse("file:///var/www/html") +result, err := healthcheck.FileServer(url) +``` + +### Stream Health Check + +```go +url, _ := url.Parse("tcp://localhost:5432") +result, err := healthcheck.Stream(url) +``` + +## Testing Notes + +- Unit tests for each health check type +- Mock Docker server for Docker health check tests +- Integration tests require running services +- Timeout handling tests diff --git a/internal/health/monitor/README.md b/internal/health/monitor/README.md index ccd5dc26..51841d89 100644 --- a/internal/health/monitor/README.md +++ b/internal/health/monitor/README.md @@ -1,33 +1,317 @@ -# Health Monitor +# Health Monitor Package -This package provides health monitoring functionality for different types of routes in GoDoxy. +Route health monitoring with configurable check intervals, retry policies, and notification integration. -## Health Check Flow +## Overview + +### Purpose + +This package provides health monitoring for different route types in GoDoxy: + +- Monitors service health via configurable check functions +- Tracks consecutive failures with configurable thresholds +- Sends notifications on status changes +- Provides last-seen tracking for idle detection + +### Primary Consumers + +- `internal/route/` - Route health monitoring +- `internal/api/v1/metrics/` - Uptime poller integration +- WebUI - Health status display + +### Non-goals + +- Health check execution itself (delegated to `internal/health/check/`) +- Alert routing (handled by `internal/notif/`) +- Automatic remediation + +### Stability + +Internal package with stable public interfaces. `HealthMonitor` interface is stable. + +## Public API + +### Types + +```go +type HealthCheckFunc func(url *url.URL) (result types.HealthCheckResult, err error) +``` + +### HealthMonitor Interface + +```go +type HealthMonitor interface { + Start(parent task.Parent) gperr.Error + Task() *task.Task + Finish(reason any) + UpdateURL(url *url.URL) + URL() *url.URL + Config() *types.HealthCheckConfig + Status() types.HealthStatus + Uptime() time.Duration + Latency() time.Duration + Detail() string + Name() string + String() string + CheckHealth() (types.HealthCheckResult, error) +} +``` + +### Monitor Creation (`new.go`) + +```go +// Create monitor for agent-proxied routes +func NewAgentProxiedMonitor( + ctx context.Context, + cfg types.HealthCheckConfig, + url *url.URL, +) (HealthMonitor, error) + +// Create monitor for Docker containers +func NewDockerHealthMonitor( + ctx context.Context, + cfg types.HealthCheckConfig, + url *url.URL, + containerID string, +) (HealthMonitor, error) + +// Create monitor for HTTP routes +func NewHTTPMonitor( + ctx context.Context, + cfg types.HealthCheckConfig, + url *url.URL, +) HealthMonitor + +// Create monitor for H2C (HTTP/2 cleartext) routes +func NewH2CMonitor( + ctx context.Context, + cfg types.HealthCheckConfig, + url *url.URL, +) HealthMonitor + +// Create monitor for file server routes +func NewFileServerMonitor( + cfg types.HealthCheckConfig, + url *url.URL, +) HealthMonitor + +// Create monitor for stream routes +func NewStreamMonitor( + cfg types.HealthCheckConfig, + url *url.URL, +) HealthMonitor + +// Unified monitor factory (routes to appropriate type) +func NewMonitor( + ctx context.Context, + cfg types.HealthCheckConfig, + url *url.URL, +) (HealthMonitor, error) +``` + +## Architecture + +### Monitor Selection Flow ```mermaid flowchart TD - A[NewMonitor route] --> B{IsAgent route} + A[NewMonitor route] --> B{IsAgent route?} B -->|true| C[NewAgentProxiedMonitor] - B -->|false| D{IsDocker route} + B -->|false| D{IsDocker route?} D -->|true| E[NewDockerHealthMonitor] - D -->|false| F[Route Type Switch] - - F --> G[HTTP Monitor] - F --> H[FileServer Monitor] - F --> I[Stream Monitor] - - E --> J[Selected Monitor] - - C --> K[Agent Health Check] - G --> L{Scheme h2c?} - L -->|true| M[H2C Health Check] - L -->|false| N[HTTP Health Check] - H --> O[FileServer Health Check] - I --> P[Stream Health Check] - - K --> Q{IsDocker route} - Q -->|true| R[NewDockerHealthMonitor with Agent as Fallback] - Q -->|false| K - - R --> K + D -->|false| F{Has h2c scheme?} + F -->|true| G[NewH2CMonitor] + F -->|false| H{Has http/https scheme?} + H -->|true| I[NewHTTPMonitor] + H -->|false| J{Is file:// scheme?} + J -->|true| K[NewFileServerMonitor] + J -->|false| L[NewStreamMonitor] ``` + +### Monitor State Machine + +```mermaid +stateDiagram-v2 + [*] --> Starting: First check + Starting --> Healthy: Check passes + Starting --> Unhealthy: Check fails + Healthy --> Unhealthy: 5 consecutive failures + Healthy --> Error: Check error + Error --> Healthy: Check passes + Error --> Unhealthy: 5 consecutive failures + Unhealthy --> Healthy: Check passes + Unhealthy --> Error: Check error + [*] --> Stopped: Task cancelled +``` + +### Component Structure + +```mermaid +classDiagram + class monitor { + -service string + -config types.HealthCheckConfig + -url synk.Value~*url.URL~ + -status synk.Value~HealthStatus~ + -lastResult synk.Value~HealthCheckResult~ + -checkHealth HealthCheckFunc + -startTime time.Time + -task *task.Task + +Start(parent task.Parent) + +CheckHealth() (HealthCheckResult, error) + +Status() HealthStatus + +Uptime() time.Duration + +Latency() time.Duration + +Detail() string + } + + class HealthMonitor { + <> + +Start(parent task.Parent) + +Task() *task.Task + +Status() HealthStatus + } +``` + +## Configuration Surface + +### HealthCheckConfig + +```go +type HealthCheckConfig struct { + Interval time.Duration // Check interval (default: 30s) + Timeout time.Duration // Check timeout (default: 10s) + Path string // Health check path + Method string // HTTP method (GET/HEAD) + Retries int // Consecutive failures before notification (-1 for immediate) + BaseContext func() context.Context +} +``` + +### Defaults + +| Field | Default | +| -------- | ------- | +| Interval | 30s | +| Timeout | 10s | +| Method | GET | +| Path | "/" | +| Retries | 3 | + +### Applying Defaults + +```go +cfg.ApplyDefaults(state.Value().Defaults.HealthCheck) +``` + +## Dependency and Integration Map + +### Internal Dependencies + +- `internal/task/task.go` - Lifetime management +- `internal/notif/` - Status change notifications +- `internal/health/check/` - Health check implementations +- `internal/types/` - Health status types +- `internal/config/types/` - Working state + +### External Dependencies + +- `github.com/puzpuzpuz/xsync/v4` - Atomic values + +## Observability + +### Logs + +| Level | When | +| ------- | ------------------------------ | +| `Info` | Service comes up | +| `Warn` | Service goes down | +| `Error` | Health check error | +| `Error` | Monitor stopped after 5 trials | + +### Notifications + +- Service up notification (with latency) +- Service down notification (with last seen time) +- Immediate notification when `Retries < 0` + +### Metrics + +- Consecutive failure count +- Last check latency +- Monitor uptime + +## Failure Modes and Recovery + +| Failure Mode | Impact | Recovery | +| --------------------------- | -------------------------------------- | ----------------------- | +| 5 consecutive check errors | Monitor enters Error state, task stops | Manual restart required | +| Health check function panic | Monitor crashes | Automatic cleanup | +| Context cancellation | Monitor stops gracefully | Stopped state | +| URL update to invalid | Check will fail | Manual URL fix | + +### Status Transitions + +| From | To | Condition | +| --------- | --------- | ------------------------------ | +| Starting | Healthy | Check passes | +| Starting | Unhealthy | Check fails | +| Healthy | Unhealthy | `Retries` consecutive failures | +| Healthy | Error | Check returns error | +| Unhealthy | Healthy | Check passes | +| Error | Healthy | Check passes | + +## Usage Examples + +### Creating an HTTP Monitor + +```go +cfg := types.HealthCheckConfig{ + Interval: 15 * time.Second, + Timeout: 5 * time.Second, + Path: "/health", + Retries: 3, +} +url, _ := url.Parse("http://localhost:8080") + +monitor := monitor.NewHTTPMonitor(context.Background(), cfg, url) +if err := monitor.Start(parent); err != nil { + return err +} + +// Check status +fmt.Printf("Status: %s\n", monitor.Status()) +fmt.Printf("Latency: %v\n", monitor.Latency()) +``` + +### Creating a Docker Monitor + +```go +monitor, err := monitor.NewDockerHealthMonitor( + context.Background(), + cfg, + url, + containerID, +) +if err != nil { + return err +} +monitor.Start(parent) +``` + +### Unified Factory + +```go +monitor, err := monitor.NewMonitor(ctx, cfg, url) +if err != nil { + return err +} +monitor.Start(parent) +``` + +## Testing Notes + +- `monitor_test.go` - Monitor lifecycle tests +- Mock health check functions for deterministic testing +- Status transition coverage tests +- Notification trigger tests diff --git a/internal/homepage/README.md b/internal/homepage/README.md new file mode 100644 index 00000000..86642102 --- /dev/null +++ b/internal/homepage/README.md @@ -0,0 +1,358 @@ +# Homepage + +The homepage package provides the GoDoxy WebUI dashboard with support for categories, favorites, widgets, and dynamic item configuration. + +## Overview + +The homepage package implements the WebUI dashboard, managing homepage items, categories, sorting methods, and widget integration for monitoring container status and providing interactive features. + +### Key Features + +- Dynamic homepage item management +- Category-based organization (All, Favorites, Hidden, Others) +- Multiple sort methods (clicks, alphabetical, custom) +- Widget support for live data display +- Icon URL handling with favicon integration +- Item override configuration +- Click tracking and statistics + +## Architecture + +```mermaid +graph TD + A[HomepageMap] --> B{Category Management} + B --> C[All] + B --> D[Favorites] + B --> E[Hidden] + B --> F[Others] + + G[Item] --> H[ItemConfig] + H --> I[Widget Config] + H --> J[Icon] + H --> K[Category] + + L[Widgets] --> M[HTTP Widget] + N[Sorting] --> O[Clicks] + N --> P[Alphabetical] + N --> Q[Custom] +``` + +## Core Types + +### Homepage Structure + +```go +type HomepageMap struct { + ordered.Map[string, *Category] +} + +type Homepage []*Category + +type Category struct { + Items []*Item + Name string +} + +type Item struct { + ItemConfig + SortOrder int + FavSortOrder int + AllSortOrder int + Clicks int + Widgets []Widget + Alias string + Provider string + OriginURL string + ContainerID string +} + +type ItemConfig struct { + Show bool + Name string + Icon *IconURL + Category string + Description string + URL string + Favorite bool + WidgetConfig *widgets.Config +} +``` + +### Sort Methods + +```go +const ( + SortMethodClicks = "clicks" + SortMethodAlphabetical = "alphabetical" + SortMethodCustom = "custom" +) +``` + +### Categories + +```go +const ( + CategoryAll = "All" + CategoryFavorites = "Favorites" + CategoryHidden = "Hidden" + CategoryOthers = "Others" +) +``` + +## Public API + +### Creation + +```go +// NewHomepageMap creates a new homepage map with default categories. +func NewHomepageMap(total int) *HomepageMap +``` + +### Item Management + +```go +// Add adds an item to appropriate categories. +func (c *HomepageMap) Add(item *Item) + +// GetOverride returns the override configuration for an item. +func (cfg Item) GetOverride() Item +``` + +### Sorting + +```go +// Sort sorts a category by the specified method. +func (c *Category) Sort(method SortMethod) +``` + +## Usage + +### Creating a Homepage Map + +```go +homepageMap := homepage.NewHomepageMap(100) // Reserve space for 100 items +``` + +### Adding Items + +```go +item := &homepage.Item{ + Alias: "my-app", + Provider: "docker", + OriginURL: "http://myapp.local", + ItemConfig: homepage.ItemConfig{ + Name: "My Application", + Show: true, + Favorite: true, + Category: "Docker", + Description: "My Docker application", + }, +} + +homepageMap.Add(item) +``` + +### Sorting Categories + +```go +allCategory := homepageMap.Get(homepage.CategoryAll) +if allCategory != nil { + allCategory.Sort(homepage.SortMethodClicks) +} +``` + +### Filtering by Category + +```go +favorites := homepageMap.Get(homepage.CategoryFavorites) +for _, item := range favorites.Items { + fmt.Printf("Favorite: %s\n", item.Name) +} +``` + +## Widgets + +The homepage supports widgets for each item: + +```go +type Widget struct { + Label string + Value string +} + +type Config struct { + // Widget configuration +} +``` + +### Widget Types + +Widgets can display various types of information: + +- **Status**: Container health status +- **Stats**: Usage statistics +- **Links**: Quick access links +- **Custom**: Provider-specific data + +## Icon Handling + +Icons are handled via `IconURL` type: + +```go +type IconURL struct { + // Icon URL with various sources +} + +// Automatic favicon fetching from item URL +``` + +## Categories + +### Default Categories + +| Category | Description | +| --------- | ------------------------ | +| All | Contains all items | +| Favorites | User-favorited items | +| Hidden | Items with `Show: false` | +| Others | Uncategorized items | + +### Custom Categories + +Custom categories are created dynamically: + +```go +// Adding to custom category +item := &homepage.Item{ + ItemConfig: homepage.ItemConfig{ + Name: "App", + Category: "Development", + }, +} +homepageMap.Add(item) +// "Development" category is auto-created +``` + +## Override Configuration + +Items can have override configurations for customization: + +```go +// GetOverride returns the effective configuration +func (cfg Item) GetOverride() Item { + return overrideConfigInstance.GetOverride(cfg) +} +``` + +## Sorting Methods + +### Clicks Sort + +Sorts by click count (most clicked first): + +```go +func (c *Category) sortByClicks() { + slices.SortStableFunc(c.Items, func(a, b *Item) int { + if a.Clicks > b.Clicks { + return -1 + } + if a.Clicks < b.Clicks { + return 1 + } + return strings.Compare(title(a.Name), title(b.Name)) + }) +} +``` + +### Alphabetical Sort + +Sorts alphabetically by name: + +```go +func (c *Category) sortByAlphabetical() { + slices.SortStableFunc(c.Items, func(a, b *Item) int { + return strings.Compare(title(a.Name), title(b.Name)) + }) +} +``` + +### Custom Sort + +Sorts by predefined sort order: + +```go +func (c *Category) sortByCustom() { + // Uses SortOrder, FavSortOrder, AllSortOrder fields +} +``` + +## Data Flow + +```mermaid +sequenceDiagram + participant RouteProvider + participant HomepageMap + participant Category + participant Widget + + RouteProvider->>HomepageMap: Add(Item) + HomepageMap->>HomepageMap: Add to All + HomepageMap->>HomepageMap: Add to Category + alt Item.Favorite + HomepageMap->>CategoryFavorites: Add item + else !Item.Show + HomepageMap->>CategoryHidden: Add item + end + + User->>HomepageMap: Get Category + HomepageMap-->>User: Items + + User->>Category: Sort(method) + Category-->>User: Sorted Items + + User->>Item: Get Widgets + Item->>Widget: Fetch Data + Widget-->>Item: Widget Data + Item-->>User: Display Widgets +``` + +## Integration Points + +The homepage package integrates with: + +- **Route Provider**: Item discovery from routes +- **Container**: Container status and metadata +- **Widgets**: Live data display +- **API**: Frontend data API +- **Configuration**: Default and override configs + +## Configuration + +### Active Configuration + +```go +var ActiveConfig atomic.Pointer[Config] +``` + +### Configuration Structure + +```go +type Config struct { + UseDefaultCategories bool + // ... other options +} +``` + +## Serialization + +The package registers default value factories for serialization: + +```go +func init() { + serialization.RegisterDefaultValueFactory(func() *ItemConfig { + return &ItemConfig{ + Show: true, + } + }) +} +``` diff --git a/internal/homepage/integrations/qbittorrent/README.md b/internal/homepage/integrations/qbittorrent/README.md new file mode 100644 index 00000000..5d06ee7a --- /dev/null +++ b/internal/homepage/integrations/qbittorrent/README.md @@ -0,0 +1,227 @@ +# qBittorrent Integration Package + +This package provides a qBittorrent widget for the GoDoxy homepage dashboard, enabling real-time monitoring of torrent status and transfer statistics. + +> [!WARNING] +> +> This package is a work in progress and is not stable. + +## Overview + +The `internal/homepage/integrations/qbittorrent` package implements the `widgets.Widget` interface for qBittorrent. It provides functionality to connect to a qBittorrent instance and fetch transfer information. + +## Architecture + +### Core Components + +``` +integrations/qbittorrent/ +├── client.go # Client and API methods +├── transfer_info.go # Transfer info widget data +└── version.go # Version checking +└── logs.go # Log fetching +``` + +### Main Types + +```go +type Client struct { + URL string + Username string + Password string +} +``` + +## API Reference + +### Client Methods + +#### Initialize + +Connects to the qBittorrent API and verifies authentication. + +```go +func (c *Client) Initialize(ctx context.Context, url string, cfg map[string]any) error +``` + +**Parameters:** + +- `ctx` - Context for the HTTP request +- `url` - Base URL of the qBittorrent instance +- `cfg` - Configuration map containing `username` and `password` + +**Returns:** + +- `error` - Connection or authentication error + +**Example:** + +```go +client := &qbittorrent.Client{} +err := client.Initialize(ctx, "http://localhost:8080", map[string]any{ + "username": "admin", + "password": "your-password", +}) +if err != nil { + log.Fatalf("Failed to connect: %v", err) +} +``` + +#### Data + +Returns current transfer statistics as name-value pairs. + +```go +func (c *Client) Data(ctx context.Context) ([]widgets.NameValue, error) +``` + +**Returns:** + +- `[]widgets.NameValue` - Transfer statistics +- `error` - API request error + +**Example:** + +```go +data, err := client.Data(ctx) +if err != nil { + log.Fatal(err) +} +for _, nv := range data { + fmt.Printf("%s: %s\n", nv.Name, nv.Value) +} +// Output: +// Status: connected +// Download: 1.5 GB +// Upload: 256 MB +// Download Speed: 5.2 MB/s +// Upload Speed: 1.1 MB/s +``` + +### Internal Methods + +#### doRequest + +Performs an HTTP request to the qBittorrent API. + +```go +func (c *Client) doRequest(ctx context.Context, method, endpoint string, query url.Values, body io.Reader) (*http.Response, error) +``` + +#### jsonRequest + +Performs a JSON API request and unmarshals the response. + +```go +func jsonRequest[T any](ctx context.Context, client *Client, endpoint string, query url.Values) (result T, err error) +``` + +## Data Types + +### TransferInfo + +Represents transfer statistics from qBittorrent. + +```go +type TransferInfo struct { + ConnectionStatus string `json:"connection_status"` + SessionDownloads uint64 `json:"dl_info_data"` + SessionUploads uint64 `json:"up_info_data"` + DownloadSpeed uint64 `json:"dl_info_speed"` + UploadSpeed uint64 `json:"up_info_speed"` +} +``` + +## API Endpoints + +| Endpoint | Method | Description | +| ----------------------- | ------ | ----------------------- | +| `/api/v2/transfer/info` | GET | Get transfer statistics | +| `/api/v2/app/version` | GET | Get qBittorrent version | + +## Usage Example + +### Complete Widget Usage + +```go +package main + +import ( + "context" + "fmt" + "github.com/yusing/godoxy/internal/homepage/integrations/qbittorrent" + "github.com/yusing/godoxy/internal/homepage/widgets" +) + +func main() { + ctx := context.Background() + + // Create and initialize client + client := &qbittorrent.Client{} + err := client.Initialize(ctx, "http://localhost:8080", map[string]any{ + "username": "admin", + "password": "password123", + }) + if err != nil { + fmt.Printf("Connection failed: %v\n", err) + return + } + + // Get transfer data + data, err := client.Data(ctx) + if err != nil { + fmt.Printf("Failed to get data: %v\n", err) + return + } + + // Display in dashboard format + fmt.Println("qBittorrent Status:") + fmt.Println(strings.Repeat("-", 30)) + for _, nv := range data { + fmt.Printf(" %-15s %s\n", nv.Name+":", nv.Value) + } +} +``` + +## Integration with Homepage Widgets + +```mermaid +graph TD + A[Homepage Dashboard] --> B[Widget Config] + B --> C{qBittorrent Provider} + C --> D[Create Client] + D --> E[Initialize with credentials] + E --> F[Fetch Transfer Info] + F --> G[Format as NameValue pairs] + G --> H[Render in UI] +``` + +### Widget Configuration + +```yaml +widgets: + - provider: qbittorrent + config: + url: http://localhost:8080 + username: admin + password: password123 +``` + +## Error Handling + +```go +// Handle HTTP errors +resp, err := client.doRequest(ctx, http.MethodGet, endpoint, query, body) +if err != nil { + return nil, err +} +if resp.StatusCode != http.StatusOK { + return nil, widgets.ErrHTTPStatus.Subject(resp.Status) +} +``` + +## Related Packages + +- `internal/homepage/widgets` - Widget framework and interface +- `github.com/bytedance/sonic` - JSON serialization +- `github.com/yusing/goutils/strings` - String utilities for formatting diff --git a/internal/homepage/widgets/README.md b/internal/homepage/widgets/README.md new file mode 100644 index 00000000..c5aefa39 --- /dev/null +++ b/internal/homepage/widgets/README.md @@ -0,0 +1,188 @@ +# Homepage Widgets Package + +> [!WARNING] +> +> This package is a work in progress and is not stable. + +This package provides a widget framework for the GoDoxy homepage dashboard, enabling integration with various service providers to display real-time data. + +## Overview + +The `internal/homepage/widgets` package defines the widget interface and common utilities for building homepage widgets. It provides a standardized way to integrate external services into the homepage dashboard. + +## Architecture + +### Core Components + +``` +widgets/ +├── widgets.go # Widget interface and config +└── http.go # HTTP client and error definitions +``` + +### Data Types + +```go +type Config struct { + Provider string `json:"provider"` + Config Widget `json:"config"` +} + +type Widget interface { + Initialize(ctx context.Context, url string, cfg map[string]any) error + Data(ctx context.Context) ([]NameValue, error) +} + +type NameValue struct { + Name string `json:"name"` + Value string `json:"value"` +} +``` + +### Constants + +```go +const ( + WidgetProviderQbittorrent = "qbittorrent" +) +``` + +### Errors + +```go +var ErrInvalidProvider = gperr.New("invalid provider") +var ErrHTTPStatus = gperr.New("http status") +``` + +## API Reference + +### Widget Interface + +```go +type Widget interface { + // Initialize sets up the widget with connection configuration + Initialize(ctx context.Context, url string, cfg map[string]any) error + + // Data returns current widget data as name-value pairs + Data(ctx context.Context) ([]NameValue, error) +} +``` + +### Configuration + +#### Config.UnmarshalMap + +Parses widget configuration from a map. + +```go +func (cfg *Config) UnmarshalMap(m map[string]any) error +``` + +**Parameters:** + +- `m` - Map containing `provider` and `config` keys + +**Returns:** + +- `error` - Parsing or validation error + +**Example:** + +```go +widgetCfg := widgets.Config{} +err := widgetCfg.UnmarshalMap(map[string]any{ + "provider": "qbittorrent", + "config": map[string]any{ + "username": "admin", + "password": "password123", + }, +}) +``` + +### HTTP Client + +```go +var HTTPClient = &http.Client{ + Timeout: 10 * time.Second, +} +``` + +### Available Providers + +- **qbittorrent** - qBittorrent torrent client integration (WIP) + +## Usage Example + +### Creating a Custom Widget + +```go +package mywidget + +import ( + "context" + "github.com/yusing/godoxy/internal/homepage/widgets" +) + +type MyWidget struct { + URL string + APIKey string +} + +func (m *MyWidget) Initialize(ctx context.Context, url string, cfg map[string]any) error { + m.URL = url + m.APIKey = cfg["api_key"].(string) + return nil +} + +func (m *MyWidget) Data(ctx context.Context) ([]widgets.NameValue, error) { + // Fetch data and return as name-value pairs + return []widgets.NameValue{ + {Name: "Status", Value: "Online"}, + {Name: "Uptime", Value: "24h"}, + }, nil +} +``` + +### Registering the Widget + +```go +// In widgets initialization +widgetProviders["mywidget"] = struct{}{} +``` + +### Using the Widget in Homepage + +```go +// Fetch widget data +widget := getWidget("qbittorrent") +data, err := widget.Data(ctx) +if err != nil { + log.Fatal(err) +} + +// Display data +for _, nv := range data { + fmt.Printf("%s: %s\n", nv.Name, nv.Value) +} +``` + +## Integration with Homepage + +```mermaid +graph TD + A[Homepage Dashboard] --> B[Widget Config] + B --> C[Widget Factory] + C --> D{Provider Type} + D -->|qbittorrent| E[qBittorrent Widget] + D -->|custom| F[Custom Widget] + E --> G[Initialize] + F --> G + G --> H[Data Fetch] + H --> I[Render UI] +``` + +## Related Packages + +- `internal/homepage/integrations/qbittorrent` - qBittorrent widget implementation +- `internal/serialization` - Configuration unmarshaling utilities +- `github.com/yusing/goutils/errs` - Error handling diff --git a/internal/idlewatcher/README.md b/internal/idlewatcher/README.md index 785d3295..56af1d86 100644 --- a/internal/idlewatcher/README.md +++ b/internal/idlewatcher/README.md @@ -1,378 +1,293 @@ # Idlewatcher -Idlewatcher manages container lifecycle based on idle timeout. When a container is idle for a configured duration, it can be automatically stopped, paused, or killed. When a request comes in, the container is woken up automatically. +Manages container lifecycle based on idle timeout, automatically stopping/pausing containers and waking them on request. -Idlewatcher also serves a small loading page (HTML + JS + CSS) and an SSE endpoint under [`internal/idlewatcher/types/paths.go`](internal/idlewatcher/types/paths.go:1) (prefixed with `/$godoxy/`) to provide wake events to browsers. +## Overview -## Architecture Overview +The `internal/idlewatcher` package implements idle-based container lifecycle management for GoDoxy. When a container is idle for a configured duration, it can be automatically stopped, paused, or killed. When a request arrives, the container is woken up automatically. -```mermaid -graph TB - subgraph Request Flow - HTTP[HTTP Request] -->|Intercept| W[Watcher] - Stream[Stream Request] -->|Intercept| W - end +### Primary Consumers - subgraph Wake Process - W -->|Wake| Wake[Wake Container] - Wake -->|Check Status| State[Container State] - Wake -->|Wait Ready| Health[Health Check] - Wake -->|Events| SSE[SSE Events] - end +- **Route layer**: Routes with idlewatcher config integrate with this package to manage container lifecycle +- **HTTP handlers**: Serve loading pages and SSE events during wake-up +- **Stream handlers**: Handle stream connections with idle detection - subgraph Idle Management - Timer[Idle Timer] -->|Timeout| Stop[Stop Container] - State -->|Running| Timer - State -->|Stopped| Timer - end +### Non-goals - subgraph Providers - Docker[DockerProvider] --> DockerAPI[Docker API] - Proxmox[ProxmoxProvider] --> ProxmoxAPI[Proxmox API] - end +- Does not implement container runtime operations directly (delegates to providers) +- Does not manage container dependencies beyond wake ordering +- Does not provide health checking (delegates to `internal/health/monitor`) - W -->|Uses| Providers +### Stability + +Internal package with stable public API. Changes to exported types require backward compatibility. + +## Public API + +### Exported Types + +```go +// Watcher manages lifecycle of a single container +type Watcher struct { + // Embedded route helper for proxy/stream/health + routeHelper + + cfg *types.IdlewatcherConfig + + // Thread-safe state containers + provider synk.Value[idlewatcher.Provider] + state synk.Value[*containerState] + lastReset synk.Value[time.Time] + + // Timers and channels + idleTicker *time.Ticker + healthTicker *time.Ticker + readyNotifyCh chan struct{} + + // SSE event broadcasting (HTTP routes only) + eventChs *xsync.Map[chan *WakeEvent, struct{}] + eventHistory []WakeEvent +} ``` -## Directory Structure - -``` -idlewatcher/ -├── debug.go # Debug utilities for watcher inspection -├── errors.go # Error types and conversion -├── events.go # Wake event types and broadcasting -├── handle_http.go # HTTP request handling and loading page -├── handle_http_debug.go # Debug HTTP handler (!production builds) -├── handle_stream.go # Stream connection handling -├── health.go # Health monitor implementation + readiness tracking -├── loading_page.go # Loading page HTML/CSS/JS templates -├── state.go # Container state management -├── watcher.go # Core Watcher implementation -├── provider/ # Container provider implementations -│ ├── docker.go # Docker container management -│ └── proxmox.go # Proxmox LXC management -├── types/ -│ ├── container_status.go # ContainerStatus enum -│ ├── paths.go # Loading page + SSE paths -│ ├── provider.go # Provider interface definition -│ └── waker.go # Waker interface (http + stream + health) -└── html/ - ├── loading_page.html # Loading page template - ├── style.css # Loading page styles - └── loading.js # Loading page JavaScript +```go +// WakeEvent is broadcast via SSE during wake-up +type WakeEvent struct { + Type WakeEventType + Message string + Timestamp time.Time + Error string +} ``` -## Core Components +### Exported Functions/Methods -### Watcher +```go +// NewWatcher creates or reuses a watcher for the given route and config +func NewWatcher(parent task.Parent, r types.Route, cfg *types.IdlewatcherConfig) (*Watcher, error) -The main component that manages a single container's lifecycle: +// Wake wakes the container, blocking until ready +func (w *Watcher) Wake(ctx context.Context) error + +// Start begins the idle watcher loop +func (w *Watcher) Start(parent task.Parent) gperr.Error + +// ServeHTTP serves the loading page and SSE events +func (w *Watcher) ServeHTTP(rw http.ResponseWriter, r *http.Request) + +// ListenAndServe handles stream connections with idle detection +func (w *Watcher) ListenAndServe(ctx context.Context, preDial, onRead nettypes.HookFunc) + +// Key returns the unique key for this watcher +func (w *Watcher) Key() string +``` + +### Package-level Variables + +```go +var ( + // watcherMap is a global registry keyed by config.Key() + watcherMap map[string]*Watcher + watcherMapMu sync.RWMutex + + // singleFlight prevents duplicate wake calls for the same container + singleFlight singleflight.Group +) +``` + +## Architecture + +### Core Components ```mermaid classDiagram class Watcher { - +string Key() string - +Wake(ctx context.Context) error - +Start(parent task.Parent) gperr.Error - +ServeHTTP(rw ResponseWriter, r *Request) - +ListenAndServe(ctx context.Context, predial, onRead HookFunc) - -idleTicker: *time.Ticker - -healthTicker: *time.Ticker - -state: synk.Value~*containerState~ - -provider: synk.Value~Provider~ - -readyNotifyCh: chan struct{} - -eventChs: *xsync.Map~chan *WakeEvent, struct{}~ - -eventHistory: []WakeEvent - -dependsOn: []*dependency + +Wake(ctx) error + +Start(parent) gperr.Error + +ServeHTTP(ResponseWriter, *Request) + +ListenAndServe(ctx, preDial, onRead) + +Key() string } class containerState { - +status: ContainerStatus - +ready: bool - +err: error - +startedAt: time.Time - +healthTries: int + status ContainerStatus + ready bool + err error + startedAt time.Time + healthTries int } - class dependency { - +*Watcher - +waitHealthy: bool + class idlewatcher.Provider { + <> + +ContainerPause(ctx) error + +ContainerStart(ctx) error + +ContainerStop(ctx, signal, timeout) error + +ContainerStatus(ctx) (ContainerStatus, error) + +Watch(ctx) (eventCh, errCh) } Watcher --> containerState : manages - Watcher --> dependency : depends on + Watcher --> idlewatcher.Provider : uses ``` -Package-level helpers: - -- `watcherMap` is a global registry of watchers keyed by [`types.IdlewatcherConfig.Key()`](internal/types/idlewatcher.go:60), guarded by `watcherMapMu`. -- `singleFlight` is a global `singleflight.Group` keyed by container name to prevent duplicate wake calls. - -### Provider Interface - -Abstraction for different container backends: +### Component Interactions ```mermaid -classDiagram - class Provider { - <> - +ContainerPause(ctx) error - +ContainerUnpause(ctx) error - +ContainerStart(ctx) error - +ContainerStop(ctx, signal, timeout) error - +ContainerKill(ctx, signal) error - +ContainerStatus(ctx) (ContainerStatus, error) - +Watch(ctx) (eventCh, errCh) - +Close() - } - - class DockerProvider { - +client: *docker.SharedClient - +watcher: watcher.DockerWatcher - +containerID: string - } - - class ProxmoxProvider { - +*proxmox.Node - +vmid: int - +lxcName: string - +running: bool - } - - Provider <|-- DockerProvider - Provider <|-- ProxmoxProvider +flowchart TD + A[HTTP Request] --> B{Container Ready?} + B -->|Yes| C[Proxy Request] + B -->|No| D[Wake Container] + D --> E[SingleFlight Check] + E --> F[Wake Dependencies] + F --> G[Start Container] + G --> H[Health Check] + H -->|Pass| I[Notify Ready] + I --> J[SSE Event] + J --> K[Loading Page] + K --> L[Retry Request] ``` -### Container Status +### State Machine ```mermaid stateDiagram-v2 - [*] --> Napping: status=stopped|paused + [*] --> Napping: Container stopped/paused - Napping --> Starting: provider start/unpause event - Starting --> Ready: health check passes - Starting --> Error: health check error / startup timeout - - Ready --> Napping: idle timeout (pause/stop/kill) - Ready --> Error: health check error - - Error --> Napping: provider stop/pause event - Error --> Starting: provider start/unpause event -``` - -Implementation notes: - -- `Starting` is represented by `containerState{status: running, ready: false, startedAt: non-zero}`. -- `Ready` is represented by `containerState{status: running, ready: true}`. -- `Error` is represented by `containerState{status: error, err: non-nil}`. -- State is updated primarily from provider events in [`(*Watcher).watchUntilDestroy()`](internal/idlewatcher/watcher.go:553) and health checks in [`(*Watcher).checkUpdateState()`](internal/idlewatcher/health.go:104). - -## Lifecycle Flow - -### Wake Flow (HTTP) - -```mermaid -sequenceDiagram - participant C as Client - participant W as Watcher - participant P as Provider - participant SSE as SSE (/\$godoxy/wake-events) - - C->>W: HTTP Request - W->>W: resetIdleTimer() - Note over W: Handles /favicon.ico and /\$godoxy/* assets first - - alt Container already ready - W->>C: Reverse-proxy upstream (same request) - else - W->>W: Wake() (singleflight + deps) - - alt Non-HTML request OR NoLoadingPage=true - W->>C: 100 Continue - W->>W: waitForReady() (readyNotifyCh) - W->>C: Reverse-proxy upstream (same request) - else HTML + loading page - W->>C: Serve loading page (HTML) - C->>SSE: Connect (EventSource) - Note over SSE: Streams history + live wake events - C->>W: Retry original request when WakeEventReady - end - end -``` - -### Stream Wake Flow - -```mermaid -sequenceDiagram - participant C as Client - participant W as Watcher - - C->>W: Connect to stream - W->>W: preDial hook - W->>W: wakeFromStream() - alt Container ready - W->>W: Pass through - else - W->>W: Wake() (singleflight + deps) - W->>W: waitStarted() (wait for route to be started) - W->>W: waitForReady() (readyNotifyCh) - W->>C: Stream connected - end -``` - -### Idle Timeout Flow - -```mermaid -sequenceDiagram - participant Client as Client - participant T as Idle Timer - participant W as Watcher - participant P as Provider - participant D as Dependencies - - loop Every request - Client->>W: HTTP/Stream - W->>W: resetIdleTimer() - end - - T->>W: Timeout - W->>W: stopByMethod() - alt stop method = pause - W->>P: ContainerPause() - else stop method = stop - W->>P: ContainerStop(signal, timeout) - else kill method = kill - W->>P: ContainerKill(signal) - end - P-->>W: Result - W->>D: Stop dependencies - D-->>W: Done -``` - -## Dependency Management - -Watchers can depend on other containers being started first: - -```mermaid -graph LR - A[App] -->|depends on| B[Database] - A -->|depends on| C[Redis] - B -->|depends on| D[Cache] -``` - -```mermaid -sequenceDiagram - participant A as App Watcher - participant B as DB Watcher - participant P as Provider - - A->>B: Wake() - Note over B: SingleFlight prevents
duplicate wake - B->>P: ContainerStart() - P-->>B: Started - B->>B: Wait healthy - B-->>A: Ready - A->>P: ContainerStart() - P-->>A: Started -``` - -## Event System - -Wake events are broadcast via Server-Sent Events (SSE): - -```mermaid -classDiagram - class WakeEvent { - +Type: WakeEventType - +Message: string - +Timestamp: time.Time - +Error: string - +WriteSSE(w io.Writer) error - } - - class WakeEventType { - <> - WakeEventStarting - WakeEventWakingDep - WakeEventDepReady - WakeEventContainerWoke - WakeEventWaitingReady - WakeEventReady - WakeEventError - } - - WakeEvent --> WakeEventType -``` - -Notes: - -- The SSE endpoint is [`idlewatcher.WakeEventsPath`](internal/idlewatcher/types/paths.go:3). -- Each SSE subscriber gets a dedicated buffered channel; the watcher also keeps an in-memory `eventHistory` that is sent to new subscribers first. -- `eventHistory` is cleared when the container transitions to napping (stop/pause). - -## State Machine - -```mermaid -stateDiagram-v2 - Napping --> Starting: provider start/unpause event + Napping --> Starting: Wake() called Starting --> Ready: Health check passes - Starting --> Error: Health check fails / startup timeout - Error --> Napping: provider stop/pause event - Error --> Starting: provider start/unpause event + Starting --> Error: Health check fails / timeout + Ready --> Napping: Idle timeout Ready --> Napping: Manual stop - note right of Napping - Container is stopped or paused - Idle timer stopped - end note - - note right of Starting - Container is running but not ready - Health checking active - Events broadcasted - end note - - note right of Ready - Container healthy - Idle timer running - end note + Error --> Starting: Retry wake + Error --> Napping: Container stopped externally ``` -## Key Files +## Configuration Surface -| File | Purpose | -| --------------------- | ----------------------------------------------------- | -| `watcher.go` | Core Watcher implementation with lifecycle management | -| `handle_http.go` | HTTP interception and loading page serving | -| `handle_stream.go` | Stream connection wake handling | -| `provider/docker.go` | Docker container operations | -| `provider/proxmox.go` | Proxmox LXC container operations | -| `state.go` | Container state transitions | -| `events.go` | Event broadcasting via SSE | -| `health.go` | Health monitor implementation + readiness tracking | +Configuration is defined in `types.IdlewatcherConfig`: -## Configuration +```go +type IdlewatcherConfig struct { + IdlewatcherConfigBase + Docker *types.DockerProviderConfig // Exactly one required + Proxmox *types.ProxmoxProviderConfig // Exactly one required +} -See [`types.IdlewatcherConfig`](internal/types/idlewatcher.go:27) for configuration options: +type IdlewatcherConfigBase struct { + IdleTimeout time.Duration // Duration before container is stopped + StopMethod types.ContainerMethod // pause, stop, or kill + StopSignal types.ContainerSignal // Signal to send + StopTimeout int // Timeout in seconds + WakeTimeout time.Duration // Max time to wait for wake + DependsOn []string // Container dependencies + StartEndpoint string // Optional path restriction + NoLoadingPage bool // Skip loading page +} +``` -- `IdleTimeout`: Duration before container is put to sleep -- `StopMethod`: pause, stop, or kill -- `StopSignal`: Signal to send when stopping -- `StopTimeout`: Timeout for stop operation -- `WakeTimeout`: Timeout for wake operation -- `DependsOn`: List of dependent containers -- `StartEndpoint`: Optional HTTP path restriction for wake requests -- `NoLoadingPage`: Skip loading page, wait directly +### Docker Labels -Provider config (exactly one must be set): +```yaml +labels: + proxy.idle_timeout: 5m + proxy.idle_stop_method: stop + proxy.idle_depends_on: database:redis +``` -- `Docker`: container id/name + docker connection info -- `Proxmox`: `node` + `vmid` +### Path Constants -## Thread Safety +```go +const ( + LoadingPagePath = "/$godoxy/loading" + WakeEventsPath = "/$godoxy/wake-events" +) +``` -- Uses `synk.Value` for atomic state updates -- Uses `xsync.Map` for SSE subscriber management -- Uses `sync.RWMutex` for watcher map (`watcherMapMu`) and SSE event history (`eventHistoryMu`) -- Uses `singleflight.Group` to prevent duplicate wake calls +## Dependency and Integration Map + +| Dependency | Purpose | +| -------------------------------- | --------------------------- | +| `internal/health/monitor` | Health checking during wake | +| `internal/route/routes` | Route registry lookup | +| `internal/docker` | Docker client connection | +| `internal/proxmox` | Proxmox LXC management | +| `internal/watcher/events` | Container event watching | +| `pkg/gperr` | Error handling | +| `xsync/v4` | Concurrent maps | +| `golang.org/x/sync/singleflight` | Duplicate wake suppression | + +## Observability + +### Logs + +- **INFO**: Wake start, container started, ready notification +- **DEBUG**: State transitions, health check details +- **ERROR**: Wake failures, health check errors + +Log context includes: `alias`, `key`, `provider`, `method` + +### Metrics + +No metrics exposed directly; health check metrics available via `internal/health/monitor`. + +## Security Considerations + +- Loading page and SSE endpoints are mounted under `/$godoxy/` path +- No authentication on loading page; assumes internal network trust +- SSE event history may contain container names (visible to connected clients) + +## Failure Modes and Recovery + +| Failure | Behavior | Recovery | +| ----------------------------- | -------------------------------------------------- | ------------------------------ | +| Wake timeout | Returns error, container remains in current state | Retry wake with longer timeout | +| Health check fails repeatedly | Container marked as error, retries on next request | External fix required | +| Provider connection lost | SSE disconnects, next request retries wake | Reconnect on next request | +| Dependencies fail to start | Wake fails with dependency error | Fix dependency container | + +## Usage Examples + +### Basic HTTP Route with Idlewatcher + +```go +route := &route.Route{ + Alias: "myapp", + Idlewatcher: &types.IdlewatcherConfig{ + IdlewatcherConfigBase: types.IdlewatcherConfigBase{ + IdleTimeout: 5 * time.Minute, + StopMethod: types.ContainerMethodStop, + StopTimeout: 30, + }, + Docker: &types.DockerProviderConfig{ + ContainerID: "abc123", + }, + }, +} + +w, err := idlewatcher.NewWatcher(parent, route, route.Idlewatcher) +if err != nil { + return err +} +return w.Start(parent) +``` + +### Watching Wake Events + +```go +// Events are automatically served at /$godoxy/wake-events +// Client connects via EventSource: + +const eventSource = new EventSource("/$godoxy/wake-events"); +eventSource.onmessage = (e) => { + const event = JSON.parse(e.data); + console.log(`Wake event: ${event.type}`, event.message); +}; +``` + +## Testing Notes + +- Unit tests cover state machine transitions +- Integration tests with Docker daemon for provider operations +- Mock provider for testing wake flow without real containers diff --git a/internal/idlewatcher/provider/README.md b/internal/idlewatcher/provider/README.md new file mode 100644 index 00000000..5224098e --- /dev/null +++ b/internal/idlewatcher/provider/README.md @@ -0,0 +1,219 @@ +# Idlewatcher Provider + +Implements container runtime abstractions for Docker and Proxmox LXC backends. + +## Overview + +The `internal/idlewatcher/provider` package implements the `idlewatcher.Provider` interface for different container runtimes. It enables the idlewatcher to manage containers regardless of the underlying runtime (Docker or Proxmox LXC). + +### Primary Consumers + +- **idlewatcher.Watcher**: Uses providers to perform container lifecycle operations +- **Package tests**: Verify provider contract compliance + +### Non-goals + +- Does not implement idle detection logic +- Does not manage route configuration +- Does not handle health checking + +### Stability + +Internal package implementing stable `idlewatcher.Provider` interface. + +## Public API + +### Provider Interface + +```go +type Provider interface { + // Lifecycle operations + ContainerPause(ctx context.Context) error + ContainerUnpause(ctx context.Context) error + ContainerStart(ctx context.Context) error + ContainerStop(ctx context.Context, signal types.ContainerSignal, timeout int) error + ContainerKill(ctx context.Context, signal types.ContainerSignal) error + + // Status and monitoring + ContainerStatus(ctx context.Context) (ContainerStatus, error) + Watch(ctx context.Context) (eventCh <-chan events.Event, errCh <-chan gperr.Error) + + // Cleanup + Close() +} +``` + +### Container Status + +```go +type ContainerStatus string + +const ( + ContainerStatusRunning ContainerStatus = "running" + ContainerStatusStopped ContainerStatus = "stopped" + ContainerStatusPaused ContainerStatus = "paused" + ContainerStatusError ContainerStatus = "error" +) +``` + +### Exported Functions + +```go +// NewDockerProvider creates a provider for Docker containers +func NewDockerProvider(dockerCfg types.DockerProviderConfig, containerID string) (idlewatcher.Provider, error) + +// NewProxmoxProvider creates a provider for Proxmox LXC containers +func NewProxmoxProvider(ctx context.Context, nodeName string, vmid int) (idlewatcher.Provider, error) +``` + +## Architecture + +### Core Components + +```mermaid +classDiagram + class Provider { + <> + +ContainerPause(ctx) error + +ContainerStart(ctx) error + +ContainerStop(ctx, signal, timeout) error + +ContainerStatus(ctx) (ContainerStatus, error) + +Watch(ctx) (eventCh, errCh) + +Close() + } + + class DockerProvider { + +client *docker.SharedClient + +watcher watcher.DockerWatcher + +containerID string + +ContainerPause(ctx) error + +ContainerStart(ctx) error + +ContainerStatus(ctx) (ContainerStatus, error) + } + + class ProxmoxProvider { + +*proxmox.Node + +vmid int + +lxcName string + +running bool + +ContainerStart(ctx) error + +ContainerStop(ctx, signal, timeout) error + } + + Provider <|-- DockerProvider + Provider <|-- ProxmoxProvider +``` + +### Component Interactions + +```mermaid +flowchart TD + A[Watcher] --> B{Provider Type} + B -->|Docker| C[DockerProvider] + B -->|Proxmox| D[ProxmoxProvider] + + C --> E[Docker API] + D --> F[Proxmox API] + + E --> G[Container Events] + F --> H[LXC Events] + + G --> A + H --> A +``` + +## Configuration Surface + +### Docker Provider Config + +```go +type DockerProviderConfig struct { + URL string // Docker socket URL (unix:///var/run/docker.sock) + SocketPath string // Alternative socket path +} +``` + +### Proxmox Provider Config + +Provided via `NewProxmoxProvider` parameters: + +- `nodeName`: Proxmox node name +- `vmid`: LXC container ID + +## Dependency and Integration Map + +| Dependency | Purpose | +| ------------------------- | -------------------------------------- | +| `internal/docker` | Docker client and container operations | +| `internal/proxmox` | Proxmox API client | +| `internal/watcher` | Event watching for container changes | +| `internal/watcher/events` | Event types | +| `pkg/gperr` | Error handling | + +## Observability + +### Logs + +- **DEBUG**: API calls and responses +- **ERROR**: Operation failures with context + +Log context includes: `container`, `vmid`, `action` + +## Security Considerations + +- Docker provider requires access to Docker socket +- Proxmox provider requires API credentials +- Both handle sensitive container operations + +## Failure Modes and Recovery + +| Failure | Behavior | Recovery | +| ------------------------- | ------------------------ | --------------------------- | +| Docker socket unavailable | Returns connection error | Fix socket permissions/path | +| Container not found | Returns not found error | Verify container ID | +| Proxmox node unavailable | Returns API error | Check network/node | +| Operation timeout | Returns timeout error | Increase timeout or retry | + +## Usage Examples + +### Creating a Docker Provider + +```go +provider, err := provider.NewDockerProvider(types.DockerProviderConfig{ + SocketPath: "/var/run/docker.sock", +}, "abc123def456") +if err != nil { + return err +} +defer provider.Close() + +// Check container status +status, err := provider.ContainerStatus(ctx) +if err != nil { + return err +} + +// Start container if stopped +if status == idlewatcher.ContainerStatusStopped { + if err := provider.ContainerStart(ctx); err != nil { + return err + } +} +``` + +### Watching for Container Events + +```go +eventCh, errCh := provider.Watch(ctx) + +for { + select { + case <-ctx.Done(): + return + case event := <-eventCh: + log.Printf("Container %s: %s", event.ActorName, event.Action) + case err := <-errCh: + log.Printf("Watch error: %v", err) + } +} +``` diff --git a/internal/jsonstore/README.md b/internal/jsonstore/README.md new file mode 100644 index 00000000..7c091173 --- /dev/null +++ b/internal/jsonstore/README.md @@ -0,0 +1,364 @@ +# JSON Store + +The jsonstore package provides persistent JSON storage with namespace support, using thread-safe concurrent maps and automatic loading/saving. + +## Overview + +The jsonstore package implements a simple yet powerful JSON storage system for GoDoxy, supporting both key-value stores (MapStore) and single object stores (ObjectStore) with automatic persistence to JSON files. + +### Key Features + +- Namespace-based storage +- Thread-safe concurrent map operations (xsync) +- Automatic JSON loading on initialization +- Automatic JSON saving on program exit +- Generic type support +- Marshal/Unmarshal integration + +## Architecture + +```mermaid +graph TD + A[JSON Store] --> B{Namespace} + B --> C[MapStore] + B --> D[ObjectStore] + + C --> E[xsync.Map] + D --> F[Single Object] + + G[Storage File] --> H[Load on Init] + H --> I[Parse JSON] + I --> J[xsync.Map or Object] + + K[Program Exit] --> L[Save All] + L --> M[Serialize to JSON] + M --> N[Write Files] +``` + +## Core Components + +### MapStore + +```go +type MapStore[VT any] struct { + *xsync.Map[string, VT] +} + +// Implements: +// - Initialize() - initializes the internal map +// - MarshalJSON() - serializes to JSON +// - UnmarshalJSON() - deserializes from JSON +``` + +### ObjectStore + +```go +type ObjectStore[Pointer Initializer] struct { + ptr Pointer +} + +// Initializer interface requires: +// - Initialize() +``` + +### Store Interface + +```go +type store interface { + Initialize() + json.Marshaler + json.Unmarshaler +} +``` + +## Public API + +### MapStore Creation + +```go +// Store creates a new namespace map store. +func Store[VT any](namespace namespace) MapStore[VT] +``` + +### ObjectStore Creation + +```go +// Object creates a new namespace object store. +func Object[Ptr Initializer](namespace namespace) Ptr +``` + +## Usage + +### MapStore Example + +```go +// Define a namespace +type UserID string + +// Create a store for user sessions +var sessions = jsonstore.Store[UserID]("sessions") + +// Store a value +sessions.Store("user123", "session-token-abc") + +// Load a value +token, ok := sessions.Load("user123") +if ok { + fmt.Println("Session:", token) +} + +// Iterate over all entries +for id, token := range sessions.Range { + fmt.Printf("%s: %s\n", id, token) +} + +// Delete a value +sessions.Delete("user123") +``` + +### ObjectStore Example + +```go +// Define a struct that implements Initialize +type AppConfig struct { + Name string + Version int +} + +func (c *AppConfig) Initialize() { + c.Name = "MyApp" + c.Version = 1 +} + +// Create an object store +var config = jsonstore.Object[*AppConfig]("app_config") + +// Access the object +fmt.Printf("App: %s v%d\n", config.Name, config.Version) + +// Modify and save (automatic on exit) +config.Version = 2 +``` + +### Complete Example + +```go +package main + +import ( + "encoding/json" + "github.com/yusing/godoxy/internal/jsonstore" +) + +type Settings struct { + Theme string + Lang string +} + +func (s *Settings) Initialize() { + s.Theme = "dark" + s.Lang = "en" +} + +func main() { + // Create namespace type + type SettingsKey string + + // Create stores + var settings = jsonstore.Object[*Settings]("settings") + var cache = jsonstore.Store[string]("cache") + + // Use stores + settings.Theme = "light" + cache.Store("key1", "value1") + + // On program exit, all stores are automatically saved +} +``` + +## Data Flow + +```mermaid +sequenceDiagram + participant Application + participant Store + participant xsync.Map + participant File + + Application->>Store: Store(key, value) + Store->>xsync.Map: Store(key, value) + xsync.Map-->>Store: Done + + Application->>Store: Load(key) + Store->>xsync.Map: Load(key) + xsync.Map-->>Store: value + Store-->>Application: value + + Application->>Store: Save() + Store->>File: Marshal JSON + File-->>Store: Success + + Note over Store,File: On program exit + Store->>File: Save all stores + File-->>Store: Complete +``` + +## Namespace + +Namespaces are string identifiers for different storage areas: + +```go +type namespace string + +// Create namespaces +var ( + users = jsonstore.Store[User]("users") + sessions = jsonstore.Store[Session]("sessions") + config = jsonstore.Object[*Config]("config") + metadata = jsonstore.Store[string]("metadata") +) +``` + +### Reserved Names + +None + +## File Storage + +### File Location + +```go +var storesPath = common.DataDir // Typically ./data/.{namespace}.json +``` + +### File Format + +Stores are saved as `{namespace}.json`: + +```json +{ + "key1": "value1", + "key2": "value2" +} +``` + +### Automatic Loading + +On initialization, stores are loaded from disk: + +```go +func loadNS[T store](ns namespace) T { + store := reflect.New(reflect.TypeFor[T]().Elem()).Interface().(T) + store.Initialize() + + path := filepath.Join(storesPath, string(ns)+".json") + file, err := os.Open(path) + if err != nil { + if !os.IsNotExist(err) { + log.Err(err).Msg("failed to load store") + } + return store + } + defer file.Close() + + if err := sonic.ConfigDefault.NewDecoder(file).Decode(&store); err != nil { + log.Err(err).Msg("failed to decode store") + } + + stores[ns] = store + return store +} +``` + +### Automatic Saving + +On program exit, all stores are saved: + +```go +func init() { + task.OnProgramExit("save_stores", func() { + if err := save(); err != nil { + log.Error().Err(err).Msg("failed to save stores") + } + }) +} + +func save() error { + for ns, store := range stores { + path := filepath.Join(storesPath, string(ns)+".json") + if err := serialization.SaveJSON(path, &store, 0644); err != nil { + return err + } + } + return nil +} +``` + +## Thread Safety + +The MapStore uses `xsync.Map` for thread-safe operations: + +```go +type MapStore[VT any] struct { + *xsync.Map[string, VT] +} + +// All operations are safe: +// - Load, Store, Delete +// - Range iteration +// - LoadAndDelete +// - LoadOrCompute +``` + +## JSON Serialization + +### MarshalJSON + +```go +func (s MapStore[VT]) MarshalJSON() ([]byte, error) { + return sonic.Marshal(xsync.ToPlainMap(s.Map)) +} +``` + +### UnmarshalJSON + +```go +func (s *MapStore[VT]) UnmarshalJSON(data []byte) error { + tmp := make(map[string]VT) + if err := sonic.Unmarshal(data, &tmp); err != nil { + return err + } + s.Map = xsync.NewMap[string, VT](xsync.WithPresize(len(tmp))) + for k, v := range tmp { + s.Store(k, v) + } + return nil +} +``` + +## Integration Points + +The jsonstore package integrates with: + +- **Serialization**: JSON marshaling/unmarshaling +- **Task Management**: Program exit callbacks +- **Common**: Data directory configuration + +## Error Handling + +Errors are logged but don't prevent store usage: + +```go +if err := sonic.Unmarshal(data, &tmp); err != nil { + log.Err(err). + Str("path", path). + Msg("failed to load store") +} +``` + +## Performance Considerations + +- Uses `xsync.Map` for lock-free reads +- Presizes maps based on input data +- Sonic library for fast JSON parsing +- Background save on program exit (non-blocking) diff --git a/internal/logging/README.md b/internal/logging/README.md index 457a6386..b9c204fb 100644 --- a/internal/logging/README.md +++ b/internal/logging/README.md @@ -1,30 +1,46 @@ # Logging Package -This package provides structured logging capabilities for GoDoxy, including application logging, HTTP access logging, and in-memory log streaming. +Structured logging capabilities for GoDoxy, including application logging, HTTP access logging, and in-memory log streaming. -## Structure +## Overview -``` -internal/logging/ -├── logging.go # Main logger initialization using zerolog -├── accesslog/ # HTTP access logging with rotation and filtering -│ ├── access_logger.go # Core logging logic and buffering -│ ├── multi_access_logger.go # Fan-out to multiple writers -│ ├── config.go # Configuration types and defaults -│ ├── formatter.go # Log format implementations -│ ├── file_logger.go # File I/O with reference counting -│ ├── rotate.go # Log rotation based on retention policy -│ ├── writer.go # Buffered/unbuffered writer abstractions -│ ├── back_scanner.go # Backward line scanning for rotation -│ ├── filter.go # Request filtering by status/method/header -│ ├── retention.go # Retention policy definitions -│ ├── response_recorder.go # HTTP response recording middleware -│ └── ... # Tests and utilities -└── memlogger/ # In-memory circular buffer with WebSocket streaming - └── mem_logger.go # Ring buffer with WebSocket event notifications -``` +This package provides structured logging for GoDoxy with three distinct subsystems: -## Architecture Overview +- **Application Logger**: Zerolog-based console logger with level-aware formatting +- **Access Logger**: HTTP request/response logging with configurable formats, filters, and destinations +- **In-Memory Logger**: Circular buffer with WebSocket streaming for real-time log viewing + +### Primary Consumers + +- `internal/api/` - HTTP request logging +- `internal/route/` - Route-level access logging +- WebUI - Real-time log streaming via WebSocket + +### Non-goals + +- Log aggregation across multiple GoDoxy instances +- Persistent storage of application logs (access logs only) +- Structured logging output to external systems (Datadog, etc.) + +### Stability + +Internal package with stable APIs. Exported interfaces (`AccessLogger`, `MemLogger`) are stable. + +## Packages + +### `accesslog/` + +HTTP request/response logging with configurable formats, filters, and destinations. + +See [accesslog/README.md](./accesslog/README.md) for full documentation. + +### `memlogger/` + +In-memory circular buffer with WebSocket streaming for real-time log viewing. + +See [memlogger/README.md](./memlogger/README.md) for full documentation. + +## Architecture ```mermaid graph TB @@ -43,13 +59,6 @@ graph TB W --> S[Stdout] end - subgraph "Log Rotation" - B --> RT[Rotate Timer] - RT --> BS[BackScanner] - BS --> T[Truncate/Move] - T --> F1 - end - subgraph "In-Memory Logger" WB[Write Buffer] WB --> RB[Circular Buffer
16KB max] @@ -58,206 +67,51 @@ graph TB end ``` -## Components +## Configuration Surface -### 1. Application Logger (`logging.go`) +### Access Log Configuration -Initializes a zerolog-based console logger with level-aware formatting: +See [accesslog/README.md](./accesslog/README.md) for configuration options. -- **Levels**: Trace → Debug → Info (determined by `common.IsTrace`/`common.IsDebug`) -- **Time Format**: 04:05 (trace) or 01-02 15:04 (debug/info) -- **Multi-line Handling**: Automatically indents continuation lines +### In-Memory Logger -```go -// Auto-initialized on import -func InitLogger(out ...io.Writer) +See [memlogger/README.md](./memlogger/README.md) for configuration options. -// Create logger with fixed level -NewLoggerWithFixedLevel(level zerolog.Level, out ...io.Writer) -``` +## Dependency and Integration Map -### 2. Access Logging (`accesslog/`) +### Internal Dependencies -Logs HTTP requests/responses with configurable formats, filters, and destinations. +- `internal/task/task.go` - Lifetime management +- `internal/maxmind/` - IP geolocation for ACL logging +- `pkg/gperr` - Error handling -#### Core Interface +### External Dependencies -```go -type AccessLogger interface { - Log(req *http.Request, res *http.Response) - LogError(req *http.Request, err error) - LogACL(info *maxmind.IPInfo, blocked bool) - Config() *Config - Flush() - Close() error -} -``` +- `github.com/rs/zerolog` - Structured logging +- `github.com/puzpuzpuz/xsync/v4` - Concurrent maps +- `golang.org/x/time/rate` - Error rate limiting -#### Log Formats +## Observability -| Format | Description | -| ---------- | --------------------------------- | -| `common` | Basic Apache Common format | -| `combined` | Common + Referer + User-Agent | -| `json` | Structured JSON with full details | +### Logs -#### Example Output +| Level | When | +| ------- | ---------------------------------------- | +| `Debug` | Buffer size adjustments, rotation checks | +| `Info` | Log rotation events, file opens/closes | +| `Error` | Write failures (rate-limited) | -``` -common: localhost 127.0.0.1 - - [01-04 10:30:45] "GET /api HTTP/1.1" 200 1234 -combined: localhost 127.0.0.1 - - [01-04 10:30:45] "GET /api HTTP/1.1" 200 1234 "https://example.com" "Mozilla/5.0" -json: {"time":"04/Jan/2025:10:30:45 +0000","ip":"127.0.0.1","method":"GET",...} -``` +## Failure Modes and Recovery -#### Filters +| Failure Mode | Impact | Recovery | +| --------------------------- | ------------------------ | ----------------------------------------------------------- | +| File write failure | Log entries dropped | Rate-limited error logging; task termination after 5 errors | +| Disk full | Rotation fails | Continue logging until space available | +| WebSocket client disconnect | Client misses logs | Client reconnects to receive new logs | +| Buffer overflow (memlogger) | Oldest entries truncated | Automatic truncation at 50% threshold | -Filter incoming requests before logging: +## Testing Notes -- **StatusCodes**: Keep/drop by HTTP status code range -- **Method**: Keep/drop by HTTP method -- **Headers**: Match header existence or value -- **CIDR**: Match client IP against CIDR ranges - -#### Multi-Destination Support - -```mermaid -graph LR - A[Request] --> B[MultiAccessLogger] - B --> C[AccessLogger 1] --> F[File] - B --> D[AccessLogger 2] --> S[Stdout] -``` - -### 3. File Management (`file_logger.go`) - -- **Reference Counting**: Multiple loggers can share the same file -- **Auto-Close**: File closes when ref count reaches zero -- **Thread-Safe**: Shared mutex per file path - -### 4. Log Rotation (`rotate.go`) - -Rotates logs based on retention policy: - -| Policy | Description | -| ---------- | ----------------------------------- | -| `Days` | Keep logs within last N days | -| `Last` | Keep last N log lines | -| `KeepSize` | Keep last N bytes (simple truncate) | - -**Algorithm** (for Days/Last): - -1. Scan file backward line-by-line using `BackScanner` -2. Parse timestamps to find cutoff point -3. Move retained lines to file front -4. Truncate excess - -```mermaid -flowchart LR - A[File End] --> B[BackScanner] - B --> C{Valid timestamp?} - C -->|No| D[Skip line] - C -->|Yes| E{Within retention?} - E -->|No| F[Keep line] - E -->|Yes| G[Stop scanning] - F --> H[Move to front] - G --> I[Truncate rest] -``` - -### 5. Buffering (`access_logger.go`) - -- **Dynamic Sizing**: Adjusts buffer size based on write throughput -- **Initial**: 4KB → **Max**: 8MB -- **Adjustment**: Every 5 seconds based on writes-per-second - -### 6. In-Memory Logger (`memlogger/`) - -Circular buffer for real-time log streaming via WebSocket: - -- **Size**: 16KB maximum, auto-truncates old entries -- **Streaming**: WebSocket connection receives live updates -- **Events API**: Subscribe to log events - -```go -// HTTP handler for WebSocket streaming -HandlerFunc() gin.HandlerFunc - -// Subscribe to log events -Events() (<-chan []byte, func()) - -// Write to buffer (implements io.Writer) -Write(p []byte) (n int, err error) -``` - -## Configuration - -```yaml -access_log: - path: /var/log/godoxy/access.log # File path (optional) - stdout: true # Also log to stdout (optional) - format: combined # common | combined | json - rotate_interval: 1h # How often to check rotation - retention: - days: 30 # Keep last 30 days - # OR - last: 10000 # Keep last 10000 lines - # OR - keep_size: 100MB # Keep last 100MB - filters: - status_codes: [400-599] # Only log errors - method: [GET, POST] - headers: - - name: X-Internal - value: "true" - cidr: - - 10.0.0.0/8 - fields: - headers: drop # keep | drop | redacted - query: keep # keep | drop | redacted - cookies: drop # keep | drop | redacted -``` - -## Data Flow - -```mermaid -sequenceDiagram - participant C as Client - participant M as Middleware - participant R as ResponseRecorder - participant F as Formatter - participant B as BufferedWriter - participant W as Writer - - C->>M: HTTP Request - M->>R: Capture request - R-->>M: Continue - - M->>M: Process request - - C->>M: HTTP Response - M->>R: Capture response - R->>F: Format log line - F->>B: Write formatted line - B->>W: Flush when needed - - par File Writer - W->>File: Append line - and Stdout Writer - W->>Stdout: Print line - end - - Note over B,W: Periodic rotation check - W->>File: Rotate if needed -``` - -## Key Design Patterns - -1. **Interface Segregation**: Small, focused interfaces (`AccessLogger`, `Writer`, `BufferedWriter`) - -2. **Dependency Injection**: Writers injected at creation for flexibility - -3. **Reference Counting**: Shared file handles prevent too-many-open-files - -4. **Dynamic Buffering**: Adapts to write throughput automatically - -5. **Backward Scanning**: Efficient rotation without loading entire file - -6. **Zero-Allocation Formatting**: Build log lines in pre-allocated buffers +- `access_logger_test.go` - Integration tests with mock file system +- `file_logger_test.go` - Reference counting tests +- `back_scanner_test.go` - Rotation boundary tests diff --git a/internal/logging/accesslog/README.md b/internal/logging/accesslog/README.md new file mode 100644 index 00000000..66ee8511 --- /dev/null +++ b/internal/logging/accesslog/README.md @@ -0,0 +1,493 @@ +# Access Logging + +Provides HTTP access logging with file rotation, log filtering, and multiple output formats for request and ACL event logging. + +## Overview + +The accesslog package captures HTTP request/response information and writes it to files or stdout. It includes configurable log formats, filtering rules, and automatic log rotation with retention policies. + +### Primary Consumers + +- `internal/route` - Middleware for logging proxied requests +- `internal/acl` - ACL decision logging +- `internal/api` - Request audit trails + +### Non-goals + +- Does not provide log parsing or analysis +- Does not implement log aggregation across services +- Does not provide log shipping to external systems +- Does not implement access control (use `internal/acl`) + +### Stability + +Internal package. Public interfaces are stable. + +## Public API + +### Exported Types + +#### AccessLogger Interface + +```go +type AccessLogger interface { + // Log records an HTTP request and response + Log(req *http.Request, res *http.Response) + + // LogError logs a request with an error status code + LogError(req *http.Request, err error) + + // LogACL logs an ACL block/allow event + LogACL(info *maxmind.IPInfo, blocked bool) + + // Config returns the logger configuration + Config() *Config + + // Flush forces any buffered log data to be written + Flush() + + // Close closes the logger and releases resources + Close() error +} +``` + +Main interface for logging HTTP requests and ACL events. + +#### Writer Interface + +```go +type Writer interface { + io.WriteCloser + ShouldBeBuffered() bool + Name() string // file name or path +} +``` + +Output destination interface. + +#### Format Type + +```go +type Format string + +const ( + FormatCommon Format = "common" + FormatCombined Format = "combined" + FormatJSON Format = "json" +) +``` + +Log format constants. + +### Configuration Types + +#### RequestLoggerConfig + +```go +type RequestLoggerConfig struct { + ConfigBase + Format Format `json:"format" validate:"oneof=common combined json"` + Filters Filters `json:"filters"` + Fields Fields `json:"fields"` +} +``` + +Configuration for request/response logging. + +#### ACLLoggerConfig + +```go +type ACLLoggerConfig struct { + ConfigBase + LogAllowed bool `json:"log_allowed"` +} +``` + +Configuration for ACL event logging. + +#### ConfigBase + +```go +type ConfigBase struct { + B int `json:"buffer_size"` // Deprecated: buffer size is adjusted dynamically + Path string `json:"path"` + Stdout bool `json:"stdout"` + Retention *Retention `json:"retention" aliases:"keep"` + RotateInterval time.Duration `json:"rotate_interval,omitempty" swaggertype:"primitive,integer"` +} +``` + +Common configuration for all loggers. + +#### Filters + +```go +type Filters struct { + StatusCodes LogFilter[*StatusCodeRange] `json:"status_codes"` + Method LogFilter[HTTPMethod] `json:"method"` + Host LogFilter[Host] `json:"host"` + Headers LogFilter[*HTTPHeader] `json:"headers"` + CIDR LogFilter[*CIDR] `json:"cidr"` +} +``` + +Filtering rules for what to log. + +#### Fields + +```go +type Fields struct { + Headers FieldConfig `json:"headers" aliases:"header"` + Query FieldConfig `json:"query" aliases:"queries"` + Cookies FieldConfig `json:"cookies" aliases:"cookie"` +} +``` + +Field configuration for what data to include. + +### Exported Functions + +#### Constructor + +```go +func NewAccessLogger(parent task.Parent, cfg AnyConfig) (AccessLogger, error) +func NewMockAccessLogger(parent task.Parent, cfg *RequestLoggerConfig) AccessLogger +func NewAccessLoggerWithIO(parent task.Parent, writer Writer, anyCfg AnyConfig) AccessLogger +``` + +Create access loggers from configurations. + +#### Default Configurations + +```go +func DefaultRequestLoggerConfig() *RequestLoggerConfig +func DefaultACLLoggerConfig() *ACLLoggerConfig +``` + +Returns default configurations. + +## Architecture + +### Core Components + +```mermaid +graph TD + subgraph Request Flow + Req[HTTP Request] -->|Passed to| Log[AccessLogger.Log] + Res[HTTP Response] -->|Passed to| Log + Log -->|Formats| Fmt[RequestFormatter] + Fmt -->|Writes to| Writer[BufferedWriter] + Writer -->|Outputs to| Output[File/Stdout] + end + + subgraph Background Tasks + Rotator[Rotation Task] -->|Triggers| Rotate[ShouldRotate] + Adjuster[Buffer Adjuster] -->|Adjusts| Buffer[Buffer Size] + end +``` + +| Component | Responsibility | +| ------------------ | ------------------------------------ | +| `AccessLogger` | Main logging interface | +| `RequestFormatter` | Formats request/response logs | +| `ACLFormatter` | Formats ACL decision logs | +| `Writer` | Output destination (file/stdout) | +| `BufferedWriter` | Efficient I/O with dynamic buffering | + +### Log Flow + +```mermaid +sequenceDiagram + participant Request + participant AccessLogger + participant Formatter + participant BufferedWriter + participant File + + Request->>AccessLogger: Log(req, res) + AccessLogger->>AccessLogger: shouldLog() filter check + alt Passes filters + AccessLogger->>Formatter: AppendRequestLog(line, req, res) + Formatter->>AccessLogger: Formatted line + AccessLogger->>BufferedWriter: Write(line) + BufferedWriter->>BufferedWriter: Buffer if needed + BufferedWriter->>File: Flush when full/rotating + else Fails filters + AccessLogger->>Request: Skip logging + end +``` + +### Buffer Management + +The logger dynamically adjusts buffer size based on write throughput: + +| Parameter | Value | +| ------------------- | --------- | +| Initial Buffer Size | 4 KB | +| Maximum Buffer Size | 8 MB | +| Adjustment Interval | 5 seconds | + +Buffer size adjustment formula: + +```go +newBufSize = origBufSize +/- step +step = max(|wps - origBufSize|/2, wps/2) +``` + +### Rotation Logic + +```mermaid +stateDiagram-v2 + [*] --> Logging + Logging --> Logging: Normal writes + Logging --> Rotating: Interval reached + Rotating --> Logging: New file created + Rotating --> [*]: Logger closed +``` + +Rotation checks: + +1. Is rotation enabled (supportRotate + valid retention)? +1. Is retention period valid? +1. Create new file with timestamp suffix +1. Delete old files beyond retention + +## Log Formats + +### Common Format + +``` +127.0.0.1 - - [10/Jan/2024:12:00:00 +0000] "GET /api HTTP/1.1" 200 1234 +``` + +### Combined Format + +``` +127.0.0.1 - - [10/Jan/2024:12:00:00 +0000] "GET /api HTTP/1.1" 200 1234 "https://example.com" "Mozilla/5.0" +``` + +### JSON Format + +```json +{ + "level": "info", + "time": "10/Jan/2024:12:00:00 +0000", + "ip": "127.0.0.1", + "method": "GET", + "scheme": "http", + "host": "example.com", + "path": "/api", + "protocol": "HTTP/1.1", + "status": 200, + "type": "application/json", + "size": 1234, + "referer": "https://example.com", + "useragent": "Mozilla/5.0" +} +``` + +## Configuration Surface + +### YAML Configuration + +```yaml +access_log: + path: /var/log/godoxy/access.log + stdout: false + rotate_interval: 1h + retention: + days: 30 + format: combined + filters: + status_codes: + keep: + - min: 200 + max: 599 + method: + keep: + - GET + - POST + headers: + keep: + - name: Authorization +``` + +### Configuration Fields + +| Field | Type | Default | Description | +| ---------------------- | -------- | -------- | ------------------- | +| `path` | string | - | Log file path | +| `stdout` | bool | false | Also log to stdout | +| `rotate_interval` | duration | 1h | Rotation interval | +| `retention.days` | int | 30 | Days to retain logs | +| `format` | string | combined | Log format | +| `filters.status_codes` | range[] | all | Status code filter | +| `filters.method` | string[] | all | HTTP method filter | +| `filters.cidr` | CIDR[] | none | IP range filter | + +### Reloading + +Configuration is fixed at construction time. Create a new logger to apply changes. + +## Dependency and Integration Map + +### Internal Dependencies + +| Package | Purpose | +| ------------------------ | ---------------------------------- | +| `internal/maxmind/types` | IP geolocation for ACL logs | +| `internal/serialization` | Default value factory registration | + +### External Dependencies + +| Dependency | Purpose | +| -------------------------------- | --------------------------- | +| `github.com/rs/zerolog` | JSON formatting and logging | +| `github.com/yusing/goutils/task` | Lifetime management | +| `github.com/puzpuzpuz/xsync/v4` | Concurrent map operations | +| `golang.org/x/time/rate` | Error rate limiting | + +## Observability + +### Logs + +| Level | When | +| ----- | ----------------------------- | +| Debug | Buffer size adjustments | +| Info | Log file rotation | +| Error | Write failures (rate limited) | + +### Metrics + +None exposed directly. Write throughput tracked internally. + +## Security Considerations + +- Log files should have appropriate permissions (644) +- Sensitive headers can be filtered via `Filters.Headers` +- Query parameters and cookies are configurable via `Fields` +- Rate limiting prevents error log flooding + +## Failure Modes and Recovery + +| Failure | Detection | Recovery | +| ----------------------- | ------------------------ | -------------------------------------- | +| Write error | `Write()` returns error | Rate-limited logging, then task finish | +| File deleted while open | Write failure | Logger continues with error | +| Disk full | Write failure | Error logged, may terminate | +| Rotation error | `Rotate()` returns error | Continue with current file | + +### Error Rate Limiting + +```go +const ( + errRateLimit = 200 * time.Millisecond + errBurst = 5 +) +``` + +Errors are rate-limited to prevent log flooding. After burst exceeded, task is finished. + +## Usage Examples + +### Basic Request Logger + +```go +import "github.com/yusing/godoxy/internal/logging/accesslog" + +cfg := accesslog.DefaultRequestLoggerConfig() +cfg.Path = "/var/log/godoxy/access.log" +cfg.RotateInterval = time.Hour +cfg.Retention = &accesslog.Retention{Days: 30} + +logger, err := accesslog.NewAccessLogger(parent, cfg) +if err != nil { + log.Fatal(err) +} +defer logger.Close() + +// Log a request +logger.Log(req, res) +``` + +### JSON Format with Filters + +```go +cfg := accesslog.RequestLoggerConfig{ + ConfigBase: accesslog.ConfigBase{ + Path: "/var/log/godoxy/requests.json.log", + Retention: &accesslog.Retention{Days: 7}, + }, + Format: accesslog.FormatJSON, + Filters: accesslog.Filters{ + StatusCodes: accesslog.LogFilter[*accesslog.StatusCodeRange]{ + Keep: []accesslog.StatusCodeRange{{Min: 400, Max: 599}}, + }, + }, +} + +logger := accesslog.NewAccessLogger(parent, &cfg) +``` + +### ACL Logger + +```go +aclCfg := accesslog.DefaultACLLoggerConfig() +aclCfg.Path = "/var/log/godoxy/acl.log" +aclCfg.LogAllowed = false // Only log blocked requests + +aclLogger, err := accesslog.NewAccessLogger(parent, aclCfg) +if err != nil { + log.Fatal(err) +} + +// Log ACL decision +aclLogger.LogACL(ipInfo, true) // blocked +aclLogger.LogACL(ipInfo, false) // allowed (if LogAllowed is true) +``` + +### Custom Writer + +```go +type customWriter struct { + *os.File +} + +func (w *customWriter) ShouldBeBuffered() bool { return true } +func (w *customWriter) Name() string { return "custom" } + +writer := &customWriter{File: myFile} +logger := accesslog.NewAccessLoggerWithIO(parent, writer, cfg) +``` + +### Integration with Route Middleware + +```go +func accessLogMiddleware(logger accesslog.AccessLogger) gin.HandlerFunc { + return func(c *gin.Context) { + c.Next() + logger.Log(c.Request, c.Writer.Result()) + } +} +``` + +## Performance Characteristics + +- Buffered writes reduce I/O operations +- Dynamic buffer sizing adapts to throughput +- Per-writer locks allow parallel writes to different files +- Byte pools reduce GC pressure +- Efficient log rotation with back scanning + +## Testing Notes + +- `NewMockAccessLogger` for testing without file I/O +- Mock file implementation via `NewMockFile` +- Filter tests verify predicate logic +- Rotation tests verify retention cleanup + +## Related Packages + +- `internal/route` - Route middleware integration +- `internal/acl` - ACL decision logging +- `internal/maxmind` - IP geolocation for ACL logs diff --git a/internal/logging/memlogger/README.md b/internal/logging/memlogger/README.md new file mode 100644 index 00000000..980b6bd1 --- /dev/null +++ b/internal/logging/memlogger/README.md @@ -0,0 +1,330 @@ +# In-Memory Logger + +Provides a thread-safe in-memory circular buffer logger with WebSocket-based real-time streaming for log data. + +## Overview + +The memlogger package implements a thread-safe in-memory log buffer with WebSocket streaming capabilities. It stores log data in memory and pushes new entries to connected WebSocket clients and event subscribers. + +### Primary Consumers + +- `internal/api/v1/cert/renew` - Provides WebSocket endpoint for certificate renewal logs +- Diagnostic and debugging interfaces + +### Non-goals + +- Does not persist logs to disk +- Does not provide log rotation or retention policies +- Does not support structured/log levels +- Does not provide authentication for WebSocket connections + +### Stability + +Internal package. Public interfaces are stable. + +## Public API + +### Exported Types + +#### MemLogger Interface + +```go +type MemLogger io.Writer +``` + +The `MemLogger` is an `io.Writer` interface. Any data written to it is stored in the circular buffer and broadcast to subscribers. + +### Exported Functions + +#### GetMemLogger + +```go +func GetMemLogger() MemLogger +``` + +Returns the global singleton `MemLogger` instance. + +**Example:** + +```go +logger := memlogger.GetMemLogger() +logger.Write([]byte("log message")) +``` + +#### HandlerFunc + +```go +func HandlerFunc() gin.HandlerFunc +``` + +Returns a Gin middleware handler that upgrades HTTP connections to WebSocket and streams log data. + +**Example:** + +```go +router.GET("/logs/ws", memlogger.HandlerFunc()) +``` + +#### Events + +```go +func Events() (<-chan []byte, func()) +``` + +Returns a channel for receiving log events and a cancel function to unsubscribe. + +**Returns:** + +- `<-chan []byte` - Channel receiving log entry slices +- `func()` - Cleanup function that unsubscribes and closes the channel + +**Example:** + +```go +ch, cancel := memlogger.Events() +defer cancel() + +for event := range ch { + fmt.Println(string(event)) +} +``` + +## Architecture + +### Core Components + +```mermaid +flowchart LR + subgraph In-Memory Buffer + LB[bytes.Buffer] -->|Stores| Logs[Log Entries 16KB cap] + end + + subgraph Notification System + Notify[notifyWS] -->|Notifies| WS[WebSocket Clients] + Notify -->|Notifies| Ch[Event Channels] + end + + subgraph External Clients + HTTP[HTTP Request] -->|Upgrades to| WS + API[Events API] -->|Subscribes to| Ch + end +``` + +| Component | Responsibility | +| -------------- | ------------------------------------------------ | +| `memLogger` | Main struct holding buffer and subscription maps | +| `bytes.Buffer` | Circular buffer for log storage (16KB max) | +| `connChans` | xsync.Map of WebSocket channels | +| `listeners` | xsync.Map of event channels | + +### Write Flow + +```mermaid +sequenceDiagram + participant Writer + participant MemLogger + participant Buffer + participant Subscribers + + Writer->>MemLogger: Write(p) + MemLogger->>Buffer: truncateIfNeeded(n) + Buffer->>Buffer: Truncate to 8KB if needed + Buffer->>Buffer: Write(p) + MemLogger->>MemLogger: writeBuf returns position + MemLogger->>Subscribers: notifyWS(pos, n) + Subscribers->>Subscribers: Send to WebSocket/Listeners +``` + +### Buffer Behavior + +The circular buffer has fixed maximum size: + +| Property | Value | +| ------------------ | ---------- | +| Maximum Size | 16 KB | +| Truncate Threshold | 8 KB (50%) | +| Write Chunk Size | 4 KB | +| Write Timeout | 10 seconds | + +**Truncation Logic:** +When the buffer exceeds the maximum size: + +1. The buffer is truncated to 8 KB (half the maximum) +1. Oldest entries are removed first +1. Recent logs are always preserved + +### Thread Safety + +Multiple synchronization mechanisms ensure thread safety: + +| Field | Mutex Type | Purpose | +| ------------ | -------------- | ------------------------------------- | +| `Buffer` | `sync.RWMutex` | Protecting buffer operations | +| `notifyLock` | `sync.RWMutex` | Protecting notification maps | +| `connChans` | `xsync.Map` | Thread-safe WebSocket channel storage | +| `listeners` | `xsync.Map` | Thread-safe event listener storage | + +## Configuration Surface + +No explicit configuration. The singleton instance uses fixed constants: + +```go +const ( + maxMemLogSize = 16 * 1024 // 16KB buffer + truncateSize = maxMemLogSize / 2 // 8KB + initialWriteChunkSize = 4 * 1024 + writeTimeout = 10 * time.Second +) +``` + +## Dependency and Integration Map + +### Internal Dependencies + +| Dependency | Purpose | +| ------------------------------------------ | -------------------- | +| `github.com/yusing/goutils/http/websocket` | WebSocket management | + +### External Dependencies + +| Dependency | Purpose | +| ------------------------------- | ------------------------- | +| `github.com/gin-gonic/gin` | HTTP/WebSocket handling | +| `github.com/puzpuzpuz/xsync/v4` | Concurrent map operations | + +## Observability + +### Logs + +No logging in this package. Errors are returned via WebSocket write failures. + +### Metrics + +None exposed. + +## Failure Modes and Recovery + +| Failure | Detection | Recovery | +| ----------------------- | ------------------------ | ------------------------- | +| WebSocket write timeout | 3-second timer | Skip subscriber, continue | +| Buffer write error | `writeBuf` returns error | Logged but not returned | +| Subscriber channel full | Channel send timeout | Skip subscriber, continue | +| Buffer exceeds max size | `truncateIfNeeded` | Truncate to 8KB | + +### Concurrency Guarantees + +- Multiple goroutines can write concurrently +- Multiple WebSocket connections supported +- Subscriptions can be added/removed during operation +- Buffer truncation is atomic + +## Usage Examples + +### Basic Log Writing + +```go +import "github.com/yusing/godoxy/internal/logging/memlogger" + +logger := memlogger.GetMemLogger() + +// Write a simple message +logger.Write([]byte("Application started\n")) + +// Write formatted logs +logger.Write([]byte(fmt.Sprintf("[INFO] Request received: %s\n", path))) +``` + +### WebSocket Endpoint + +```go +import ( + "github.com/gin-gonic/gin" + "github.com/yusing/godoxy/internal/logging/memlogger" +) + +func setupRouter(r *gin.Engine) { + // Real-time log streaming via WebSocket + r.GET("/api/logs/stream", memlogger.HandlerFunc()) +} +``` + +### Subscribing to Log Events + +```go +func monitorLogs(ctx context.Context) { + ch, cancel := memlogger.Events() + defer cancel() + + for { + select { + case <-ctx.Done(): + return + case event := <-ch: + processLogEvent(event) + } + } +} + +func processLogEvent(event []byte) { + // Handle the log event + fmt.Printf("Log: %s", string(event)) +} +``` + +### WebSocket Client + +```javascript +// Client-side JavaScript +const ws = new WebSocket("ws://localhost:8080/api/logs/stream"); + +ws.onmessage = (event) => { + console.log("New log entry:", event.data); +}; + +ws.onclose = () => { + console.log("Log stream disconnected"); +}; + +ws.onerror = (error) => { + console.error("Log stream error:", error); +}; +``` + +### Complete Integration + +```go +func setupLogging(r *gin.Engine) *memlogger.MemLogger { + logger := memlogger.GetMemLogger() + + // WebSocket endpoint for real-time logs + r.GET("/ws/logs", memlogger.HandlerFunc()) + + return logger +} + +// Elsewhere in the application +func recordRequest(logger memlogger.MemLogger, path string, status int) { + logger.Write([]byte(fmt.Sprintf("[%s] %s - %d\n", + time.Now().Format(time.RFC3339), path, status))) +} +``` + +## Performance Characteristics + +- O(1) write operations (amortized) +- O(n) for truncation where n is buffer size +- WebSocket notifications are non-blocking (3-second timeout) +- Memory usage is bounded at 16KB + +## Testing Notes + +- Mock by providing a custom `io.Writer` implementation +- Test concurrent writes with goroutines +- Verify truncation behavior +- Test WebSocket upgrade failures + +## Related Packages + +- `internal/api` - HTTP API endpoints +- `github.com/gin-gonic/gin` - HTTP framework +- `github.com/yusing/goutils/http/websocket` - WebSocket utilities diff --git a/internal/maxmind/README.md b/internal/maxmind/README.md new file mode 100644 index 00000000..2791e799 --- /dev/null +++ b/internal/maxmind/README.md @@ -0,0 +1,337 @@ +# MaxMind + +The maxmind package provides MaxMind GeoIP database integration for IP geolocation, including automatic database downloading and updates. + +## Overview + +The maxmind package implements MaxMind GeoIP database management, providing IP geolocation lookups for country and city information. It supports automatic database downloading, scheduled updates, and thread-safe access. + +### Key Features + +- MaxMind GeoIP database loading +- Automatic database downloading from MaxMind +- Scheduled updates every 24 hours +- City lookup with cache support +- IP geolocation (country, city, timezone) +- Thread-safe access + +## Architecture + +```mermaid +graph TD + A[MaxMind Config] --> B[Load Database] + B --> C{Exists?} + C -->|No| D[Download] + C -->|Yes| E[Load] + D --> F[Extract from TarGz] + E --> G[Open Reader] + + H[IP Lookup] --> I[City Lookup] + I --> J{Cache Hit?} + J -->|Yes| K[Return Cached] + J -->|No| L[Database Query] + L --> M[Cache Result] + M --> K + + N[Update Scheduler] --> O[Check Daily] + O --> P{Different?} + P -->|Yes| Q[Download Update] + P -->|No| O +``` + +## Core Components + +### MaxMind Structure + +```go +type MaxMind struct { + *Config + lastUpdate time.Time + db struct { + *maxminddb.Reader + sync.RWMutex + } +} +``` + +### Configuration + +```go +type Config struct { + Database string // Database type (GeoLite2 or GeoIP2) + AccountID int + LicenseKey Secret +} +``` + +### IP Information + +```go +type IPInfo struct { + IP net.IP + Str string + Country *Country + City *City + Location *Location +} + +type Country struct { + IsoCode string + Name string +} + +type City struct { + Country *Country + Name string + Location *Location +} + +type Location struct { + TimeZone string + Latitude float64 + Longitude float64 +} +``` + +## Public API + +### Initialization + +```go +// LoadMaxMindDB loads or downloads the MaxMind database. +func (cfg *MaxMind) LoadMaxMindDB(parent task.Parent) gperr.Error +``` + +### Lookup + +```go +// LookupCity looks up city information for an IP. +func LookupCity(info *IPInfo) (city *City, ok bool) +``` + +## Usage + +### Basic Setup + +```go +maxmindCfg := &maxmind.Config{ + Database: maxmind.MaxMindGeoLite, + AccountID: 123456, + LicenseKey: "your-license-key", +} + +err := maxmindCfg.LoadMaxMindDB(parent) +if err != nil { + log.Fatal(err) +} +``` + +### IP Lookup + +```go +// Create IP info +ipInfo := &maxmind.IPInfo{ + IP: net.ParseIP("8.8.8.8"), + Str: "8.8.8.8", +} + +// Lookup city +city, ok := maxmind.LookupCity(ipInfo) +if ok { + fmt.Printf("Country: %s\n", city.Country.IsoCode) + fmt.Printf("City: %s\n", city.Name) + fmt.Printf("Timezone: %s\n", city.Location.TimeZone) +} +``` + +### Database Types + +```go +const ( + MaxMindGeoLite = "GeoLite2-Country" + MaxMindGeoIP2 = "GeoIP2-Country" +) +``` + +## Data Flow + +```mermaid +sequenceDiagram + participant Config + participant MaxMind + participant Database + participant Cache + participant UpdateScheduler + + Config->>MaxMind: LoadMaxMindDB() + MaxMind->>Database: Open() + alt Database Missing + MaxMind->>MaxMind: Download() + MaxMind->>Database: Extract & Create + end + Database-->>MaxMind: Reader + + Note over MaxMind: Start Update Scheduler + + loop Every 24 Hours + UpdateScheduler->>MaxMind: Check Update + MaxMind->>MaxMind: Check Last-Modified + alt Update Available + MaxMind->>MaxMind: Download + MaxMind->>Database: Replace + end + end + + participant Lookup + Lookup->>MaxMind: LookupCity(ip) + MaxMind->>Cache: Check + alt Cache Hit + Cache-->>Lookup: City Info + else Cache Miss + MaxMind->>Database: Query + Database-->>MaxMind: City Info + MaxMind->>Cache: Store + MaxMind-->>Lookup: City Info + end +``` + +## Database Download + +### Download Process + +```go +func (cfg *MaxMind) download() error { + resp, err := cfg.doReq(http.MethodGet) + if err != nil { + return err + } + + // Read response + databaseGZ, err := io.ReadAll(resp.Body) + if err != nil { + return err + } + + // Extract from tar.gz + err = extractFileFromTarGz(databaseGZ, cfg.dbFilename(), tmpDBPath) + if err != nil { + return err + } + + // Validate + db, err := maxmindDBOpen(tmpDBPath) + if err != nil { + os.Remove(tmpDBPath) + return err + } + db.Close() + + // Rename to final location + os.Rename(tmpDBPath, dbFile) + return nil +} +``` + +### Security Checks + +The download process includes tar bomb protection: + +```go +sumSize := int64(0) +for { + hdr, err := tr.Next() + if err == io.EOF { + break + } + sumSize += hdr.Size + if sumSize > 30*1024*1024 { + return errors.New("file size exceeds 30MB") + } +} +``` + +## Update Scheduling + +```go +func (cfg *MaxMind) scheduleUpdate(parent task.Parent) { + task := parent.Subtask("maxmind_schedule_update", true) + ticker := time.NewTicker(updateInterval) // 24 hours + + cfg.loadLastUpdate() + cfg.update() + + for { + select { + case <-task.Context().Done(): + return + case <-ticker.C: + cfg.update() + } + } +} +``` + +## Thread Safety + +The database uses a read-write mutex: + +```go +type MaxMind struct { + *Config + db struct { + *maxminddb.Reader + sync.RWMutex + } +} + +// Lookups use RLock +func (cfg *MaxMind) lookup(ip net.IP) (*maxminddb.City, error) { + cfg.db.RLock() + defer cfg.db.RUnlock() + return cfg.db.Lookup(ip) +} +``` + +## Configuration + +### Environment Variables + +| Variable | Description | +| --------------------- | ------------------- | +| `MAXMIND_ACCOUNT_ID` | MaxMind account ID | +| `MAXMIND_LICENSE_KEY` | MaxMind license key | + +### YAML Configuration + +```yaml +providers: + maxmind: + database: geolite2 + account_id: 123456 + license_key: your-license-key +``` + +## Integration Points + +The maxmind package integrates with: + +- **ACL**: IP-based access control (country/timezone matching) +- **Config**: Configuration management +- **Logging**: Update notifications +- **City Cache**: IP geolocation caching + +## Error Handling + +```go +var ( + ErrResponseNotOK = gperr.New("response not OK") + ErrDownloadFailure = gperr.New("download failure") +) +``` + +## Performance Considerations + +- 24-hour update interval reduces unnecessary downloads +- Database size ~10-30MB +- City lookup cache reduces database queries +- RLock for concurrent reads diff --git a/internal/metrics/README.md b/internal/metrics/README.md index a6ab5aa6..60b3a740 100644 --- a/internal/metrics/README.md +++ b/internal/metrics/README.md @@ -1,285 +1,118 @@ # Metrics Package -System monitoring and metrics collection for GoDoxy. +System monitoring and metrics collection for GoDoxy with time-series storage and REST/WebSocket APIs. ## Overview -This package provides a unified metrics collection system that polls system and route data at regular intervals, stores historical data across multiple time periods, and exposes both REST and WebSocket APIs for consumption. +This package provides a unified metrics collection system that: + +- Polls system and route data at regular intervals +- Stores historical data across multiple time periods +- Exposes both REST and WebSocket APIs for consumption + +### Primary Consumers + +- `internal/api/v1/metrics/` - REST API endpoints +- WebUI - Real-time charts +- `internal/metrics/uptime/` - Route health monitoring + +### Non-goals + +- Metric aggregation from external sources +- Alerting (handled by `internal/notif/`) +- Long-term storage (30-day retention only) + +### Stability + +Internal package. See `internal/metrics/period/README.md` for the core framework documentation. + +## Packages + +### `period/` + +Generic time-bucketed metrics storage framework with: + +- `Period[T]` - Multi-timeframe container +- `Poller[T, A]` - Background data collector +- `Entries[T]` - Circular buffer for time-series data + +See [period/README.md](./period/README.md) for full documentation. + +### `uptime/` + +Route health status monitoring using the period framework. + +### `systeminfo/` + +System metrics collection (CPU, memory, disk, network, sensors) using the period framework. ## Architecture ```mermaid graph TB - subgraph "Core Framework" - P[Period Generic] - E[Entries Ring Buffer] - PL[Poller Orchestrator] - end - subgraph "Data Sources" SI[SystemInfo Poller] UP[Uptime Poller] end - subgraph "Utilities" - UT[Utils] + subgraph "Period Framework" + P[Period Generic] + E[Entries Ring Buffer] + PL[Poller Orchestrator] + H[Handler HTTP API] + end + + subgraph "Storage" + JSON[(data/metrics/*.json)] end P --> E PL --> P PL --> SI PL --> UP - UT -.-> PL - UT -.-> SI - UT -.-> UP + H --> PL + PL --> JSON ``` -## Directory Structure +## Configuration Surface -``` -internal/metrics/ -├── period/ # Core polling and storage framework -│ ├── period.go # Period[T] - multi-timeframe container -│ ├── entries.go # Entries[T] - ring buffer implementation -│ ├── poller.go # Poller[T, A] - orchestration and lifecycle -│ └── handler.go # HTTP handler for data access -├── systeminfo/ # System metrics (CPU, memory, disk, network, sensors) -├── uptime/ # Route health and uptime monitoring -└── utils/ # Shared utilities (query parsing, pagination) -``` +No explicit configuration. Pollers respect `common.MetricsDisable*` flags: -## Core Components +| Flag | Disables | +| ----------------------- | ------------------------- | +| `MetricsDisableCPU` | CPU percentage collection | +| `MetricsDisableMemory` | Memory statistics | +| `MetricsDisableDisk` | Disk usage and I/O | +| `MetricsDisableNetwork` | Network counters | +| `MetricsDisableSensors` | Temperature sensors | -### 1. Period[T] (`period/period.go`) +## Dependency and Integration Map -A generic container that manages multiple time periods for the same data type. +### Internal Dependencies -```go -type Period[T any] struct { - Entries map[Filter]*Entries[T] // 5m, 15m, 1h, 1d, 1mo - mu sync.RWMutex -} -``` +- `github.com/yusing/goutils/task` - Lifetime management +- `internal/types` - Health check types -**Time Periods:** +### External Dependencies -| Filter | Duration | Entries | Interval | -| ------ | -------- | ------- | -------- | -| `5m` | 5 min | 100 | 3s | -| `15m` | 15 min | 100 | 9s | -| `1h` | 1 hour | 100 | 36s | -| `1d` | 1 day | 100 | 14.4m | -| `1mo` | 30 days | 100 | 7.2h | +- `github.com/shirou/gopsutil/v4` - System metrics collection +- `github.com/puzpuzpuz/xsync/v4` - Atomic value storage +- `github.com/bytedance/sonic` - JSON serialization -### 2. Entries[T] (`period/entries.go`) +## Observability -A fixed-size ring buffer (100 entries) with time-aware sampling. +### Logs -```go -type Entries[T any] struct { - entries [100]T // Fixed-size array - index int // Current position - count int // Number of entries - interval time.Duration // Sampling interval - lastAdd time.Time // Last write timestamp -} -``` +| Level | When | +| ------- | ------------------------------------------- | +| `Debug` | Poller start, data load/save | +| `Error` | Data source failures (aggregated every 30s) | -**Features:** +## Failure Modes and Recovery -- Circular buffer for efficient memory usage -- Rate-limited adds (respects configured interval) -- JSON serialization/deserialization with temporal spacing - -### 3. Poller[T, A] (`period/poller.go`) - -The orchestrator that ties together polling, storage, and HTTP serving. - -```go -type Poller[T any, A any] struct { - name string - poll PollFunc[T] // Data collection - aggregate AggregateFunc[T, A] // Data aggregation - resultFilter FilterFunc[T] // Query filtering - period *Period[T] // Data storage - lastResult synk.Value[T] // Latest snapshot -} -``` - -**Poll Cycle (1 second interval):** - -```mermaid -sequenceDiagram - participant T as Task - participant P as Poller - participant D as Data Source - participant S as Storage (Period) - participant F as File - - T->>P: Start() - P->>F: Load historical data - F-->>P: Period[T] state - - loop Every 1 second - P->>D: Poll(ctx, lastResult) - D-->>P: New data point - P->>S: Add to all periods - P->>P: Update lastResult - - alt Every 30 seconds - P->>P: Gather & log errors - end - - alt Every 5 minutes - P->>F: Persist to JSON - end - end -``` - -### 4. HTTP Handler (`period/handler.go`) - -Provides REST and WebSocket endpoints for data access. - -**Endpoints:** - -- `GET /metrics?period=5m&aggregate=cpu_average` - Historical data -- `WS /metrics?period=5m&interval=5s` - Streaming updates - -**Query Parameters:** -| Parameter | Type | Default | Description | -|-----------|------|---------|-------------| -| `period` | Filter | (none) | Time range (5m, 15m, 1h, 1d, 1mo) | -| `aggregate` | string | (varies) | Aggregation mode | -| `interval` | duration | 1s | WebSocket update interval | -| `limit` | int | 0 | Max results (0 = all) | -| `offset` | int | 0 | Pagination offset | -| `keyword` | string | "" | Fuzzy search filter | - -## Implementations - -### SystemInfo Poller - -Collects system metrics using `gopsutil`: - -```go -type SystemInfo struct { - Timestamp int64 - CPUAverage *float64 - Memory mem.VirtualMemoryStat - Disks map[string]disk.UsageStat - DisksIO map[string]*disk.IOCountersStat - Network net.IOCountersStat - Sensors Sensors -} -``` - -**Aggregation Modes:** - -- `cpu_average` - CPU usage percentage -- `memory_usage` - Memory used in bytes -- `memory_usage_percent` - Memory usage percentage -- `disks_read_speed` - Disk read speed (bytes/s) -- `disks_write_speed` - Disk write speed (bytes/s) -- `disks_iops` - Disk I/O operations per second -- `disk_usage` - Disk usage in bytes -- `network_speed` - Upload/download speed (bytes/s) -- `network_transfer` - Total bytes transferred -- `sensor_temperature` - Temperature sensor readings - -### Uptime Poller - -Monitors route health and calculates uptime statistics: - -```go -type RouteAggregate struct { - Alias string - DisplayName string - Uptime float32 // Percentage healthy - Downtime float32 // Percentage unhealthy - Idle float32 // Percentage napping/starting - AvgLatency float32 // Average latency in ms - CurrentStatus HealthStatus - Statuses []Status // Historical statuses -} -``` - -## Data Flow - -```mermaid -flowchart TD - A[Data Source] -->|PollFunc| B[Poller] - B -->|Add| C[Period.Entries] - C -->|Ring Buffer| D[(Memory)] - D -->|Every 5min| E[(data/metrics/*.json)] - - B -->|HTTP Request| F[ServeHTTP] - F -->|Filter| G[Get] - G -->|Aggregate| H[Response] - - F -->|WebSocket| I[PeriodicWrite] - I -->|interval| J[Push Updates] -``` - -## Persistence - -Data is persisted to `data/metrics/` as JSON files: - -```json -{ - "entries": { - "5m": { - "entries": [...], - "interval": "3s" - }, - "15m": {...}, - "1h": {...}, - "1d": {...}, - "1mo": {...} - } -} -``` - -**On Load:** - -- Validates and fixes interval mismatches -- Reconstructs temporal spacing for historical entries - -## Thread Safety - -- `Period[T]` uses `sync.RWMutex` for concurrent access -- `Entries[T]` is append-only (safe for single writer) -- `Poller` uses `synk.Value[T]` for atomic last result storage - -## Creating a New Poller - -```go -type MyData struct { - Value int -} - -type MyAggregate struct { - Values []int -} - -var MyPoller = period.NewPoller( - "my_poll_name", - func(ctx context.Context, last *MyData) (*MyData, error) { - // Fetch data - return &MyData{Value: 42}, nil - }, - func(entries []*MyData, query url.Values) (int, MyAggregate) { - // Aggregate for API response - return len(entries), MyAggregate{Values: [...]} - }, -) - -func init() { - MyPoller.Start() -} -``` - -## Error Handling - -- Poll errors are aggregated over 30-second windows -- Errors are logged with frequency counts -- Individual sensor warnings (e.g., ENODATA) are suppressed gracefully +| Failure Mode | Impact | Recovery | +| ------------------------- | -------------------- | -------------------------------- | +| Data source timeout | Missing data point | Logged, aggregated, continues | +| Disk read failure | No historical data | Starts fresh, warns | +| Disk write failure | Data loss on restart | Continues, retries next interval | +| Memory allocation failure | OOM risk | Go runtime handles | diff --git a/internal/metrics/period/README.md b/internal/metrics/period/README.md new file mode 100644 index 00000000..c2058808 --- /dev/null +++ b/internal/metrics/period/README.md @@ -0,0 +1,470 @@ +# Period Metrics + +Provides time-bucketed metrics storage with configurable periods, enabling historical data aggregation and real-time streaming. + +## Overview + +The period package implements a generic metrics collection system with time-bucketed storage. It collects data points at regular intervals and stores them in predefined time windows (5m, 15m, 1h, 1d, 1mo) with automatic persistence and HTTP/WebSocket APIs. + +### Primary Consumers + +- `internal/metrics/uptime` - Route health status storage +- `internal/metrics/systeminfo` - System metrics storage +- `internal/api/v1/metrics` - HTTP API endpoints + +### Non-goals + +- Does not provide data visualization +- Does not implement alerting or anomaly detection +- Does not support custom time periods (fixed set only) +- Does not provide data aggregation across multiple instances + +### Stability + +Internal package. Public interfaces are stable. + +## Public API + +### Exported Types + +#### Period[T] Struct + +```go +type Period[T any] struct { + Entries map[Filter]*Entries[T] + mu sync.RWMutex +} +``` + +Container for all time-bucketed entries. Maps each filter to its corresponding `Entries`. + +**Methods:** + +- `Add(info T)` - Adds a data point to all periods +- `Get(filter Filter) ([]T, bool)` - Gets entries for a specific period +- `Total() int` - Returns total number of entries across all periods +- `ValidateAndFixIntervals()` - Validates and fixes intervals after loading + +#### Entries[T] Struct + +```go +type Entries[T any] struct { + entries [maxEntries]T + index int + count int + interval time.Duration + lastAdd time.Time +} +``` + +Circular buffer holding up to 100 entries for a single time period. + +**Methods:** + +- `Add(now time.Time, info T)` - Adds an entry with interval checking +- `Get() []T` - Returns all entries in chronological order + +#### Filter Type + +```go +type Filter string +``` + +Time period filter. + +```go +const ( + MetricsPeriod5m Filter = "5m" + MetricsPeriod15m Filter = "15m" + MetricsPeriod1h Filter = "1h" + MetricsPeriod1d Filter = "1d" + MetricsPeriod1mo Filter = "1mo" +) +``` + +#### Poller[T, A] Struct + +```go +type Poller[T any, A any] struct { + name string + poll PollFunc[T] + aggregate AggregateFunc[T, A] + resultFilter FilterFunc[T] + period *Period[T] + lastResult synk.Value[T] + errs []pollErr +} +``` + +Generic poller that collects data at regular intervals. + +**Type Aliases:** + +```go +type PollFunc[T any] func(ctx context.Context, lastResult T) (T, error) +type AggregateFunc[T any, A any] func(entries []T, query url.Values) (total int, result A) +type FilterFunc[T any] func(entries []T, keyword string) (filtered []T) +``` + +#### ResponseType[AggregateT] + +```go +type ResponseType[AggregateT any] struct { + Total int `json:"total"` + Data AggregateT `json:"data"` +} +``` + +Standard response format for API endpoints. + +### Exported Functions + +#### Period Constructors + +```go +func NewPeriod[T any]() *Period[T] +``` + +Creates a new `Period[T]` with all time buckets initialized. + +#### Poller Constructors + +```go +func NewPoller[T any, A any]( + name string, + poll PollFunc[T], + aggregator AggregateFunc[T, A], +) *Poller[T, A] +``` + +Creates a new poller with the specified name, poll function, and aggregator. + +```go +func (p *Poller[T, A]) WithResultFilter(filter FilterFunc[T]) *Poller[T, A] +``` + +Adds a result filter to the poller for keyword-based filtering. + +#### Poller Methods + +```go +func (p *Poller[T, A]) Get(filter Filter) ([]T, bool) +``` + +Gets entries for a specific time period. + +```go +func (p *Poller[T, A]) GetLastResult() T +``` + +Gets the most recently collected data point. + +```go +func (p *Poller[T, A]) Start() +``` + +Starts the poller. Launches a background goroutine that: + +1. Polls for data at 1-second intervals +1. Stores data in all time buckets +1. Saves data to disk every 5 minutes +1. Reports errors every 30 seconds + +```go +func (p *Poller[T, A]) ServeHTTP(c *gin.Context) +``` + +HTTP handler for data retrieval. + +## Architecture + +### Core Components + +```mermaid +flowchart TD + subgraph Poller + Poll[PollFunc] -->|Collects| Data[Data Point T] + Data -->|Adds to| Period[Period T] + Period -->|Stores in| Buckets[Time Buckets] + end + + subgraph Time Buckets + Bucket5m[5m Bucket] -->|Holds| Entries5m[100 Entries] + Bucket15m[15m Bucket] -->|Holds| Entries15m[100 Entries] + Bucket1h[1h Bucket] -->|Holds| Entries1h[100 Entries] + Bucket1d[1d Bucket] -->|Holds| Entries1d[100 Entries] + Bucket1mo[1mo Bucket] -->|Holds| Entries1mo[100 Entries] + end + + subgraph API + Handler[ServeHTTP] -->|Queries| Period + Period -->|Returns| Aggregate[Aggregated Data] + WebSocket[WebSocket] -->|Streams| Periodic[Periodic Updates] + end + + subgraph Persistence + Save[save] -->|Writes| File[JSON File] + File -->|Loads| Load[load] + end +``` + +### Data Flow + +```mermaid +sequenceDiagram + participant Collector + participant Poller + participant Period + participant Entries as Time Bucket + participant Storage + + Poller->>Poller: Start background goroutine + + loop Every 1 second + Poller->>Collector: poll(ctx, lastResult) + Collector-->>Poller: data, error + Poller->>Period: Add(data) + Period->>Entries: Add(now, data) + Entries->>Entries: Circular buffer write + + Poller->>Poller: Check save interval (every 5min) + alt Save interval reached + Poller->>Storage: Save to JSON + end + + alt Error interval reached (30s) + Poller->>Poller: Gather and log errors + end + end +``` + +### Time Periods + +| Filter | Duration | Interval | Max Entries | +| ------ | ---------- | ------------ | ----------- | +| `5m` | 5 minutes | 3 seconds | 100 | +| `15m` | 15 minutes | 9 seconds | 100 | +| `1h` | 1 hour | 36 seconds | 100 | +| `1d` | 1 day | 14.4 minutes | 100 | +| `1mo` | 30 days | 7.2 hours | 100 | + +### Circular Buffer Behavior + +```mermaid +stateDiagram-v2 + [*] --> Empty: NewEntries() + Empty --> Filling: Add(entry 1) + Filling --> Filling: Add(entry 2..N) + Filling --> Full: count == maxEntries + Full --> Overwrite: Add(new entry) + Overwrite --> Overwrite: index = (index + 1) % max +``` + +When full, new entries overwrite oldest entries (FIFO). + +## Configuration Surface + +### Poller Configuration + +| Parameter | Type | Default | Description | +| -------------------- | ------------- | -------------- | -------------------------- | +| `PollInterval` | time.Duration | 1s | How often to poll for data | +| `saveInterval` | time.Duration | 5m | How often to save to disk | +| `gatherErrsInterval` | time.Duration | 30s | Error aggregation interval | +| `saveBaseDir` | string | `data/metrics` | Persistence directory | + +### HTTP Query Parameters + +| Parameter | Description | +| ------------------ | ----------------------------------- | +| `period` | Time filter (5m, 15m, 1h, 1d, 1mo) | +| `aggregate` | Aggregation mode (package-specific) | +| `interval` | WebSocket update interval | +| `limit` / `offset` | Pagination parameters | + +## Dependency and Integration Map + +### Internal Dependencies + +None. + +### External Dependencies + +| Dependency | Purpose | +| ------------------------------------------ | ------------------------ | +| `github.com/gin-gonic/gin` | HTTP handling | +| `github.com/yusing/goutils/http/websocket` | WebSocket streaming | +| `github.com/bytedance/sonic` | JSON serialization | +| `github.com/yusing/goutils/task` | Lifetime management | +| `github.com/puzpuzpuz/xsync/v4` | Concurrent value storage | + +### Integration Points + +- Poll function collects data from external sources +- Aggregate function transforms data for visualization +- Filter function enables keyword-based filtering +- HTTP handler provides REST/WebSocket endpoints + +## Observability + +### Logs + +| Level | When | +| ----- | ------------------------------------- | +| Debug | Poller start/stop, buffer adjustments | +| Error | Load/save failures | +| Info | Data loaded from disk | + +### Metrics + +None exposed directly. Poll errors are accumulated and logged periodically. + +## Security Considerations + +- HTTP endpoint should be protected via authentication +- Data files contain potentially sensitive metrics +- No input validation beyond basic query parsing +- WebSocket connections have configurable intervals + +## Failure Modes and Recovery + +| Failure | Detection | Recovery | +| -------------------- | ---------------------- | ----------------------------------- | +| Poll function error | `poll()` returns error | Error accumulated, logged every 30s | +| JSON load failure | `os.ReadFile` error | Continue with empty period | +| JSON save failure | `Encode` error | Error accumulated, logged | +| Context cancellation | `<-ctx.Done()` | Goroutine exits, final save | +| Disk full | Write error | Error logged, continue | + +### Persistence Behavior + +1. On startup, attempts to load existing data from `data/metrics/{name}.json` +1. If file doesn't exist, starts with empty data +1. On load, validates and fixes intervals +1. Saves every 5 minutes during operation +1. Final save on goroutine exit + +## Usage Examples + +### Defining a Custom Poller + +```go +import "github.com/yusing/godoxy/internal/metrics/period" + +type CustomMetric struct { + Timestamp int64 `json:"timestamp"` + Value float64 `json:"value"` + Name string `json:"name"` +} + +func pollCustomMetric(ctx context.Context, last CustomMetric) (CustomMetric, error) { + return CustomMetric{ + Timestamp: time.Now().Unix(), + Value: readSensorValue(), + Name: "sensor_1", + }, nil +} + +func aggregateCustomMetric(entries []CustomMetric, query url.Values) (int, Aggregated) { + // Aggregate logic here + return len(aggregated), aggregated +} + +var CustomPoller = period.NewPoller("custom", pollCustomMetric, aggregateCustomMetric) +``` + +### Starting the Poller + +```go +// In your main initialization +CustomPoller.Start() +``` + +### Accessing Data + +```go +// Get all entries from the last hour +entries, ok := CustomPoller.Get(period.MetricsPeriod1h) +if ok { + for _, entry := range entries { + fmt.Printf("Value: %.2f at %d\n", entry.Value, entry.Timestamp) + } +} + +// Get the most recent value +latest := CustomPoller.GetLastResult() +``` + +### HTTP Integration + +```go +import "github.com/gin-gonic/gin" + +func setupMetricsAPI(r *gin.Engine) { + r.GET("/api/metrics/custom", CustomPoller.ServeHTTP) +} +``` + +**API Examples:** + +```bash +# Get last collected data +GET /api/metrics/custom + +# Get 1-hour history +GET /api/metrics/custom?period=1h + +# Get 1-day history with aggregation +GET /api/metrics/custom?period=1d&aggregate=cpu_average +``` + +### WebSocket Integration + +```go +// WebSocket connections automatically receive updates +// at the specified interval +ws, _, _ := websocket.DefaultDialer.Dial("ws://localhost/api/metrics/custom?interval=5s", nil) + +for { + _, msg, _ := ws.ReadMessage() + // Process the update +} +``` + +### Data Persistence Format + +```json +{ + "entries": { + "5m": { + "entries": [...], + "interval": 3000000000 + }, + "15m": {...}, + "1h": {...}, + "1d": {...}, + "1mo": {...} + } +} +``` + +## Performance Characteristics + +- O(1) add to circular buffer +- O(1) get (returns slice view) +- O(n) serialization where n = total entries +- Memory: O(5 * 100 * sizeof(T)) = fixed overhead +- JSON load/save: O(n) where n = total entries + +## Testing Notes + +- Test circular buffer overflow behavior +- Test interval validation after load +- Test aggregation with various query parameters +- Test concurrent access to period +- Test error accumulation and reporting + +## Related Packages + +- `internal/metrics/uptime` - Uses period for health status +- `internal/metrics/systeminfo` - Uses period for system metrics diff --git a/internal/metrics/systeminfo/README.md b/internal/metrics/systeminfo/README.md new file mode 100644 index 00000000..7573d668 --- /dev/null +++ b/internal/metrics/systeminfo/README.md @@ -0,0 +1,439 @@ +# System Info + +Collects and aggregates system metrics including CPU, memory, disk, network, and sensor data with configurable aggregation modes. + +## Overview + +The systeminfo package a custom fork of the [gopsutil](https://github.com/shirou/gopsutil) library to collect system metrics and integrates with the `period` package for time-bucketed storage. It supports collecting CPU, memory, disk, network, and sensor data with configurable collection intervals and aggregation modes for visualization. + +### Primary Consumers + +- `internal/api/v1/metrics` - HTTP endpoint for system metrics +- `internal/homepage` - Dashboard system monitoring widgets +- Monitoring and alerting systems + +### Non-goals + +- Does not provide alerting on metric thresholds +- Does not persist metrics beyond the period package retention +- Does not provide data aggregation across multiple instances +- Does not support custom metric collectors + +### Stability + +Internal package. Data format and API are stable. + +## Public API + +### Exported Types + +#### SystemInfo Struct + +```go +type SystemInfo struct { + Timestamp int64 `json:"timestamp"` + CPUAverage *float64 `json:"cpu_average"` + Memory mem.VirtualMemoryStat `json:"memory"` + Disks map[string]disk.UsageStat `json:"disks"` + DisksIO map[string]*disk.IOCountersStat `json:"disks_io"` + Network net.IOCountersStat `json:"network"` + Sensors Sensors `json:"sensors"` +} +``` + +Container for all system metrics at a point in time. + +**Fields:** + +- `Timestamp` - Unix timestamp of collection +- `CPUAverage` - Average CPU usage percentage (0-100) +- `Memory` - Virtual memory statistics (used, total, percent, etc.) +- `Disks` - Disk usage by partition mountpoint +- `DisksIO` - Disk I/O counters by device name +- `Network` - Network I/O counters for primary interface +- `Sensors` - Hardware temperature sensor readings + +#### Sensors Type + +```go +type Sensors []sensors.TemperatureStat +``` + +Slice of temperature sensor readings. + +#### Aggregated Type + +```go +type Aggregated []map[string]any +``` + +Aggregated data suitable for charting libraries like Recharts. Each entry is a map with timestamp and values. + +#### SystemInfoAggregateMode Type + +```go +type SystemInfoAggregateMode string +``` + +Aggregation mode constants: + +```go +const ( + SystemInfoAggregateModeCPUAverage SystemInfoAggregateMode = "cpu_average" + SystemInfoAggregateModeMemoryUsage SystemInfoAggregateMode = "memory_usage" + SystemInfoAggregateModeMemoryUsagePercent SystemInfoAggregateMode = "memory_usage_percent" + SystemInfoAggregateModeDisksReadSpeed SystemInfoAggregateMode = "disks_read_speed" + SystemInfoAggregateModeDisksWriteSpeed SystemInfoAggregateMode = "disks_write_speed" + SystemInfoAggregateModeDisksIOPS SystemInfoAggregateMode = "disks_iops" + SystemInfoAggregateModeDiskUsage SystemInfoAggregateMode = "disk_usage" + SystemInfoAggregateModeNetworkSpeed SystemInfoAggregateMode = "network_speed" + SystemInfoAggregateModeNetworkTransfer SystemInfoAggregateMode = "network_transfer" + SystemInfoAggregateModeSensorTemperature SystemInfoAggregateMode = "sensor_temperature" +) +``` + +### Exported Variables + +#### Poller + +```go +var Poller = period.NewPoller("system_info", getSystemInfo, aggregate) +``` + +Pre-configured poller for system info metrics. Start with `Poller.Start()`. + +### Exported Functions + +#### getSystemInfo + +```go +func getSystemInfo(ctx context.Context, lastResult *SystemInfo) (*SystemInfo, error) +``` + +Collects current system metrics. This is the poll function passed to the period poller. + +**Features:** + +- Concurrent collection of all metric categories +- Handles partial failures gracefully +- Calculates rates based on previous result (for speed metrics) +- Logs warnings for non-critical errors + +**Rate Calculations:** + +- Disk read/write speed: `(currentBytes - lastBytes) / interval` +- Disk IOPS: `(currentCount - lastCount) / interval` +- Network speed: `(currentBytes - lastBytes) / interval` + +#### aggregate + +```go +func aggregate(entries []*SystemInfo, query url.Values) (total int, result Aggregated) +``` + +Aggregates system info entries for a specific mode. Called by the period poller. + +**Query Parameters:** + +- `aggregate` - The aggregation mode (see constants above) + +**Returns:** + +- `total` - Number of aggregated entries +- `result` - Slice of maps suitable for charting + +## Architecture + +### Core Components + +```mermaid +flowchart TD + subgraph Collection + G[gopsutil] -->|CPU| CPU[CPU Percent] + G -->|Memory| Mem[Virtual Memory] + G -->|Disks| Disk[Partitions & IO] + G -->|Network| Net[Network Counters] + G -->|Sensors| Sens[Temperature] + end + + subgraph Poller + Collect[getSystemInfo] -->|Aggregates| Info[SystemInfo] + Info -->|Stores in| Period[Period SystemInfo] + end + + subgraph Aggregation Modes + CPUAvg[cpu_average] + MemUsage[memory_usage] + MemPercent[memory_usage_percent] + DiskRead[disks_read_speed] + DiskWrite[disks_write_speed] + DiskIOPS[disks_iops] + DiskUsage[disk_usage] + NetSpeed[network_speed] + NetTransfer[network_transfer] + SensorTemp[sensor_temperature] + end + + Period -->|Query with| Aggregate[aggregate function] + Aggregate --> CPUAvg + Aggregate --> MemUsage + Aggregate --> DiskRead +``` + +### Data Flow + +```mermaid +sequenceDiagram + participant gopsutil + participant Poller + participant Period + participant API + + Poller->>Poller: Start background goroutine + + loop Every 1 second + Poller->>gopsutil: Collect CPU (500ms timeout) + Poller->>gopsutil: Collect Memory + Poller->>gopsutil: Collect Disks (partition + IO) + Poller->>gopsutil: Collect Network + Poller->>gopsutil: Collect Sensors + + gopsutil-->>Poller: SystemInfo + Poller->>Period: Add(SystemInfo) + end + + API->>Period: Get(filter) + Period-->>API: Entries + API->>API: aggregate(entries, mode) + API-->>Client: Chart data +``` + +### Collection Categories + +| Category | Data Source | Optional | Rate Metrics | +| -------- | ------------------------------------------------------ | -------- | --------------------- | +| CPU | `cpu.PercentWithContext` | Yes | No | +| Memory | `mem.VirtualMemoryWithContext` | Yes | No | +| Disks | `disk.PartitionsWithContext` + `disk.UsageWithContext` | Yes | Yes (read/write/IOPS) | +| Network | `net.IOCountersWithContext` | Yes | Yes (upload/download) | +| Sensors | `sensors.TemperaturesWithContext` | Yes | No | + +### Aggregation Modes + +Each mode produces chart-friendly output: + +**CPU Average:** + +```json +[ + { "timestamp": 1704892800, "cpu_average": 45.5 }, + { "timestamp": 1704892810, "cpu_average": 52.3 } +] +``` + +**Memory Usage:** + +```json +[ + { "timestamp": 1704892800, "memory_usage": 8388608000 }, + { "timestamp": 1704892810, "memory_usage": 8453440000 } +] +``` + +**Disk Read/Write Speed:** + +```json +[ + { "timestamp": 1704892800, "sda": 10485760, "sdb": 5242880 }, + { "timestamp": 1704892810, "sda": 15728640, "sdb": 4194304 } +] +``` + +## Configuration Surface + +### Disabling Metrics Categories + +Metrics categories can be disabled via environment variables: + +| Variable | Purpose | +| ------------------------- | ------------------------------------------- | +| `METRICS_DISABLE_CPU` | Set to "true" to disable CPU collection | +| `METRICS_DISABLE_MEMORY` | Set to "true" to disable memory collection | +| `METRICS_DISABLE_DISK` | Set to "true" to disable disk collection | +| `METRICS_DISABLE_NETWORK` | Set to "true" to disable network collection | +| `METRICS_DISABLE_SENSORS` | Set to "true" to disable sensor collection | + +## Dependency and Integration Map + +### Internal Dependencies + +| Package | Purpose | +| -------------------------------- | --------------------- | +| `internal/metrics/period` | Time-bucketed storage | +| `internal/common` | Configuration flags | +| `github.com/yusing/goutils/errs` | Error handling | + +### External Dependencies + +| Dependency | Purpose | +| ------------------------------- | ------------------------- | +| `github.com/shirou/gopsutil/v4` | System metrics collection | +| `github.com/rs/zerolog` | Logging | + +### Integration Points + +- gopsutil provides raw system metrics +- period package handles storage and persistence +- HTTP API provides query interface + +## Observability + +### Logs + +| Level | When | +| ----- | ------------------------------------------ | +| Warn | Non-critical errors (e.g., no sensor data) | +| Error | Other errors | + +### Metrics + +No metrics exposed directly. Collection errors are logged. + +## Failure Modes and Recovery + +| Failure | Detection | Recovery | +| --------------- | ------------------------------------ | -------------------------------- | +| No CPU data | `cpu.Percent` returns error | Skip and log later with warning | +| No memory data | `mem.VirtualMemory` returns error | Skip and log later with warning | +| No disk data | `disk.Usage` returns error for all | Skip and log later with warning | +| No network data | `net.IOCounters` returns error | Skip and log later with warning | +| No sensor data | `sensors.Temperatures` returns error | Skip and log later with warning | +| Context timeout | Context deadline exceeded | Return partial data with warning | + +### Partial Collection + +The package uses `gperr.NewGroup` to collect errors from concurrent operations: + +```go +errs := gperr.NewGroup("failed to get system info") +errs.Go(func() error { return s.collectCPUInfo(ctx) }) +errs.Go(func() error { return s.collectMemoryInfo(ctx) }) +// ... +result := errs.Wait() +``` + +Warnings (like `ENODATA`) are logged but don't fail the collection. +Critical errors cause the function to return an error. + +## Usage Examples + +### Starting the Poller + +```go +import "github.com/yusing/godoxy/internal/metrics/systeminfo" + +func init() { + systeminfo.Poller.Start() +} +``` + +### HTTP Endpoint + +```go +import "github.com/gin-gonic/gin" + +func setupMetricsAPI(r *gin.Engine) { + r.GET("/api/metrics/system", systeminfo.Poller.ServeHTTP) +} +``` + +**API Examples:** + +```bash +# Get latest metrics +curl http://localhost:8080/api/metrics/system + +# Get 1-hour history with CPU aggregation +curl "http://localhost:8080/api/metrics/system?period=1h&aggregate=cpu_average" + +# Get 24-hour memory usage history +curl "http://localhost:8080/api/metrics/system?period=1d&aggregate=memory_usage_percent" + +# Get disk I/O for the last hour +curl "http://localhost:8080/api/metrics/system?period=1h&aggregate=disks_read_speed" +``` + +### WebSocket Streaming + +```javascript +const ws = new WebSocket( + "ws://localhost:8080/api/metrics/system?period=1m&interval=5s&aggregate=cpu_average" +); + +ws.onmessage = (event) => { + const data = JSON.parse(event.data); + console.log("CPU:", data.data); +}; +``` + +### Direct Data Access + +```go +// Get entries for the last hour +entries, ok := systeminfo.Poller.Get(period.MetricsPeriod1h) +for _, entry := range entries { + if entry.CPUAverage != nil { + fmt.Printf("CPU: %.1f%% at %d\n", *entry.CPUAverage, entry.Timestamp) + } +} + +// Get the most recent metrics +latest := systeminfo.Poller.GetLastResult() +``` + +### Disabling Metrics at Runtime + +```go +import ( + "github.com/yusing/godoxy/internal/common" + "github.com/yusing/godoxy/internal/metrics/systeminfo" +) + +func init() { + // Disable expensive sensor collection + common.MetricsDisableSensors = true + systeminfo.Poller.Start() +} +``` + +## Performance Characteristics + +- O(1) per metric collection (gopsutil handles complexity) +- Concurrent collection of all categories +- Rate calculations O(n) where n = number of disks/interfaces +- Memory: O(5 _ 100 _ sizeof(SystemInfo)) +- JSON serialization O(n) for API responses + +### Collection Latency + +| Category | Typical Latency | +| -------- | -------------------------------------- | +| CPU | ~10-50ms | +| Memory | ~5-10ms | +| Disks | ~10-100ms (depends on partition count) | +| Network | ~5-10ms | +| Sensors | ~10-50ms | + +## Testing Notes + +- Mock gopsutil calls for unit tests +- Test with real metrics to verify rate calculations +- Test aggregation modes with various data sets +- Verify disable flags work correctly +- Test partial failure scenarios + +## Related Packages + +- `internal/metrics/period` - Time-bucketed storage +- `internal/api/v1/metrics` - HTTP API endpoints +- `github.com/shirou/gopsutil/v4` - System metrics library diff --git a/internal/metrics/uptime/README.md b/internal/metrics/uptime/README.md new file mode 100644 index 00000000..4c5b985c --- /dev/null +++ b/internal/metrics/uptime/README.md @@ -0,0 +1,402 @@ +# Uptime + +Tracks and aggregates route health status over time, providing uptime/downtime statistics and latency metrics. + +## Overview + +The uptime package monitors route health status and calculates uptime percentages over configurable time periods. It integrates with the `period` package for historical storage and provides aggregated statistics for visualization. + +### Primary Consumers + +- `internal/api/v1/metrics` - HTTP endpoint for uptime data +- `internal/homepage` - Dashboard uptime widgets +- Monitoring and alerting systems + +### Non-goals + +- Does not perform health checks (handled by `internal/route/routes`) +- Does not provide alerting on downtime +- Does not persist data beyond the period package retention +- Does not aggregate across multiple GoDoxy instances + +### Stability + +Internal package. Data format and API are stable. + +## Public API + +### Exported Types + +#### StatusByAlias + +```go +type StatusByAlias struct { + Map map[string]routes.HealthInfoWithoutDetail `json:"statuses"` + Timestamp int64 `json:"timestamp"` +} +``` + +Container for health status of all routes at a specific time. + +#### Status + +```go +type Status struct { + Status types.HealthStatus `json:"status" swaggertype:"string" enums:"healthy,unhealthy,unknown,napping,starting"` + Latency int32 `json:"latency"` + Timestamp int64 `json:"timestamp"` +} +``` + +Individual route status at a point in time. + +#### RouteAggregate + +```go +type RouteAggregate struct { + Alias string `json:"alias"` + DisplayName string `json:"display_name"` + Uptime float32 `json:"uptime"` + Downtime float32 `json:"downtime"` + Idle float32 `json:"idle"` + AvgLatency float32 `json:"avg_latency"` + IsDocker bool `json:"is_docker"` + IsExcluded bool `json:"is_excluded"` + CurrentStatus types.HealthStatus `json:"current_status" swaggertype:"string" enums:"healthy,unhealthy,unknown,napping,starting"` + Statuses []Status `json:"statuses"` +} +``` + +Aggregated statistics for a single route. + +#### Aggregated + +```go +type Aggregated []RouteAggregate +``` + +Slice of route aggregates, sorted alphabetically by alias. + +### Exported Variables + +#### Poller + +```go +var Poller = period.NewPoller("uptime", getStatuses, aggregateStatuses) +``` + +Pre-configured poller for uptime metrics. Start with `Poller.Start()`. + +### Unexported Functions + +#### getStatuses + +```go +func getStatuses(ctx context.Context, _ StatusByAlias) (StatusByAlias, error) +``` + +Collects current status of all routes. Called by the period poller every second. + +**Returns:** + +- `StatusByAlias` - Map of all route statuses with current timestamp +- `error` - Always nil (errors are logged internally) + +#### aggregateStatuses + +```go +func aggregateStatuses(entries []StatusByAlias, query url.Values) (int, Aggregated) +``` + +Aggregates status entries into route statistics. + +**Query Parameters:** + +- `period` - Time filter (5m, 15m, 1h, 1d, 1mo) +- `limit` - Maximum number of routes to return (0 = all) +- `offset` - Offset for pagination +- `keyword` - Fuzzy search keyword for filtering routes + +**Returns:** + +- `int` - Total number of routes matching the query +- `Aggregated` - Slice of route aggregates + +## Architecture + +### Core Components + +```mermaid +flowchart TD + subgraph Health Monitoring + Routes[Routes] -->|GetHealthInfoWithoutDetail| Status[Status Map] + Status -->|Polls every| Second[1 Second] + end + + subgraph Poller + Poll[getStatuses] -->|Collects| StatusByAlias + StatusByAlias -->|Stores in| Period[Period StatusByAlias] + end + + subgraph Aggregation + Query[Query Params] -->|Filters| Aggregate[aggregateStatuses] + Aggregate -->|Calculates| RouteAggregate + RouteAggregate -->|Uptime| UP[Uptime %] + RouteAggregate -->|Downtime| DOWN[Downtime %] + RouteAggregate -->|Idle| IDLE[Idle %] + RouteAggregate -->|Latency| LAT[Avg Latency] + end + + subgraph Response + RouteAggregate -->|JSON| Client[API Client] + end +``` + +### Data Flow + +```mermaid +sequenceDiagram + participant Routes as Route Registry + participant Poller as Uptime Poller + participant Period as Period Storage + participant API as HTTP API + + Routes->>Poller: GetHealthInfoWithoutDetail() + Poller->>Period: Add(StatusByAlias) + + loop Every second + Poller->>Routes: Collect status + Poller->>Period: Store status + end + + API->>Period: Get(filter) + Period-->>API: Entries + API->>API: aggregateStatuses() + API-->>Client: Aggregated JSON +``` + +### Status Types + +| Status | Description | Counted as Uptime? | +| ----------- | ------------------------------ | ------------------ | +| `healthy` | Route is responding normally | Yes | +| `unhealthy` | Route is not responding | No | +| `unknown` | Status could not be determined | Excluded | +| `napping` | Route is in idle/sleep state | Idle (separate) | +| `starting` | Route is starting up | Idle (separate) | + +### Calculation Formula + +For a set of status entries: + +``` +Uptime = healthy_count / total_count +Downtime = unhealthy_count / total_count +Idle = (napping_count + starting_count) / total_count +AvgLatency = sum(latency) / count +``` + +Note: `unknown` statuses are excluded from all calculations. + +## Configuration Surface + +No explicit configuration. The poller uses period package defaults: + +| Parameter | Value | +| ------------- | ---------------------------- | +| Poll Interval | 1 second | +| Retention | 5m, 15m, 1h, 1d, 1mo periods | + +## Dependency and Integration Map + +### Internal Dependencies + +| Package | Purpose | +| ------------------------- | --------------------- | +| `internal/route/routes` | Health info retrieval | +| `internal/metrics/period` | Time-bucketed storage | +| `internal/types` | HealthStatus enum | +| `internal/metrics/utils` | Query utilities | + +### External Dependencies + +| Dependency | Purpose | +| ---------------------------------------- | ---------------- | +| `github.com/lithammer/fuzzysearch/fuzzy` | Keyword matching | +| `github.com/bytedance/sonic` | JSON marshaling | + +### Integration Points + +- Route health monitors provide status via `routes.GetHealthInfoWithoutDetail()` +- Period poller handles data collection and storage +- HTTP API provides query interface via `Poller.ServeHTTP` + +## Observability + +### Logs + +Poller lifecycle and errors are logged via zerolog. + +### Metrics + +No metrics exposed directly. Status data available via API. + +## Failure Modes and Recovery + +| Failure | Detection | Recovery | +| -------------------------------- | --------------------------------- | ------------------------------ | +| Route health monitor unavailable | Empty map returned | Log warning, continue | +| Invalid query parameters | `aggregateStatuses` returns empty | Return empty result | +| Poller panic | Goroutine crash | Process terminates | +| Persistence failure | Load/save error | Log, continue with empty state | + +### Fuzzy Search + +The package uses `fuzzy.MatchFold` for keyword matching: + +- Case-insensitive matching +- Substring matching +- Fuzzy ranking + +## Usage Examples + +### Starting the Poller + +```go +import "github.com/yusing/godoxy/internal/metrics/uptime" + +func init() { + uptime.Poller.Start() +} +``` + +### HTTP Endpoint + +```go +import ( + "github.com/gin-gonic/gin" + "github.com/yusing/godoxy/internal/metrics/uptime" +) + +func setupUptimeAPI(r *gin.Engine) { + r.GET("/api/uptime", uptime.Poller.ServeHTTP) +} +``` + +**API Examples:** + +```bash +# Get latest status +curl http://localhost:8080/api/uptime + +# Get 1-hour history +curl "http://localhost:8080/api/uptime?period=1h" + +# Get with limit and offset (pagination) +curl "http://localhost:8080/api/uptime?limit=10&offset=0" + +# Search for routes containing "api" +curl "http://localhost:8080/api/uptime?keyword=api" + +# Combined query +curl "http://localhost:8080/api/uptime?period=1d&limit=20&offset=0&keyword=docker" +``` + +### WebSocket Streaming + +```javascript +const ws = new WebSocket( + "ws://localhost:8080/api/uptime?period=1m&interval=5s" +); + +ws.onmessage = (event) => { + const data = JSON.parse(event.data); + data.data.forEach((route) => { + console.log(`${route.display_name}: ${route.uptime * 100}% uptime`); + }); +}; +``` + +### Direct Data Access + +```go +// Get entries for the last hour +entries, ok := uptime.Poller.Get(period.MetricsPeriod1h) +for _, entry := range entries { + for alias, status := range entry.Map { + fmt.Printf("Route %s: %s (latency: %dms)\n", + alias, status.Status, status.Latency.Milliseconds()) + } +} + +// Get aggregated statistics +_, agg := uptime.aggregateStatuses(entries, url.Values{ + "period": []string{"1h"}, +}) + +for _, route := range agg { + fmt.Printf("%s: %.1f%% uptime, %.1fms avg latency\n", + route.DisplayName, route.Uptime*100, route.AvgLatency) +} +``` + +### Response Format + +**Latest Status Response:** + +```json +{ + "alias1": { + "status": "healthy", + "latency": 45 + }, + "alias2": { + "status": "unhealthy", + "latency": 0 + } +} +``` + +**Aggregated Response:** + +```json +{ + "total": 5, + "data": [ + { + "alias": "api-server", + "display_name": "API Server", + "uptime": 0.98, + "downtime": 0.02, + "idle": 0.0, + "avg_latency": 45.5, + "is_docker": true, + "is_excluded": false, + "current_status": "healthy", + "statuses": [ + { "status": "healthy", "latency": 45, "timestamp": 1704892800 } + ] + } + ] +} +``` + +## Performance Characteristics + +- O(n) status collection per poll where n = number of routes +- O(m \* k) aggregation where m = entries, k = routes +- Memory: O(p _ r _ s) where p = periods, r = routes, s = status size +- Fuzzy search is O(routes \* keyword_length) + +## Testing Notes + +- Mock `routes.GetHealthInfoWithoutDetail()` for testing +- Test aggregation with known status sequences +- Verify pagination and filtering logic +- Test fuzzy search matching + +## Related Packages + +- `internal/route/routes` - Route health monitoring +- `internal/metrics/period` - Time-bucketed metrics storage +- `internal/types` - Health status types diff --git a/internal/net/README.md b/internal/net/README.md new file mode 100644 index 00000000..f2fa745b --- /dev/null +++ b/internal/net/README.md @@ -0,0 +1,144 @@ +# Network Utilities + +The net package provides network utility functions for GoDoxy, including TCP connection testing and network-related helpers. + +## Overview + +The net package implements network utility functions that are used throughout GoDoxy for connectivity testing, TCP operations, and network-related utilities. + +### Key Features + +- TCP connection testing (ping) +- Connection utilities + +## Core Functions + +### TCP Ping + +```go +// PingTCP pings a TCP endpoint by attempting a connection. +func PingTCP(ctx context.Context, ip net.IP, port int) error +``` + +## Usage + +### Basic Usage + +```go +import "github.com/yusing/godoxy/internal/net" + +func checkService(ctx context.Context, ip string, port int) error { + addr := net.ParseIP(ip) + if addr == nil { + return fmt.Errorf("invalid IP: %s", ip) + } + + err := net.PingTCP(ctx, addr, port) + if err != nil { + return fmt.Errorf("service %s:%d unreachable: %w", ip, port, err) + } + + fmt.Printf("Service %s:%d is reachable\n", ip, port) + return nil +} +``` + +### Timeout Usage + +```go +ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) +defer cancel() + +ip := net.ParseIP("192.168.1.100") +err := net.PingTCP(ctx, ip, 8080) + +if err != nil { + if errors.Is(err, context.DeadlineExceeded) { + log.Println("Connection timed out") + } else { + log.Printf("Connection failed: %v", err) + } +} +``` + +## Implementation + +```go +func PingTCP(ctx context.Context, ip net.IP, port int) error { + var dialer net.Dialer + conn, err := dialer.DialContext(ctx, "tcp", fmt.Sprintf("%s:%d", ip, port)) + if err != nil { + return err + } + conn.Close() + return nil +} +``` + +## Data Flow + +```mermaid +sequenceDiagram + participant Caller + participant Dialer + participant TCPEndpoint + participant Connection + + Caller->>Dialer: DialContext("tcp", "ip:port") + Dialer->>TCPEndpoint: SYN + TCPEndpoint-->>Dialer: SYN-ACK + Dialer->>Connection: Create connection + Connection-->>Dialer: Connection + Dialer-->>Caller: nil error + + Note over Caller,Connection: Connection immediately closed + Connection->>TCPEndpoint: FIN + TCPEndpoint-->>Connection: FIN-ACK +``` + +## Use Cases + +### Service Health Check + +```go +func checkServices(ctx context.Context, services []Service) error { + for _, svc := range services { + ip := net.ParseIP(svc.IP) + if ip == nil { + return fmt.Errorf("invalid IP for %s: %s", svc.Name, svc.IP) + } + + if err := net.PingTCP(ctx, ip, svc.Port); err != nil { + return fmt.Errorf("service %s (%s:%d) unreachable: %w", + svc.Name, svc.IP, svc.Port, err) + } + } + return nil +} +``` + +### Proxmox Container Reachability + +```go +// Check if a Proxmox container is reachable on its proxy port +func checkContainerReachability(ctx context.Context, node *proxmox.Node, vmid int, port int) error { + ips, err := node.LXCGetIPs(ctx, vmid) + if err != nil { + return err + } + + for _, ip := range ips { + if err := net.PingTCP(ctx, ip, port); err == nil { + return nil // Found reachable IP + } + } + + return fmt.Errorf("no reachable IP found for container %d", vmid) +} +``` + +## Related Packages + +- **Route**: Uses TCP ping for load balancing health checks +- **Proxmox**: Uses TCP ping to verify container reachability +- **Idlewatcher**: Uses TCP ping to check idle status diff --git a/internal/net/gphttp/README.md b/internal/net/gphttp/README.md new file mode 100644 index 00000000..0115d176 --- /dev/null +++ b/internal/net/gphttp/README.md @@ -0,0 +1,146 @@ +# gphttp + +HTTP utilities package providing transport configuration, default HTTP client, and a wrapper around `http.ServeMux` with panic recovery. + +## Overview + +This package provides shared HTTP utilities used throughout GoDoxy: + +- **Default HTTP Client**: Pre-configured `http.Client` with secure settings +- **Transport Factory**: Functions to create optimized `http.Transport` configurations +- **ServeMux Wrapper**: Extended `http.ServeMux` with panic recovery for handler registration + +## Architecture + +```mermaid +graph TD + A[HTTP Request] --> B[gphttp.Client] + B --> C[Transport] + C --> D[Network Connection] + + E[Server Setup] --> F[gphttp.ServeMux] + F --> G[http.ServeMux] + G --> H[HTTP Handlers] +``` + +## Core Components + +### HTTP Client + +The package exports a pre-configured `http.Client` with secure defaults: + +```go +var ( + httpClient = &http.Client{ + Timeout: 5 * time.Second, + Transport: &http.Transport{ + DisableKeepAlives: true, + ForceAttemptHTTP2: false, + DialContext: (&net.Dialer{ + Timeout: 3 * time.Second, + KeepAlive: 60 * time.Second, + }).DialContext, + TLSClientConfig: &tls.Config{InsecureSkipVerify: true}, + }, + } + + Get = httpClient.Get + Post = httpClient.Post + Head = httpClient.Head + Do = httpClient.Do +) +``` + +### Transport Factory + +Functions for creating optimized HTTP transports: + +```go +// NewTransport creates an http.Transport with proxy support and optimized settings +func NewTransport() *http.Transport + +// NewTransportWithTLSConfig creates an http.Transport with custom TLS configuration +func NewTransportWithTLSConfig(tlsConfig *tls.Config) *http.Transport +``` + +Default transport settings: + +- `MaxIdleConnsPerHost`: 1000 +- `IdleConnTimeout`: 90 seconds +- `TLSHandshakeTimeout`: 10 seconds +- `ResponseHeaderTimeout`: 60 seconds +- `WriteBufferSize` / `ReadBufferSize`: 16KB + +### ServeMux Wrapper + +Extended `http.ServeMux` with panic recovery: + +```go +type ServeMux struct { + *http.ServeMux +} + +func NewServeMux() ServeMux +func (mux ServeMux) Handle(pattern string, handler http.Handler) (err error) +func (mux ServeMux) HandleFunc(pattern string, handler http.HandlerFunc) (err error) +``` + +The `Handle` and `HandleFunc` methods recover from panics and return them as errors, preventing one bad handler from crashing the entire server. + +## Usage Examples + +### Using the Default Client + +```go +import "github.com/yusing/godoxy/internal/net/gphttp" + +// Simple GET request +resp, err := gphttp.Get("https://example.com") +if err != nil { + log.Fatal(err) +} +defer resp.Body.Close() + +// POST request +resp, err := gphttp.Post("https://example.com", "application/json", body) +``` + +### Creating Custom Transports + +```go +import ( + "crypto/tls" + "net/http" + "github.com/yusing/godoxy/internal/net/gphttp" +) + +// Default transport with environment proxy +transport := gphttp.NewTransport() + +// Custom TLS configuration +tlsConfig := &tls.Config{ + ServerName: "example.com", +} +transport := gphttp.NewTransportWithTLSConfig(tlsConfig) +``` + +### Using ServeMux with Panic Recovery + +```go +mux := gphttp.NewServeMux() + +// Register handlers - panics are converted to errors +if err := mux.HandleFunc("/api", apiHandler); err != nil { + log.Printf("handler registration failed: %v", err) +} +``` + +## Integration Points + +- Used by `internal/net/gphttp/middleware` for HTTP request/response processing +- Used by `internal/net/gphttp/loadbalancer` for backend connections +- Used throughout the route handling system + +## Configuration + +The default client disables HTTP/2 (`ForceAttemptHTTP2: false`) and keep-alives (`DisableKeepAlives: true`) for security and compatibility reasons. The transport uses environment proxy settings via `http.ProxyFromEnvironment`. diff --git a/internal/net/gphttp/loadbalancer/README.md b/internal/net/gphttp/loadbalancer/README.md new file mode 100644 index 00000000..97432657 --- /dev/null +++ b/internal/net/gphttp/loadbalancer/README.md @@ -0,0 +1,304 @@ +# Load Balancer + +Load balancing package providing multiple distribution algorithms, sticky sessions, and server health management. + +## Overview + +This package implements a flexible load balancer for distributing HTTP requests across multiple backend servers. It supports multiple balancing algorithms and integrates with GoDoxy's task management and health monitoring systems. + +## Architecture + +```mermaid +graph TD + A[HTTP Request] --> B[LoadBalancer] + B --> C{Algorithm} + C -->|Round Robin| D[RoundRobin] + C -->|Least Connections| E[LeastConn] + C -->|IP Hash| F[IPHash] + + D --> G[Available Servers] + E --> G + F --> G + + G --> H[Server Selection] + H --> I{Sticky Session?} + I -->|Yes| J[Set Cookie] + I -->|No| K[Continue] + + J --> L[ServeHTTP] + K --> L +``` + +## Algorithms + +### Round Robin + +Distributes requests evenly across all available servers in sequence. + +```mermaid +sequenceDiagram + participant C as Client + participant LB as LoadBalancer + participant S1 as Server 1 + participant S2 as Server 2 + participant S3 as Server 3 + + C->>LB: Request 1 + LB->>S1: Route to Server 1 + C->>LB: Request 2 + LB->>S2: Route to Server 2 + C->>LB: Request 3 + LB->>S3: Route to Server 3 + C->>LB: Request 4 + LB->>S1: Route to Server 1 +``` + +### Least Connections + +Routes requests to the server with the fewest active connections. + +```mermaid +flowchart LR + subgraph LB["Load Balancer"] + direction TB + A["Server A
3 connections"] + B["Server B
1 connection"] + C["Server C
5 connections"] + end + + New["New Request"] --> B +``` + +### IP Hash + +Consistently routes requests from the same client IP to the same server using hash-based distribution. + +```mermaid +graph LR + Client1["Client IP: 192.168.1.10"] -->|Hash| ServerA + Client2["Client IP: 192.168.1.20"] -->|Hash| ServerB + Client3["Client IP: 192.168.1.30"] -->|Hash| ServerA +``` + +## Core Components + +### LoadBalancer + +```go +type LoadBalancer struct { + *types.LoadBalancerConfig + task *task.Task + pool pool.Pool[types.LoadBalancerServer] + poolMu sync.Mutex + sumWeight int + startTime time.Time +} +``` + +**Key Methods:** + +```go +// Create a new load balancer from configuration +func New(cfg *types.LoadBalancerConfig) *LoadBalancer + +// Start the load balancer as a background task +func (lb *LoadBalancer) Start(parent task.Parent) gperr.Error + +// Update configuration dynamically +func (lb *LoadBalancer) UpdateConfigIfNeeded(cfg *types.LoadBalancerConfig) + +// Add a backend server +func (lb *LoadBalancer) AddServer(srv types.LoadBalancerServer) + +// Remove a backend server +func (lb *LoadBalancer) RemoveServer(srv types.LoadBalancerServer) + +// ServeHTTP implements http.Handler +func (lb *LoadBalancer) ServeHTTP(rw http.ResponseWriter, r *http.Request) +``` + +### Server + +```go +type server struct { + name string + url *nettypes.URL + weight int + http.Handler + types.HealthMonitor +} + +// Create a new backend server +func NewServer(name string, url *nettypes.URL, weight int, handler http.Handler, healthMon types.HealthMonitor) types.LoadBalancerServer +``` + +**Server Interface:** + +```go +type LoadBalancerServer interface { + Name() string + URL() *nettypes.URL + Key() string + Weight() int + SetWeight(weight int) + Status() types.HealthStatus + Latency() time.Duration + ServeHTTP(rw http.ResponseWriter, r *http.Request) + TryWake() error +} +``` + +### Sticky Sessions + +The load balancer supports sticky sessions via cookies: + +```mermaid +flowchart TD + A[Client Request] --> B{Cookie exists?} + B -->|No| C[Select Server] + B -->|Yes| D[Extract Server Hash] + D --> E[Find Matching Server] + C --> F[Set Cookie
godoxy_lb_sticky] + E --> G[Route to Server] + F --> G +``` + +```go +// Cookie settings +Name: "godoxy_lb_sticky" +MaxAge: Configurable (default: 24 hours) +HttpOnly: true +SameSite: Lax +Secure: Based on TLS/Forwarded-Proto +``` + +## Balancing Modes + +```go +const ( + LoadbalanceModeUnset = "" + LoadbalanceModeRoundRobin = "round_robin" + LoadbalanceModeLeastConn = "least_conn" + LoadbalanceModeIPHash = "ip_hash" +) +``` + +## Configuration + +```go +type LoadBalancerConfig struct { + Link string // Link name + Mode LoadbalanceMode // Balancing algorithm + Sticky bool // Enable sticky sessions + StickyMaxAge time.Duration // Cookie max age + Options map[string]any // Algorithm-specific options +} +``` + +## Usage Examples + +### Basic Round Robin Load Balancer + +```go +config := &types.LoadBalancerConfig{ + Link: "my-service", + Mode: types.LoadbalanceModeRoundRobin, +} + +lb := loadbalancer.New(config) +lb.Start(parentTask) + +// Add backend servers +lb.AddServer(loadbalancer.NewServer("backend-1", url1, 10, handler1, health1)) +lb.AddServer(loadbalancer.NewServer("backend-2", url2, 10, handler2, health2)) + +// Use as HTTP handler +http.Handle("/", lb) +``` + +### Least Connections with Sticky Sessions + +```go +config := &types.LoadBalancerConfig{ + Link: "api-service", + Mode: types.LoadbalanceModeLeastConn, + Sticky: true, + StickyMaxAge: 1 * time.Hour, +} + +lb := loadbalancer.New(config) +lb.Start(parentTask) + +for _, srv := range backends { + lb.AddServer(srv) +} +``` + +### IP Hash Load Balancer with Real IP + +```go +config := &types.LoadBalancerConfig{ + Link: "user-service", + Mode: types.LoadbalanceModeIPHash, + Options: map[string]any{ + "header": "X-Real-IP", + "from": []string{"10.0.0.0/8", "172.16.0.0/12"}, + "recursive": true, + }, +} + +lb := loadbalancer.New(config) +``` + +### Server Weight Management + +```go +// Servers are balanced based on weight (max total: 100) +lb.AddServer(NewServer("server1", url1, 30, handler, health)) +lb.AddServer(NewServer("server2", url2, 50, handler, health)) +lb.AddServer(NewServer("server3", url3, 20, handler, health)) + +// Weights are auto-rebalanced if total != 100 +``` + +## Idlewatcher Integration + +The load balancer integrates with the idlewatcher system: + +- Wake events path (`/api/wake`): Wakes all idle servers +- Favicon and loading page paths: Bypassed for sticky session handling +- Server wake support via `TryWake()` interface + +## Health Monitoring + +The load balancer implements `types.HealthMonitor`: + +```go +func (lb *LoadBalancer) Status() types.HealthStatus +func (lb *LoadBalancer) Detail() string +func (lb *LoadBalancer) Uptime() time.Duration +func (lb *LoadBalancer) Latency() time.Duration +``` + +Health JSON representation: + +```json +{ + "name": "my-service", + "status": "healthy", + "detail": "3/3 servers are healthy", + "started": "2024-01-01T00:00:00Z", + "uptime": "1h2m3s", + "latency": "10ms", + "extra": { + "config": {...}, + "pool": {...} + } +} +``` + +## Thread Safety + +- Server pool operations are protected by `poolMu` mutex +- Algorithm-specific state uses atomic operations or dedicated synchronization +- Least connections uses `xsync.Map` for thread-safe connection counting diff --git a/internal/net/gphttp/middleware/README.md b/internal/net/gphttp/middleware/README.md new file mode 100644 index 00000000..b649c4fc --- /dev/null +++ b/internal/net/gphttp/middleware/README.md @@ -0,0 +1,336 @@ +# Middleware + +HTTP middleware framework providing request/response processing, middleware chaining, and composition from YAML files. + +## Overview + +This package implements a flexible HTTP middleware system for GoDoxy. Middleware can modify requests before they reach the backend and modify responses before they return to the client. The system supports: + +- **Request Modifiers**: Process requests before forwarding +- **Response Modifiers**: Modify responses before returning to client +- **Middleware Chaining**: Compose multiple middleware in priority order +- **YAML Composition**: Define middleware chains in configuration files +- **Bypass Rules**: Skip middleware based on request properties +- **Dynamic Loading**: Load middleware definitions from files at runtime + +## Architecture + +```mermaid +graph TD + A[HTTP Request] --> B[Middleware Chain] + + subgraph Chain [Middleware Pipeline] + direction LR + B1[RedirectHTTP] --> B2[RealIP] + B2 --> B3[RateLimit] + B3 --> B4[OIDC] + B4 --> B5[CustomErrorPage] + end + + Chain --> C[Backend Handler] + C --> D[Response Modifier] + + subgraph ResponseChain [Response Pipeline] + direction LR + D1[CustomErrorPage] --> D2[ModifyResponse] + D2 --> D3[ModifyHTML] + end + + ResponseChain --> E[HTTP Response] +``` + +## Middleware Flow + +```mermaid +sequenceDiagram + participant C as Client + participant M as Middleware Chain + participant B as Backend + participant R as Response Chain + participant C2 as Client + + C->>M: HTTP Request + M->>M: before() - RequestModifier + M->>M: Check Bypass Rules + M->>M: Sort by Priority + + par Request Modifiers + M->>M: Middleware 1 (before) + M->>M: Middleware 2 (before) + end + + M->>B: Forward Request + + B-->>M: HTTP Response + + par Response Modifiers + M->>R: ResponseModifier 1 + M->>R: ResponseModifier 2 + end + + R-->>C2: Modified Response +``` + +## Core Components + +### Middleware + +```go +type Middleware struct { + name string + construct ImplNewFunc + impl any + commonOptions +} + +type commonOptions struct { + Priority int `json:"priority"` // Default: 10, 0 is highest + Bypass Bypass `json:"bypass"` +} +``` + +**Interfaces:** + +```go +// RequestModifier - modify or filter requests +type RequestModifier interface { + before(w http.ResponseWriter, r *http.Request) (proceed bool) +} + +// ResponseModifier - modify responses +type ResponseModifier interface { + modifyResponse(r *http.Response) error +} + +// MiddlewareWithSetup - one-time setup after construction +type MiddlewareWithSetup interface { + setup() +} + +// MiddlewareFinalizer - finalize after options applied +type MiddlewareFinalizer interface { + finalize() +} + +// MiddlewareFinalizerWithError - finalize with error handling +type MiddlewareFinalizerWithError interface { + finalize() error +} +``` + +### Middleware Chain + +```go +type middlewareChain struct { + beforess []RequestModifier + modResps []ResponseModifier +} + +func NewMiddlewareChain(name string, chain []*Middleware) *Middleware +``` + +### Bypass Rules + +```go +type Bypass []rules.RuleOn + +// ShouldBypass checks if request should skip middleware +func (b Bypass) ShouldBypass(w http.ResponseWriter, r *http.Request) bool +``` + +## Available Middleware + +| Name | Type | Description | +| ------------------------------- | -------- | ------------------------------------------ | +| `redirecthttp` | Request | Redirect HTTP to HTTPS | +| `oidc` | Request | OIDC authentication | +| `forwardauth` | Request | Forward authentication to external service | +| `modifyrequest` / `request` | Request | Modify request headers and path | +| `modifyresponse` / `response` | Response | Modify response headers | +| `setxforwarded` | Request | Set X-Forwarded headers | +| `hidexforwarded` | Request | Remove X-Forwarded headers | +| `modifyhtml` | Response | Inject HTML into responses | +| `themed` | Response | Apply theming to HTML | +| `errorpage` / `customerrorpage` | Response | Serve custom error pages | +| `realip` | Request | Extract real client IP from headers | +| `cloudflarerealip` | Request | Cloudflare-specific real IP extraction | +| `cidrwhitelist` | Request | Allow only specific IP ranges | +| `ratelimit` | Request | Rate limiting by IP | +| `hcaptcha` | Request | hCAPTCHA verification | + +## Usage Examples + +### Creating a Middleware + +```go +import "github.com/yusing/godoxy/internal/net/gphttp/middleware" + +type myMiddleware struct { + SomeOption string `json:"some_option"` +} + +func (m *myMiddleware) before(w http.ResponseWriter, r *http.Request) bool { + // Process request + r.Header.Set("X-Custom", m.SomeOption) + return true // false would block the request +} + +var MyMiddleware = middleware.NewMiddleware[myMiddleware]() +``` + +### Building Middleware from Map + +```go +middlewaresMap := map[string]middleware.OptionsRaw{ + "realip": { + "priority": 5, + "header": "X-Real-IP", + "from": []string{"10.0.0.0/8"}, + }, + "ratelimit": { + "priority": 10, + "average": 10, + "burst": 20, + }, +} + +mid, err := middleware.BuildMiddlewareFromMap("my-chain", middlewaresMap) +if err != nil { + log.Fatal(err) +} +``` + +### YAML Composition + +```yaml +# config/middlewares/my-chain.yml +- use: realip + header: X-Real-IP + from: + - 10.0.0.0/8 + - 172.16.0.0/12 + bypass: + - path glob("/public/*") + +- use: ratelimit + average: 100 + burst: 200 + +- use: oidc + allowed_users: + - user@example.com +``` + +```go +// Load from file +eb := &gperr.Builder{} +middlewares := middleware.BuildMiddlewaresFromComposeFile( + "config/middlewares/my-chain.yml", + eb, +) +``` + +### Applying Middleware to Reverse Proxy + +```go +import "github.com/yusing/goutils/http/reverseproxy" + +rp := &reverseproxy.ReverseProxy{ + Target: backendURL, +} + +err := middleware.PatchReverseProxy(rp, middlewaresMap) +if err != nil { + log.Fatal(err) +} +``` + +### Bypass Rules + +```go +bypassRules := middleware.Bypass{ + { + Type: rules.RuleOnTypePathPrefix, + Value: "/public", + }, + { + Type: rules.RuleOnTypePath, + Value: "/health", + }, +} + +mid, _ := middleware.RateLimiter.New(middleware.OptionsRaw{ + "bypass": bypassRules, + "average": 10, + "burst": 20, +}) +``` + +## Priority + +Middleware are executed in priority order (lower number = higher priority): + +```mermaid +graph LR + A[Priority 0] --> B[Priority 5] + B --> C[Priority 10] + C --> D[Priority 20] + + style A fill:#14532d,stroke:#fff,color:#fff + style B fill:#14532d,stroke:#fff,color:#fff + style C fill:#44403c,stroke:#fff,color:#fff + style D fill:#44403c,stroke:#fff,color:#fff +``` + +## Request Processing + +```mermaid +flowchart TD + A[Request] --> B{Has Bypass Rules?} + B -->|Yes| C{Match Bypass?} + B -->|No| D[Execute before#40;#41;] + + C -->|Match| E[Skip Middleware
Proceed to Next] + C -->|No Match| D + + D --> F{before#40;#41; Returns?} + F -->|true| G[Continue to Next] + F -->|false| H[Stop Pipeline] + + G --> I[Backend Handler] + I --> J[Response] + J --> K{Has Response Modifier?} + K -->|Yes| L[Execute modifyResponse] + K -->|No| M[Return Response] + L --> M +``` + +## Integration Points + +- **Error Pages**: Uses `errorpage` package for custom error responses +- **Authentication**: Integrates with `internal/auth` for OIDC +- **Rate Limiting**: Uses `golang.org/x/time/rate` +- **IP Processing**: Uses `internal/net/types` for CIDR handling + +## Error Handling + +Errors during middleware construction are collected and reported: + +```go +var errs gperr.Builder +for name, opts := range middlewaresMap { + m, err := middleware.Get(name) + if err != nil { + errs.Add(err) + continue + } + mid, err := m.New(opts) + if err != nil { + errs.AddSubjectf(err, "middlewares.%s", name) + continue + } +} +if errs.HasError() { + log.Error().Err(errs.Error()).Msg("middleware compilation failed") +} +``` diff --git a/internal/net/gphttp/middleware/captcha/README.md b/internal/net/gphttp/middleware/captcha/README.md new file mode 100644 index 00000000..eeb0d57a --- /dev/null +++ b/internal/net/gphttp/middleware/captcha/README.md @@ -0,0 +1,264 @@ +# Captcha Middleware + +CAPTCHA verification middleware package providing session-based captcha challenge and verification. + +## Overview + +This package implements CAPTCHA verification middleware that protects routes by requiring users to complete a CAPTCHA challenge before accessing the protected resource. It supports pluggable providers (currently hCAPTCHA) and uses encrypted sessions for verification state. + +## Architecture + +```mermaid +graph TD + A[Client Request] --> B{Captcha Session?} + B -->|Valid| C[Proceed to Backend] + B -->|Invalid| D[Show CAPTCHA Page] + + D --> E{POST with Token?} + E -->|Valid| F[Create Session
Set Cookie] + E -->|Invalid| G[Show Error] + F --> C + + subgraph Captcha Provider + H[hCAPTCHA API] + D -->|Script/Form HTML| H + F -->|Verify Token| H + end + + subgraph Session Store + I[CaptchaSessions
jsonstore] + end + + F --> I + I -.->|Session Check| B +``` + +## Captcha Flow + +```mermaid +sequenceDiagram + participant C as Client + participant M as Middleware + participant P as Provider + participant S as Session Store + participant B as Backend + + C->>M: Request (no session) + M->>M: Check cookie + M->>M: Session not found/expired + M->>C: Send CAPTCHA Page + + C->>M: POST with captcha response + M->>P: Verify token + P-->>M: Verification result + + alt Verification successful + M->>S: Store session + M->>C: Set session cookie
Redirect to protected path + C->>M: Request (with session cookie) + M->>S: Validate session + M->>B: Forward request + else Verification failed + M->>C: Error: verification failed + end +``` + +## Core Components + +### Provider Interface + +```go +type Provider interface { + // CSP directives for the captcha provider + CSPDirectives() []string + // CSP sources for the captcha provider + CSPSources() []string + // Verify the captcha response from the request + Verify(r *http.Request) error + // Session expiry duration after successful verification + SessionExpiry() time.Duration + // Script HTML to include in the page + ScriptHTML() string + // Form HTML to render the captcha widget + FormHTML() string +} +``` + +### ProviderBase + +```go +type ProviderBase struct { + Expiry time.Duration `json:"session_expiry"` // Default: 24 hours +} + +func (p *ProviderBase) SessionExpiry() time.Duration +``` + +### hCAPTCHA Provider + +```go +type HcaptchaProvider struct { + ProviderBase + SiteKey string `json:"site_key" validate:"required"` + Secret string `json:"secret" validate:"required"` +} + +// CSP Directives: script-src, frame-src, style-src, connect-src +// CSP Sources: https://hcaptcha.com, https://*.hcaptcha.com +``` + +### Captcha Session + +```go +type CaptchaSession struct { + ID string `json:"id"` + Expiry time.Time `json:"expiry"` +} + +var CaptchaSessions = jsonstore.Store[*CaptchaSession]("captcha_sessions") + +func newCaptchaSession(p Provider) *CaptchaSession +func (s *CaptchaSession) expired() bool +``` + +## Middleware Integration + +```go +type hCaptcha struct { + captcha.HcaptchaProvider +} + +func (h *hCaptcha) before(w http.ResponseWriter, r *http.Request) bool { + return captcha.PreRequest(h, w, r) +} + +var HCaptcha = NewMiddleware[hCaptcha]() +``` + +### PreRequest Handler + +```go +func PreRequest(p Provider, w http.ResponseWriter, r *http.Request) (proceed bool) +``` + +This function: + +1. Checks for valid session cookie +1. Validates session expiry +1. Returns true if session is valid +1. For non-HTML requests, returns 403 Forbidden +1. For POST requests, verifies the captcha token +1. For GET requests, renders the CAPTCHA challenge page + +## Configuration + +### hCAPTCHA Configuration + +```yaml +middleware: + my-captcha: + use: hcaptcha + site_key: "YOUR_SITE_KEY" + secret: "YOUR_SECRET" + session_expiry: 24h # optional, default 24h +``` + +### Route Configuration + +```yaml +routes: + - host: example.com + path: /admin + middlewares: + - my-captcha +``` + +## Usage Examples + +### Basic Setup + +```go +import "github.com/yusing/godoxy/internal/net/gphttp/middleware" + +hcaptchaMiddleware := middleware.HCaptcha.New(middleware.OptionsRaw{ + "site_key": "your-site-key", + "secret": "your-secret", +}) +``` + +### Using in Middleware Chain + +```yaml +# config/middlewares/admin-protection.yml +- use: captcha + site_key: "${HCAPTCHA_SITE_KEY}" + secret: "${HCAPTCHA_SECRET}" + bypass: + - type: CIDR + value: 10.0.0.0/8 +``` + +## Session Management + +Sessions are stored in a JSON-based store with the following properties: + +- **Session ID**: 32-byte CRNG (`crypto/rand.Read`) random hex string +- **Expiry**: Configurable duration (default 24 hours) +- **Cookie**: `godoxy_captcha_session` with HttpOnly flag + +```mermaid +flowchart TD + A[Session Created] --> B[Cookie Set] + B --> C[Client Sends Cookie] + C --> D{Session Valid?} + D -->|Yes| E[Proceed] + D -->|No| F{HTML Request?} + F -->|Yes| G[Show CAPTCHA] + F -->|No| H[403 Forbidden] +``` + +## CSP Integration + +The CAPTCHA provider supplies CSP directives that should be added to the response: + +```go +// hCAPTCHA CSP Directives +CSPDirectives() []string +// Returns: ["script-src", "frame-src", "style-src", "connect-src"] + +CSPSources() []string +// Returns: ["https://hcaptcha.com", "https://*.hcaptcha.com"] +``` + +## HTML Template + +The package includes an embedded HTML template (`captcha.html`) that renders the CAPTCHA challenge page with: + +- Provider script (`