Files

Health Monitor Package

Route health monitoring with configurable check intervals, retry policies, and notification integration.

Overview

Purpose

This package provides health monitoring for different route types in GoDoxy:

  • Monitors service health via configurable check functions
  • Tracks consecutive failures with configurable thresholds
  • Sends notifications on status changes
  • Provides last-seen tracking for idle detection

Primary Consumers

  • internal/route/ - Route health monitoring
  • internal/api/v1/metrics/ - Uptime poller integration
  • WebUI - Health status display

Non-goals

  • Health check execution itself (delegated to internal/health/check/)
  • Alert routing (handled by internal/notif/)
  • Automatic remediation

Stability

Internal package with stable public interfaces. HealthMonitor interface is stable.

Public API

Types

type HealthCheckFunc func(url *url.URL) (result types.HealthCheckResult, err error)

HealthMonitor Interface

type HealthMonitor interface {
    Start(parent task.Parent) gperr.Error
    Task() *task.Task
    Finish(reason any)
    UpdateURL(url *url.URL)
    URL() *url.URL
    Config() *types.HealthCheckConfig
    Status() types.HealthStatus
    Uptime() time.Duration
    Latency() time.Duration
    Detail() string
    Name() string
    String() string
    CheckHealth() (types.HealthCheckResult, error)
}

Monitor Creation (new.go)

// Create monitor for agent-proxied routes
func NewAgentProxiedMonitor(
    ctx context.Context,
    cfg types.HealthCheckConfig,
    url *url.URL,
) (HealthMonitor, error)

// Create monitor for Docker containers
func NewDockerHealthMonitor(
    ctx context.Context,
    cfg types.HealthCheckConfig,
    url *url.URL,
    containerID string,
) (HealthMonitor, error)

// Create monitor for HTTP routes
func NewHTTPMonitor(
    ctx context.Context,
    cfg types.HealthCheckConfig,
    url *url.URL,
) HealthMonitor

// Create monitor for H2C (HTTP/2 cleartext) routes
func NewH2CMonitor(
    ctx context.Context,
    cfg types.HealthCheckConfig,
    url *url.URL,
) HealthMonitor

// Create monitor for file server routes
func NewFileServerMonitor(
    cfg types.HealthCheckConfig,
    url *url.URL,
) HealthMonitor

// Create monitor for stream routes
func NewStreamMonitor(
    cfg types.HealthCheckConfig,
    url *url.URL,
) HealthMonitor

// Unified monitor factory (routes to appropriate type)
func NewMonitor(
    ctx context.Context,
    cfg types.HealthCheckConfig,
    url *url.URL,
) (HealthMonitor, error)

Architecture

Monitor Selection Flow

flowchart TD
    A[NewMonitor route] --> B{IsAgent route?}
    B -->|true| C[NewAgentProxiedMonitor]
    B -->|false| D{IsDocker route?}
    D -->|true| E[NewDockerHealthMonitor]
    D -->|false| F{Has h2c scheme?}
    F -->|true| G[NewH2CMonitor]
    F -->|false| H{Has http/https scheme?}
    H -->|true| I[NewHTTPMonitor]
    H -->|false| J{Is file:// scheme?}
    J -->|true| K[NewFileServerMonitor]
    J -->|false| L[NewStreamMonitor]

Monitor State Machine

stateDiagram-v2
    [*] --> Starting: First check
    Starting --> Healthy: Check passes
    Starting --> Unhealthy: Check fails
    Healthy --> Unhealthy: 5 consecutive failures
    Healthy --> Error: Check error
    Error --> Healthy: Check passes
    Error --> Unhealthy: 5 consecutive failures
    Unhealthy --> Healthy: Check passes
    Unhealthy --> Error: Check error
    [*] --> Stopped: Task cancelled

Component Structure

classDiagram
    class monitor {
        -service string
        -config types.HealthCheckConfig
        -url synk.Value~*url.URL~
        -status synk.Value~HealthStatus~
        -lastResult synk.Value~HealthCheckResult~
        -checkHealth HealthCheckFunc
        -startTime time.Time
        -task *task.Task
        +Start(parent task.Parent)
        +CheckHealth() (HealthCheckResult, error)
        +Status() HealthStatus
        +Uptime() time.Duration
        +Latency() time.Duration
        +Detail() string
    }

    class HealthMonitor {
        <<interface>>
        +Start(parent task.Parent)
        +Task() *task.Task
        +Status() HealthStatus
    }

Configuration Surface

HealthCheckConfig

type HealthCheckConfig struct {
    Interval    time.Duration // Check interval (default: 30s)
    Timeout     time.Duration // Check timeout (default: 10s)
    Path        string        // Health check path
    Method      string        // HTTP method (GET/HEAD)
    Retries     int           // Consecutive failures before notification (-1 for immediate)
    BaseContext func() context.Context
}

Defaults

Field Default
Interval 30s
Timeout 10s
Method GET
Path "/"
Retries 3

Applying Defaults

cfg.ApplyDefaults(state.Value().Defaults.HealthCheck)

Dependency and Integration Map

Internal Dependencies

  • internal/task/task.go - Lifetime management
  • internal/notif/ - Status change notifications
  • internal/health/check/ - Health check implementations
  • internal/types/ - Health status types
  • internal/config/types/ - Working state

External Dependencies

  • github.com/puzpuzpuz/xsync/v4 - Atomic values

Observability

Logs

Level When
Info Service comes up
Warn Service goes down
Error Health check error
Error Monitor stopped after 5 trials

Notifications

  • Service up notification (with latency)
  • Service down notification (with last seen time)
  • Immediate notification when Retries < 0

Metrics

  • Consecutive failure count
  • Last check latency
  • Monitor uptime

Failure Modes and Recovery

Failure Mode Impact Recovery
5 consecutive check errors Monitor enters Error state, task stops Manual restart required
Health check function panic Monitor crashes Automatic cleanup
Context cancellation Monitor stops gracefully Stopped state
URL update to invalid Check will fail Manual URL fix

Status Transitions

From To Condition
Starting Healthy Check passes
Starting Unhealthy Check fails
Healthy Unhealthy Retries consecutive failures
Healthy Error Check returns error
Unhealthy Healthy Check passes
Error Healthy Check passes

Usage Examples

Creating an HTTP Monitor

cfg := types.HealthCheckConfig{
    Interval: 15 * time.Second,
    Timeout: 5 * time.Second,
    Path:    "/health",
    Retries: 3,
}
url, _ := url.Parse("http://localhost:8080")

monitor := monitor.NewHTTPMonitor(context.Background(), cfg, url)
if err := monitor.Start(parent); err != nil {
    return err
}

// Check status
fmt.Printf("Status: %s\n", monitor.Status())
fmt.Printf("Latency: %v\n", monitor.Latency())

Creating a Docker Monitor

monitor, err := monitor.NewDockerHealthMonitor(
    context.Background(),
    cfg,
    url,
    containerID,
)
if err != nil {
    return err
}
monitor.Start(parent)

Unified Factory

monitor, err := monitor.NewMonitor(ctx, cfg, url)
if err != nil {
    return err
}
monitor.Start(parent)

Testing Notes

  • monitor_test.go - Monitor lifecycle tests
  • Mock health check functions for deterministic testing
  • Status transition coverage tests
  • Notification trigger tests