mirror of https://github.com/yusing/godoxy.git synced 2026-03-21 17:10:14 +01:00

Files

yusing 3001417a37 fix(health): only send recovery notification after down notification

Previously, up notifications were sent whenever a service recovered,
even if no down notification had been sent (e.g., when recovering
before the failure threshold was met). This could confuse users who
would receive "service is up" notifications without ever being
notified of a problem.

Now, recovery notifications are only sent when a prior down
notification exists, ensuring notification pairs are always complete.

2026-02-23 11:05:19 +08:00

last_seen.go

refactor: move internal/watcher/health to internal/health

2026-01-08 15:08:02 +08:00

monitor_test.go

fix(health): only send recovery notification after down notification

2026-02-23 11:05:19 +08:00

monitor.go

fix(health): only send recovery notification after down notification

2026-02-23 11:05:19 +08:00

new.go

fix(lint): improve styling and fix lint errors

2026-02-10 16:57:41 +08:00

README.md

docs: unify header to import path for package docs

2026-02-18 03:25:32 +08:00

README.md

internal/health/monitor

Route health monitoring with configurable check intervals, retry policies, and notification integration.

Overview

Purpose

This package provides health monitoring for different route types in GoDoxy:

Monitors service health via configurable check functions
Tracks consecutive failures with configurable thresholds
Sends notifications on status changes
Provides last-seen tracking for idle detection

Primary Consumers

internal/route/ - Route health monitoring
internal/api/v1/metrics/ - Uptime poller integration
WebUI - Health status display

Non-goals

Health check execution itself (delegated to internal/health/check/)
Alert routing (handled by internal/notif/)
Automatic remediation

Stability

Internal package with stable public interfaces. HealthMonitor interface is stable.

Public API

Types

type HealthCheckFunc func(url *url.URL) (result types.HealthCheckResult, err error)

HealthMonitor Interface

type HealthMonitor interface {
    Start(parent task.Parent) error
    Task() *task.Task
    Finish(reason any)
    UpdateURL(url *url.URL)
    URL() *url.URL
    Config() *types.HealthCheckConfig
    Status() types.HealthStatus
    Uptime() time.Duration
    Latency() time.Duration
    Detail() string
    Name() string
    String() string
    CheckHealth() (types.HealthCheckResult, error)
}

Monitor Creation (`new.go`)

// Create monitor for agent-proxied routes
func NewAgentProxiedMonitor(
    ctx context.Context,
    cfg types.HealthCheckConfig,
    url *url.URL,
) (HealthMonitor, error)

// Create monitor for Docker containers
func NewDockerHealthMonitor(
    ctx context.Context,
    cfg types.HealthCheckConfig,
    url *url.URL,
    containerID string,
) (HealthMonitor, error)

// Create monitor for HTTP routes
func NewHTTPMonitor(
    ctx context.Context,
    cfg types.HealthCheckConfig,
    url *url.URL,
) HealthMonitor

// Create monitor for H2C (HTTP/2 cleartext) routes
func NewH2CMonitor(
    ctx context.Context,
    cfg types.HealthCheckConfig,
    url *url.URL,
) HealthMonitor

// Create monitor for file server routes
func NewFileServerMonitor(
    cfg types.HealthCheckConfig,
    url *url.URL,
) HealthMonitor

// Create monitor for stream routes
func NewStreamMonitor(
    cfg types.HealthCheckConfig,
    url *url.URL,
) HealthMonitor

// Unified monitor factory (routes to appropriate type)
func NewMonitor(
    ctx context.Context,
    cfg types.HealthCheckConfig,
    url *url.URL,
) (HealthMonitor, error)

Architecture

Monitor Selection Flow

flowchart TD
    A[NewMonitor route] --> B{IsAgent route?}
    B -->|true| C[NewAgentProxiedMonitor]
    B -->|false| D{IsDocker route?}
    D -->|true| E[NewDockerHealthMonitor]
    D -->|false| F{Has h2c scheme?}
    F -->|true| G[NewH2CMonitor]
    F -->|false| H{Has http/https scheme?}
    H -->|true| I[NewHTTPMonitor]
    H -->|false| J{Is file:// scheme?}
    J -->|true| K[NewFileServerMonitor]
    J -->|false| L[NewStreamMonitor]

Monitor State Machine

stateDiagram-v2
    [*] --> Starting: First check
    Starting --> Healthy: Check passes
    Starting --> Unhealthy: Check fails
    Healthy --> Unhealthy: 5 consecutive failures
    Healthy --> Error: Check error
    Error --> Healthy: Check passes
    Error --> Unhealthy: 5 consecutive failures
    Unhealthy --> Healthy: Check passes
    Unhealthy --> Error: Check error
    [*] --> Stopped: Task cancelled

Component Structure

classDiagram
    class monitor {
        -service string
        -config types.HealthCheckConfig
        -url synk.Value~*url.URL~
        -status synk.Value~HealthStatus~
        -lastResult synk.Value~HealthCheckResult~
        -checkHealth HealthCheckFunc
        -startTime time.Time
        -task *task.Task
        +Start(parent task.Parent)
        +CheckHealth() (HealthCheckResult, error)
        +Status() HealthStatus
        +Uptime() time.Duration
        +Latency() time.Duration
        +Detail() string
    }

    class HealthMonitor {
        <<interface>>
        +Start(parent task.Parent)
        +Task() *task.Task
        +Status() HealthStatus
    }

Configuration Surface

HealthCheckConfig

type HealthCheckConfig struct {
    Interval    time.Duration // Check interval (default: 30s)
    Timeout     time.Duration // Check timeout (default: 10s)
    Path        string        // Health check path
    Method      string        // HTTP method (GET/HEAD)
    Retries     int           // Consecutive failures before notification (-1 for immediate)
    BaseContext func() context.Context
}

Defaults

Field	Default
Interval	30s
Timeout	10s
Method	GET
Path	"/"
Retries	3

Applying Defaults

cfg.ApplyDefaults(state.Value().Defaults.HealthCheck)

Dependency and Integration Map

Internal Dependencies

internal/task/task.go - Lifetime management
internal/notif/ - Status change notifications
internal/health/check/ - Health check implementations
internal/types/ - Health status types
internal/config/types/ - Working state

External Dependencies

github.com/puzpuzpuz/xsync/v4 - Atomic values

Observability

Logs

Level	When
`Info`	Service comes up
`Warn`	Service goes down
`Error`	Health check error
`Error`	Monitor stopped after 5 trials

Notifications

Service up notification (with latency)
Service down notification (with last seen time)
Immediate notification when Retries < 0

Metrics

Consecutive failure count
Last check latency
Monitor uptime

Failure Modes and Recovery

Failure Mode	Impact	Recovery
5 consecutive check errors	Monitor enters Error state, task stops	Manual restart required
Health check function panic	Monitor crashes	Automatic cleanup
Context cancellation	Monitor stops gracefully	Stopped state
URL update to invalid	Check will fail	Manual URL fix

Status Transitions

From	To	Condition
Starting	Healthy	Check passes
Starting	Unhealthy	Check fails
Healthy	Unhealthy	`Retries` consecutive failures
Healthy	Error	Check returns error
Unhealthy	Healthy	Check passes
Error	Healthy	Check passes

Usage Examples

Creating an HTTP Monitor

cfg := types.HealthCheckConfig{
    Interval: 15 * time.Second,
    Timeout: 5 * time.Second,
    Path:    "/health",
    Retries: 3,
}
url, _ := url.Parse("http://localhost:8080")

monitor := monitor.NewHTTPMonitor(context.Background(), cfg, url)
if err := monitor.Start(parent); err != nil {
    return err
}

// Check status
fmt.Printf("Status: %s\n", monitor.Status())
fmt.Printf("Latency: %v\n", monitor.Latency())

Creating a Docker Monitor

monitor, err := monitor.NewDockerHealthMonitor(
    context.Background(),
    cfg,
    url,
    containerID,
)
if err != nil {
    return err
}
monitor.Start(parent)

Unified Factory

monitor, err := monitor.NewMonitor(ctx, cfg, url)
if err != nil {
    return err
}
monitor.Start(parent)

Testing Notes

monitor_test.go - Monitor lifecycle tests
Mock health check functions for deterministic testing
Status transition coverage tests
Notification trigger tests

README.md

internal/health/monitor

Overview

Purpose

Primary Consumers

Non-goals

Stability

Public API

Types

HealthMonitor Interface

Monitor Creation (new.go)

Architecture

Monitor Selection Flow

Monitor State Machine

Component Structure

Configuration Surface

HealthCheckConfig

Defaults

Applying Defaults

Dependency and Integration Map

Internal Dependencies

External Dependencies

Observability

Logs

Notifications

Metrics

Failure Modes and Recovery

Status Transitions

Usage Examples

Creating an HTTP Monitor

Creating a Docker Monitor

Unified Factory

Testing Notes

Monitor Creation (`new.go`)