build(deps): bump github.com/go-viper/mapstructure/v2

Bumps [github.com/go-viper/mapstructure/v2](https://github.com/go-viper/mapstructure) from 2.2.1 to 2.4.0. - [Release notes](https://github.com/go-viper/mapstructure/releases) - [Changelog](https://github.com/go-viper/mapstructure/blob/main/CHANGELOG.md) - [Commits](https://github.com/go-viper/mapstructure/compare/v2.2.1...v2.4.0) --- updated-dependencies: - dependency-name: github.com/go-viper/mapstructure/v2 dependency-version: 2.4.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
cmd: add option to get and set policy directly from database (#2765 )
2026-02-16 04:37:42 +01:00 · 2025-09-12 14:57:41 +00:00 · 2025-09-12 16:55:15 +02:00 · 2025-09-12 14:47:56 +02:00 · 2025-09-12 11:47:51 +02:00 · 2025-09-12 11:47:31 +02:00
172 changed files with 19234 additions and 10575 deletions
--- a/.claude/agents/headscale-integration-tester.md
+++ b/.claude/agents/headscale-integration-tester.md
@@ -0,0 +1,763 @@
+---
+name: headscale-integration-tester
+description: Use this agent when you need to execute, analyze, or troubleshoot Headscale integration tests. This includes running specific test scenarios, investigating test failures, interpreting test artifacts, validating end-to-end functionality, or ensuring integration test quality before releases. Examples: <example>Context: User has made changes to the route management code and wants to validate the changes work correctly. user: 'I've updated the route advertisement logic in poll.go. Can you run the relevant integration tests to make sure everything still works?' assistant: 'I'll use the headscale-integration-tester agent to run the subnet routing integration tests and analyze the results.' <commentary>Since the user wants to validate route-related changes with integration tests, use the headscale-integration-tester agent to execute the appropriate tests and analyze results.</commentary></example> <example>Context: A CI pipeline integration test is failing and the user needs help understanding why. user: 'The TestSubnetRouterMultiNetwork test is failing in CI. The logs show some timing issues but I can't figure out what's wrong.' assistant: 'Let me use the headscale-integration-tester agent to analyze the test failure and examine the artifacts.' <commentary>Since this involves analyzing integration test failures and interpreting test artifacts, use the headscale-integration-tester agent to investigate the issue.</commentary></example>
+color: green
+---
+
+You are a specialist Quality Assurance Engineer with deep expertise in Headscale's integration testing system. You understand the Docker-based test infrastructure, real Tailscale client interactions, and the complex timing considerations involved in end-to-end network testing.
+
+## Integration Test System Overview
+
+The Headscale integration test system uses Docker containers running real Tailscale clients against a Headscale server. Tests validate end-to-end functionality including routing, ACLs, node lifecycle, and network coordination. The system is built around the `hi` (Headscale Integration) test runner in `cmd/hi/`.
+
+## Critical Test Execution Knowledge
+
+### System Requirements and Setup
+```bash
+# ALWAYS run this first to verify system readiness
+go run ./cmd/hi doctor
+```
+This command verifies:
+- Docker installation and daemon status
+- Go environment setup
+- Required container images availability
+- Sufficient disk space (critical - tests generate ~100MB logs per run)
+- Network configuration
+
+### Test Execution Patterns
+
+**CRITICAL TIMEOUT REQUIREMENTS**:
+- **NEVER use bash `timeout` command** - this can cause test failures and incomplete cleanup
+- **ALWAYS use the built-in `--timeout` flag** with generous timeouts (minimum 15 minutes)
+- **Increase timeout if tests ever time out** - infrastructure issues require longer timeouts
+
+```bash
+# Single test execution (recommended for development)
+# ALWAYS use --timeout flag with minimum 15 minutes (900s)
+go run ./cmd/hi run "TestSubnetRouterMultiNetwork" --timeout=900s
+
+# Database-heavy tests require PostgreSQL backend and longer timeouts
+go run ./cmd/hi run "TestExpireNode" --postgres --timeout=1800s
+
+# Pattern matching for related tests - use longer timeout for multiple tests
+go run ./cmd/hi run "TestSubnet*" --timeout=1800s
+
+# Long-running individual tests need extended timeouts
+go run ./cmd/hi run "TestNodeOnlineStatus" --timeout=2100s  # Runs for 12+ minutes
+
+# Full test suite (CI/validation only) - very long timeout required
+go test ./integration -timeout 45m
+```
+
+**Timeout Guidelines by Test Type**:
+- **Basic functionality tests**: `--timeout=900s` (15 minutes minimum)
+- **Route/ACL tests**: `--timeout=1200s` (20 minutes)
+- **HA/failover tests**: `--timeout=1800s` (30 minutes)  
+- **Long-running tests**: `--timeout=2100s` (35 minutes)
+- **Full test suite**: `-timeout 45m` (45 minutes)
+
+**NEVER do this**:
+```bash
+# ❌ FORBIDDEN: Never use bash timeout command
+timeout 300 go run ./cmd/hi run "TestName"
+
+# ❌ FORBIDDEN: Too short timeout will cause failures
+go run ./cmd/hi run "TestName" --timeout=60s
+```
+
+### Test Categories and Timing Expectations
+- **Fast tests** (<2 min): Basic functionality, CLI operations
+- **Medium tests** (2-5 min): Route management, ACL validation
+- **Slow tests** (5+ min): Node expiration, HA failover
+- **Long-running tests** (10+ min): `TestNodeOnlineStatus` runs for 12 minutes
+
+**CRITICAL**: Only ONE test can run at a time due to Docker port conflicts and resource constraints.
+
+## Test Artifacts and Log Analysis
+
+### Artifact Structure
+All test runs save comprehensive artifacts to `control_logs/TIMESTAMP-ID/`:
+```
+control_logs/20250713-213106-iajsux/
+├── hs-testname-abc123.stderr.log     # Headscale server error logs
+├── hs-testname-abc123.stdout.log     # Headscale server output logs
+├── hs-testname-abc123.db             # Database snapshot for post-mortem
+├── hs-testname-abc123_metrics.txt    # Prometheus metrics dump
+├── hs-testname-abc123-mapresponses/  # Protocol-level debug data
+├── ts-client-xyz789.stderr.log       # Tailscale client error logs
+├── ts-client-xyz789.stdout.log       # Tailscale client output logs
+└── ts-client-xyz789_status.json      # Client network status dump
+```
+
+### Log Analysis Priority Order
+When tests fail, examine artifacts in this specific order:
+
+1. **Headscale server stderr logs** (`hs-*.stderr.log`): Look for errors, panics, database issues, policy evaluation failures
+2. **Tailscale client stderr logs** (`ts-*.stderr.log`): Check for authentication failures, network connectivity issues
+3. **MapResponse JSON files**: Protocol-level debugging for network map generation issues
+4. **Client status dumps** (`*_status.json`): Network state and peer connectivity information
+5. **Database snapshots** (`.db` files): For data consistency and state persistence issues
+
+## Common Failure Patterns and Root Cause Analysis
+
+### CRITICAL MINDSET: Code Issues vs Infrastructure Issues
+
+**⚠️ IMPORTANT**: When tests fail, it is ALMOST ALWAYS a code issue with Headscale, NOT infrastructure problems. Do not immediately blame disk space, Docker issues, or timing unless you have thoroughly investigated the actual error logs first.
+
+### Systematic Debugging Process
+
+1. **Read the actual error message**: Don't assume - read the stderr logs completely
+2. **Check Headscale server logs first**: Most issues originate from server-side logic
+3. **Verify client connectivity**: Only after ruling out server issues
+4. **Check timing patterns**: Use proper `EventuallyWithT` patterns
+5. **Infrastructure as last resort**: Only blame infrastructure after code analysis
+
+### Real Failure Patterns
+
+#### 1. Timing Issues (Common but fixable)
+```go
+// ❌ Wrong: Immediate assertions after async operations
+client.Execute([]string{"tailscale", "set", "--advertise-routes=10.0.0.0/24"})
+nodes, _ := headscale.ListNodes()
+require.Len(t, nodes[0].GetAvailableRoutes(), 1) // WILL FAIL
+
+// ✅ Correct: Wait for async operations
+client.Execute([]string{"tailscale", "set", "--advertise-routes=10.0.0.0/24"})
+require.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err := headscale.ListNodes()
+    assert.NoError(c, err)
+    assert.Len(c, nodes[0].GetAvailableRoutes(), 1)
+}, 10*time.Second, 100*time.Millisecond, "route should be advertised")
+```
+
+**Timeout Guidelines**:
+- Route operations: 3-5 seconds
+- Node state changes: 5-10 seconds
+- Complex scenarios: 10-15 seconds
+- Policy recalculation: 5-10 seconds
+
+#### 2. NodeStore Synchronization Issues
+Route advertisements must propagate through poll requests (`poll.go:420`). NodeStore updates happen at specific synchronization points after Hostinfo changes.
+
+#### 3. Test Data Management Issues
+```go
+// ❌ Wrong: Assuming array ordering
+require.Len(t, nodes[0].GetAvailableRoutes(), 1)
+
+// ✅ Correct: Identify nodes by properties
+expectedRoutes := map[string]string{"1": "10.33.0.0/16"}
+for _, node := range nodes {
+    nodeIDStr := fmt.Sprintf("%d", node.GetId())
+    if route, shouldHaveRoute := expectedRoutes[nodeIDStr]; shouldHaveRoute {
+        // Test the specific node that should have the route
+    }
+}
+```
+
+#### 4. Database Backend Differences
+SQLite vs PostgreSQL have different timing characteristics:
+- Use `--postgres` flag for database-intensive tests
+- PostgreSQL generally has more consistent timing
+- Some race conditions only appear with specific backends
+
+## Resource Management and Cleanup
+
+### Disk Space Management
+Tests consume significant disk space (~100MB per run):
+```bash
+# Check available space before running tests
+df -h
+
+# Clean up test artifacts periodically
+rm -rf control_logs/older-timestamp-dirs/
+
+# Clean Docker resources
+docker system prune -f
+docker volume prune -f
+```
+
+### Container Cleanup
+- Successful tests clean up automatically
+- Failed tests may leave containers running
+- Manually clean if needed: `docker ps -a` and `docker rm -f <containers>`
+
+## Advanced Debugging Techniques
+
+### Protocol-Level Debugging
+MapResponse JSON files in `control_logs/*/hs-*-mapresponses/` contain:
+- Network topology as sent to clients
+- Peer relationships and visibility
+- Route distribution and primary route selection
+- Policy evaluation results
+
+### Database State Analysis
+Use the database snapshots for post-mortem analysis:
+```bash
+# SQLite examination
+sqlite3 control_logs/TIMESTAMP/hs-*.db
+.tables
+.schema nodes
+SELECT * FROM nodes WHERE name LIKE '%problematic%';
+```
+
+### Performance Analysis
+Prometheus metrics dumps show:
+- Request latencies and error rates
+- NodeStore operation timing
+- Database query performance
+- Memory usage patterns
+
+## Test Development and Quality Guidelines
+
+### Proper Test Patterns
+```go
+// Always use EventuallyWithT for async operations
+require.EventuallyWithT(t, func(c *assert.CollectT) {
+    // Test condition that may take time to become true
+}, timeout, interval, "descriptive failure message")
+
+// Handle node identification correctly
+var targetNode *v1.Node
+for _, node := range nodes {
+    if node.GetName() == expectedNodeName {
+        targetNode = node
+        break
+    }
+}
+require.NotNil(t, targetNode, "should find expected node")
+```
+
+### Quality Validation Checklist
+- ✅ Tests use `EventuallyWithT` for asynchronous operations
+- ✅ Tests don't rely on array ordering for node identification
+- ✅ Proper cleanup and resource management
+- ✅ Tests handle both success and failure scenarios
+- ✅ Timing assumptions are realistic for operations being tested
+- ✅ Error messages are descriptive and actionable
+
+## Real-World Test Failure Patterns from HA Debugging
+
+### Infrastructure vs Code Issues - Detailed Examples
+
+**INFRASTRUCTURE FAILURES (Rare but Real)**:
+1. **DNS Resolution in Auth Tests**: `failed to resolve "hs-pingallbyip-jax97k": no DNS fallback candidates remain`
+   - **Pattern**: Client containers can't resolve headscale server hostname during logout
+   - **Detection**: Error messages specifically mention DNS/hostname resolution
+   - **Solution**: Docker networking reset, not code changes
+
+2. **Container Creation Timeouts**: Test gets stuck during client container setup
+   - **Pattern**: Tests hang indefinitely at container startup phase
+   - **Detection**: No progress in logs for >2 minutes during initialization
+   - **Solution**: `docker system prune -f` and retry
+
+3. **Docker Port Conflicts**: Multiple tests trying to use same ports
+   - **Pattern**: "bind: address already in use" errors
+   - **Detection**: Port binding failures in Docker logs
+   - **Solution**: Only run ONE test at a time
+
+**CODE ISSUES (99% of failures)**:
+1. **Route Approval Process Failures**: Routes not getting approved when they should be
+   - **Pattern**: Tests expecting approved routes but finding none
+   - **Detection**: `SubnetRoutes()` returns empty when `AnnouncedRoutes()` shows routes
+   - **Root Cause**: Auto-approval logic bugs, policy evaluation issues
+
+2. **NodeStore Synchronization Issues**: State updates not propagating correctly
+   - **Pattern**: Route changes not reflected in NodeStore or Primary Routes
+   - **Detection**: Logs show route announcements but no tracking updates
+   - **Root Cause**: Missing synchronization points in `poll.go:420` area
+
+3. **HA Failover Architecture Issues**: Routes removed when nodes go offline
+   - **Pattern**: `TestHASubnetRouterFailover` fails because approved routes disappear
+   - **Detection**: Routes available on online nodes but lost when nodes disconnect
+   - **Root Cause**: Conflating route approval with node connectivity
+
+### Critical Test Environment Setup
+
+**Pre-Test Cleanup (MANDATORY)**:
+```bash
+# ALWAYS run this before each test
+rm -rf control_logs/202507*
+docker system prune -f
+df -h  # Verify sufficient disk space
+```
+
+**Environment Verification**:
+```bash
+# Verify system readiness
+go run ./cmd/hi doctor
+
+# Check for running containers that might conflict
+docker ps
+```
+
+### Specific Test Categories and Known Issues
+
+#### Route-Related Tests (Primary Focus)
+```bash
+# Core route functionality - these should work first
+# Note: Generous timeouts are required for reliable execution
+go run ./cmd/hi run "TestSubnetRouteACL" --timeout=1200s
+go run ./cmd/hi run "TestAutoApproveMultiNetwork" --timeout=1800s
+go run ./cmd/hi run "TestHASubnetRouterFailover" --timeout=1800s
+```
+
+**Common Route Test Patterns**:
+- Tests validate route announcement, approval, and distribution workflows
+- Route state changes are asynchronous - may need `EventuallyWithT` wrappers
+- Route approval must respect ACL policies - test expectations encode security requirements
+- HA tests verify route persistence during node connectivity changes
+
+#### Authentication Tests (Infrastructure-Prone)
+```bash
+# These tests are more prone to infrastructure issues
+# Require longer timeouts due to auth flow complexity
+go run ./cmd/hi run "TestAuthKeyLogoutAndReloginSameUser" --timeout=1200s
+go run ./cmd/hi run "TestAuthWebFlowLogoutAndRelogin" --timeout=1200s
+go run ./cmd/hi run "TestOIDCExpireNodesBasedOnTokenExpiry" --timeout=1800s
+```
+
+**Common Auth Test Infrastructure Failures**:
+- DNS resolution during logout operations
+- Container creation timeouts
+- HTTP/2 stream errors (often symptoms, not root cause)
+
+### Security-Critical Debugging Rules
+
+**❌ FORBIDDEN CHANGES (Security & Test Integrity)**:
+1. **Never change expected test outputs** - Tests define correct behavior contracts
+   - Changing `require.Len(t, routes, 3)` to `require.Len(t, routes, 2)` because test fails
+   - Modifying expected status codes, node counts, or route counts
+   - Removing assertions that are "inconvenient"
+   - **Why forbidden**: Test expectations encode business requirements and security policies
+
+2. **Never bypass security mechanisms** - Security must never be compromised for convenience
+   - Using `AnnouncedRoutes()` instead of `SubnetRoutes()` in production code
+   - Skipping authentication or authorization checks
+   - **Why forbidden**: Security bypasses create vulnerabilities in production
+
+3. **Never reduce test coverage** - Tests prevent regressions
+   - Removing test cases or assertions
+   - Commenting out "problematic" test sections
+   - **Why forbidden**: Reduced coverage allows bugs to slip through
+
+**✅ ALLOWED CHANGES (Timing & Observability)**:
+1. **Fix timing issues with proper async patterns**
+   ```go
+   // ✅ GOOD: Add EventuallyWithT for async operations
+   require.EventuallyWithT(t, func(c *assert.CollectT) {
+       nodes, err := headscale.ListNodes()
+       assert.NoError(c, err)
+       assert.Len(c, nodes, expectedCount) // Keep original expectation
+   }, 10*time.Second, 100*time.Millisecond, "nodes should reach expected count")
+   ```
+   - **Why allowed**: Fixes race conditions without changing business logic
+
+2. **Add MORE observability and debugging**
+   - Additional logging statements
+   - More detailed error messages
+   - Extra assertions that verify intermediate states
+   - **Why allowed**: Better observability helps debug without changing behavior
+
+3. **Improve test documentation**
+   - Add godoc comments explaining test purpose and business logic
+   - Document timing requirements and async behavior
+   - **Why encouraged**: Helps future maintainers understand intent
+
+### Advanced Debugging Workflows
+
+#### Route Tracking Debug Flow
+```bash
+# Run test with detailed logging and proper timeout
+go run ./cmd/hi run "TestSubnetRouteACL" --timeout=1200s > test_output.log 2>&1
+
+# Check route approval process
+grep -E "(auto-approval|ApproveRoutesWithPolicy|PolicyManager)" test_output.log
+
+# Check route tracking
+tail -50 control_logs/*/hs-*.stderr.log | grep -E "(announced|tracking|SetNodeRoutes)"
+
+# Check for security violations
+grep -E "(AnnouncedRoutes.*SetNodeRoutes|bypass.*approval)" test_output.log
+```
+
+#### HA Failover Debug Flow
+```bash
+# Test HA failover specifically with adequate timeout
+go run ./cmd/hi run "TestHASubnetRouterFailover" --timeout=1800s
+
+# Check route persistence during disconnect
+grep -E "(Disconnect|NodeWentOffline|PrimaryRoutes)" control_logs/*/hs-*.stderr.log
+
+# Verify routes don't disappear inappropriately
+grep -E "(removing.*routes|SetNodeRoutes.*empty)" control_logs/*/hs-*.stderr.log
+```
+
+### Test Result Interpretation Guidelines
+
+#### Success Patterns to Look For
+- `"updating node routes for tracking"` in logs
+- Routes appearing in `announcedRoutes` logs
+- Proper `ApproveRoutesWithPolicy` calls for auto-approval
+- Routes persisting through node connectivity changes (HA tests)
+
+#### Failure Patterns to Investigate
+- `SubnetRoutes()` returning empty when `AnnouncedRoutes()` has routes
+- Routes disappearing when nodes go offline (HA architectural issue)
+- Missing `EventuallyWithT` causing timing race conditions
+- Security bypass attempts using wrong route methods
+
+### Critical Testing Methodology
+
+**Phase-Based Testing Approach**:
+1. **Phase 1**: Core route tests (ACL, auto-approval, basic functionality)
+2. **Phase 2**: HA and complex route scenarios
+3. **Phase 3**: Auth tests (infrastructure-sensitive, test last)
+
+**Per-Test Process**:
+1. Clean environment before each test
+2. Monitor logs for route tracking and approval messages
+3. Check artifacts in `control_logs/` if test fails
+4. Focus on actual error messages, not assumptions
+5. Document results and patterns discovered
+
+## Test Documentation and Code Quality Standards
+
+### Adding Missing Test Documentation
+When you understand a test's purpose through debugging, always add comprehensive godoc:
+
+```go
+// TestSubnetRoutes validates the complete subnet route lifecycle including
+// advertisement from clients, policy-based approval, and distribution to peers.
+// This test ensures that route security policies are properly enforced and that
+// only approved routes are distributed to the network.
+//
+// The test verifies:
+// - Route announcements are received and tracked
+// - ACL policies control route approval correctly  
+// - Only approved routes appear in peer network maps
+// - Route state persists correctly in the database
+func TestSubnetRoutes(t *testing.T) {
+    // Test implementation...
+}
+```
+
+**Why add documentation**: Future maintainers need to understand business logic and security requirements encoded in tests.
+
+### Comment Guidelines - Focus on WHY, Not WHAT
+
+```go
+// ✅ GOOD: Explains reasoning and business logic
+// Wait for route propagation because NodeStore updates are asynchronous
+// and happen after poll requests complete processing
+require.EventuallyWithT(t, func(c *assert.CollectT) {
+    // Check that security policies are enforced...
+}, timeout, interval, "route approval must respect ACL policies")
+
+// ❌ BAD: Just describes what the code does
+// Wait for routes
+require.EventuallyWithT(t, func(c *assert.CollectT) {
+    // Get routes and check length
+}, timeout, interval, "checking routes")
+```
+
+**Why focus on WHY**: Helps maintainers understand architectural decisions and security requirements.
+
+## EventuallyWithT Pattern for External Calls
+
+### Overview
+EventuallyWithT is a testing pattern used to handle eventual consistency in distributed systems. In Headscale integration tests, many operations are asynchronous - clients advertise routes, the server processes them, updates propagate through the network. EventuallyWithT allows tests to wait for these operations to complete while making assertions.
+
+### External Calls That Must Be Wrapped
+The following operations are **external calls** that interact with the headscale server or tailscale clients and MUST be wrapped in EventuallyWithT:
+- `headscale.ListNodes()` - Queries server state
+- `client.Status()` - Gets client network status
+- `client.Curl()` - Makes HTTP requests through the network
+- `client.Traceroute()` - Performs network diagnostics
+- `client.Execute()` when running commands that query state
+- Any operation that reads from the headscale server or tailscale client
+
+### Five Key Rules for EventuallyWithT
+
+1. **One External Call Per EventuallyWithT Block**
+   - Each EventuallyWithT should make ONE external call (e.g., ListNodes OR Status)
+   - Related assertions based on that single call can be grouped together
+   - Unrelated external calls must be in separate EventuallyWithT blocks
+
+2. **Variable Scoping**
+   - Declare variables that need to be shared across EventuallyWithT blocks at function scope
+   - Use `=` for assignment inside EventuallyWithT, not `:=` (unless the variable is only used within that block)
+   - Variables declared with `:=` inside EventuallyWithT are not accessible outside
+
+3. **No Nested EventuallyWithT**
+   - NEVER put an EventuallyWithT inside another EventuallyWithT
+   - This is a critical anti-pattern that must be avoided
+
+4. **Use CollectT for Assertions**
+   - Inside EventuallyWithT, use `assert` methods with the CollectT parameter
+   - Helper functions called within EventuallyWithT must accept `*assert.CollectT`
+
+5. **Descriptive Messages**
+   - Always provide a descriptive message as the last parameter
+   - Message should explain what condition is being waited for
+
+### Correct Pattern Examples
+
+```go
+// CORRECT: Single external call with related assertions
+var nodes []*v1.Node
+var err error
+
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err = headscale.ListNodes()
+    assert.NoError(c, err)
+    assert.Len(c, nodes, 2)
+    // These assertions are all based on the ListNodes() call
+    requireNodeRouteCountWithCollect(c, nodes[0], 2, 2, 2)
+    requireNodeRouteCountWithCollect(c, nodes[1], 1, 1, 1)
+}, 10*time.Second, 500*time.Millisecond, "nodes should have expected route counts")
+
+// CORRECT: Separate EventuallyWithT for different external call
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    status, err := client.Status()
+    assert.NoError(c, err)
+    // All these assertions are based on the single Status() call
+    for _, peerKey := range status.Peers() {
+        peerStatus := status.Peer[peerKey]
+        requirePeerSubnetRoutesWithCollect(c, peerStatus, expectedPrefixes)
+    }
+}, 10*time.Second, 500*time.Millisecond, "client should see expected routes")
+
+// CORRECT: Variable scoping for sharing between blocks
+var routeNode *v1.Node
+var nodeKey key.NodePublic
+
+// First EventuallyWithT to get the node
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err := headscale.ListNodes()
+    assert.NoError(c, err)
+    
+    for _, node := range nodes {
+        if node.GetName() == "router" {
+            routeNode = node
+            nodeKey, _ = key.ParseNodePublicUntyped(mem.S(node.GetNodeKey()))
+            break
+        }
+    }
+    assert.NotNil(c, routeNode, "should find router node")
+}, 10*time.Second, 100*time.Millisecond, "router node should exist")
+
+// Second EventuallyWithT using the nodeKey from first block
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    status, err := client.Status()
+    assert.NoError(c, err)
+    
+    peerStatus, ok := status.Peer[nodeKey]
+    assert.True(c, ok, "peer should exist in status")
+    requirePeerSubnetRoutesWithCollect(c, peerStatus, expectedPrefixes)
+}, 10*time.Second, 100*time.Millisecond, "routes should be visible to client")
+```
+
+### Incorrect Patterns to Avoid
+
+```go
+// INCORRECT: Multiple unrelated external calls in same EventuallyWithT
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    // First external call
+    nodes, err := headscale.ListNodes()
+    assert.NoError(c, err)
+    assert.Len(c, nodes, 2)
+    
+    // Second unrelated external call - WRONG!
+    status, err := client.Status()
+    assert.NoError(c, err)
+    assert.NotNil(c, status)
+}, 10*time.Second, 500*time.Millisecond, "mixed operations")
+
+// INCORRECT: Nested EventuallyWithT
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err := headscale.ListNodes()
+    assert.NoError(c, err)
+    
+    // NEVER do this!
+    assert.EventuallyWithT(t, func(c2 *assert.CollectT) {
+        status, _ := client.Status()
+        assert.NotNil(c2, status)
+    }, 5*time.Second, 100*time.Millisecond, "nested")
+}, 10*time.Second, 500*time.Millisecond, "outer")
+
+// INCORRECT: Variable scoping error
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err := headscale.ListNodes() // This shadows outer 'nodes' variable
+    assert.NoError(c, err)
+}, 10*time.Second, 500*time.Millisecond, "get nodes")
+
+// This will fail - nodes is nil because := created a new variable inside the block
+require.Len(t, nodes, 2) // COMPILATION ERROR or nil pointer
+
+// INCORRECT: Not wrapping external calls
+nodes, err := headscale.ListNodes() // External call not wrapped!
+require.NoError(t, err)
+```
+
+### Helper Functions for EventuallyWithT
+
+When creating helper functions for use within EventuallyWithT:
+
+```go
+// Helper function that accepts CollectT
+func requireNodeRouteCountWithCollect(c *assert.CollectT, node *v1.Node, available, approved, primary int) {
+    assert.Len(c, node.GetAvailableRoutes(), available, "available routes for node %s", node.GetName())
+    assert.Len(c, node.GetApprovedRoutes(), approved, "approved routes for node %s", node.GetName())
+    assert.Len(c, node.GetPrimaryRoutes(), primary, "primary routes for node %s", node.GetName())
+}
+
+// Usage within EventuallyWithT
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err := headscale.ListNodes()
+    assert.NoError(c, err)
+    requireNodeRouteCountWithCollect(c, nodes[0], 2, 2, 2)
+}, 10*time.Second, 500*time.Millisecond, "route counts should match expected")
+```
+
+### Operations That Must NOT Be Wrapped
+
+**CRITICAL**: The following operations are **blocking/mutating operations** that change state and MUST NOT be wrapped in EventuallyWithT:
+- `tailscale set` commands (e.g., `--advertise-routes`, `--accept-routes`)
+- `headscale.ApproveRoute()` - Approves routes on server
+- `headscale.CreateUser()` - Creates users
+- `headscale.CreatePreAuthKey()` - Creates authentication keys
+- `headscale.RegisterNode()` - Registers new nodes
+- Any `client.Execute()` that modifies configuration
+- Any operation that creates, updates, or deletes resources
+
+These operations:
+1. Complete synchronously or fail immediately
+2. Should not be retried automatically
+3. Need explicit error handling with `require.NoError()`
+
+### Correct Pattern for Blocking Operations
+
+```go
+// CORRECT: Blocking operation NOT wrapped
+status := client.MustStatus()
+command := []string{"tailscale", "set", "--advertise-routes=" + expectedRoutes[string(status.Self.ID)]}
+_, _, err = client.Execute(command)
+require.NoErrorf(t, err, "failed to advertise route: %s", err)
+
+// Then wait for the result with EventuallyWithT
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err := headscale.ListNodes()
+    assert.NoError(c, err)
+    assert.Contains(c, nodes[0].GetAvailableRoutes(), expectedRoutes[string(status.Self.ID)])
+}, 10*time.Second, 100*time.Millisecond, "route should be advertised")
+
+// INCORRECT: Blocking operation wrapped (DON'T DO THIS)
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    _, _, err = client.Execute([]string{"tailscale", "set", "--advertise-routes=10.0.0.0/24"})
+    assert.NoError(c, err) // This might retry the command multiple times!
+}, 10*time.Second, 100*time.Millisecond, "advertise routes")
+```
+
+### Assert vs Require Pattern
+
+When working within EventuallyWithT blocks where you need to prevent panics:
+
+```go
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err := headscale.ListNodes()
+    assert.NoError(c, err)
+    
+    // For array bounds - use require with t to prevent panic
+    assert.Len(c, nodes, 6)  // Test expectation
+    require.GreaterOrEqual(t, len(nodes), 3, "need at least 3 nodes to avoid panic")
+    
+    // For nil pointer access - use require with t before dereferencing
+    assert.NotNil(c, srs1PeerStatus.PrimaryRoutes)  // Test expectation
+    require.NotNil(t, srs1PeerStatus.PrimaryRoutes, "primary routes must be set to avoid panic")
+    assert.Contains(c,
+        srs1PeerStatus.PrimaryRoutes.AsSlice(),
+        pref,
+    )
+}, 5*time.Second, 200*time.Millisecond, "checking route state")
+```
+
+**Key Principle**: 
+- Use `assert` with `c` (*assert.CollectT) for test expectations that can be retried
+- Use `require` with `t` (*testing.T) for MUST conditions that prevent panics
+- Within EventuallyWithT, both are available - choose based on whether failure would cause a panic
+
+### Common Scenarios
+
+1. **Waiting for route advertisement**:
+```go
+client.Execute([]string{"tailscale", "set", "--advertise-routes=10.0.0.0/24"})
+
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err := headscale.ListNodes()
+    assert.NoError(c, err)
+    assert.Contains(c, nodes[0].GetAvailableRoutes(), "10.0.0.0/24")
+}, 10*time.Second, 100*time.Millisecond, "route should be advertised")
+```
+
+2. **Checking client sees routes**:
+```go
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    status, err := client.Status()
+    assert.NoError(c, err)
+    
+    // Check all peers have expected routes
+    for _, peerKey := range status.Peers() {
+        peerStatus := status.Peer[peerKey]
+        assert.Contains(c, peerStatus.AllowedIPs, expectedPrefix)
+    }
+}, 10*time.Second, 100*time.Millisecond, "all peers should see route")
+```
+
+3. **Sequential operations**:
+```go
+// First wait for node to appear
+var nodeID uint64
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err := headscale.ListNodes()
+    assert.NoError(c, err)
+    assert.Len(c, nodes, 1)
+    nodeID = nodes[0].GetId()
+}, 10*time.Second, 100*time.Millisecond, "node should register")
+
+// Then perform operation
+_, err := headscale.ApproveRoute(nodeID, "10.0.0.0/24")
+require.NoError(t, err)
+
+// Then wait for result
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err := headscale.ListNodes()
+    assert.NoError(c, err)
+    assert.Contains(c, nodes[0].GetApprovedRoutes(), "10.0.0.0/24")
+}, 10*time.Second, 100*time.Millisecond, "route should be approved")
+```
+
+## Your Core Responsibilities
+
+1. **Test Execution Strategy**: Execute integration tests with appropriate configurations, understanding when to use `--postgres` and timing requirements for different test categories. Follow phase-based testing approach prioritizing route tests.
+   - **Why this priority**: Route tests are less infrastructure-sensitive and validate core security logic
+
+2. **Systematic Test Analysis**: When tests fail, systematically examine artifacts starting with Headscale server logs, then client logs, then protocol data. Focus on CODE ISSUES first (99% of cases), not infrastructure. Use real-world failure patterns to guide investigation.
+   - **Why this approach**: Most failures are logic bugs, not environment issues - efficient debugging saves time
+
+3. **Timing & Synchronization Expertise**: Understand asynchronous Headscale operations, particularly route advertisements, NodeStore synchronization at `poll.go:420`, and policy propagation. Fix timing with `EventuallyWithT` while preserving original test expectations.
+   - **Why preserve expectations**: Test assertions encode business requirements and security policies
+   - **Key Pattern**: Apply the EventuallyWithT pattern correctly for all external calls as documented above
+
+4. **Root Cause Analysis**: Distinguish between actual code regressions (route approval logic, HA failover architecture), timing issues requiring `EventuallyWithT` patterns, and genuine infrastructure problems (DNS, Docker, container issues).
+   - **Why this distinction matters**: Different problem types require completely different solution approaches
+   - **EventuallyWithT Issues**: Often manifest as flaky tests or immediate assertion failures after async operations
+
+5. **Security-Aware Quality Validation**: Ensure tests properly validate end-to-end functionality with realistic timing expectations and proper error handling. Never suggest security bypasses or test expectation changes. Add comprehensive godoc when you understand test business logic.
+   - **Why security focus**: Integration tests are the last line of defense against security regressions
+   - **EventuallyWithT Usage**: Proper use prevents race conditions without weakening security assertions
+
+**CRITICAL PRINCIPLE**: Test expectations are sacred contracts that define correct system behavior. When tests fail, fix the code to match the test, never change the test to match broken code. Only timing and observability improvements are allowed - business logic expectations are immutable.
+
+**EventuallyWithT PRINCIPLE**: Every external call to headscale server or tailscale client must be wrapped in EventuallyWithT. Follow the five key rules strictly: one external call per block, proper variable scoping, no nesting, use CollectT for assertions, and provide descriptive messages.
+
+**Remember**: Test failures are usually code issues in Headscale that need to be fixed, not infrastructure problems to be ignored. Use the specific debugging workflows and failure patterns documented above to efficiently identify root causes. Infrastructure issues have very specific signatures - everything else is code-related.
--- a/.dockerignore
+++ b/.dockerignore
@@ -21,4 +21,3 @@ LICENSE
 node_modules/
 package-lock.json
 package.json
-
--- a/.github/ISSUE_TEMPLATE/bug_report.yaml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yaml
@@ -52,12 +52,15 @@ body:
        If you are using a container, always provide the headscale version and not only the Docker image version.
        Please do not put "latest".

+        Describe your "headscale network". Is there a lot of nodes, are the nodes all interconnected, are some subnet routers?
+
        If you are experiencing a problem during an upgrade, please provide the versions of the old and new versions of Headscale and Tailscale.

        examples:
          - **OS**: Ubuntu 24.04
          - **Headscale version**: 0.24.3
          - **Tailscale version**: 1.80.0
+          - **Number of nodes**: 20
      value: |
        - OS:
        - Headscale version:
@@ -77,6 +80,10 @@ body:
    attributes:
      label: Debug information
      description: |
+        Please have a look at our [Debugging and troubleshooting
+        guide](https://headscale.net/development/ref/debug/) to learn about
+        common debugging techniques.
+
        Links? References? Anything that will give us more context about the issue you are encountering.
        If **any** of these are omitted we will likely close your issue, do **not** ignore them.

@@ -92,7 +99,7 @@ body:
        `tailscale status --json > DESCRIPTIVE_NAME.json`

        Get the logs of a Tailscale client that is not working as expected.
-        `tailscale daemon-logs`
+        `tailscale debug daemon-logs`

        Tip: You can attach images or log files by clicking this area to highlight it and then dragging files in.
        **Ensure** you use formatting for files you attach.
--- a/.github/ISSUE_TEMPLATE/feature_request.yaml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yaml
@@ -16,15 +16,13 @@ body:
  - type: textarea
    attributes:
      label: Description
-      description:
-        A clear and precise description of what new or changed feature you want.
+      description: A clear and precise description of what new or changed feature you want.
    validations:
      required: true
  - type: checkboxes
    attributes:
      label: Contribution
-      description:
-        Are you willing to contribute to the implementation of this feature?
+      description: Are you willing to contribute to the implementation of this feature?
      options:
        - label: I can write the design doc for this feature
          required: false
@@ -33,7 +31,6 @@ body:
  - type: textarea
    attributes:
      label: How can it be implemented?
-      description:
-        Free text for your ideas on how this feature could be implemented.
+      description: Free text for your ideas on how this feature could be implemented.
    validations:
      required: false
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@@ -79,11 +79,7 @@ jobs:
    strategy:
      matrix:
        env:
-          - "GOARCH=arm   GOOS=linux GOARM=5"
-          - "GOARCH=arm   GOOS=linux GOARM=6"
-          - "GOARCH=arm   GOOS=linux GOARM=7"
          - "GOARCH=arm64 GOOS=linux"
-          - "GOARCH=386   GOOS=linux"
          - "GOARCH=amd64 GOOS=linux"
          - "GOARCH=arm64 GOOS=darwin"
          - "GOARCH=amd64 GOOS=darwin"
--- a/.github/workflows/check-generated.yml
+++ b/.github/workflows/check-generated.yml
@@ -0,0 +1,55 @@
+name: Check Generated Files
+
+on:
+  push:
+    branches:
+      - main
+  pull_request:
+    branches:
+      - main
+
+concurrency:
+  group: ${{ github.workflow }}-$${{ github.head_ref || github.run_id }}
+  cancel-in-progress: true
+
+jobs:
+  check-generated:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
+        with:
+          fetch-depth: 2
+      - name: Get changed files
+        id: changed-files
+        uses: dorny/paths-filter@de90cc6fb38fc0963ad72b210f1f284cd68cea36 # v3.0.2
+        with:
+          filters: |
+            files:
+              - '*.nix'
+              - 'go.*'
+              - '**/*.go'
+              - '**/*.proto'
+              - 'buf.gen.yaml'
+              - 'tools/**'
+      - uses: nixbuild/nix-quick-install-action@889f3180bb5f064ee9e3201428d04ae9e41d54ad # v31
+        if: steps.changed-files.outputs.files == 'true'
+      - uses: nix-community/cache-nix-action@135667ec418502fa5a3598af6fb9eb733888ce6a # v6.1.3
+        if: steps.changed-files.outputs.files == 'true'
+        with:
+          primary-key: nix-${{ runner.os }}-${{ runner.arch }}-${{ hashFiles('**/*.nix', '**/flake.lock') }}
+          restore-prefixes-first-match: nix-${{ runner.os }}-${{ runner.arch }}
+
+      - name: Run make generate
+        if: steps.changed-files.outputs.files == 'true'
+        run: nix develop --command -- make generate
+
+      - name: Check for uncommitted changes
+        if: steps.changed-files.outputs.files == 'true'
+        run: |
+          if ! git diff --exit-code; then
+            echo "❌ Generated files are not up to date!"
+            echo "Please run 'make generate' and commit the changes."
+            exit 1
+          else
+            echo "✅ All generated files are up to date."
+          fi
--- a/.github/workflows/integration-test-template.yml
+++ b/.github/workflows/integration-test-template.yml
@@ -62,24 +62,10 @@ jobs:
            '**/flake.lock') }}
          restore-prefixes-first-match: nix-${{ runner.os }}-${{ runner.arch }}
      - name: Run Integration Test
-        uses: Wandalen/wretry.action@e68c23e6309f2871ca8ae4763e7629b9c258e1ea # v3.8.0
-        if: steps.changed-files.outputs.files == 'true'
-        with:
-          # Our integration tests are started like a thundering herd, often
-          # hitting limits of the various external repositories we depend on
-          # like docker hub. This will retry jobs every 5 min, 10 times,
-          # hopefully letting us avoid manual intervention and restarting jobs.
-          # One could of course argue that we should invest in trying to avoid
-          # this, but currently it seems like a larger investment to be cleverer
-          # about this.
-          # Some of the jobs might still require manual restart as they are really
-          # slow and this will cause them to eventually be killed by Github actions.
-          attempt_delay: 300000 # 5 min
-          attempt_limit: 2
-          command: |
-            nix develop --command -- hi run "^${{ inputs.test }}$" \
-              --timeout=120m \
-              ${{ inputs.postgres_flag }}
+        run:
+          nix develop --command -- hi run --stats --ts-memory-limit=300 --hs-memory-limit=1500 "^${{ inputs.test }}$" \
+          --timeout=120m \
+          ${{ inputs.postgres_flag }}
      - uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
        if: always() && steps.changed-files.outputs.files == 'true'
        with:
--- a/.github/workflows/lint.yml
+++ b/.github/workflows/lint.yml
@@ -38,7 +38,10 @@ jobs:
        if: steps.changed-files.outputs.files == 'true'
        run: nix develop --command -- golangci-lint run
          --new-from-rev=${{github.event.pull_request.base.sha}}
-          --format=colored-line-number
+          --output.text.path=stdout
+          --output.text.print-linter-name
+          --output.text.print-issued-lines
+          --output.text.colors

  prettier-lint:
    runs-on: ubuntu-latest
--- a/.gitignore
+++ b/.gitignore
@@ -1,6 +1,9 @@
 ignored/
 tailscale/
 .vscode/
+.claude/
+
+*.prof

 # Binaries for programs and plugins
 *.exe
@@ -47,8 +50,6 @@ integration_test/etc/config.dump.yaml

 __debug_bin

-
 node_modules/
 package-lock.json
 package.json
-
--- a/.goreleaser.yml
+++ b/.goreleaser.yml
@@ -19,18 +19,10 @@ builds:
      - darwin_amd64
      - darwin_arm64
      - freebsd_amd64
-      - linux_386
      - linux_amd64
      - linux_arm64
-      - linux_arm_5
-      - linux_arm_6
-      - linux_arm_7
    flags:
      - -mod=readonly
-    ldflags:
-      - -s -w
-      - -X github.com/juanfont/headscale/hscontrol/types.Version={{ .Version }}
-      - -X github.com/juanfont/headscale/hscontrol/types.GitCommitHash={{ .Commit }}
    tags:
      - ts2019

@@ -113,9 +105,7 @@ kos:
      - CGO_ENABLED=0
    platforms:
      - linux/amd64
-      - linux/386
      - linux/arm64
-      - linux/arm/v7
    tags:
      - "{{ if not .Prerelease }}latest{{ end }}"
      - "{{ if not .Prerelease }}{{ .Major }}.{{ .Minor }}.{{ .Patch }}{{ end }}"
@@ -142,9 +132,7 @@ kos:
      - CGO_ENABLED=0
    platforms:
      - linux/amd64
-      - linux/386
      - linux/arm64
-      - linux/arm/v7
    tags:
      - "{{ if not .Prerelease }}latest-debug{{ end }}"
      - "{{ if not .Prerelease }}{{ .Major }}.{{ .Minor }}.{{ .Patch }}-debug{{ end }}"
--- a/.mcp.json
+++ b/.mcp.json
@@ -0,0 +1,48 @@
+{
+  "mcpServers": {
+    "claude-code-mcp": {
+      "type": "stdio",
+      "command": "npx",
+      "args": [
+        "-y",
+        "@steipete/claude-code-mcp@latest"
+      ],
+      "env": {}
+    },
+    "sequential-thinking": {
+      "type": "stdio",
+      "command": "npx",
+      "args": [
+        "-y",
+        "@modelcontextprotocol/server-sequential-thinking"
+      ],
+      "env": {}
+    },
+    "nixos": {
+      "type": "stdio",
+      "command": "uvx",
+      "args": [
+        "mcp-nixos"
+      ],
+      "env": {}
+    },
+    "context7": {
+      "type": "stdio",
+      "command": "npx",
+      "args": [
+        "-y",
+        "@upstash/context7-mcp"
+      ],
+      "env": {}
+    },
+    "git": {
+      "type": "stdio",
+      "command": "npx",
+      "args": [
+        "-y",
+        "@cyanheads/git-mcp-server"
+      ],
+      "env": {}
+    }
+  }
+}
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,8 @@

 ## Next

+**Minimum supported Tailscale client version: v1.64.0**
+
 ### Database integrity improvements

 This release includes a significant database migration that addresses longstanding
@@ -24,6 +26,7 @@ Please read the [PR description](https://github.com/juanfont/headscale/pull/2617
 for more technical details about the issues and solutions.

 **SQLite Database Backup Example:**
+
 ```bash
 # Stop headscale
 systemctl stop headscale
@@ -39,49 +42,17 @@ cp /var/lib/headscale/db.sqlite-shm /var/lib/headscale/db.sqlite-shm.backup
 systemctl start headscale
 ```

+### DERPMap update frequency
+
+The default DERPMap update frequency has been changed from 24 hours to 3 hours.
+If you set the `derp.update_frequency` configuration option, it is recommended to change
+it to `3h` to ensure that the headscale instance gets the latest DERPMap updates when
+upstream is changed.
+
 ### BREAKING

- **CLI: Remove deprecated flags**
-  - `--identifier` flag removed - use `--node` or `--user` instead
-  - `--namespace` flag removed - use `--user` instead
-  
-  **Command changes:**
-  ```bash
-  # Before
-  headscale nodes expire --identifier 123
-  headscale nodes rename --identifier 123 new-name
-  headscale nodes delete --identifier 123
-  headscale nodes move --identifier 123 --user 456
-  headscale nodes list-routes --identifier 123
-  
-  # After
-  headscale nodes expire --node 123
-  headscale nodes rename --node 123 new-name
-  headscale nodes delete --node 123
-  headscale nodes move --node 123 --user 456
-  headscale nodes list-routes --node 123
-  
-  # Before
-  headscale users destroy --identifier 123
-  headscale users rename --identifier 123 --new-name john
-  headscale users list --identifier 123
-  
-  # After
-  headscale users destroy --user 123
-  headscale users rename --user 123 --new-name john
-  headscale users list --user 123
-  
-  # Before
-  headscale nodes register --namespace myuser nodekey
-  headscale nodes list --namespace myuser
-  headscale preauthkeys create --namespace myuser
-  
-  # After
-  headscale nodes register --user myuser nodekey
-  headscale nodes list --user myuser
-  headscale preauthkeys create --user myuser
-  ```
-
+- Remove support for 32-bit binaries
+  [#2692](https://github.com/juanfont/headscale/pull/2692)
 - Policy: Zero or empty destination port is no longer allowed
  [#2606](https://github.com/juanfont/headscale/pull/2606)

@@ -92,7 +63,17 @@ systemctl start headscale
  - **IMPORTANT: Backup your SQLite database before upgrading**
  - Introduces safer table renaming migration strategy
  - Addresses longstanding database integrity issues
-
+- Add flag to directly manipulate the policy in the database
+  [#2765](https://github.com/juanfont/headscale/pull/2765)
+- DERPmap update frequency default changed from 24h to 3h
+  [#2741](https://github.com/juanfont/headscale/pull/2741)
+- DERPmap update mechanism has been improved with retry,
+  and is now failing conservatively, preserving the old map upon failure.
+  [#2741](https://github.com/juanfont/headscale/pull/2741)
+- Add support for `autogroup:member`, `autogroup:tagged`
+  [#2572](https://github.com/juanfont/headscale/pull/2572)
+- Fix bug where return routes were being removed by policy
+  [#2767](https://github.com/juanfont/headscale/pull/2767)
 - Remove policy v1 code [#2600](https://github.com/juanfont/headscale/pull/2600)
 - Refactor Debian/Ubuntu packaging and drop support for Ubuntu 20.04.
  [#2614](https://github.com/juanfont/headscale/pull/2614)
@@ -104,6 +85,14 @@ systemctl start headscale
  [#2625](https://github.com/juanfont/headscale/pull/2625)
 - Don't crash if config file is missing
  [#2656](https://github.com/juanfont/headscale/pull/2656)
+- Adds `/robots.txt` endpoint to avoid crawlers
+  [#2643](https://github.com/juanfont/headscale/pull/2643)
+- OIDC: Use group claim from UserInfo
+  [#2663](https://github.com/juanfont/headscale/pull/2663)
+- OIDC: Update user with claims from UserInfo _before_ comparing with allowed
+  groups, email and domain [#2663](https://github.com/juanfont/headscale/pull/2663)
+- Policy will now reject invalid fields, making it easier to spot spelling errors
+  [#2764](https://github.com/juanfont/headscale/pull/2764)

 ## 0.26.1 (2025-06-06)

@@ -260,8 +249,6 @@ working in v1 and not tested might be broken in v2 (and vice versa).
  [#2438](https://github.com/juanfont/headscale/pull/2438)
 - Add documentation for routes
  [#2496](https://github.com/juanfont/headscale/pull/2496)
- Add support for `autogroup:member`, `autogroup:tagged`
-  [#2572](https://github.com/juanfont/headscale/pull/2572)

 ## 0.25.1 (2025-02-25)

--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -205,139 +205,46 @@ The architecture supports incremental development:
 - **Policy Tests**: ACL rule evaluation and edge cases
 - **Performance Tests**: NodeStore and high-frequency operation validation

-## Integration Test System
+## Integration Testing System

 ### Overview
-Integration tests use Docker containers running real Tailscale clients against a Headscale server. Tests validate end-to-end functionality including routing, ACLs, node lifecycle, and network coordination.
+Headscale uses Docker-based integration tests with real Tailscale clients to validate end-to-end functionality. The integration test system is complex and requires specialized knowledge for effective execution and debugging.

-### Running Integration Tests
+### **MANDATORY: Use the headscale-integration-tester Agent**
+
+**CRITICAL REQUIREMENT**: For ANY integration test execution, analysis, troubleshooting, or validation, you MUST use the `headscale-integration-tester` agent. This agent contains specialized knowledge about:
+
+- Test execution strategies and timing requirements  
+- Infrastructure vs code issue distinction (99% vs 1% failure patterns)
+- Security-critical debugging rules and forbidden practices
+- Comprehensive artifact analysis workflows
+- Real-world failure patterns from HA debugging experiences
+
+### Quick Reference Commands

-**System Requirements**
 ```bash
-# Check if your system is ready
+# Check system requirements (always run first)
 go run ./cmd/hi doctor
-```
-This verifies Docker, Go, required images, and disk space.

-**Test Execution Patterns**
-```bash
-# Run a single test (recommended for development)
-go run ./cmd/hi run "TestSubnetRouterMultiNetwork"
+# Run single test (recommended for development)  
+go run ./cmd/hi run "TestName"

-# Run with PostgreSQL backend (for database-heavy tests)
-go run ./cmd/hi run "TestExpireNode" --postgres
+# Use PostgreSQL for database-heavy tests
+go run ./cmd/hi run "TestName" --postgres

-# Run multiple tests with pattern matching
-go run ./cmd/hi run "TestSubnet*"
-
-# Run all integration tests (CI/full validation)
-go test ./integration -timeout 30m
+# Pattern matching for related tests
+go run ./cmd/hi run "TestPattern*"
 ```

-**Test Categories & Timing**
- **Fast tests** (< 2 min): Basic functionality, CLI operations
- **Medium tests** (2-5 min): Route management, ACL validation  
- **Slow tests** (5+ min): Node expiration, HA failover
- **Long-running tests** (10+ min): `TestNodeOnlineStatus` (12 min duration)
+**Critical Notes**:
+- Only ONE test can run at a time (Docker port conflicts)
+- Tests generate ~100MB of logs per run in `control_logs/`
+- Clean environment before each test: `rm -rf control_logs/202507* && docker system prune -f`

-### Test Infrastructure
+### Test Artifacts Location
+All test runs save comprehensive debugging artifacts to `control_logs/TIMESTAMP-ID/` including server logs, client logs, database dumps, MapResponse protocol data, and Prometheus metrics.

-**Docker Setup**
- Headscale server container with configurable database backend
- Multiple Tailscale client containers with different versions
- Isolated networks per test scenario
- Automatic cleanup after test completion
-
-**Test Artifacts**
-All test runs save artifacts to `control_logs/TIMESTAMP-ID/`:
-```
-control_logs/20250713-213106-iajsux/
-├── hs-testname-abc123.stderr.log     # Headscale server logs
-├── hs-testname-abc123.stdout.log
-├── hs-testname-abc123.db             # Database snapshot
-├── hs-testname-abc123_metrics.txt    # Prometheus metrics
-├── hs-testname-abc123-mapresponses/  # Protocol debug data
-├── ts-client-xyz789.stderr.log       # Tailscale client logs
-├── ts-client-xyz789.stdout.log
-└── ts-client-xyz789_status.json      # Client status dump
-```
-
-### Test Development Guidelines
-
-**Timing Considerations**
-Integration tests involve real network operations and Docker container lifecycle:
-
-```go
-// ❌ Wrong: Immediate assertions after async operations
-client.Execute([]string{"tailscale", "set", "--advertise-routes=10.0.0.0/24"})
-nodes, _ := headscale.ListNodes()
-require.Len(t, nodes[0].GetAvailableRoutes(), 1) // May fail due to timing
-
-// ✅ Correct: Wait for async operations to complete
-client.Execute([]string{"tailscale", "set", "--advertise-routes=10.0.0.0/24"})
-require.EventuallyWithT(t, func(c *assert.CollectT) {
-    nodes, err := headscale.ListNodes()
-    assert.NoError(c, err)
-    assert.Len(c, nodes[0].GetAvailableRoutes(), 1)
-}, 10*time.Second, 100*time.Millisecond, "route should be advertised")
-```
-
-**Common Test Patterns**
- **Route Advertisement**: Use `EventuallyWithT` for route propagation
- **Node State Changes**: Wait for NodeStore synchronization  
- **ACL Policy Changes**: Allow time for policy recalculation
- **Network Connectivity**: Use ping tests with retries
-
-**Test Data Management**
-```go
-// Node identification: Don't assume array ordering
-expectedRoutes := map[string]string{"1": "10.33.0.0/16"}
-for _, node := range nodes {
-    nodeIDStr := fmt.Sprintf("%d", node.GetId())
-    if route, shouldHaveRoute := expectedRoutes[nodeIDStr]; shouldHaveRoute {
-        // Test the node that should have the route
-    }
-}
-```
-
-### Troubleshooting Integration Tests
-
-**Common Failure Patterns**
-1. **Timing Issues**: Test assertions run before async operations complete
-   - **Solution**: Use `EventuallyWithT` with appropriate timeouts
-   - **Timeout Guidelines**: 3-5s for route operations, 10s for complex scenarios
-
-2. **Infrastructure Problems**: Disk space, Docker issues, network conflicts
-   - **Check**: `go run ./cmd/hi doctor` for system health
-   - **Clean**: Remove old test containers and networks
-
-3. **NodeStore Synchronization**: Tests expecting immediate data availability
-   - **Key Points**: Route advertisements must propagate through poll requests
-   - **Fix**: Wait for NodeStore updates after Hostinfo changes
-
-4. **Database Backend Differences**: SQLite vs PostgreSQL behavior differences
-   - **Use**: `--postgres` flag for database-intensive tests
-   - **Note**: Some timing characteristics differ between backends
-
-**Debugging Failed Tests**
-1. **Check test artifacts** in `control_logs/` for detailed logs
-2. **Examine MapResponse JSON** files for protocol-level debugging
-3. **Review Headscale stderr logs** for server-side error messages
-4. **Check Tailscale client status** for network-level issues
-
-**Resource Management**
- Tests require significant disk space (each run ~100MB of logs)
- Docker containers are cleaned up automatically on success
- Failed tests may leave containers running - clean manually if needed
- Use `docker system prune` periodically to reclaim space
-
-### Best Practices for Test Modifications
-
-1. **Always test locally** before committing integration test changes
-2. **Use appropriate timeouts** - too short causes flaky tests, too long slows CI
-3. **Clean up properly** - ensure tests don't leave persistent state
-4. **Handle both success and failure paths** in test scenarios
-5. **Document timing requirements** for complex test scenarios
+**For all integration test work, use the headscale-integration-tester agent - it contains the complete knowledge needed for effective testing and debugging.**

 ## NodeStore Implementation Details

@@ -352,14 +259,108 @@ for _, node := range nodes {
 ## Testing Guidelines

 ### Integration Test Patterns
+
+#### **CRITICAL: EventuallyWithT Pattern for External Calls**
+
+**All external calls in integration tests MUST be wrapped in EventuallyWithT blocks** to handle eventual consistency in distributed systems. External calls include:
+- `client.Status()` - Getting Tailscale client status
+- `client.Curl()` - Making HTTP requests through clients
+- `client.Traceroute()` - Running network diagnostics
+- `headscale.ListNodes()` - Querying headscale server state
+- Any other calls that interact with external systems or network operations
+
+**Key Rules**:
+1. **Never use bare `require.NoError(t, err)` with external calls** - Always wrap in EventuallyWithT
+2. **Keep related assertions together** - If multiple assertions depend on the same external call, keep them in the same EventuallyWithT block
+3. **Split unrelated external calls** - Different external calls should be in separate EventuallyWithT blocks
+4. **Never nest EventuallyWithT calls** - Each EventuallyWithT should be at the same level
+5. **Declare shared variables at function scope** - Variables used across multiple EventuallyWithT blocks must be declared before first use
+
+**Examples**:
+
 ```go
-// Use EventuallyWithT for async operations
-require.EventuallyWithT(t, func(c *assert.CollectT) {
+// CORRECT: External call wrapped in EventuallyWithT
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    status, err := client.Status()
+    assert.NoError(c, err)
+    
+    // Related assertions using the same status call
+    for _, peerKey := range status.Peers() {
+        peerStatus := status.Peer[peerKey]
+        assert.NotNil(c, peerStatus.PrimaryRoutes)
+        requirePeerSubnetRoutesWithCollect(c, peerStatus, expectedRoutes)
+    }
+}, 5*time.Second, 200*time.Millisecond, "Verifying client status and routes")
+
+// INCORRECT: Bare external call without EventuallyWithT
+status, err := client.Status()  // ❌ Will fail intermittently
+require.NoError(t, err)
+
+// CORRECT: Separate EventuallyWithT for different external calls
+// First external call - headscale.ListNodes()
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
    nodes, err := headscale.ListNodes()
    assert.NoError(c, err)
-    // Check expected state
-}, 10*time.Second, 100*time.Millisecond, "description")
+    assert.Len(c, nodes, 2)
+    requireNodeRouteCountWithCollect(c, nodes[0], 2, 2, 2)
+}, 10*time.Second, 500*time.Millisecond, "route state changes should propagate to nodes")

+// Second external call - client.Status() 
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    status, err := client.Status()
+    assert.NoError(c, err)
+    
+    for _, peerKey := range status.Peers() {
+        peerStatus := status.Peer[peerKey]
+        requirePeerSubnetRoutesWithCollect(c, peerStatus, []netip.Prefix{tsaddr.AllIPv4(), tsaddr.AllIPv6()})
+    }
+}, 10*time.Second, 500*time.Millisecond, "routes should be visible to client")
+
+// INCORRECT: Multiple unrelated external calls in same EventuallyWithT
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err := headscale.ListNodes()  // ❌ First external call
+    assert.NoError(c, err)
+    
+    status, err := client.Status()  // ❌ Different external call - should be separate
+    assert.NoError(c, err)
+}, 10*time.Second, 500*time.Millisecond, "mixed calls")
+
+// CORRECT: Variable scoping for shared data
+var (
+    srs1, srs2, srs3       *ipnstate.Status
+    clientStatus           *ipnstate.Status
+    srs1PeerStatus         *ipnstate.PeerStatus
+)
+
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    srs1 = subRouter1.MustStatus()  // = not :=
+    srs2 = subRouter2.MustStatus()
+    clientStatus = client.MustStatus()
+    
+    srs1PeerStatus = clientStatus.Peer[srs1.Self.PublicKey]
+    // assertions...
+}, 5*time.Second, 200*time.Millisecond, "checking router status")
+
+// CORRECT: Wrapping client operations
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    result, err := client.Curl(weburl)
+    assert.NoError(c, err)
+    assert.Len(c, result, 13)
+}, 5*time.Second, 200*time.Millisecond, "Verifying HTTP connectivity")
+
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    tr, err := client.Traceroute(webip)
+    assert.NoError(c, err)
+    assertTracerouteViaIPWithCollect(c, tr, expectedRouter.MustIPv4())
+}, 5*time.Second, 200*time.Millisecond, "Verifying network path")
+```
+
+**Helper Functions**:
+- Use `requirePeerSubnetRoutesWithCollect` instead of `requirePeerSubnetRoutes` inside EventuallyWithT
+- Use `requireNodeRouteCountWithCollect` instead of `requireNodeRouteCount` inside EventuallyWithT
+- Use `assertTracerouteViaIPWithCollect` instead of `assertTracerouteViaIP` inside EventuallyWithT
+
+```go
 // Node route checking by actual node properties, not array position
 var routeNode *v1.Node
 for _, node := range nodes {
@@ -375,21 +376,155 @@ for _, node := range nodes {
 - Infrastructure issues like disk space can cause test failures unrelated to code changes  
 - Use `--postgres` flag when testing database-heavy scenarios

+## Quality Assurance and Testing Requirements
+
+### **MANDATORY: Always Use Specialized Testing Agents**
+
+**CRITICAL REQUIREMENT**: For ANY task involving testing, quality assurance, review, or validation, you MUST use the appropriate specialized agent at the END of your task list. This ensures comprehensive quality validation and prevents regressions.
+
+**Required Agents for Different Task Types**:
+
+1. **Integration Testing**: Use `headscale-integration-tester` agent for:
+   - Running integration tests with `cmd/hi`
+   - Analyzing test failures and artifacts
+   - Troubleshooting Docker-based test infrastructure
+   - Validating end-to-end functionality changes
+
+2. **Quality Control**: Use `quality-control-enforcer` agent for:
+   - Code review and validation
+   - Ensuring best practices compliance  
+   - Preventing common pitfalls and anti-patterns
+   - Validating architectural decisions
+
+**Agent Usage Pattern**: Always add the appropriate agent as the FINAL step in any task list to ensure quality validation occurs after all work is complete.
+
+### Integration Test Debugging Reference
+
+Test artifacts are preserved in `control_logs/TIMESTAMP-ID/` including:
+- Headscale server logs (stderr/stdout)
+- Tailscale client logs and status  
+- Database dumps and network captures
+- MapResponse JSON files for protocol debugging
+
+**For integration test issues, ALWAYS use the headscale-integration-tester agent - do not attempt manual debugging.**
+
+## EventuallyWithT Pattern for Integration Tests
+
+### Overview
+EventuallyWithT is a testing pattern used to handle eventual consistency in distributed systems. In Headscale integration tests, many operations are asynchronous - clients advertise routes, the server processes them, updates propagate through the network. EventuallyWithT allows tests to wait for these operations to complete while making assertions.
+
+### External Calls That Must Be Wrapped
+The following operations are **external calls** that interact with the headscale server or tailscale clients and MUST be wrapped in EventuallyWithT:
+- `headscale.ListNodes()` - Queries server state
+- `client.Status()` - Gets client network status
+- `client.Curl()` - Makes HTTP requests through the network
+- `client.Traceroute()` - Performs network diagnostics
+- `client.Execute()` when running commands that query state
+- Any operation that reads from the headscale server or tailscale client
+
+### Operations That Must NOT Be Wrapped
+The following are **blocking operations** that modify state and should NOT be wrapped in EventuallyWithT:
+- `tailscale set` commands (e.g., `--advertise-routes`, `--exit-node`)
+- Any command that changes configuration or state
+- Use `client.MustStatus()` instead of `client.Status()` when you just need the ID for a blocking operation
+
+### Five Key Rules for EventuallyWithT
+
+1. **One External Call Per EventuallyWithT Block**
+   - Each EventuallyWithT should make ONE external call (e.g., ListNodes OR Status)
+   - Related assertions based on that single call can be grouped together
+   - Unrelated external calls must be in separate EventuallyWithT blocks
+
+2. **Variable Scoping**
+   - Declare variables that need to be shared across EventuallyWithT blocks at function scope
+   - Use `=` for assignment inside EventuallyWithT, not `:=` (unless the variable is only used within that block)
+   - Variables declared with `:=` inside EventuallyWithT are not accessible outside
+
+3. **No Nested EventuallyWithT**
+   - NEVER put an EventuallyWithT inside another EventuallyWithT
+   - This is a critical anti-pattern that must be avoided
+
+4. **Use CollectT for Assertions**
+   - Inside EventuallyWithT, use `assert` methods with the CollectT parameter
+   - Helper functions called within EventuallyWithT must accept `*assert.CollectT`
+
+5. **Descriptive Messages**
+   - Always provide a descriptive message as the last parameter
+   - Message should explain what condition is being waited for
+
+### Correct Pattern Examples
+
+```go
+// CORRECT: Blocking operation NOT wrapped
+for _, client := range allClients {
+    status := client.MustStatus()
+    command := []string{
+        "tailscale",
+        "set",
+        "--advertise-routes=" + expectedRoutes[string(status.Self.ID)],
+    }
+    _, _, err = client.Execute(command)
+    require.NoErrorf(t, err, "failed to advertise route: %s", err)
+}
+
+// CORRECT: Single external call with related assertions
+var nodes []*v1.Node
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    nodes, err = headscale.ListNodes()
+    assert.NoError(c, err)
+    assert.Len(c, nodes, 2)
+    requireNodeRouteCountWithCollect(c, nodes[0], 2, 2, 2)
+}, 10*time.Second, 500*time.Millisecond, "nodes should have expected route counts")
+
+// CORRECT: Separate EventuallyWithT for different external call
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    status, err := client.Status()
+    assert.NoError(c, err)
+    for _, peerKey := range status.Peers() {
+        peerStatus := status.Peer[peerKey]
+        requirePeerSubnetRoutesWithCollect(c, peerStatus, expectedPrefixes)
+    }
+}, 10*time.Second, 500*time.Millisecond, "client should see expected routes")
+```
+
+### Incorrect Patterns to Avoid
+
+```go
+// INCORRECT: Blocking operation wrapped in EventuallyWithT
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    status, err := client.Status()
+    assert.NoError(c, err)
+    
+    // This is a blocking operation - should NOT be in EventuallyWithT!
+    command := []string{
+        "tailscale",
+        "set",
+        "--advertise-routes=" + expectedRoutes[string(status.Self.ID)],
+    }
+    _, _, err = client.Execute(command)
+    assert.NoError(c, err)
+}, 5*time.Second, 200*time.Millisecond, "wrong pattern")
+
+// INCORRECT: Multiple unrelated external calls in same EventuallyWithT
+assert.EventuallyWithT(t, func(c *assert.CollectT) {
+    // First external call
+    nodes, err := headscale.ListNodes()
+    assert.NoError(c, err)
+    assert.Len(c, nodes, 2)
+    
+    // Second unrelated external call - WRONG!
+    status, err := client.Status()
+    assert.NoError(c, err)
+    assert.NotNil(c, status)
+}, 10*time.Second, 500*time.Millisecond, "mixed operations")
+```
+
 ## Important Notes

 - **Dependencies**: Use `nix develop` for consistent toolchain (Go, buf, protobuf tools, linting)
 - **Protocol Buffers**: Changes to `proto/` require `make generate` and should be committed separately
 - **Code Style**: Enforced via golangci-lint with golines (width 88) and gofumpt formatting
 - **Database**: Supports both SQLite (development) and PostgreSQL (production/testing)
- **Integration Tests**: Require Docker and can consume significant disk space
+- **Integration Tests**: Require Docker and can consume significant disk space - use headscale-integration-tester agent
 - **Performance**: NodeStore optimizations are critical for scale - be careful with changes to state management
-
-## Debugging Integration Tests
-
-Test artifacts are preserved in `control_logs/TIMESTAMP-ID/` including:
- Headscale server logs (stderr/stdout)
- Tailscale client logs and status
- Database dumps and network captures
- MapResponse JSON files for protocol debugging
-
-When tests fail, check these artifacts first before assuming code issues.
+- **Quality Assurance**: Always use appropriate specialized agents for testing and validation tasks
--- a/CLI_IMPROVEMENT_PLAN.md
+++ b/CLI_IMPROVEMENT_PLAN.md
--- a/CLI_STANDARDIZATION_SUMMARY.md
+++ b/CLI_STANDARDIZATION_SUMMARY.md
@@ -1,201 +0,0 @@
-# CLI Standardization Summary
-
-## Changes Made
-
-### 1. Command Naming Standardization
- **Fixed**: `backfillips` → `backfill-ips` (with backward compat alias)
- **Fixed**: `dumpConfig` → `dump-config` (with backward compat alias) 
- **Result**: All commands now use kebab-case consistently
-
-### 2. Flag Standardization
-
-#### Node Commands
- **Added**: `--node` flag as primary way to specify nodes
- **Deprecated**: `--identifier` flag (hidden, marked deprecated)
- **Backward Compatible**: Both flags work, `--identifier` shows deprecation warning
- **Smart Lookup Ready**: `--node` accepts strings for future name/hostname/IP lookup
-
-#### User Commands  
- **Updated**: User identification flow prepared for `--user` flag
- **Maintained**: Existing `--name` and `--identifier` flags for backward compatibility
-
-### 3. Description Consistency
- **Fixed**: "Api" → "API" throughout
- **Fixed**: Capitalization consistency in short descriptions
- **Fixed**: Removed unnecessary periods from short descriptions
- **Standardized**: "Handle/Manage the X of Headscale" pattern
-
-### 4. Type Consistency
- **Standardized**: Node IDs use `uint64` consistently
- **Maintained**: Backward compatibility with existing flag types
-
-## Current Status
-
-### ✅ Completed
- Command naming (kebab-case)
- Flag deprecation and aliasing
- Description standardization  
- Backward compatibility preservation
- Helper functions for flag processing
- **SMART LOOKUP IMPLEMENTATION**:
-  - Enhanced `ListNodesRequest` proto with ID, name, hostname, IP filters
-  - Implemented smart filtering in `ListNodes` gRPC method
-  - Added CLI smart lookup functions for nodes and users
-  - Single match validation with helpful error messages
-  - Automatic detection: ID (numeric) vs IP vs name/hostname/email
-
-### ✅ Smart Lookup Features
- **Node Lookup**: By ID, hostname, or IP address
- **User Lookup**: By ID, username, or email address  
- **Single Match Enforcement**: Errors if 0 or >1 matches found
- **Helpful Error Messages**: Shows all matches when ambiguous
- **Full Backward Compatibility**: All existing flags still work
- **Enhanced List Commands**: Both `nodes list` and `users list` support all filter types
-
-## Breaking Changes
-
-**None.** All changes maintain full backward compatibility through flag aliases and deprecation warnings.
-
-## Implementation Details
-
-### Smart Lookup Algorithm
-
-1. **Input Detection**:
-   ```go
-   if numeric && > 0 -> treat as ID
-   else if contains "@" -> treat as email (users only)  
-   else if valid IP address -> treat as IP (nodes only)
-   else -> treat as name/hostname
-   ```
-
-2. **gRPC Filtering**:
-   - Uses enhanced `ListNodes`/`ListUsers` with specific filters
-   - Server-side filtering for optimal performance
-   - Single transaction per lookup
-
-3. **Match Validation**:
-   - Exactly 1 match: Return ID
-   - 0 matches: Error with "not found" message
-   - >1 matches: Error listing all matches for disambiguation
-
-### Enhanced Proto Definitions
-
-```protobuf
-message ListNodesRequest { 
-  string user = 1;           // existing
-  uint64 id = 2;            // new: filter by ID
-  string name = 3;          // new: filter by hostname  
-  string hostname = 4;      // new: alias for name
-  repeated string ip_addresses = 5; // new: filter by IPs
-}
-```
-
-### Future Enhancements
-
- **Fuzzy Matching**: Partial name matching with confirmation
- **Recently Used**: Cache recently accessed nodes/users
- **Tab Completion**: Shell completion for names/hostnames
- **Bulk Operations**: Multi-select with pattern matching
-
-## Migration Path for Users
-
-### Now Available (Current Release)
-```bash
-# Old way (still works, shows deprecation warning)
-headscale nodes expire --identifier 123
-
-# New way with smart lookup:
-headscale nodes expire --node 123                    # by ID
-headscale nodes expire --node "my-laptop"           # by hostname  
-headscale nodes expire --node "100.64.0.1"          # by Tailscale IP
-headscale nodes expire --node "192.168.1.100"       # by real IP
-
-# User operations:
-headscale users destroy --user 123                   # by ID
-headscale users destroy --user "alice"               # by username
-headscale users destroy --user "alice@company.com"   # by email
-
-# Enhanced list commands with filtering:
-headscale nodes list --node "laptop"                 # filter nodes by name
-headscale nodes list --ip "100.64.0.1"              # filter nodes by IP
-headscale nodes list --user "alice"                  # filter nodes by user
-headscale users list --user "alice"                  # smart lookup user
-headscale users list --email "@company.com"          # filter by email domain
-headscale users list --name "alice"                  # filter by exact name
-
-# Error handling examples:
-headscale nodes expire --node "laptop"
-# Error: multiple nodes found matching 'laptop': ID=1 name=laptop-alice, ID=2 name=laptop-bob
-
-headscale nodes expire --node "nonexistent" 
-# Error: no node found matching 'nonexistent'
-```
-
-## Command Structure Overview
-
-```
-headscale [global-flags] <command> [command-flags] <subcommand> [subcommand-flags] [args]
-
-Global Flags:
-  --config, -c     config file path
-  --output, -o     output format (json, yaml, json-line)  
-  --force          disable prompts
-
-Commands:
-├── serve
-├── version  
-├── config-test
-├── dump-config (alias: dumpConfig)
-├── mockoidc
-├── generate/
-│   └── private-key
-├── nodes/
-│   ├── list (--user, --tags, --columns)
-│   ├── register (--user, --key) 
-│   ├── list-routes (--node)
-│   ├── expire (--node)
-│   ├── rename (--node) <new-name>
-│   ├── delete (--node)
-│   ├── move (--node, --user)
-│   ├── tag (--node, --tags)
-│   ├── approve-routes (--node, --routes)
-│   └── backfill-ips (alias: backfillips)
-├── users/
-│   ├── create <name> (--display-name, --email, --picture-url)
-│   ├── list (--user, --name, --email, --columns)
-│   ├── destroy (--user|--name|--identifier)
-│   └── rename (--user|--name|--identifier, --new-name)
-├── apikeys/
-│   ├── list
-│   ├── create (--expiration)
-│   ├── expire (--prefix)
-│   └── delete (--prefix)
-├── preauthkeys/
-│   ├── list (--user)
-│   ├── create (--user, --reusable, --ephemeral, --expiration, --tags)
-│   └── expire (--user) <key>
-├── policy/
-│   ├── get
-│   ├── set (--file)
-│   └── check (--file)
-└── debug/
-    └── create-node (--name, --user, --key, --route)
-```
-
-## Deprecated Flags
-
-All deprecated flags continue to work but show warnings:
-
- `--identifier` → use `--node` (for node commands) or `--user` (for user commands)
- `--namespace` → use `--user` (already implemented)
- `dumpConfig` → use `dump-config`
- `backfillips` → use `backfill-ips`
-
-## Error Handling
-
-Improved error messages provide clear guidance:
-```
-Error: node specifier must be a numeric ID (smart lookup by name/hostname/IP not yet implemented)
-Error: --node flag is required  
-Error: --user flag is required
-```
--- a/Dockerfile.integration
+++ b/Dockerfile.integration
@@ -13,14 +13,18 @@ RUN apt-get update \
  && apt-get clean
 RUN mkdir -p /var/run/headscale

+# Install delve debugger
+RUN go install github.com/go-delve/delve/cmd/dlv@latest
+
 COPY go.mod go.sum /go/src/headscale/
 RUN go mod download

 COPY . .

-RUN CGO_ENABLED=0 GOOS=linux go install -a ./cmd/headscale && test -e /go/bin/headscale
+# Build debug binary with debug symbols for delve
+RUN CGO_ENABLED=0 GOOS=linux go build -gcflags="all=-N -l" -o /go/bin/headscale ./cmd/headscale

 # Need to reset the entrypoint or everything will run as a busybox script
 ENTRYPOINT []
-EXPOSE 8080/tcp
-CMD ["headscale"]
+EXPOSE 8080/tcp 40000/tcp
+CMD ["/go/bin/dlv", "--listen=0.0.0.0:40000", "--headless=true", "--api-version=2", "--accept-multiclient", "exec", "/go/bin/headscale", "--"]
--- a/Dockerfile.tailscale-HEAD
+++ b/Dockerfile.tailscale-HEAD
@@ -4,7 +4,7 @@
 # This Dockerfile is more or less lifted from tailscale/tailscale
 # to ensure a similar build process when testing the HEAD of tailscale.

-FROM golang:1.24-alpine AS build-env
+FROM golang:1.25-alpine AS build-env

 WORKDIR /go/src

--- a/7
+++ b/7
@@ -87,10 +87,9 @@ lint-proto: check-deps $(PROTO_SOURCES)

 # Code generation
 .PHONY: generate
-generate: check-deps $(PROTO_SOURCES)
-	@echo "Generating code from Protocol Buffers..."
-	rm -rf gen
-	buf generate proto
+generate: check-deps
+	@echo "Generating code..."
+	go generate ./...

 # Clean targets
 .PHONY: clean
--- a/cmd/headscale/cli/api_key.go
+++ b/cmd/headscale/cli/api_key.go
@@ -1,7 +1,6 @@
 package cli

 import (
-	"context"
 	"fmt"
 	"strconv"
 	"time"
@@ -15,6 +14,11 @@ import (
 	"google.golang.org/protobuf/types/known/timestamppb"
 )

+const (
+	// 90 days.
+	DefaultAPIKeyExpiry = "90d"
+)
+
 func init() {
 	rootCmd.AddCommand(apiKeysCmd)
 	apiKeysCmd.AddCommand(listAPIKeys)
@@ -39,80 +43,75 @@ func init() {

 var apiKeysCmd = &cobra.Command{
 	Use:     "apikeys",
-	Short:   "Handle the API keys in Headscale",
+	Short:   "Handle the Api keys in Headscale",
 	Aliases: []string{"apikey", "api"},
 }

 var listAPIKeys = &cobra.Command{
 	Use:     "list",
-	Short:   "List the API keys for Headscale",
+	Short:   "List the Api keys for headscale",
 	Aliases: []string{"ls", "show"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")

-		err := WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			request := &v1.ListApiKeysRequest{}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			response, err := client.ListApiKeys(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Error getting the list of keys: %s", err),
-					output,
-				)
-				return err
-			}
+		request := &v1.ListApiKeysRequest{}

-			if output != "" {
-				SuccessOutput(response.GetApiKeys(), "", output)
-				return nil
-			}
-
-			tableData := pterm.TableData{
-				{"ID", "Prefix", "Expiration", "Created"},
-			}
-			for _, key := range response.GetApiKeys() {
-				expiration := "-"
-
-				if key.GetExpiration() != nil {
-					expiration = ColourTime(key.GetExpiration().AsTime())
-				}
-
-				tableData = append(tableData, []string{
-					strconv.FormatUint(key.GetId(), util.Base10),
-					key.GetPrefix(),
-					expiration,
-					key.GetCreatedAt().AsTime().Format(HeadscaleDateTimeFormat),
-				})
-
-			}
-			err = pterm.DefaultTable.WithHasHeader().WithData(tableData).Render()
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Failed to render pterm table: %s", err),
-					output,
-				)
-				return err
-			}
-			return nil
-		})
+		response, err := client.ListApiKeys(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Error getting the list of keys: %s", err),
+				output,
+			)
+		}
+
+		if output != "" {
+			SuccessOutput(response.GetApiKeys(), "", output)
+		}
+
+		tableData := pterm.TableData{
+			{"ID", "Prefix", "Expiration", "Created"},
+		}
+		for _, key := range response.GetApiKeys() {
+			expiration := "-"
+
+			if key.GetExpiration() != nil {
+				expiration = ColourTime(key.GetExpiration().AsTime())
+			}
+
+			tableData = append(tableData, []string{
+				strconv.FormatUint(key.GetId(), util.Base10),
+				key.GetPrefix(),
+				expiration,
+				key.GetCreatedAt().AsTime().Format(HeadscaleDateTimeFormat),
+			})
+
+		}
+		err = pterm.DefaultTable.WithHasHeader().WithData(tableData).Render()
+		if err != nil {
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Failed to render pterm table: %s", err),
+				output,
+			)
 		}
 	},
 }

 var createAPIKeyCmd = &cobra.Command{
 	Use:   "create",
-	Short: "Create a new API key",
+	Short: "Creates a new Api key",
 	Long: `
 Creates a new Api key, the Api key is only visible on creation
 and cannot be retrieved again.
 If you loose a key, create a new one and revoke (expire) the old one.`,
 	Aliases: []string{"c", "new"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")

 		request := &v1.CreateApiKeyRequest{}

@@ -125,101 +124,99 @@ If you loose a key, create a new one and revoke (expire) the old one.`,
 				fmt.Sprintf("Could not parse duration: %s\n", err),
 				output,
 			)
-			return
 		}

 		expiration := time.Now().UTC().Add(time.Duration(duration))

 		request.Expiration = timestamppb.New(expiration)

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			response, err := client.CreateApiKey(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Cannot create Api Key: %s\n", err),
-					output,
-				)
-				return err
-			}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			SuccessOutput(response.GetApiKey(), response.GetApiKey(), output)
-			return nil
-		})
+		response, err := client.CreateApiKey(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Cannot create Api Key: %s\n", err),
+				output,
+			)
 		}
+
+		SuccessOutput(response.GetApiKey(), response.GetApiKey(), output)
 	},
 }

 var expireAPIKeyCmd = &cobra.Command{
 	Use:     "expire",
-	Short:   "Expire an API key",
+	Short:   "Expire an ApiKey",
 	Aliases: []string{"revoke", "exp", "e"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")
+
 		prefix, err := cmd.Flags().GetString("prefix")
 		if err != nil {
-			ErrorOutput(err, fmt.Sprintf("Error getting prefix from CLI flag: %s", err), output)
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Error getting prefix from CLI flag: %s", err),
+				output,
+			)
 		}

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			request := &v1.ExpireApiKeyRequest{
-				Prefix: prefix,
-			}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			response, err := client.ExpireApiKey(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Cannot expire Api Key: %s\n", err),
-					output,
-				)
-				return err
-			}
+		request := &v1.ExpireApiKeyRequest{
+			Prefix: prefix,
+		}

-			SuccessOutput(response, "Key expired", output)
-			return nil
-		})
+		response, err := client.ExpireApiKey(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Cannot expire Api Key: %s\n", err),
+				output,
+			)
 		}
+
+		SuccessOutput(response, "Key expired", output)
 	},
 }

 var deleteAPIKeyCmd = &cobra.Command{
 	Use:     "delete",
-	Short:   "Delete an API key",
+	Short:   "Delete an ApiKey",
 	Aliases: []string{"remove", "del"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")
+
 		prefix, err := cmd.Flags().GetString("prefix")
 		if err != nil {
-			ErrorOutput(err, fmt.Sprintf("Error getting prefix from CLI flag: %s", err), output)
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Error getting prefix from CLI flag: %s", err),
+				output,
+			)
 		}

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			request := &v1.DeleteApiKeyRequest{
-				Prefix: prefix,
-			}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			response, err := client.DeleteApiKey(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Cannot delete Api Key: %s\n", err),
-					output,
-				)
-				return err
-			}
+		request := &v1.DeleteApiKeyRequest{
+			Prefix: prefix,
+		}

-			SuccessOutput(response, "Key deleted", output)
-			return nil
-		})
+		response, err := client.DeleteApiKey(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Cannot delete Api Key: %s\n", err),
+				output,
+			)
 		}
+
+		SuccessOutput(response, "Key deleted", output)
 	},
 }
--- a/cmd/headscale/cli/client.go
+++ b/cmd/headscale/cli/client.go
@@ -1,16 +0,0 @@
-package cli
-
-import (
-	"context"
-
-	v1 "github.com/juanfont/headscale/gen/go/headscale/v1"
-)
-
-// WithClient handles gRPC client setup and cleanup, calls fn with client and context
-func WithClient(fn func(context.Context, v1.HeadscaleServiceClient) error) error {
-	ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
-	defer cancel()
-	defer conn.Close()
-
-	return fn(ctx, client)
-}
--- a/cmd/headscale/cli/configtest.go
+++ b/cmd/headscale/cli/configtest.go
@@ -11,8 +11,8 @@ func init() {

 var configTestCmd = &cobra.Command{
 	Use:   "configtest",
-	Short: "Test the configuration",
-	Long:  "Run a test of the configuration and exit",
+	Short: "Test the configuration.",
+	Long:  "Run a test of the configuration and exit.",
 	Run: func(cmd *cobra.Command, args []string) {
 		_, err := newHeadscaleServerWithConfig()
 		if err != nil {
--- a/cmd/headscale/cli/configtest_test.go
+++ b/cmd/headscale/cli/configtest_test.go
@@ -1,46 +0,0 @@
-package cli
-
-import (
-	"testing"
-
-	"github.com/stretchr/testify/assert"
-	"github.com/stretchr/testify/require"
-)
-
-func TestConfigTestCommand(t *testing.T) {
-	// Test that the configtest command exists and is properly configured
-	assert.NotNil(t, configTestCmd)
-	assert.Equal(t, "configtest", configTestCmd.Use)
-	assert.Equal(t, "Test the configuration.", configTestCmd.Short)
-	assert.Equal(t, "Run a test of the configuration and exit.", configTestCmd.Long)
-	assert.NotNil(t, configTestCmd.Run)
-}
-
-func TestConfigTestCommandInRootCommand(t *testing.T) {
-	// Test that configtest is available as a subcommand of root
-	cmd, _, err := rootCmd.Find([]string{"configtest"})
-	require.NoError(t, err)
-	assert.Equal(t, "configtest", cmd.Name())
-	assert.Equal(t, configTestCmd, cmd)
-}
-
-func TestConfigTestCommandHelp(t *testing.T) {
-	// Test that the command has proper help text
-	assert.NotEmpty(t, configTestCmd.Short)
-	assert.NotEmpty(t, configTestCmd.Long)
-	assert.Contains(t, configTestCmd.Short, "configuration")
-	assert.Contains(t, configTestCmd.Long, "test")
-	assert.Contains(t, configTestCmd.Long, "configuration")
-}
-
-// Note: We can't easily test the actual execution of configtest because:
-// 1. It depends on configuration files being present
-// 2. It calls log.Fatal() which would exit the test process
-// 3. It tries to initialize a full Headscale server
-//
-// In a real refactor, we would:
-// 1. Extract the configuration validation logic to a testable function
-// 2. Return errors instead of calling log.Fatal()
-// 3. Accept configuration as a parameter instead of loading from global state
-//
-// For now, we test the command structure and that it's properly wired up.
--- a/cmd/headscale/cli/debug.go
+++ b/cmd/headscale/cli/debug.go
@@ -1,7 +1,6 @@
 package cli

 import (
-	"context"
 	"fmt"

 	v1 "github.com/juanfont/headscale/gen/go/headscale/v1"
@@ -15,6 +14,11 @@ const (
 	errPreAuthKeyMalformed = Error("key is malformed. expected 64 hex characters with `nodekey` prefix")
 )

+// Error is used to compare errors as per https://dave.cheney.net/2016/04/07/constant-errors
+type Error string
+
+func (e Error) Error() string { return string(e) }
+
 func init() {
 	rootCmd.AddCommand(debugCmd)

@@ -25,6 +29,11 @@ func init() {
 	}
 	createNodeCmd.Flags().StringP("user", "u", "", "User")

+	createNodeCmd.Flags().StringP("namespace", "n", "", "User")
+	createNodeNamespaceFlag := createNodeCmd.Flags().Lookup("namespace")
+	createNodeNamespaceFlag.Deprecated = deprecateNamespaceMessage
+	createNodeNamespaceFlag.Hidden = true
+
 	err = createNodeCmd.MarkFlagRequired("user")
 	if err != nil {
 		log.Fatal().Err(err).Msg("")
@@ -50,14 +59,17 @@ var createNodeCmd = &cobra.Command{
 	Use:   "create-node",
 	Short: "Create a node that can be registered with `nodes register <>` command",
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")

 		user, err := cmd.Flags().GetString("user")
 		if err != nil {
 			ErrorOutput(err, fmt.Sprintf("Error getting user: %s", err), output)
-			return
 		}

+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()
+
 		name, err := cmd.Flags().GetString("name")
 		if err != nil {
 			ErrorOutput(
@@ -65,7 +77,6 @@ var createNodeCmd = &cobra.Command{
 				fmt.Sprintf("Error getting node from flag: %s", err),
 				output,
 			)
-			return
 		}

 		registrationID, err := cmd.Flags().GetString("key")
@@ -75,7 +86,6 @@ var createNodeCmd = &cobra.Command{
 				fmt.Sprintf("Error getting key from flag: %s", err),
 				output,
 			)
-			return
 		}

 		_, err = types.RegistrationIDFromString(registrationID)
@@ -85,7 +95,6 @@ var createNodeCmd = &cobra.Command{
 				fmt.Sprintf("Failed to parse machine key from flag: %s", err),
 				output,
 			)
-			return
 		}

 		routes, err := cmd.Flags().GetStringSlice("route")
@@ -95,32 +104,24 @@ var createNodeCmd = &cobra.Command{
 				fmt.Sprintf("Error getting routes from flag: %s", err),
 				output,
 			)
-			return
 		}

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			request := &v1.DebugCreateNodeRequest{
-				Key:    registrationID,
-				Name:   name,
-				User:   user,
-				Routes: routes,
-			}
+		request := &v1.DebugCreateNodeRequest{
+			Key:    registrationID,
+			Name:   name,
+			User:   user,
+			Routes: routes,
+		}

-			response, err := client.DebugCreateNode(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					"Cannot create node: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
-
-			SuccessOutput(response.GetNode(), "Node created", output)
-			return nil
-		})
+		response, err := client.DebugCreateNode(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				"Cannot create node: "+status.Convert(err).Message(),
+				output,
+			)
 		}
+
+		SuccessOutput(response.GetNode(), "Node created", output)
 	},
 }
--- a/cmd/headscale/cli/debug_test.go
+++ b/cmd/headscale/cli/debug_test.go
@@ -1,144 +0,0 @@
-package cli
-
-import (
-	"testing"
-
-	"github.com/stretchr/testify/assert"
-	"github.com/stretchr/testify/require"
-)
-
-func TestDebugCommand(t *testing.T) {
-	// Test that the debug command exists and is properly configured
-	assert.NotNil(t, debugCmd)
-	assert.Equal(t, "debug", debugCmd.Use)
-	assert.Equal(t, "debug and testing commands", debugCmd.Short)
-	assert.Equal(t, "debug contains extra commands used for debugging and testing headscale", debugCmd.Long)
-}
-
-func TestDebugCommandInRootCommand(t *testing.T) {
-	// Test that debug is available as a subcommand of root
-	cmd, _, err := rootCmd.Find([]string{"debug"})
-	require.NoError(t, err)
-	assert.Equal(t, "debug", cmd.Name())
-	assert.Equal(t, debugCmd, cmd)
-}
-
-func TestCreateNodeCommand(t *testing.T) {
-	// Test that the create-node command exists and is properly configured
-	assert.NotNil(t, createNodeCmd)
-	assert.Equal(t, "create-node", createNodeCmd.Use)
-	assert.Equal(t, "Create a node that can be registered with `nodes register <>` command", createNodeCmd.Short)
-	assert.NotNil(t, createNodeCmd.Run)
-}
-
-func TestCreateNodeCommandInDebugCommand(t *testing.T) {
-	// Test that create-node is available as a subcommand of debug
-	cmd, _, err := rootCmd.Find([]string{"debug", "create-node"})
-	require.NoError(t, err)
-	assert.Equal(t, "create-node", cmd.Name())
-	assert.Equal(t, createNodeCmd, cmd)
-}
-
-func TestCreateNodeCommandFlags(t *testing.T) {
-	// Test that create-node has the required flags
-
-	// Test name flag
-	nameFlag := createNodeCmd.Flags().Lookup("name")
-	assert.NotNil(t, nameFlag)
-	assert.Equal(t, "", nameFlag.Shorthand) // No shorthand for name
-	assert.Equal(t, "", nameFlag.DefValue)
-
-	// Test user flag
-	userFlag := createNodeCmd.Flags().Lookup("user")
-	assert.NotNil(t, userFlag)
-	assert.Equal(t, "u", userFlag.Shorthand)
-
-	// Test key flag
-	keyFlag := createNodeCmd.Flags().Lookup("key")
-	assert.NotNil(t, keyFlag)
-	assert.Equal(t, "k", keyFlag.Shorthand)
-
-	// Test route flag
-	routeFlag := createNodeCmd.Flags().Lookup("route")
-	assert.NotNil(t, routeFlag)
-	assert.Equal(t, "r", routeFlag.Shorthand)
-
-}
-
-func TestCreateNodeCommandRequiredFlags(t *testing.T) {
-	// Test that required flags are marked as required
-	// We can't easily test the actual requirement enforcement without executing the command
-	// But we can test that the flags exist and have the expected properties
-
-	// These flags should be required based on the init() function
-	requiredFlags := []string{"name", "user", "key"}
-
-	for _, flagName := range requiredFlags {
-		flag := createNodeCmd.Flags().Lookup(flagName)
-		assert.NotNil(t, flag, "Required flag %s should exist", flagName)
-	}
-}
-
-func TestErrorType(t *testing.T) {
-	// Test the Error type implementation
-	err := errPreAuthKeyMalformed
-	assert.Equal(t, "key is malformed. expected 64 hex characters with `nodekey` prefix", err.Error())
-	assert.Equal(t, "key is malformed. expected 64 hex characters with `nodekey` prefix", string(err))
-
-	// Test that it implements the error interface
-	var genericErr error = err
-	assert.Equal(t, "key is malformed. expected 64 hex characters with `nodekey` prefix", genericErr.Error())
-}
-
-func TestErrorConstants(t *testing.T) {
-	// Test that error constants are defined properly
-	assert.Equal(t, Error("key is malformed. expected 64 hex characters with `nodekey` prefix"), errPreAuthKeyMalformed)
-}
-
-func TestDebugCommandStructure(t *testing.T) {
-	// Test that debug has create-node as a subcommand
-	found := false
-	for _, subcmd := range debugCmd.Commands() {
-		if subcmd.Name() == "create-node" {
-			found = true
-			break
-		}
-	}
-	assert.True(t, found, "create-node should be a subcommand of debug")
-}
-
-func TestCreateNodeCommandHelp(t *testing.T) {
-	// Test that the command has proper help text
-	assert.NotEmpty(t, createNodeCmd.Short)
-	assert.Contains(t, createNodeCmd.Short, "Create a node")
-	assert.Contains(t, createNodeCmd.Short, "nodes register")
-}
-
-func TestCreateNodeCommandFlagDescriptions(t *testing.T) {
-	// Test that flags have appropriate usage descriptions
-	nameFlag := createNodeCmd.Flags().Lookup("name")
-	assert.Equal(t, "Name", nameFlag.Usage)
-
-	userFlag := createNodeCmd.Flags().Lookup("user")
-	assert.Equal(t, "User", userFlag.Usage)
-
-	keyFlag := createNodeCmd.Flags().Lookup("key")
-	assert.Equal(t, "Key", keyFlag.Usage)
-
-	routeFlag := createNodeCmd.Flags().Lookup("route")
-	assert.Contains(t, routeFlag.Usage, "routes to advertise")
-
-}
-
-// Note: We can't easily test the actual execution of create-node because:
-// 1. It depends on gRPC client configuration
-// 2. It calls SuccessOutput/ErrorOutput which exit the process
-// 3. It requires valid registration keys and user setup
-//
-// In a real refactor, we would:
-// 1. Extract the business logic to testable functions
-// 2. Use dependency injection for the gRPC client
-// 3. Return errors instead of calling ErrorOutput/SuccessOutput
-// 4. Add validation functions that can be tested independently
-//
-// For now, we test the command structure and flag configuration.
--- a/cmd/headscale/cli/dump_config.go
+++ b/cmd/headscale/cli/dump_config.go
@@ -12,10 +12,9 @@ func init() {
 }

 var dumpConfigCmd = &cobra.Command{
-	Use:     "dump-config",
-	Short:   "Dump current config to /etc/headscale/config.dump.yaml, integration test only",
-	Aliases: []string{"dumpConfig"},
-	Hidden:  true,
+	Use:    "dumpConfig",
+	Short:  "dump current config to /etc/headscale/config.dump.yaml, integration test only",
+	Hidden: true,
 	Args: func(cmd *cobra.Command, args []string) error {
 		return nil
 	},
--- a/cmd/headscale/cli/generate.go
+++ b/cmd/headscale/cli/generate.go
@@ -22,7 +22,7 @@ var generatePrivateKeyCmd = &cobra.Command{
 	Use:   "private-key",
 	Short: "Generate a private key for the headscale server",
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")
 		machineKey := key.NewMachine()

 		machineKeyStr, err := machineKey.MarshalText()
--- a/cmd/headscale/cli/generate_test.go
+++ b/cmd/headscale/cli/generate_test.go
@@ -1,230 +0,0 @@
-package cli
-
-import (
-	"bytes"
-	"encoding/json"
-	"strings"
-	"testing"
-
-	"github.com/spf13/cobra"
-	"github.com/stretchr/testify/assert"
-	"github.com/stretchr/testify/require"
-	"gopkg.in/yaml.v3"
-)
-
-func TestGenerateCommand(t *testing.T) {
-	// Test that the generate command exists and shows help
-	cmd := &cobra.Command{
-		Use:   "headscale",
-		Short: "headscale - a Tailscale control server",
-	}
-
-	cmd.AddCommand(generateCmd)
-
-	out := new(bytes.Buffer)
-	cmd.SetOut(out)
-	cmd.SetErr(out)
-	cmd.SetArgs([]string{"generate", "--help"})
-
-	err := cmd.Execute()
-	require.NoError(t, err)
-
-	outStr := out.String()
-	assert.Contains(t, outStr, "Generate commands")
-	assert.Contains(t, outStr, "private-key")
-	assert.Contains(t, outStr, "Aliases:")
-	assert.Contains(t, outStr, "gen")
-}
-
-func TestGenerateCommandAlias(t *testing.T) {
-	// Test that the "gen" alias works
-	cmd := &cobra.Command{
-		Use:   "headscale",
-		Short: "headscale - a Tailscale control server",
-	}
-
-	cmd.AddCommand(generateCmd)
-
-	out := new(bytes.Buffer)
-	cmd.SetOut(out)
-	cmd.SetErr(out)
-	cmd.SetArgs([]string{"gen", "--help"})
-
-	err := cmd.Execute()
-	require.NoError(t, err)
-
-	outStr := out.String()
-	assert.Contains(t, outStr, "Generate commands")
-}
-
-func TestGeneratePrivateKeyCommand(t *testing.T) {
-	tests := []struct {
-		name       string
-		args       []string
-		expectJSON bool
-		expectYAML bool
-	}{
-		{
-			name:       "default output",
-			args:       []string{"generate", "private-key"},
-			expectJSON: false,
-			expectYAML: false,
-		},
-		{
-			name:       "json output",
-			args:       []string{"generate", "private-key", "--output", "json"},
-			expectJSON: true,
-			expectYAML: false,
-		},
-		{
-			name:       "yaml output",
-			args:       []string{"generate", "private-key", "--output", "yaml"},
-			expectJSON: false,
-			expectYAML: true,
-		},
-	}
-
-	for _, tt := range tests {
-		t.Run(tt.name, func(t *testing.T) {
-			// Note: This command calls SuccessOutput which exits the process
-			// We can't test the actual execution easily without mocking
-			// Instead, we test the command structure and that it exists
-
-			cmd := &cobra.Command{
-				Use:   "headscale",
-				Short: "headscale - a Tailscale control server",
-			}
-
-			cmd.AddCommand(generateCmd)
-			cmd.PersistentFlags().StringP("output", "o", "", "Output format")
-
-			// Test that the command exists and can be found
-			privateKeyCmd, _, err := cmd.Find([]string{"generate", "private-key"})
-			require.NoError(t, err)
-			assert.Equal(t, "private-key", privateKeyCmd.Name())
-			assert.Equal(t, "Generate a private key for the headscale server", privateKeyCmd.Short)
-		})
-	}
-}
-
-func TestGeneratePrivateKeyHelp(t *testing.T) {
-	cmd := &cobra.Command{
-		Use:   "headscale",
-		Short: "headscale - a Tailscale control server",
-	}
-
-	cmd.AddCommand(generateCmd)
-
-	out := new(bytes.Buffer)
-	cmd.SetOut(out)
-	cmd.SetErr(out)
-	cmd.SetArgs([]string{"generate", "private-key", "--help"})
-
-	err := cmd.Execute()
-	require.NoError(t, err)
-
-	outStr := out.String()
-	assert.Contains(t, outStr, "Generate a private key for the headscale server")
-	assert.Contains(t, outStr, "Usage:")
-}
-
-// Test the key generation logic in isolation (without SuccessOutput/ErrorOutput)
-func TestPrivateKeyGeneration(t *testing.T) {
-	// We can't easily test the full command because it calls SuccessOutput which exits
-	// But we can test that the key generation produces valid output format
-
-	// This is testing the core logic that would be in the command
-	// In a real refactor, we'd extract this to a testable function
-
-	// For now, we can test that the command structure is correct
-	assert.NotNil(t, generatePrivateKeyCmd)
-	assert.Equal(t, "private-key", generatePrivateKeyCmd.Use)
-	assert.Equal(t, "Generate a private key for the headscale server", generatePrivateKeyCmd.Short)
-	assert.NotNil(t, generatePrivateKeyCmd.Run)
-}
-
-func TestGenerateCommandStructure(t *testing.T) {
-	// Test the command hierarchy
-	assert.Equal(t, "generate", generateCmd.Use)
-	assert.Equal(t, "Generate commands", generateCmd.Short)
-	assert.Contains(t, generateCmd.Aliases, "gen")
-
-	// Test that private-key is a subcommand
-	found := false
-	for _, subcmd := range generateCmd.Commands() {
-		if subcmd.Name() == "private-key" {
-			found = true
-			break
-		}
-	}
-	assert.True(t, found, "private-key should be a subcommand of generate")
-}
-
-// Helper function to test output formats (would be used if we refactored the command)
-func validatePrivateKeyOutput(t *testing.T, output string, format string) {
-	switch format {
-	case "json":
-		var result map[string]interface{}
-		err := json.Unmarshal([]byte(output), &result)
-		require.NoError(t, err, "Output should be valid JSON")
-
-		privateKey, exists := result["private_key"]
-		require.True(t, exists, "JSON should contain private_key field")
-
-		keyStr, ok := privateKey.(string)
-		require.True(t, ok, "private_key should be a string")
-		require.NotEmpty(t, keyStr, "private_key should not be empty")
-
-		// Basic validation that it looks like a machine key
-		assert.True(t, strings.HasPrefix(keyStr, "mkey:"), "Machine key should start with mkey:")
-
-	case "yaml":
-		var result map[string]interface{}
-		err := yaml.Unmarshal([]byte(output), &result)
-		require.NoError(t, err, "Output should be valid YAML")
-
-		privateKey, exists := result["private_key"]
-		require.True(t, exists, "YAML should contain private_key field")
-
-		keyStr, ok := privateKey.(string)
-		require.True(t, ok, "private_key should be a string")
-		require.NotEmpty(t, keyStr, "private_key should not be empty")
-
-		assert.True(t, strings.HasPrefix(keyStr, "mkey:"), "Machine key should start with mkey:")
-
-	default:
-		// Default format should just be the key itself
-		assert.True(t, strings.HasPrefix(output, "mkey:"), "Default output should be the machine key")
-		assert.NotContains(t, output, "{", "Default output should not contain JSON")
-		assert.NotContains(t, output, "private_key:", "Default output should not contain YAML structure")
-	}
-}
-
-func TestPrivateKeyOutputFormats(t *testing.T) {
-	// Test cases for different output formats
-	// These test the validation logic we would use after refactoring
-
-	tests := []struct {
-		format string
-		sample string
-	}{
-		{
-			format: "json",
-			sample: `{"private_key": "mkey:abcd1234567890abcd1234567890abcd1234567890abcd1234567890abcd1234"}`,
-		},
-		{
-			format: "yaml",
-			sample: "private_key: mkey:abcd1234567890abcd1234567890abcd1234567890abcd1234567890abcd1234\n",
-		},
-		{
-			format: "",
-			sample: "mkey:abcd1234567890abcd1234567890abcd1234567890abcd1234567890abcd1234",
-		},
-	}
-
-	for _, tt := range tests {
-		t.Run("format_"+tt.format, func(t *testing.T) {
-			validatePrivateKeyOutput(t, tt.sample, tt.format)
-		})
-	}
-}
--- a/cmd/headscale/cli/mockoidc.go
+++ b/cmd/headscale/cli/mockoidc.go
@@ -15,11 +15,6 @@ import (
 	"github.com/spf13/cobra"
 )

-// Error is used to compare errors as per https://dave.cheney.net/2016/04/07/constant-errors
-type Error string
-
-func (e Error) Error() string { return string(e) }
-
 const (
 	errMockOidcClientIDNotDefined     = Error("MOCKOIDC_CLIENT_ID not defined")
 	errMockOidcClientSecretNotDefined = Error("MOCKOIDC_CLIENT_SECRET not defined")
--- a/cmd/headscale/cli/nodes.go
+++ b/cmd/headscale/cli/nodes.go
@@ -1,7 +1,6 @@
 package cli

 import (
-	"context"
 	"fmt"
 	"log"
 	"net/netip"
@@ -10,7 +9,6 @@ import (
 	"strings"
 	"time"

-	survey "github.com/AlecAivazis/survey/v2"
 	v1 "github.com/juanfont/headscale/gen/go/headscale/v1"
 	"github.com/juanfont/headscale/hscontrol/util"
 	"github.com/pterm/pterm"
@@ -22,23 +20,25 @@ import (

 func init() {
 	rootCmd.AddCommand(nodeCmd)
-	// User filtering
 	listNodesCmd.Flags().StringP("user", "u", "", "Filter by user")
-	// Node filtering
-	listNodesCmd.Flags().StringP("node", "", "", "Filter by node (ID, name, hostname, or IP)")
-	listNodesCmd.Flags().Uint64P("id", "", 0, "Filter by node ID")
-	listNodesCmd.Flags().StringP("name", "", "", "Filter by node hostname")
-	listNodesCmd.Flags().StringP("ip", "", "", "Filter by node IP address")
-	// Display options
 	listNodesCmd.Flags().BoolP("tags", "t", false, "Show tags")
-	listNodesCmd.Flags().String("columns", "", "Comma-separated list of columns to display")
+
+	listNodesCmd.Flags().StringP("namespace", "n", "", "User")
+	listNodesNamespaceFlag := listNodesCmd.Flags().Lookup("namespace")
+	listNodesNamespaceFlag.Deprecated = deprecateNamespaceMessage
+	listNodesNamespaceFlag.Hidden = true
 	nodeCmd.AddCommand(listNodesCmd)

-	listNodeRoutesCmd.Flags().StringP("node", "n", "", "Node identifier (ID, name, hostname, or IP)")
+	listNodeRoutesCmd.Flags().Uint64P("identifier", "i", 0, "Node identifier (ID)")
 	nodeCmd.AddCommand(listNodeRoutesCmd)

 	registerNodeCmd.Flags().StringP("user", "u", "", "User")

+	registerNodeCmd.Flags().StringP("namespace", "n", "", "User")
+	registerNodeNamespaceFlag := registerNodeCmd.Flags().Lookup("namespace")
+	registerNodeNamespaceFlag.Deprecated = deprecateNamespaceMessage
+	registerNodeNamespaceFlag.Hidden = true
+
 	err := registerNodeCmd.MarkFlagRequired("user")
 	if err != nil {
 		log.Fatal(err.Error())
@@ -50,43 +50,54 @@ func init() {
 	}
 	nodeCmd.AddCommand(registerNodeCmd)

-	expireNodeCmd.Flags().StringP("node", "n", "", "Node identifier (ID, name, hostname, or IP)")
+	expireNodeCmd.Flags().Uint64P("identifier", "i", 0, "Node identifier (ID)")
+	err = expireNodeCmd.MarkFlagRequired("identifier")
 	if err != nil {
 		log.Fatal(err.Error())
 	}
 	nodeCmd.AddCommand(expireNodeCmd)

-	renameNodeCmd.Flags().StringP("node", "n", "", "Node identifier (ID, name, hostname, or IP)")
+	renameNodeCmd.Flags().Uint64P("identifier", "i", 0, "Node identifier (ID)")
+	err = renameNodeCmd.MarkFlagRequired("identifier")
 	if err != nil {
 		log.Fatal(err.Error())
 	}
 	nodeCmd.AddCommand(renameNodeCmd)

-	deleteNodeCmd.Flags().StringP("node", "n", "", "Node identifier (ID, name, hostname, or IP)")
+	deleteNodeCmd.Flags().Uint64P("identifier", "i", 0, "Node identifier (ID)")
+	err = deleteNodeCmd.MarkFlagRequired("identifier")
 	if err != nil {
 		log.Fatal(err.Error())
 	}
 	nodeCmd.AddCommand(deleteNodeCmd)

-	moveNodeCmd.Flags().StringP("node", "n", "", "Node identifier (ID, name, hostname, or IP)")
+	moveNodeCmd.Flags().Uint64P("identifier", "i", 0, "Node identifier (ID)")

+	err = moveNodeCmd.MarkFlagRequired("identifier")
 	if err != nil {
 		log.Fatal(err.Error())
 	}

-	moveNodeCmd.Flags().StringP("user", "u", "", "New user (ID, name, or email)")
-	moveNodeCmd.Flags().String("name", "", "New username")
+	moveNodeCmd.Flags().Uint64P("user", "u", 0, "New user")

-	// One of --user or --name is required (checked in GetUserIdentifier)
+	moveNodeCmd.Flags().StringP("namespace", "n", "", "User")
+	moveNodeNamespaceFlag := moveNodeCmd.Flags().Lookup("namespace")
+	moveNodeNamespaceFlag.Deprecated = deprecateNamespaceMessage
+	moveNodeNamespaceFlag.Hidden = true
+
+	err = moveNodeCmd.MarkFlagRequired("user")
+	if err != nil {
+		log.Fatal(err.Error())
+	}
 	nodeCmd.AddCommand(moveNodeCmd)

-	tagCmd.Flags().StringP("node", "n", "", "Node identifier (ID, name, hostname, or IP)")
-	tagCmd.MarkFlagRequired("node")
+	tagCmd.Flags().Uint64P("identifier", "i", 0, "Node identifier (ID)")
+	tagCmd.MarkFlagRequired("identifier")
 	tagCmd.Flags().StringSliceP("tags", "t", []string{}, "List of tags to add to the node")
 	nodeCmd.AddCommand(tagCmd)

-	approveRoutesCmd.Flags().StringP("node", "n", "", "Node identifier (ID, name, hostname, or IP)")
-	approveRoutesCmd.MarkFlagRequired("node")
+	approveRoutesCmd.Flags().Uint64P("identifier", "i", 0, "Node identifier (ID)")
+	approveRoutesCmd.MarkFlagRequired("identifier")
 	approveRoutesCmd.Flags().StringSliceP("routes", "r", []string{}, `List of routes that will be approved (comma-separated, e.g. "10.0.0.0/8,192.168.0.0/24" or empty string to remove all approved routes)`)
 	nodeCmd.AddCommand(approveRoutesCmd)

@@ -103,13 +114,16 @@ var registerNodeCmd = &cobra.Command{
 	Use:   "register",
 	Short: "Registers a node to your network",
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")
 		user, err := cmd.Flags().GetString("user")
 		if err != nil {
 			ErrorOutput(err, fmt.Sprintf("Error getting user: %s", err), output)
-			return
 		}

+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()
+
 		registrationID, err := cmd.Flags().GetString("key")
 		if err != nil {
 			ErrorOutput(
@@ -117,36 +131,28 @@ var registerNodeCmd = &cobra.Command{
 				fmt.Sprintf("Error getting node key from flag: %s", err),
 				output,
 			)
-			return
 		}

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			request := &v1.RegisterNodeRequest{
-				Key:  registrationID,
-				User: user,
-			}
+		request := &v1.RegisterNodeRequest{
+			Key:  registrationID,
+			User: user,
+		}

-			response, err := client.RegisterNode(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf(
-						"Cannot register node: %s\n",
-						status.Convert(err).Message(),
-					),
-					output,
-				)
-				return err
-			}
-
-			SuccessOutput(
-				response.GetNode(),
-				fmt.Sprintf("Node %s registered", response.GetNode().GetGivenName()), output)
-			return nil
-		})
+		response, err := client.RegisterNode(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf(
+					"Cannot register node: %s\n",
+					status.Convert(err).Message(),
+				),
+				output,
+			)
 		}
+
+		SuccessOutput(
+			response.GetNode(),
+			fmt.Sprintf("Node %s registered", response.GetNode().GetGivenName()), output)
 	},
 }

@@ -155,79 +161,49 @@ var listNodesCmd = &cobra.Command{
 	Short:   "List nodes",
 	Aliases: []string{"ls", "show"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")
+		user, err := cmd.Flags().GetString("user")
+		if err != nil {
+			ErrorOutput(err, fmt.Sprintf("Error getting user: %s", err), output)
+		}
 		showTags, err := cmd.Flags().GetBool("tags")
 		if err != nil {
 			ErrorOutput(err, fmt.Sprintf("Error getting tags flag: %s", err), output)
-			return
 		}

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			request := &v1.ListNodesRequest{}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			// Handle user filtering (existing functionality)
-			if user, _ := cmd.Flags().GetString("user"); user != "" {
-				request.User = user
-			}
+		request := &v1.ListNodesRequest{
+			User: user,
+		}

-			// Handle node filtering (new functionality)
-			if nodeFlag, _ := cmd.Flags().GetString("node"); nodeFlag != "" {
-				// Use smart lookup to determine filter type
-				if id, err := strconv.ParseUint(nodeFlag, 10, 64); err == nil && id > 0 {
-					request.Id = id
-				} else if isIPAddress(nodeFlag) {
-					request.IpAddresses = []string{nodeFlag}
-				} else {
-					request.Name = nodeFlag
-				}
-			} else {
-				// Check specific filter flags
-				if id, _ := cmd.Flags().GetUint64("id"); id > 0 {
-					request.Id = id
-				} else if name, _ := cmd.Flags().GetString("name"); name != "" {
-					request.Name = name
-				} else if ip, _ := cmd.Flags().GetString("ip"); ip != "" {
-					request.IpAddresses = []string{ip}
-				}
-			}
-
-			response, err := client.ListNodes(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					"Cannot get nodes: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
-
-			if output != "" {
-				SuccessOutput(response.GetNodes(), "", output)
-				return nil
-			}
-
-			// Get user for table display (if filtering by user)
-			userFilter := request.User
-			tableData, err := nodesToPtables(userFilter, showTags, response.GetNodes())
-			if err != nil {
-				ErrorOutput(err, fmt.Sprintf("Error converting to table: %s", err), output)
-				return err
-			}
-
-			tableData = FilterTableColumns(cmd, tableData)
-			err = pterm.DefaultTable.WithHasHeader().WithData(tableData).Render()
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Failed to render pterm table: %s", err),
-					output,
-				)
-				return err
-			}
-			return nil
-		})
+		response, err := client.ListNodes(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				"Cannot get nodes: "+status.Convert(err).Message(),
+				output,
+			)
+		}
+
+		if output != "" {
+			SuccessOutput(response.GetNodes(), "", output)
+		}
+
+		tableData, err := nodesToPtables(user, showTags, response.GetNodes())
+		if err != nil {
+			ErrorOutput(err, fmt.Sprintf("Error converting to table: %s", err), output)
+		}
+
+		err = pterm.DefaultTable.WithHasHeader().WithData(tableData).Render()
+		if err != nil {
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Failed to render pterm table: %s", err),
+				output,
+			)
 		}
 	},
 }
@@ -237,68 +213,61 @@ var listNodeRoutesCmd = &cobra.Command{
 	Short:   "List routes available on nodes",
 	Aliases: []string{"lsr", "routes"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
-		identifier, err := GetNodeIdentifier(cmd)
+		output, _ := cmd.Flags().GetString("output")
+		identifier, err := cmd.Flags().GetUint64("identifier")
 		if err != nil {
 			ErrorOutput(
 				err,
-				fmt.Sprintf("Error getting node identifier: %s", err),
+				fmt.Sprintf("Error converting ID to integer: %s", err),
 				output,
 			)
-			return
 		}

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			request := &v1.ListNodesRequest{}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			response, err := client.ListNodes(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					"Cannot get nodes: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
+		request := &v1.ListNodesRequest{}

-			if output != "" {
-				SuccessOutput(response.GetNodes(), "", output)
-				return nil
-			}
+		response, err := client.ListNodes(ctx, request)
+		if err != nil {
+			ErrorOutput(
+				err,
+				"Cannot get nodes: "+status.Convert(err).Message(),
+				output,
+			)
+		}

-			nodes := response.GetNodes()
-			if identifier != 0 {
-				for _, node := range response.GetNodes() {
-					if node.GetId() == identifier {
-						nodes = []*v1.Node{node}
-						break
-					}
+		if output != "" {
+			SuccessOutput(response.GetNodes(), "", output)
+		}
+
+		nodes := response.GetNodes()
+		if identifier != 0 {
+			for _, node := range response.GetNodes() {
+				if node.GetId() == identifier {
+					nodes = []*v1.Node{node}
+					break
 				}
 			}
+		}

-			nodes = lo.Filter(nodes, func(n *v1.Node, _ int) bool {
-				return (n.GetSubnetRoutes() != nil && len(n.GetSubnetRoutes()) > 0) || (n.GetApprovedRoutes() != nil && len(n.GetApprovedRoutes()) > 0) || (n.GetAvailableRoutes() != nil && len(n.GetAvailableRoutes()) > 0)
-			})
-
-			tableData, err := nodeRoutesToPtables(nodes)
-			if err != nil {
-				ErrorOutput(err, fmt.Sprintf("Error converting to table: %s", err), output)
-				return err
-			}
-
-			err = pterm.DefaultTable.WithHasHeader().WithData(tableData).Render()
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Failed to render pterm table: %s", err),
-					output,
-				)
-				return err
-			}
-			return nil
+		nodes = lo.Filter(nodes, func(n *v1.Node, _ int) bool {
+			return (n.GetSubnetRoutes() != nil && len(n.GetSubnetRoutes()) > 0) || (n.GetApprovedRoutes() != nil && len(n.GetApprovedRoutes()) > 0) || (n.GetAvailableRoutes() != nil && len(n.GetAvailableRoutes()) > 0)
 		})
+
+		tableData, err := nodeRoutesToPtables(nodes)
 		if err != nil {
-			return
+			ErrorOutput(err, fmt.Sprintf("Error converting to table: %s", err), output)
+		}
+
+		err = pterm.DefaultTable.WithHasHeader().WithData(tableData).Render()
+		if err != nil {
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Failed to render pterm table: %s", err),
+				output,
+			)
 		}
 	},
 }
@@ -309,42 +278,38 @@ var expireNodeCmd = &cobra.Command{
 	Long:    "Expiring a node will keep the node in the database and force it to reauthenticate.",
 	Aliases: []string{"logout", "exp", "e"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")

-		identifier, err := GetNodeIdentifier(cmd)
+		identifier, err := cmd.Flags().GetUint64("identifier")
 		if err != nil {
 			ErrorOutput(
 				err,
-				fmt.Sprintf("Error getting node identifier: %s", err),
+				fmt.Sprintf("Error converting ID to integer: %s", err),
 				output,
 			)
-			return
 		}

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			request := &v1.ExpireNodeRequest{
-				NodeId: identifier,
-			}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			response, err := client.ExpireNode(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf(
-						"Cannot expire node: %s\n",
-						status.Convert(err).Message(),
-					),
-					output,
-				)
-				return err
-			}
+		request := &v1.ExpireNodeRequest{
+			NodeId: identifier,
+		}

-			SuccessOutput(response.GetNode(), "Node expired", output)
-			return nil
-		})
+		response, err := client.ExpireNode(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf(
+					"Cannot expire node: %s\n",
+					status.Convert(err).Message(),
+				),
+				output,
+			)
 		}
+
+		SuccessOutput(response.GetNode(), "Node expired", output)
 	},
 }

@@ -352,48 +317,43 @@ var renameNodeCmd = &cobra.Command{
 	Use:   "rename NEW_NAME",
 	Short: "Renames a node in your network",
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")

-		identifier, err := GetNodeIdentifier(cmd)
+		identifier, err := cmd.Flags().GetUint64("identifier")
 		if err != nil {
 			ErrorOutput(
 				err,
-				fmt.Sprintf("Error getting node identifier: %s", err),
+				fmt.Sprintf("Error converting ID to integer: %s", err),
 				output,
 			)
-			return
 		}

+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()
+
 		newName := ""
 		if len(args) > 0 {
 			newName = args[0]
 		}
-
-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			request := &v1.RenameNodeRequest{
-				NodeId:  identifier,
-				NewName: newName,
-			}
-
-			response, err := client.RenameNode(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf(
-						"Cannot rename node: %s\n",
-						status.Convert(err).Message(),
-					),
-					output,
-				)
-				return err
-			}
-
-			SuccessOutput(response.GetNode(), "Node renamed", output)
-			return nil
-		})
-		if err != nil {
-			return
+		request := &v1.RenameNodeRequest{
+			NodeId:  identifier,
+			NewName: newName,
 		}
+
+		response, err := client.RenameNode(ctx, request)
+		if err != nil {
+			ErrorOutput(
+				err,
+				fmt.Sprintf(
+					"Cannot rename node: %s\n",
+					status.Convert(err).Message(),
+				),
+				output,
+			)
+		}
+
+		SuccessOutput(response.GetNode(), "Node renamed", output)
 	},
 }

@@ -402,84 +362,66 @@ var deleteNodeCmd = &cobra.Command{
 	Short:   "Delete a node",
 	Aliases: []string{"del"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")

-		identifier, err := GetNodeIdentifier(cmd)
+		identifier, err := cmd.Flags().GetUint64("identifier")
 		if err != nil {
 			ErrorOutput(
 				err,
-				fmt.Sprintf("Error getting node identifier: %s", err),
+				fmt.Sprintf("Error converting ID to integer: %s", err),
 				output,
 			)
-			return
 		}

-		var nodeName string
-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			getRequest := &v1.GetNodeRequest{
-				NodeId: identifier,
-			}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			getResponse, err := client.GetNode(ctx, getRequest)
-			if err != nil {
-				ErrorOutput(
-					err,
-					"Error getting node node: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
-			nodeName = getResponse.GetNode().GetName()
-			return nil
-		})
+		getRequest := &v1.GetNodeRequest{
+			NodeId: identifier,
+		}
+
+		getResponse, err := client.GetNode(ctx, getRequest)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				"Error getting node node: "+status.Convert(err).Message(),
+				output,
+			)
+		}
+
+		deleteRequest := &v1.DeleteNodeRequest{
+			NodeId: identifier,
 		}

 		confirm := false
 		force, _ := cmd.Flags().GetBool("force")
 		if !force {
-			prompt := &survey.Confirm{
-				Message: fmt.Sprintf(
-					"Do you want to remove the node %s?",
-					nodeName,
-				),
-			}
-			err = survey.AskOne(prompt, &confirm)
-			if err != nil {
-				return
-			}
+			confirm = util.YesNo(fmt.Sprintf(
+				"Do you want to remove the node %s?",
+				getResponse.GetNode().GetName(),
+			))
 		}

 		if confirm || force {
-			err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-				deleteRequest := &v1.DeleteNodeRequest{
-					NodeId: identifier,
-				}
+			response, err := client.DeleteNode(ctx, deleteRequest)
+			if output != "" {
+				SuccessOutput(response, "", output)

-				response, err := client.DeleteNode(ctx, deleteRequest)
-				if output != "" {
-					SuccessOutput(response, "", output)
-					return nil
-				}
-				if err != nil {
-					ErrorOutput(
-						err,
-						"Error deleting node: "+status.Convert(err).Message(),
-						output,
-					)
-					return err
-				}
-				SuccessOutput(
-					map[string]string{"Result": "Node deleted"},
-					"Node deleted",
-					output,
-				)
-				return nil
-			})
-			if err != nil {
 				return
 			}
+			if err != nil {
+				ErrorOutput(
+					err,
+					"Error deleting node: "+status.Convert(err).Message(),
+					output,
+				)
+			}
+			SuccessOutput(
+				map[string]string{"Result": "Node deleted"},
+				"Node deleted",
+				output,
+			)
 		} else {
 			SuccessOutput(map[string]string{"Result": "Node not deleted"}, "Node not deleted", output)
 		}
@@ -491,71 +433,64 @@ var moveNodeCmd = &cobra.Command{
 	Short:   "Move node to another user",
 	Aliases: []string{"mv"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")

-		identifier, err := GetNodeIdentifier(cmd)
+		identifier, err := cmd.Flags().GetUint64("identifier")
 		if err != nil {
 			ErrorOutput(
 				err,
-				fmt.Sprintf("Error getting node identifier: %s", err),
+				fmt.Sprintf("Error converting ID to integer: %s", err),
 				output,
 			)
-			return
 		}

-		userID, err := GetUserIdentifier(cmd)
+		user, err := cmd.Flags().GetUint64("user")
 		if err != nil {
 			ErrorOutput(
 				err,
 				fmt.Sprintf("Error getting user: %s", err),
 				output,
 			)
-			return
 		}

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			getRequest := &v1.GetNodeRequest{
-				NodeId: identifier,
-			}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			_, err := client.GetNode(ctx, getRequest)
-			if err != nil {
-				ErrorOutput(
-					err,
-					"Error getting node: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
+		getRequest := &v1.GetNodeRequest{
+			NodeId: identifier,
+		}

-			moveRequest := &v1.MoveNodeRequest{
-				NodeId: identifier,
-				User:   userID,
-			}
-
-			moveResponse, err := client.MoveNode(ctx, moveRequest)
-			if err != nil {
-				ErrorOutput(
-					err,
-					"Error moving node: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
-
-			SuccessOutput(moveResponse.GetNode(), "Node moved to another user", output)
-			return nil
-		})
+		_, err = client.GetNode(ctx, getRequest)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				"Error getting node: "+status.Convert(err).Message(),
+				output,
+			)
 		}
+
+		moveRequest := &v1.MoveNodeRequest{
+			NodeId: identifier,
+			User:   user,
+		}
+
+		moveResponse, err := client.MoveNode(ctx, moveRequest)
+		if err != nil {
+			ErrorOutput(
+				err,
+				"Error moving node: "+status.Convert(err).Message(),
+				output,
+			)
+		}
+
+		SuccessOutput(moveResponse.GetNode(), "Node moved to another user", output)
 	},
 }

 var backfillNodeIPsCmd = &cobra.Command{
-	Use:     "backfill-ips",
-	Short:   "Backfill IPs missing from nodes",
-	Aliases: []string{"backfillips"},
+	Use:   "backfillips",
+	Short: "Backfill IPs missing from nodes",
 	Long: `
 Backfill IPs can be used to add/remove IPs from nodes
 based on the current configuration of Headscale.
@@ -569,35 +504,30 @@ If you remove IPv4 or IPv6 prefixes from the config,
 it can be run to remove the IPs that should no longer
 be assigned to nodes.`,
 	Run: func(cmd *cobra.Command, args []string) {
-		var err error
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")

 		confirm := false
-		prompt := &survey.Confirm{
-			Message: "Are you sure that you want to assign/remove IPs to/from nodes?",
-		}
-		err = survey.AskOne(prompt, &confirm)
-		if err != nil {
-			return
-		}
-		if confirm {
-			err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-				changes, err := client.BackfillNodeIPs(ctx, &v1.BackfillNodeIPsRequest{Confirmed: confirm})
-				if err != nil {
-					ErrorOutput(
-						err,
-						"Error backfilling IPs: "+status.Convert(err).Message(),
-						output,
-					)
-					return err
-				}

-				SuccessOutput(changes, "Node IPs backfilled successfully", output)
-				return nil
-			})
+		force, _ := cmd.Flags().GetBool("force")
+		if !force {
+			confirm = util.YesNo("Are you sure that you want to assign/remove IPs to/from nodes?")
+		}
+
+		if confirm || force {
+			ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+			defer cancel()
+			defer conn.Close()
+
+			changes, err := client.BackfillNodeIPs(ctx, &v1.BackfillNodeIPsRequest{Confirmed: confirm || force})
 			if err != nil {
-				return
+				ErrorOutput(
+					err,
+					"Error backfilling IPs: "+status.Convert(err).Message(),
+					output,
+				)
 			}
+
+			SuccessOutput(changes, "Node IPs backfilled successfully", output)
 		}
 	},
 }
@@ -640,14 +570,14 @@ func nodesToPtables(
 		var lastSeenTime string
 		if node.GetLastSeen() != nil {
 			lastSeen = node.GetLastSeen().AsTime()
-			lastSeenTime = lastSeen.Format(HeadscaleDateTimeFormat)
+			lastSeenTime = lastSeen.Format("2006-01-02 15:04:05")
 		}

 		var expiry time.Time
 		var expiryTime string
 		if node.GetExpiry() != nil {
 			expiry = node.GetExpiry().AsTime()
-			expiryTime = expiry.Format(HeadscaleDateTimeFormat)
+			expiryTime = expiry.Format("2006-01-02 15:04:05")
 		} else {
 			expiryTime = "N/A"
 		}
@@ -780,17 +710,19 @@ var tagCmd = &cobra.Command{
 	Short:   "Manage the tags of a node",
 	Aliases: []string{"tags", "t"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

 		// retrieve flags from CLI
-		identifier, err := GetNodeIdentifier(cmd)
+		identifier, err := cmd.Flags().GetUint64("identifier")
 		if err != nil {
 			ErrorOutput(
 				err,
-				fmt.Sprintf("Error getting node identifier: %s", err),
+				fmt.Sprintf("Error converting ID to integer: %s", err),
 				output,
 			)
-			return
 		}
 		tagsToSet, err := cmd.Flags().GetStringSlice("tags")
 		if err != nil {
@@ -799,36 +731,28 @@ var tagCmd = &cobra.Command{
 				fmt.Sprintf("Error retrieving list of tags to add to node, %v", err),
 				output,
 			)
-			return
 		}

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			// Sending tags to node
-			request := &v1.SetTagsRequest{
-				NodeId: identifier,
-				Tags:   tagsToSet,
-			}
-			resp, err := client.SetTags(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Error while sending tags to headscale: %s", err),
-					output,
-				)
-				return err
-			}
-
-			if resp != nil {
-				SuccessOutput(
-					resp.GetNode(),
-					"Node updated",
-					output,
-				)
-			}
-			return nil
-		})
+		// Sending tags to node
+		request := &v1.SetTagsRequest{
+			NodeId: identifier,
+			Tags:   tagsToSet,
+		}
+		resp, err := client.SetTags(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Error while sending tags to headscale: %s", err),
+				output,
+			)
+		}
+
+		if resp != nil {
+			SuccessOutput(
+				resp.GetNode(),
+				"Node updated",
+				output,
+			)
 		}
 	},
 }
@@ -837,17 +761,19 @@ var approveRoutesCmd = &cobra.Command{
 	Use:   "approve-routes",
 	Short: "Manage the approved routes of a node",
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

 		// retrieve flags from CLI
-		identifier, err := GetNodeIdentifier(cmd)
+		identifier, err := cmd.Flags().GetUint64("identifier")
 		if err != nil {
 			ErrorOutput(
 				err,
-				fmt.Sprintf("Error getting node identifier: %s", err),
+				fmt.Sprintf("Error converting ID to integer: %s", err),
 				output,
 			)
-			return
 		}
 		routes, err := cmd.Flags().GetStringSlice("routes")
 		if err != nil {
@@ -856,36 +782,28 @@ var approveRoutesCmd = &cobra.Command{
 				fmt.Sprintf("Error retrieving list of routes to add to node, %v", err),
 				output,
 			)
-			return
 		}

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			// Sending routes to node
-			request := &v1.SetApprovedRoutesRequest{
-				NodeId: identifier,
-				Routes: routes,
-			}
-			resp, err := client.SetApprovedRoutes(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Error while sending routes to headscale: %s", err),
-					output,
-				)
-				return err
-			}
-
-			if resp != nil {
-				SuccessOutput(
-					resp.GetNode(),
-					"Node updated",
-					output,
-				)
-			}
-			return nil
-		})
+		// Sending routes to node
+		request := &v1.SetApprovedRoutesRequest{
+			NodeId: identifier,
+			Routes: routes,
+		}
+		resp, err := client.SetApprovedRoutes(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Error while sending routes to headscale: %s", err),
+				output,
+			)
+		}
+
+		if resp != nil {
+			SuccessOutput(
+				resp.GetNode(),
+				"Node updated",
+				output,
+			)
 		}
 	},
 }
--- a/cmd/headscale/cli/policy.go
+++ b/cmd/headscale/cli/policy.go
@@ -1,27 +1,35 @@
 package cli

 import (
-	"context"
 	"fmt"
 	"io"
 	"os"

 	v1 "github.com/juanfont/headscale/gen/go/headscale/v1"
+	"github.com/juanfont/headscale/hscontrol/db"
 	"github.com/juanfont/headscale/hscontrol/policy"
 	"github.com/juanfont/headscale/hscontrol/types"
+	"github.com/juanfont/headscale/hscontrol/util"
 	"github.com/rs/zerolog/log"
 	"github.com/spf13/cobra"
 	"tailscale.com/types/views"
 )

+const (
+	bypassFlag = "bypass-grpc-and-access-database-directly"
+)
+
 func init() {
 	rootCmd.AddCommand(policyCmd)
+
+	getPolicy.Flags().BoolP(bypassFlag, "", false, "Uses the headscale config to directly access the database, bypassing gRPC and does not require the server to be running")
 	policyCmd.AddCommand(getPolicy)

 	setPolicy.Flags().StringP("file", "f", "", "Path to a policy file in HuJSON format")
 	if err := setPolicy.MarkFlagRequired("file"); err != nil {
 		log.Fatal().Err(err).Msg("")
 	}
+	setPolicy.Flags().BoolP(bypassFlag, "", false, "Uses the headscale config to directly access the database, bypassing gRPC and does not require the server to be running")
 	policyCmd.AddCommand(setPolicy)

 	checkPolicy.Flags().StringP("file", "f", "", "Path to a policy file in HuJSON format")
@@ -41,26 +49,59 @@ var getPolicy = &cobra.Command{
 	Short:   "Print the current ACL Policy",
 	Aliases: []string{"show", "view", "fetch"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")
+		var policy string
+		if bypass, _ := cmd.Flags().GetBool(bypassFlag); bypass {
+			confirm := false
+			force, _ := cmd.Flags().GetBool("force")
+			if !force {
+				confirm = util.YesNo("DO NOT run this command if an instance of headscale is running, are you sure headscale is not running?")
+			}
+
+			if !confirm && !force {
+				ErrorOutput(nil, "Aborting command", output)
+				return
+			}
+
+			cfg, err := types.LoadServerConfig()
+			if err != nil {
+				ErrorOutput(err, fmt.Sprintf("Failed loading config: %s", err), output)
+			}
+
+			d, err := db.NewHeadscaleDatabase(
+				cfg.Database,
+				cfg.BaseDomain,
+				nil,
+			)
+			if err != nil {
+				ErrorOutput(err, fmt.Sprintf("Failed to open database: %s", err), output)
+			}
+
+			pol, err := d.GetPolicy()
+			if err != nil {
+				ErrorOutput(err, fmt.Sprintf("Failed loading Policy from database: %s", err), output)
+			}
+
+			policy = pol.Data
+		} else {
+			ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+			defer cancel()
+			defer conn.Close()

-		err := WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
 			request := &v1.GetPolicyRequest{}

 			response, err := client.GetPolicy(ctx, request)
 			if err != nil {
 				ErrorOutput(err, fmt.Sprintf("Failed loading ACL Policy: %s", err), output)
-				return err
 			}

-			// TODO(pallabpain): Maybe print this better?
-			// This does not pass output as we dont support yaml, json or json-line
-			// output for this command. It is HuJSON already.
-			SuccessOutput("", response.GetPolicy(), "")
-			return nil
-		})
-		if err != nil {
-			return
+			policy = response.GetPolicy()
 		}
+
+		// TODO(pallabpain): Maybe print this better?
+		// This does not pass output as we dont support yaml, json or json-line
+		// output for this command. It is HuJSON already.
+		SuccessOutput("", policy, "")
 	},
 }

@@ -72,57 +113,18 @@ var setPolicy = &cobra.Command{
 	This command only works when the acl.policy_mode is set to "db", and the policy will be stored in the database.`,
 	Aliases: []string{"put", "update"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")
 		policyPath, _ := cmd.Flags().GetString("file")

 		f, err := os.Open(policyPath)
 		if err != nil {
 			ErrorOutput(err, fmt.Sprintf("Error opening the policy file: %s", err), output)
-			return
 		}
 		defer f.Close()

 		policyBytes, err := io.ReadAll(f)
 		if err != nil {
 			ErrorOutput(err, fmt.Sprintf("Error reading the policy file: %s", err), output)
-			return
-		}
-
-		request := &v1.SetPolicyRequest{Policy: string(policyBytes)}
-
-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			if _, err := client.SetPolicy(ctx, request); err != nil {
-				ErrorOutput(err, fmt.Sprintf("Failed to set ACL Policy: %s", err), output)
-				return err
-			}
-
-			SuccessOutput(nil, "Policy updated.", "")
-			return nil
-		})
-		if err != nil {
-			return
-		}
-	},
-}
-
-var checkPolicy = &cobra.Command{
-	Use:   "check",
-	Short: "Check the Policy file for errors",
-	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
-		policyPath, _ := cmd.Flags().GetString("file")
-
-		f, err := os.Open(policyPath)
-		if err != nil {
-			ErrorOutput(err, fmt.Sprintf("Error opening the policy file: %s", err), output)
-			return
-		}
-		defer f.Close()
-
-		policyBytes, err := io.ReadAll(f)
-		if err != nil {
-			ErrorOutput(err, fmt.Sprintf("Error reading the policy file: %s", err), output)
-			return
 		}

 		_, err = policy.NewPolicyManager(policyBytes, nil, views.Slice[types.NodeView]{})
@@ -131,6 +133,75 @@ var checkPolicy = &cobra.Command{
 			return
 		}

+		if bypass, _ := cmd.Flags().GetBool(bypassFlag); bypass {
+			confirm := false
+			force, _ := cmd.Flags().GetBool("force")
+			if !force {
+				confirm = util.YesNo("DO NOT run this command if an instance of headscale is running, are you sure headscale is not running?")
+			}
+
+			if !confirm && !force {
+				ErrorOutput(nil, "Aborting command", output)
+				return
+			}
+
+			cfg, err := types.LoadServerConfig()
+			if err != nil {
+				ErrorOutput(err, fmt.Sprintf("Failed loading config: %s", err), output)
+			}
+
+			d, err := db.NewHeadscaleDatabase(
+				cfg.Database,
+				cfg.BaseDomain,
+				nil,
+			)
+			if err != nil {
+				ErrorOutput(err, fmt.Sprintf("Failed to open database: %s", err), output)
+			}
+
+			_, err = d.SetPolicy(string(policyBytes))
+			if err != nil {
+				ErrorOutput(err, fmt.Sprintf("Failed to set ACL Policy: %s", err), output)
+			}
+		} else {
+			request := &v1.SetPolicyRequest{Policy: string(policyBytes)}
+
+			ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+			defer cancel()
+			defer conn.Close()
+
+			if _, err := client.SetPolicy(ctx, request); err != nil {
+				ErrorOutput(err, fmt.Sprintf("Failed to set ACL Policy: %s", err), output)
+			}
+		}
+
+		SuccessOutput(nil, "Policy updated.", "")
+	},
+}
+
+var checkPolicy = &cobra.Command{
+	Use:   "check",
+	Short: "Check the Policy file for errors",
+	Run: func(cmd *cobra.Command, args []string) {
+		output, _ := cmd.Flags().GetString("output")
+		policyPath, _ := cmd.Flags().GetString("file")
+
+		f, err := os.Open(policyPath)
+		if err != nil {
+			ErrorOutput(err, fmt.Sprintf("Error opening the policy file: %s", err), output)
+		}
+		defer f.Close()
+
+		policyBytes, err := io.ReadAll(f)
+		if err != nil {
+			ErrorOutput(err, fmt.Sprintf("Error reading the policy file: %s", err), output)
+		}
+
+		_, err = policy.NewPolicyManager(policyBytes, nil, views.Slice[types.NodeView]{})
+		if err != nil {
+			ErrorOutput(err, fmt.Sprintf("Error parsing the policy file: %s", err), output)
+		}
+
 		SuccessOutput(nil, "Policy is valid", "")
 	},
 }
--- a/cmd/headscale/cli/preauthkeys.go
+++ b/cmd/headscale/cli/preauthkeys.go
@@ -1,7 +1,6 @@
 package cli

 import (
-	"context"
 	"fmt"
 	"strconv"
 	"strings"
@@ -15,10 +14,19 @@ import (
 	"google.golang.org/protobuf/types/known/timestamppb"
 )

+const (
+	DefaultPreAuthKeyExpiry = "1h"
+)
+
 func init() {
 	rootCmd.AddCommand(preauthkeysCmd)
 	preauthkeysCmd.PersistentFlags().Uint64P("user", "u", 0, "User identifier (ID)")

+	preauthkeysCmd.PersistentFlags().StringP("namespace", "n", "", "User")
+	pakNamespaceFlag := preauthkeysCmd.PersistentFlags().Lookup("namespace")
+	pakNamespaceFlag.Deprecated = deprecateNamespaceMessage
+	pakNamespaceFlag.Hidden = true
+
 	err := preauthkeysCmd.MarkPersistentFlagRequired("user")
 	if err != nil {
 		log.Fatal().Err(err).Msg("")
@@ -47,85 +55,81 @@ var listPreAuthKeys = &cobra.Command{
 	Short:   "List the preauthkeys for this user",
 	Aliases: []string{"ls", "show"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")

 		user, err := cmd.Flags().GetUint64("user")
 		if err != nil {
 			ErrorOutput(err, fmt.Sprintf("Error getting user: %s", err), output)
+		}
+
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()
+
+		request := &v1.ListPreAuthKeysRequest{
+			User: user,
+		}
+
+		response, err := client.ListPreAuthKeys(ctx, request)
+		if err != nil {
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Error getting the list of keys: %s", err),
+				output,
+			)
+
 			return
 		}

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			request := &v1.ListPreAuthKeysRequest{
-				User: user,
+		if output != "" {
+			SuccessOutput(response.GetPreAuthKeys(), "", output)
+		}
+
+		tableData := pterm.TableData{
+			{
+				"ID",
+				"Key",
+				"Reusable",
+				"Ephemeral",
+				"Used",
+				"Expiration",
+				"Created",
+				"Tags",
+			},
+		}
+		for _, key := range response.GetPreAuthKeys() {
+			expiration := "-"
+			if key.GetExpiration() != nil {
+				expiration = ColourTime(key.GetExpiration().AsTime())
 			}

-			response, err := client.ListPreAuthKeys(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Error getting the list of keys: %s", err),
-					output,
-				)
-				return err
+			aclTags := ""
+
+			for _, tag := range key.GetAclTags() {
+				aclTags += "," + tag
 			}

-			if output != "" {
-				SuccessOutput(response.GetPreAuthKeys(), "", output)
-				return nil
-			}
+			aclTags = strings.TrimLeft(aclTags, ",")

-			tableData := pterm.TableData{
-				{
-					"ID",
-					"Key",
-					"Reusable",
-					"Ephemeral",
-					"Used",
-					"Expiration",
-					"Created",
-					"Tags",
-				},
-			}
-			for _, key := range response.GetPreAuthKeys() {
-				expiration := "-"
-				if key.GetExpiration() != nil {
-					expiration = ColourTime(key.GetExpiration().AsTime())
-				}
+			tableData = append(tableData, []string{
+				strconv.FormatUint(key.GetId(), 10),
+				key.GetKey(),
+				strconv.FormatBool(key.GetReusable()),
+				strconv.FormatBool(key.GetEphemeral()),
+				strconv.FormatBool(key.GetUsed()),
+				expiration,
+				key.GetCreatedAt().AsTime().Format("2006-01-02 15:04:05"),
+				aclTags,
+			})

-				aclTags := ""
-
-				for _, tag := range key.GetAclTags() {
-					aclTags += "," + tag
-				}
-
-				aclTags = strings.TrimLeft(aclTags, ",")
-
-				tableData = append(tableData, []string{
-					strconv.FormatUint(key.GetId(), 10),
-					key.GetKey(),
-					strconv.FormatBool(key.GetReusable()),
-					strconv.FormatBool(key.GetEphemeral()),
-					strconv.FormatBool(key.GetUsed()),
-					expiration,
-					key.GetCreatedAt().AsTime().Format(HeadscaleDateTimeFormat),
-					aclTags,
-				})
-
-			}
-			err = pterm.DefaultTable.WithHasHeader().WithData(tableData).Render()
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Failed to render pterm table: %s", err),
-					output,
-				)
-				return err
-			}
-			return nil
-		})
+		}
+		err = pterm.DefaultTable.WithHasHeader().WithData(tableData).Render()
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Failed to render pterm table: %s", err),
+				output,
+			)
 		}
 	},
 }
@@ -135,12 +139,11 @@ var createPreAuthKeyCmd = &cobra.Command{
 	Short:   "Creates a new preauthkey in the specified user",
 	Aliases: []string{"c", "new"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")

 		user, err := cmd.Flags().GetUint64("user")
 		if err != nil {
 			ErrorOutput(err, fmt.Sprintf("Error getting user: %s", err), output)
-			return
 		}

 		reusable, _ := cmd.Flags().GetBool("reusable")
@@ -163,7 +166,6 @@ var createPreAuthKeyCmd = &cobra.Command{
 				fmt.Sprintf("Could not parse duration: %s\n", err),
 				output,
 			)
-			return
 		}

 		expiration := time.Now().UTC().Add(time.Duration(duration))
@@ -174,23 +176,20 @@ var createPreAuthKeyCmd = &cobra.Command{

 		request.Expiration = timestamppb.New(expiration)

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			response, err := client.CreatePreAuthKey(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Cannot create Pre Auth Key: %s\n", err),
-					output,
-				)
-				return err
-			}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			SuccessOutput(response.GetPreAuthKey(), response.GetPreAuthKey().GetKey(), output)
-			return nil
-		})
+		response, err := client.CreatePreAuthKey(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Cannot create Pre Auth Key: %s\n", err),
+				output,
+			)
 		}
+
+		SuccessOutput(response.GetPreAuthKey(), response.GetPreAuthKey().GetKey(), output)
 	},
 }

@@ -206,34 +205,30 @@ var expirePreAuthKeyCmd = &cobra.Command{
 		return nil
 	},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")
 		user, err := cmd.Flags().GetUint64("user")
 		if err != nil {
 			ErrorOutput(err, fmt.Sprintf("Error getting user: %s", err), output)
-			return
 		}

-		err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			request := &v1.ExpirePreAuthKeyRequest{
-				User: user,
-				Key:  args[0],
-			}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			response, err := client.ExpirePreAuthKey(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Cannot expire Pre Auth Key: %s\n", err),
-					output,
-				)
-				return err
-			}
+		request := &v1.ExpirePreAuthKeyRequest{
+			User: user,
+			Key:  args[0],
+		}

-			SuccessOutput(response, "Key expired", output)
-			return nil
-		})
+		response, err := client.ExpirePreAuthKey(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Cannot expire Pre Auth Key: %s\n", err),
+				output,
+			)
 		}
+
+		SuccessOutput(response, "Key expired", output)
 	},
 }
--- a/cmd/headscale/cli/pterm_style.go
+++ b/cmd/headscale/cli/pterm_style.go
@@ -7,7 +7,7 @@ import (
 )

 func ColourTime(date time.Time) string {
-	dateStr := date.Format(HeadscaleDateTimeFormat)
+	dateStr := date.Format("2006-01-02 15:04:05")

 	if date.After(time.Now()) {
 		dateStr = pterm.LightGreen(dateStr)
--- a/cmd/headscale/cli/root.go
+++ b/cmd/headscale/cli/root.go
@@ -14,6 +14,10 @@ import (
 	"github.com/tcnksm/go-latest"
 )

+const (
+	deprecateNamespaceMessage = "use --user"
+)
+
 var cfgFile string = ""

 func init() {
@@ -67,19 +71,20 @@ func initConfig() {

 	disableUpdateCheck := viper.GetBool("disable_check_updates")
 	if !disableUpdateCheck && !machineOutput {
+		versionInfo := types.GetVersionInfo()
 		if (runtime.GOOS == "linux" || runtime.GOOS == "darwin") &&
-			types.Version != "dev" {
+			!versionInfo.Dirty {
 			githubTag := &latest.GithubTag{
 				Owner:      "juanfont",
 				Repository: "headscale",
 			}
-			res, err := latest.Check(githubTag, types.Version)
+			res, err := latest.Check(githubTag, versionInfo.Version)
 			if err == nil && res.Outdated {
 				//nolint
 				log.Warn().Msgf(
 					"An updated version of Headscale has been found (%s vs. your current %s). Check it out https://github.com/juanfont/headscale/releases\n",
 					res.Current,
-					types.Version,
+					versionInfo.Version,
 				)
 			}
 		}
--- a/cmd/headscale/cli/serve_test.go
+++ b/cmd/headscale/cli/serve_test.go
@@ -1,70 +0,0 @@
-package cli
-
-import (
-	"testing"
-
-	"github.com/stretchr/testify/assert"
-	"github.com/stretchr/testify/require"
-)
-
-func TestServeCommand(t *testing.T) {
-	// Test that the serve command exists and is properly configured
-	assert.NotNil(t, serveCmd)
-	assert.Equal(t, "serve", serveCmd.Use)
-	assert.Equal(t, "Launches the headscale server", serveCmd.Short)
-	assert.NotNil(t, serveCmd.Run)
-	assert.NotNil(t, serveCmd.Args)
-}
-
-func TestServeCommandInRootCommand(t *testing.T) {
-	// Test that serve is available as a subcommand of root
-	cmd, _, err := rootCmd.Find([]string{"serve"})
-	require.NoError(t, err)
-	assert.Equal(t, "serve", cmd.Name())
-	assert.Equal(t, serveCmd, cmd)
-}
-
-func TestServeCommandArgs(t *testing.T) {
-	// Test that the Args function is defined and accepts any arguments
-	// The current implementation always returns nil (accepts any args)
-	assert.NotNil(t, serveCmd.Args)
-
-	// Test the args function directly
-	err := serveCmd.Args(serveCmd, []string{})
-	assert.NoError(t, err, "Args function should accept empty arguments")
-
-	err = serveCmd.Args(serveCmd, []string{"extra", "args"})
-	assert.NoError(t, err, "Args function should accept extra arguments")
-}
-
-func TestServeCommandHelp(t *testing.T) {
-	// Test that the command has proper help text
-	assert.NotEmpty(t, serveCmd.Short)
-	assert.Contains(t, serveCmd.Short, "server")
-	assert.Contains(t, serveCmd.Short, "headscale")
-}
-
-func TestServeCommandStructure(t *testing.T) {
-	// Test basic command structure
-	assert.Equal(t, "serve", serveCmd.Name())
-	assert.Equal(t, "Launches the headscale server", serveCmd.Short)
-
-	// Test that it has no subcommands (it's a leaf command)
-	subcommands := serveCmd.Commands()
-	assert.Empty(t, subcommands, "Serve command should not have subcommands")
-}
-
-// Note: We can't easily test the actual execution of serve because:
-// 1. It depends on configuration files being present and valid
-// 2. It calls log.Fatal() which would exit the test process
-// 3. It tries to start an actual HTTP server which would block forever
-// 4. It requires database connections and other infrastructure
-//
-// In a real refactor, we would:
-// 1. Extract server initialization logic to a testable function
-// 2. Use dependency injection for configuration and dependencies
-// 3. Return errors instead of calling log.Fatal()
-// 4. Add graceful shutdown capabilities for testing
-// 5. Allow server startup to be cancelled via context
-//
-// For now, we test the command structure and basic properties.
--- a/cmd/headscale/cli/table_filter.go
+++ b/cmd/headscale/cli/table_filter.go
@@ -1,55 +0,0 @@
-package cli
-
-import (
-	"strings"
-
-	"github.com/pterm/pterm"
-	"github.com/spf13/cobra"
-)
-
-const (
-	HeadscaleDateTimeFormat = "2006-01-02 15:04:05"
-	DefaultAPIKeyExpiry     = "90d"
-	DefaultPreAuthKeyExpiry = "1h"
-)
-
-// FilterTableColumns filters table columns based on --columns flag
-func FilterTableColumns(cmd *cobra.Command, tableData pterm.TableData) pterm.TableData {
-	columns, _ := cmd.Flags().GetString("columns")
-	if columns == "" || len(tableData) == 0 {
-		return tableData
-	}
-
-	headers := tableData[0]
-	wantedColumns := strings.Split(columns, ",")
-
-	// Find column indices
-	var indices []int
-	for _, wanted := range wantedColumns {
-		wanted = strings.TrimSpace(wanted)
-		for i, header := range headers {
-			if strings.EqualFold(header, wanted) {
-				indices = append(indices, i)
-				break
-			}
-		}
-	}
-
-	if len(indices) == 0 {
-		return tableData
-	}
-
-	// Filter all rows
-	filtered := make(pterm.TableData, len(tableData))
-	for i, row := range tableData {
-		newRow := make([]string, len(indices))
-		for j, idx := range indices {
-			if idx < len(row) {
-				newRow[j] = row[idx]
-			}
-		}
-		filtered[i] = newRow
-	}
-
-	return filtered
-}
--- a/cmd/headscale/cli/users.go
+++ b/cmd/headscale/cli/users.go
@@ -1,15 +1,13 @@
 package cli

 import (
-	"context"
 	"errors"
 	"fmt"
 	"net/url"
 	"strconv"
-	"strings"

-	survey "github.com/AlecAivazis/survey/v2"
 	v1 "github.com/juanfont/headscale/gen/go/headscale/v1"
+	"github.com/juanfont/headscale/hscontrol/util"
 	"github.com/pterm/pterm"
 	"github.com/rs/zerolog/log"
 	"github.com/spf13/cobra"
@@ -17,23 +15,25 @@ import (
 )

 func usernameAndIDFlag(cmd *cobra.Command) {
-	cmd.Flags().StringP("user", "u", "", "User identifier (ID, name, or email)")
+	cmd.Flags().Int64P("identifier", "i", -1, "User identifier (ID)")
 	cmd.Flags().StringP("name", "n", "", "Username")
 }

-// userIDFromFlag returns the user ID using smart lookup.
-// If no user is specified, it will exit the program with an error.
-func userIDFromFlag(cmd *cobra.Command) uint64 {
-	userID, err := GetUserIdentifier(cmd)
-	if err != nil {
+// usernameAndIDFromFlag returns the username and ID from the flags of the command.
+// If both are empty, it will exit the program with an error.
+func usernameAndIDFromFlag(cmd *cobra.Command) (uint64, string) {
+	username, _ := cmd.Flags().GetString("name")
+	identifier, _ := cmd.Flags().GetInt64("identifier")
+	if username == "" && identifier < 0 {
+		err := errors.New("--name or --identifier flag is required")
 		ErrorOutput(
 			err,
-			"Cannot identify user: "+err.Error(),
-			GetOutputFlag(cmd),
+			"Cannot rename user: "+status.Convert(err).Message(),
+			"",
 		)
 	}

-	return userID
+	return uint64(identifier), username
 }

 func init() {
@@ -43,18 +43,14 @@ func init() {
 	createUserCmd.Flags().StringP("email", "e", "", "Email")
 	createUserCmd.Flags().StringP("picture-url", "p", "", "Profile picture URL")
 	userCmd.AddCommand(listUsersCmd)
-	// Smart lookup filters - can be used individually or combined
-	listUsersCmd.Flags().StringP("user", "u", "", "Filter by user (ID, name, or email)")
-	listUsersCmd.Flags().Uint64P("id", "", 0, "Filter by user ID")
-	listUsersCmd.Flags().StringP("name", "n", "", "Filter by username")
-	listUsersCmd.Flags().StringP("email", "e", "", "Filter by email address")
-	listUsersCmd.Flags().String("columns", "", "Comma-separated list of columns to display (ID,Name,Username,Email,Created)")
+	usernameAndIDFlag(listUsersCmd)
+	listUsersCmd.Flags().StringP("email", "e", "", "Email")
 	userCmd.AddCommand(destroyUserCmd)
 	usernameAndIDFlag(destroyUserCmd)
 	userCmd.AddCommand(renameUserCmd)
 	usernameAndIDFlag(renameUserCmd)
 	renameUserCmd.Flags().StringP("new-name", "r", "", "New username")
-	renameUserCmd.MarkFlagRequired("new-name")
+	renameNodeCmd.MarkFlagRequired("new-name")
 }

 var errMissingParameter = errors.New("missing parameters")
@@ -77,9 +73,16 @@ var createUserCmd = &cobra.Command{
 		return nil
 	},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")
+
 		userName := args[0]

+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()
+
+		log.Trace().Interface("client", client).Msg("Obtained gRPC client")
+
 		request := &v1.CreateUserRequest{Name: userName}

 		if displayName, _ := cmd.Flags().GetString("display-name"); displayName != "" {
@@ -100,109 +103,82 @@ var createUserCmd = &cobra.Command{
 					),
 					output,
 				)
-				return
 			}
 			request.PictureUrl = pictureURL
 		}

-		err := WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			log.Trace().Interface("client", client).Msg("Obtained gRPC client")
-			log.Trace().Interface("request", request).Msg("Sending CreateUser request")
-
-			response, err := client.CreateUser(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					"Cannot create user: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
-
-			SuccessOutput(response.GetUser(), "User created", output)
-			return nil
-		})
+		log.Trace().Interface("request", request).Msg("Sending CreateUser request")
+		response, err := client.CreateUser(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				"Cannot create user: "+status.Convert(err).Message(),
+				output,
+			)
 		}
+
+		SuccessOutput(response.GetUser(), "User created", output)
 	},
 }

 var destroyUserCmd = &cobra.Command{
-	Use:     "destroy --user USER",
+	Use:     "destroy --identifier ID or --name NAME",
 	Short:   "Destroys a user",
 	Aliases: []string{"delete"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")

-		id := userIDFromFlag(cmd)
+		id, username := usernameAndIDFromFlag(cmd)
 		request := &v1.ListUsersRequest{
-			Id: id,
+			Name: username,
+			Id:   id,
 		}

-		var user *v1.User
-		err := WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			users, err := client.ListUsers(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					"Error: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			if len(users.GetUsers()) != 1 {
-				err := errors.New("Unable to determine user to delete, query returned multiple users, use ID")
-				ErrorOutput(
-					err,
-					"Error: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
-
-			user = users.GetUsers()[0]
-			return nil
-		})
+		users, err := client.ListUsers(ctx, request)
 		if err != nil {
-			return
+			ErrorOutput(
+				err,
+				"Error: "+status.Convert(err).Message(),
+				output,
+			)
 		}

+		if len(users.GetUsers()) != 1 {
+			err := errors.New("Unable to determine user to delete, query returned multiple users, use ID")
+			ErrorOutput(
+				err,
+				"Error: "+status.Convert(err).Message(),
+				output,
+			)
+		}
+
+		user := users.GetUsers()[0]
+
 		confirm := false
 		force, _ := cmd.Flags().GetBool("force")
 		if !force {
-			prompt := &survey.Confirm{
-				Message: fmt.Sprintf(
-					"Do you want to remove the user %q (%d) and any associated preauthkeys?",
-					user.GetName(), user.GetId(),
-				),
-			}
-			err := survey.AskOne(prompt, &confirm)
-			if err != nil {
-				return
-			}
+			confirm = util.YesNo(fmt.Sprintf(
+				"Do you want to remove the user %q (%d) and any associated preauthkeys?",
+				user.GetName(), user.GetId(),
+			))
 		}

 		if confirm || force {
-			err = WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-				request := &v1.DeleteUserRequest{Id: user.GetId()}
+			request := &v1.DeleteUserRequest{Id: user.GetId()}

-				response, err := client.DeleteUser(ctx, request)
-				if err != nil {
-					ErrorOutput(
-						err,
-						"Cannot destroy user: "+status.Convert(err).Message(),
-						output,
-					)
-					return err
-				}
-				SuccessOutput(response, "User destroyed", output)
-				return nil
-			})
+			response, err := client.DeleteUser(ctx, request)
 			if err != nil {
-				return
+				ErrorOutput(
+					err,
+					"Cannot destroy user: "+status.Convert(err).Message(),
+					output,
+				)
 			}
+			SuccessOutput(response, "User destroyed", output)
 		} else {
 			SuccessOutput(map[string]string{"Result": "User not destroyed"}, "User not destroyed", output)
 		}
@@ -214,76 +190,61 @@ var listUsersCmd = &cobra.Command{
 	Short:   "List all the users",
 	Aliases: []string{"ls", "show"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")

-		err := WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			request := &v1.ListUsersRequest{}
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()

-			// Check for smart lookup flag first
-			userFlag, _ := cmd.Flags().GetString("user")
-			if userFlag != "" {
-				// Use smart lookup to determine filter type
-				if id, err := strconv.ParseUint(userFlag, 10, 64); err == nil && id > 0 {
-					request.Id = id
-				} else if strings.Contains(userFlag, "@") {
-					request.Email = userFlag
-				} else {
-					request.Name = userFlag
-				}
-			} else {
-				// Check specific filter flags
-				if id, _ := cmd.Flags().GetUint64("id"); id > 0 {
-					request.Id = id
-				} else if name, _ := cmd.Flags().GetString("name"); name != "" {
-					request.Name = name
-				} else if email, _ := cmd.Flags().GetString("email"); email != "" {
-					request.Email = email
-				}
-			}
+		request := &v1.ListUsersRequest{}

-			response, err := client.ListUsers(ctx, request)
-			if err != nil {
-				ErrorOutput(
-					err,
-					"Cannot get users: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
+		id, _ := cmd.Flags().GetInt64("identifier")
+		username, _ := cmd.Flags().GetString("name")
+		email, _ := cmd.Flags().GetString("email")

-			if output != "" {
-				SuccessOutput(response.GetUsers(), "", output)
-				return nil
-			}
+		// filter by one param at most
+		switch {
+		case id > 0:
+			request.Id = uint64(id)
+		case username != "":
+			request.Name = username
+		case email != "":
+			request.Email = email
+		}

-			tableData := pterm.TableData{{"ID", "Name", "Username", "Email", "Created"}}
-			for _, user := range response.GetUsers() {
-				tableData = append(
-					tableData,
-					[]string{
-						strconv.FormatUint(user.GetId(), 10),
-						user.GetDisplayName(),
-						user.GetName(),
-						user.GetEmail(),
-						user.GetCreatedAt().AsTime().Format(HeadscaleDateTimeFormat),
-					},
-				)
-			}
-			tableData = FilterTableColumns(cmd, tableData)
-			err = pterm.DefaultTable.WithHasHeader().WithData(tableData).Render()
-			if err != nil {
-				ErrorOutput(
-					err,
-					fmt.Sprintf("Failed to render pterm table: %s", err),
-					output,
-				)
-				return err
-			}
-			return nil
-		})
+		response, err := client.ListUsers(ctx, request)
 		if err != nil {
-			// Error already handled in closure
-			return
+			ErrorOutput(
+				err,
+				"Cannot get users: "+status.Convert(err).Message(),
+				output,
+			)
+		}
+
+		if output != "" {
+			SuccessOutput(response.GetUsers(), "", output)
+		}
+
+		tableData := pterm.TableData{{"ID", "Name", "Username", "Email", "Created"}}
+		for _, user := range response.GetUsers() {
+			tableData = append(
+				tableData,
+				[]string{
+					strconv.FormatUint(user.GetId(), 10),
+					user.GetDisplayName(),
+					user.GetName(),
+					user.GetEmail(),
+					user.GetCreatedAt().AsTime().Format("2006-01-02 15:04:05"),
+				},
+			)
+		}
+		err = pterm.DefaultTable.WithHasHeader().WithData(tableData).Render()
+		if err != nil {
+			ErrorOutput(
+				err,
+				fmt.Sprintf("Failed to render pterm table: %s", err),
+				output,
+			)
 		}
 	},
 }
@@ -293,56 +254,52 @@ var renameUserCmd = &cobra.Command{
 	Short:   "Renames a user",
 	Aliases: []string{"mv"},
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
+		output, _ := cmd.Flags().GetString("output")
+
+		ctx, client, conn, cancel := newHeadscaleCLIWithConfig()
+		defer cancel()
+		defer conn.Close()
+
+		id, username := usernameAndIDFromFlag(cmd)
+		listReq := &v1.ListUsersRequest{
+			Name: username,
+			Id:   id,
+		}
+
+		users, err := client.ListUsers(ctx, listReq)
+		if err != nil {
+			ErrorOutput(
+				err,
+				"Error: "+status.Convert(err).Message(),
+				output,
+			)
+		}
+
+		if len(users.GetUsers()) != 1 {
+			err := errors.New("Unable to determine user to delete, query returned multiple users, use ID")
+			ErrorOutput(
+				err,
+				"Error: "+status.Convert(err).Message(),
+				output,
+			)
+		}

-		id := userIDFromFlag(cmd)
 		newName, _ := cmd.Flags().GetString("new-name")

-		err := WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-			listReq := &v1.ListUsersRequest{
-				Id: id,
-			}
-
-			users, err := client.ListUsers(ctx, listReq)
-			if err != nil {
-				ErrorOutput(
-					err,
-					"Error: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
-
-			if len(users.GetUsers()) != 1 {
-				err := errors.New("Unable to determine user to delete, query returned multiple users, use ID")
-				ErrorOutput(
-					err,
-					"Error: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
-
-			renameReq := &v1.RenameUserRequest{
-				OldId:   id,
-				NewName: newName,
-			}
-
-			response, err := client.RenameUser(ctx, renameReq)
-			if err != nil {
-				ErrorOutput(
-					err,
-					"Cannot rename user: "+status.Convert(err).Message(),
-					output,
-				)
-				return err
-			}
-
-			SuccessOutput(response.GetUser(), "User renamed", output)
-			return nil
-		})
-		if err != nil {
-			return
+		renameReq := &v1.RenameUserRequest{
+			OldId:   id,
+			NewName: newName,
 		}
+
+		response, err := client.RenameUser(ctx, renameReq)
+		if err != nil {
+			ErrorOutput(
+				err,
+				"Cannot rename user: "+status.Convert(err).Message(),
+				output,
+			)
+		}
+
+		SuccessOutput(response.GetUser(), "User renamed", output)
 	},
 }
--- a/cmd/headscale/cli/utils.go
+++ b/cmd/headscale/cli/utils.go
@@ -5,23 +5,24 @@ import (
 	"crypto/tls"
 	"encoding/json"
 	"fmt"
-	"net"
 	"os"
-	"strconv"
-	"strings"

 	v1 "github.com/juanfont/headscale/gen/go/headscale/v1"
 	"github.com/juanfont/headscale/hscontrol"
 	"github.com/juanfont/headscale/hscontrol/types"
 	"github.com/juanfont/headscale/hscontrol/util"
 	"github.com/rs/zerolog/log"
-	"github.com/spf13/cobra"
 	"google.golang.org/grpc"
 	"google.golang.org/grpc/credentials"
 	"google.golang.org/grpc/credentials/insecure"
 	"gopkg.in/yaml.v3"
 )

+const (
+	HeadscaleDateTimeFormat = "2006-01-02 15:04:05"
+	SocketWritePermissions  = 0o666
+)
+
 func newHeadscaleServerWithConfig() (*hscontrol.Headscale, error) {
 	cfg, err := types.LoadServerConfig()
 	if err != nil {
@@ -71,7 +72,7 @@ func newHeadscaleCLIWithConfig() (context.Context, v1.HeadscaleServiceClient, *g

 		// Try to give the user better feedback if we cannot write to the headscale
 		// socket.
-		socket, err := os.OpenFile(cfg.UnixSocket, os.O_WRONLY, 0o666) // nolint
+		socket, err := os.OpenFile(cfg.UnixSocket, os.O_WRONLY, SocketWritePermissions) // nolint
 		if err != nil {
 			if os.IsPermission(err) {
 				log.Fatal().
@@ -168,7 +169,14 @@ func ErrorOutput(errResult error, override string, outputFormat string) {
 		Error string `json:"error"`
 	}

-	fmt.Fprintf(os.Stderr, "%s\n", output(errOutput{errResult.Error()}, override, outputFormat))
+	var errorMessage string
+	if errResult != nil {
+		errorMessage = errResult.Error()
+	} else {
+		errorMessage = override
+	}
+
+	fmt.Fprintf(os.Stderr, "%s\n", output(errOutput{errorMessage}, override, outputFormat))
 	os.Exit(1)
 }

@@ -199,152 +207,3 @@ func (t tokenAuth) GetRequestMetadata(
 func (tokenAuth) RequireTransportSecurity() bool {
 	return true
 }
-
-// GetOutputFlag returns the output flag value (never fails)
-func GetOutputFlag(cmd *cobra.Command) string {
-	output, _ := cmd.Flags().GetString("output")
-	return output
-}
-
-
-// GetNodeIdentifier returns the node ID using smart lookup via gRPC ListNodes call
-func GetNodeIdentifier(cmd *cobra.Command) (uint64, error) {
-	nodeFlag, _ := cmd.Flags().GetString("node")
-
-	// Use --node flag
-	if nodeFlag == "" {
-		return 0, fmt.Errorf("--node flag is required")
-	}
-
-	// Use smart lookup via gRPC
-	return lookupNodeBySpecifier(nodeFlag)
-}
-
-// lookupNodeBySpecifier performs smart lookup of a node by ID, name, hostname, or IP
-func lookupNodeBySpecifier(specifier string) (uint64, error) {
-	var nodeID uint64
-
-	err := WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-		request := &v1.ListNodesRequest{}
-
-		// Detect what type of specifier this is and set appropriate filter
-		if id, err := strconv.ParseUint(specifier, 10, 64); err == nil && id > 0 {
-			// Looks like a numeric ID
-			request.Id = id
-		} else if isIPAddress(specifier) {
-			// Looks like an IP address
-			request.IpAddresses = []string{specifier}
-		} else {
-			// Treat as hostname/name
-			request.Name = specifier
-		}
-
-		response, err := client.ListNodes(ctx, request)
-		if err != nil {
-			return fmt.Errorf("failed to lookup node: %w", err)
-		}
-
-		nodes := response.GetNodes()
-		if len(nodes) == 0 {
-			return fmt.Errorf("node not found")
-		}
-
-		if len(nodes) > 1 {
-			var nodeInfo []string
-			for _, node := range nodes {
-				nodeInfo = append(nodeInfo, fmt.Sprintf("ID=%d name=%s", node.GetId(), node.GetName()))
-			}
-			return fmt.Errorf("multiple nodes found matching '%s': %s", specifier, strings.Join(nodeInfo, ", "))
-		}
-
-		// Exactly one match - this is what we want
-		nodeID = nodes[0].GetId()
-		return nil
-	})
-	if err != nil {
-		return 0, err
-	}
-
-	return nodeID, nil
-}
-
-// isIPAddress checks if a string looks like an IP address
-func isIPAddress(s string) bool {
-	// Try parsing as IP address (both IPv4 and IPv6)
-	if net.ParseIP(s) != nil {
-		return true
-	}
-	// Try parsing as CIDR
-	if _, _, err := net.ParseCIDR(s); err == nil {
-		return true
-	}
-	return false
-}
-
-// GetUserIdentifier returns the user ID using smart lookup via gRPC ListUsers call
-func GetUserIdentifier(cmd *cobra.Command) (uint64, error) {
-	userFlag, _ := cmd.Flags().GetString("user")
-	nameFlag, _ := cmd.Flags().GetString("name")
-
-	var specifier string
-
-	// Determine which flag was used (prefer --user, fall back to legacy flags)
-	if userFlag != "" {
-		specifier = userFlag
-	} else if nameFlag != "" {
-		specifier = nameFlag
-	} else {
-		return 0, fmt.Errorf("--user flag is required")
-	}
-
-	// Use smart lookup via gRPC
-	return lookupUserBySpecifier(specifier)
-}
-
-// lookupUserBySpecifier performs smart lookup of a user by ID, name, or email
-func lookupUserBySpecifier(specifier string) (uint64, error) {
-	var userID uint64
-
-	err := WithClient(func(ctx context.Context, client v1.HeadscaleServiceClient) error {
-		request := &v1.ListUsersRequest{}
-
-		// Detect what type of specifier this is and set appropriate filter
-		if id, err := strconv.ParseUint(specifier, 10, 64); err == nil && id > 0 {
-			// Looks like a numeric ID
-			request.Id = id
-		} else if strings.Contains(specifier, "@") {
-			// Looks like an email address
-			request.Email = specifier
-		} else {
-			// Treat as username
-			request.Name = specifier
-		}
-
-		response, err := client.ListUsers(ctx, request)
-		if err != nil {
-			return fmt.Errorf("failed to lookup user: %w", err)
-		}
-
-		users := response.GetUsers()
-		if len(users) == 0 {
-			return fmt.Errorf("user not found")
-		}
-
-		if len(users) > 1 {
-			var userInfo []string
-			for _, user := range users {
-				userInfo = append(userInfo, fmt.Sprintf("ID=%d name=%s email=%s", user.GetId(), user.GetName(), user.GetEmail()))
-			}
-			return fmt.Errorf("multiple users found matching '%s': %s", specifier, strings.Join(userInfo, ", "))
-		}
-
-		// Exactly one match - this is what we want
-		userID = users[0].GetId()
-		return nil
-	})
-	if err != nil {
-		return 0, err
-	}
-
-	return userID, nil
-}
--- a/cmd/headscale/cli/utils_test.go
+++ b/cmd/headscale/cli/utils_test.go
@@ -1,175 +0,0 @@
-package cli
-
-import (
-	"os"
-	"testing"
-
-	"github.com/stretchr/testify/assert"
-)
-
-func TestHasMachineOutputFlag(t *testing.T) {
-	tests := []struct {
-		name     string
-		args     []string
-		expected bool
-	}{
-		{
-			name:     "no machine output flags",
-			args:     []string{"headscale", "users", "list"},
-			expected: false,
-		},
-		{
-			name:     "json flag present",
-			args:     []string{"headscale", "users", "list", "json"},
-			expected: true,
-		},
-		{
-			name:     "json-line flag present",
-			args:     []string{"headscale", "nodes", "list", "json-line"},
-			expected: true,
-		},
-		{
-			name:     "yaml flag present",
-			args:     []string{"headscale", "apikeys", "list", "yaml"},
-			expected: true,
-		},
-		{
-			name:     "mixed flags with json",
-			args:     []string{"headscale", "--config", "/tmp/config.yaml", "users", "list", "json"},
-			expected: true,
-		},
-		{
-			name:     "flag as part of longer argument",
-			args:     []string{"headscale", "users", "create", "json-user@example.com"},
-			expected: false,
-		},
-	}
-
-	for _, tt := range tests {
-		t.Run(tt.name, func(t *testing.T) {
-			// Save original os.Args
-			originalArgs := os.Args
-			defer func() { os.Args = originalArgs }()
-
-			// Set os.Args to test case
-			os.Args = tt.args
-
-			result := HasMachineOutputFlag()
-			assert.Equal(t, tt.expected, result)
-		})
-	}
-}
-
-func TestOutput(t *testing.T) {
-	tests := []struct {
-		name         string
-		result       interface{}
-		override     string
-		outputFormat string
-		expected     string
-	}{
-		{
-			name:         "default format returns override",
-			result:       map[string]string{"test": "value"},
-			override:     "Human readable output",
-			outputFormat: "",
-			expected:     "Human readable output",
-		},
-		{
-			name:         "default format with empty override",
-			result:       map[string]string{"test": "value"},
-			override:     "",
-			outputFormat: "",
-			expected:     "",
-		},
-		{
-			name:         "json format",
-			result:       map[string]string{"name": "test", "id": "123"},
-			override:     "Human readable",
-			outputFormat: "json",
-			expected:     "{\n\t\"id\": \"123\",\n\t\"name\": \"test\"\n}",
-		},
-		{
-			name:         "json-line format",
-			result:       map[string]string{"name": "test", "id": "123"},
-			override:     "Human readable",
-			outputFormat: "json-line",
-			expected:     "{\"id\":\"123\",\"name\":\"test\"}",
-		},
-		{
-			name:         "yaml format",
-			result:       map[string]string{"name": "test", "id": "123"},
-			override:     "Human readable",
-			outputFormat: "yaml",
-			expected:     "id: \"123\"\nname: test\n",
-		},
-		{
-			name:         "invalid format returns override",
-			result:       map[string]string{"test": "value"},
-			override:     "Human readable output",
-			outputFormat: "invalid",
-			expected:     "Human readable output",
-		},
-	}
-
-	for _, tt := range tests {
-		t.Run(tt.name, func(t *testing.T) {
-			result := output(tt.result, tt.override, tt.outputFormat)
-			assert.Equal(t, tt.expected, result)
-		})
-	}
-}
-
-func TestOutputWithComplexData(t *testing.T) {
-	// Test with more complex data structures
-	complexData := struct {
-		Users []struct {
-			Name string `json:"name" yaml:"name"`
-			ID   int    `json:"id" yaml:"id"`
-		} `json:"users" yaml:"users"`
-	}{
-		Users: []struct {
-			Name string `json:"name" yaml:"name"`
-			ID   int    `json:"id" yaml:"id"`
-		}{
-			{Name: "user1", ID: 1},
-			{Name: "user2", ID: 2},
-		},
-	}
-
-	// Test JSON output
-	jsonResult := output(complexData, "override", "json")
-	assert.Contains(t, jsonResult, "\"users\":")
-	assert.Contains(t, jsonResult, "\"name\": \"user1\"")
-	assert.Contains(t, jsonResult, "\"id\": 1")
-
-	// Test YAML output
-	yamlResult := output(complexData, "override", "yaml")
-	assert.Contains(t, yamlResult, "users:")
-	assert.Contains(t, yamlResult, "name: user1")
-	assert.Contains(t, yamlResult, "id: 1")
-}
-
-func TestOutputWithNilData(t *testing.T) {
-	// Test with nil data
-	result := output(nil, "fallback", "json")
-	assert.Equal(t, "null", result)
-
-	result = output(nil, "fallback", "yaml")
-	assert.Equal(t, "null\n", result)
-
-	result = output(nil, "fallback", "")
-	assert.Equal(t, "fallback", result)
-}
-
-func TestOutputWithEmptyData(t *testing.T) {
-	// Test with empty slice
-	emptySlice := []string{}
-	result := output(emptySlice, "fallback", "json")
-	assert.Equal(t, "[]", result)
-
-	// Test with empty map
-	emptyMap := map[string]string{}
-	result = output(emptyMap, "fallback", "json")
-	assert.Equal(t, "{}", result)
-}
--- a/cmd/headscale/cli/version.go
+++ b/cmd/headscale/cli/version.go
@@ -7,17 +7,18 @@ import (

 func init() {
 	rootCmd.AddCommand(versionCmd)
+	versionCmd.Flags().StringP("output", "o", "", "Output format. Empty for human-readable, 'json', 'json-line' or 'yaml'")
 }

 var versionCmd = &cobra.Command{
 	Use:   "version",
-	Short: "Print the version",
-	Long:  "The version of headscale",
+	Short: "Print the version.",
+	Long:  "The version of headscale.",
 	Run: func(cmd *cobra.Command, args []string) {
-		output := GetOutputFlag(cmd)
-		SuccessOutput(map[string]string{
-			"version": types.Version,
-			"commit":  types.GitCommitHash,
-		}, types.Version, output)
+		output, _ := cmd.Flags().GetString("output")
+
+		info := types.GetVersionInfo()
+
+		SuccessOutput(info, info.String(), output)
 	},
 }
--- a/cmd/headscale/cli/version_test.go
+++ b/cmd/headscale/cli/version_test.go
@@ -1,45 +0,0 @@
-package cli
-
-import (
-	"testing"
-
-	"github.com/stretchr/testify/assert"
-)
-
-func TestVersionCommand(t *testing.T) {
-	// Test that version command exists
-	assert.NotNil(t, versionCmd)
-	assert.Equal(t, "version", versionCmd.Use)
-	assert.Equal(t, "Print the version.", versionCmd.Short)
-	assert.Equal(t, "The version of headscale.", versionCmd.Long)
-}
-
-func TestVersionCommandStructure(t *testing.T) {
-	// Test command is properly added to root
-	found := false
-	for _, cmd := range rootCmd.Commands() {
-		if cmd.Use == "version" {
-			found = true
-			break
-		}
-	}
-	assert.True(t, found, "version command should be added to root command")
-}
-
-func TestVersionCommandFlags(t *testing.T) {
-	// Version command should inherit output flag from root as persistent flag
-	outputFlag := versionCmd.Flag("output")
-	if outputFlag == nil {
-		// Try persistent flags from root
-		outputFlag = rootCmd.PersistentFlags().Lookup("output")
-	}
-	assert.NotNil(t, outputFlag, "version command should have access to output flag")
-}
-
-func TestVersionCommandRun(t *testing.T) {
-	// Test that Run function is set
-	assert.NotNil(t, versionCmd.Run)
-
-	// We can't easily test the actual execution without mocking SuccessOutput
-	// but we can verify the function exists and has the right signature
-}
--- a/cmd/hi/docker.go
+++ b/cmd/hi/docker.go
@@ -90,6 +90,32 @@ func runTestContainer(ctx context.Context, config *RunConfig) error {

 	log.Printf("Starting test: %s", config.TestPattern)

+	// Start stats collection for container resource monitoring (if enabled)
+	var statsCollector *StatsCollector
+	if config.Stats {
+		var err error
+		statsCollector, err = NewStatsCollector()
+		if err != nil {
+			if config.Verbose {
+				log.Printf("Warning: failed to create stats collector: %v", err)
+			}
+			statsCollector = nil
+		}
+
+		if statsCollector != nil {
+			defer statsCollector.Close()
+
+			// Start stats collection immediately - no need for complex retry logic
+			// The new implementation monitors Docker events and will catch containers as they start
+			if err := statsCollector.StartCollection(ctx, runID, config.Verbose); err != nil {
+				if config.Verbose {
+					log.Printf("Warning: failed to start stats collection: %v", err)
+				}
+			}
+			defer statsCollector.StopCollection()
+		}
+	}
+
 	exitCode, err := streamAndWait(ctx, cli, resp.ID)

 	// Ensure all containers have finished and logs are flushed before extracting artifacts
@@ -105,6 +131,21 @@ func runTestContainer(ctx context.Context, config *RunConfig) error {
 	// Always list control files regardless of test outcome
 	listControlFiles(logsDir)

+	// Print stats summary and check memory limits if enabled
+	if config.Stats && statsCollector != nil {
+		violations := statsCollector.PrintSummaryAndCheckLimits(config.HSMemoryLimit, config.TSMemoryLimit)
+		if len(violations) > 0 {
+			log.Printf("MEMORY LIMIT VIOLATIONS DETECTED:")
+			log.Printf("=================================")
+			for _, violation := range violations {
+				log.Printf("Container %s exceeded memory limit: %.1f MB > %.1f MB",
+					violation.ContainerName, violation.MaxMemoryMB, violation.LimitMB)
+			}
+
+			return fmt.Errorf("test failed: %d container(s) exceeded memory limits", len(violations))
+		}
+	}
+
 	shouldCleanup := config.CleanAfter && (!config.KeepOnFailure || exitCode == 0)
 	if shouldCleanup {
 		if config.Verbose {
@@ -379,10 +420,37 @@ func getDockerSocketPath() string {
 	return "/var/run/docker.sock"
 }

-// ensureImageAvailable pulls the specified Docker image to ensure it's available.
+// checkImageAvailableLocally checks if the specified Docker image is available locally.
+func checkImageAvailableLocally(ctx context.Context, cli *client.Client, imageName string) (bool, error) {
+	_, _, err := cli.ImageInspectWithRaw(ctx, imageName)
+	if err != nil {
+		if client.IsErrNotFound(err) {
+			return false, nil
+		}
+		return false, fmt.Errorf("failed to inspect image %s: %w", imageName, err)
+	}
+
+	return true, nil
+}
+
+// ensureImageAvailable checks if the image is available locally first, then pulls if needed.
 func ensureImageAvailable(ctx context.Context, cli *client.Client, imageName string, verbose bool) error {
+	// First check if image is available locally
+	available, err := checkImageAvailableLocally(ctx, cli, imageName)
+	if err != nil {
+		return fmt.Errorf("failed to check local image availability: %w", err)
+	}
+
+	if available {
+		if verbose {
+			log.Printf("Image %s is available locally", imageName)
+		}
+		return nil
+	}
+
+	// Image not available locally, try to pull it
 	if verbose {
-		log.Printf("Pulling image %s...", imageName)
+		log.Printf("Image %s not found locally, pulling...", imageName)
 	}

 	reader, err := cli.ImagePull(ctx, imageName, image.PullOptions{})
--- a/cmd/hi/doctor.go
+++ b/cmd/hi/doctor.go
@@ -190,7 +190,7 @@ func checkDockerSocket(ctx context.Context) DoctorResult {
 	}
 }

-// checkGolangImage verifies we can access the golang Docker image.
+// checkGolangImage verifies the golang Docker image is available locally or can be pulled.
 func checkGolangImage(ctx context.Context) DoctorResult {
 	cli, err := createDockerClient()
 	if err != nil {
@@ -205,17 +205,40 @@ func checkGolangImage(ctx context.Context) DoctorResult {
 	goVersion := detectGoVersion()
 	imageName := "golang:" + goVersion

-	// Check if we can pull the image
+	// First check if image is available locally
+	available, err := checkImageAvailableLocally(ctx, cli, imageName)
+	if err != nil {
+		return DoctorResult{
+			Name:    "Golang Image",
+			Status:  "FAIL",
+			Message: fmt.Sprintf("Cannot check golang image %s: %v", imageName, err),
+			Suggestions: []string{
+				"Check Docker daemon status",
+				"Try: docker images | grep golang",
+			},
+		}
+	}
+
+	if available {
+		return DoctorResult{
+			Name:    "Golang Image",
+			Status:  "PASS",
+			Message: fmt.Sprintf("Golang image %s is available locally", imageName),
+		}
+	}
+
+	// Image not available locally, try to pull it
 	err = ensureImageAvailable(ctx, cli, imageName, false)
 	if err != nil {
 		return DoctorResult{
 			Name:    "Golang Image",
 			Status:  "FAIL",
-			Message: fmt.Sprintf("Cannot pull golang image %s: %v", imageName, err),
+			Message: fmt.Sprintf("Golang image %s not available locally and cannot pull: %v", imageName, err),
 			Suggestions: []string{
 				"Check internet connectivity",
 				"Verify Docker Hub access",
 				"Try: docker pull " + imageName,
+				"Or run tests offline if image was pulled previously",
 			},
 		}
 	}
@@ -223,7 +246,7 @@ func checkGolangImage(ctx context.Context) DoctorResult {
 	return DoctorResult{
 		Name:    "Golang Image",
 		Status:  "PASS",
-		Message: fmt.Sprintf("Golang image %s is available", imageName),
+		Message: fmt.Sprintf("Golang image %s is now available", imageName),
 	}
 }

--- a/cmd/hi/run.go
+++ b/cmd/hi/run.go
@@ -24,6 +24,9 @@ type RunConfig struct {
 	KeepOnFailure bool          `flag:"keep-on-failure,default=false,Keep containers on test failure"`
 	LogsDir       string        `flag:"logs-dir,default=control_logs,Control logs directory"`
 	Verbose       bool          `flag:"verbose,default=false,Verbose output"`
+	Stats         bool          `flag:"stats,default=false,Collect and display container resource usage statistics"`
+	HSMemoryLimit float64       `flag:"hs-memory-limit,default=0,Fail test if any Headscale container exceeds this memory limit in MB (0 = disabled)"`
+	TSMemoryLimit float64       `flag:"ts-memory-limit,default=0,Fail test if any Tailscale container exceeds this memory limit in MB (0 = disabled)"`
 }

 // runIntegrationTest executes the integration test workflow.
--- a/cmd/hi/stats.go
+++ b/cmd/hi/stats.go
@@ -0,0 +1,471 @@
+package main
+
+import (
+	"context"
+	"encoding/json"
+	"errors"
+	"fmt"
+	"log"
+	"sort"
+	"strings"
+	"sync"
+	"time"
+
+	"github.com/docker/docker/api/types"
+	"github.com/docker/docker/api/types/container"
+	"github.com/docker/docker/api/types/events"
+	"github.com/docker/docker/api/types/filters"
+	"github.com/docker/docker/client"
+)
+
+// ContainerStats represents statistics for a single container.
+type ContainerStats struct {
+	ContainerID   string
+	ContainerName string
+	Stats         []StatsSample
+	mutex         sync.RWMutex
+}
+
+// StatsSample represents a single stats measurement.
+type StatsSample struct {
+	Timestamp time.Time
+	CPUUsage  float64 // CPU usage percentage
+	MemoryMB  float64 // Memory usage in MB
+}
+
+// StatsCollector manages collection of container statistics.
+type StatsCollector struct {
+	client            *client.Client
+	containers        map[string]*ContainerStats
+	stopChan          chan struct{}
+	wg                sync.WaitGroup
+	mutex             sync.RWMutex
+	collectionStarted bool
+}
+
+// NewStatsCollector creates a new stats collector instance.
+func NewStatsCollector() (*StatsCollector, error) {
+	cli, err := createDockerClient()
+	if err != nil {
+		return nil, fmt.Errorf("failed to create Docker client: %w", err)
+	}
+
+	return &StatsCollector{
+		client:     cli,
+		containers: make(map[string]*ContainerStats),
+		stopChan:   make(chan struct{}),
+	}, nil
+}
+
+// StartCollection begins monitoring all containers and collecting stats for hs- and ts- containers with matching run ID.
+func (sc *StatsCollector) StartCollection(ctx context.Context, runID string, verbose bool) error {
+	sc.mutex.Lock()
+	defer sc.mutex.Unlock()
+
+	if sc.collectionStarted {
+		return errors.New("stats collection already started")
+	}
+
+	sc.collectionStarted = true
+
+	// Start monitoring existing containers
+	sc.wg.Add(1)
+	go sc.monitorExistingContainers(ctx, runID, verbose)
+
+	// Start Docker events monitoring for new containers
+	sc.wg.Add(1)
+	go sc.monitorDockerEvents(ctx, runID, verbose)
+
+	if verbose {
+		log.Printf("Started container monitoring for run ID %s", runID)
+	}
+
+	return nil
+}
+
+// StopCollection stops all stats collection.
+func (sc *StatsCollector) StopCollection() {
+	// Check if already stopped without holding lock
+	sc.mutex.RLock()
+	if !sc.collectionStarted {
+		sc.mutex.RUnlock()
+		return
+	}
+	sc.mutex.RUnlock()
+
+	// Signal stop to all goroutines
+	close(sc.stopChan)
+
+	// Wait for all goroutines to finish
+	sc.wg.Wait()
+
+	// Mark as stopped
+	sc.mutex.Lock()
+	sc.collectionStarted = false
+	sc.mutex.Unlock()
+}
+
+// monitorExistingContainers checks for existing containers that match our criteria.
+func (sc *StatsCollector) monitorExistingContainers(ctx context.Context, runID string, verbose bool) {
+	defer sc.wg.Done()
+
+	containers, err := sc.client.ContainerList(ctx, container.ListOptions{})
+	if err != nil {
+		if verbose {
+			log.Printf("Failed to list existing containers: %v", err)
+		}
+		return
+	}
+
+	for _, cont := range containers {
+		if sc.shouldMonitorContainer(cont, runID) {
+			sc.startStatsForContainer(ctx, cont.ID, cont.Names[0], verbose)
+		}
+	}
+}
+
+// monitorDockerEvents listens for container start events and begins monitoring relevant containers.
+func (sc *StatsCollector) monitorDockerEvents(ctx context.Context, runID string, verbose bool) {
+	defer sc.wg.Done()
+
+	filter := filters.NewArgs()
+	filter.Add("type", "container")
+	filter.Add("event", "start")
+
+	eventOptions := events.ListOptions{
+		Filters: filter,
+	}
+
+	events, errs := sc.client.Events(ctx, eventOptions)
+
+	for {
+		select {
+		case <-sc.stopChan:
+			return
+		case <-ctx.Done():
+			return
+		case event := <-events:
+			if event.Type == "container" && event.Action == "start" {
+				// Get container details
+				containerInfo, err := sc.client.ContainerInspect(ctx, event.ID)
+				if err != nil {
+					continue
+				}
+
+				// Convert to types.Container format for consistency
+				cont := types.Container{
+					ID:     containerInfo.ID,
+					Names:  []string{containerInfo.Name},
+					Labels: containerInfo.Config.Labels,
+				}
+
+				if sc.shouldMonitorContainer(cont, runID) {
+					sc.startStatsForContainer(ctx, cont.ID, cont.Names[0], verbose)
+				}
+			}
+		case err := <-errs:
+			if verbose {
+				log.Printf("Error in Docker events stream: %v", err)
+			}
+			return
+		}
+	}
+}
+
+// shouldMonitorContainer determines if a container should be monitored.
+func (sc *StatsCollector) shouldMonitorContainer(cont types.Container, runID string) bool {
+	// Check if it has the correct run ID label
+	if cont.Labels == nil || cont.Labels["hi.run-id"] != runID {
+		return false
+	}
+
+	// Check if it's an hs- or ts- container
+	for _, name := range cont.Names {
+		containerName := strings.TrimPrefix(name, "/")
+		if strings.HasPrefix(containerName, "hs-") || strings.HasPrefix(containerName, "ts-") {
+			return true
+		}
+	}
+
+	return false
+}
+
+// startStatsForContainer begins stats collection for a specific container.
+func (sc *StatsCollector) startStatsForContainer(ctx context.Context, containerID, containerName string, verbose bool) {
+	containerName = strings.TrimPrefix(containerName, "/")
+
+	sc.mutex.Lock()
+	// Check if we're already monitoring this container
+	if _, exists := sc.containers[containerID]; exists {
+		sc.mutex.Unlock()
+		return
+	}
+
+	sc.containers[containerID] = &ContainerStats{
+		ContainerID:   containerID,
+		ContainerName: containerName,
+		Stats:         make([]StatsSample, 0),
+	}
+	sc.mutex.Unlock()
+
+	if verbose {
+		log.Printf("Starting stats collection for container %s (%s)", containerName, containerID[:12])
+	}
+
+	sc.wg.Add(1)
+	go sc.collectStatsForContainer(ctx, containerID, verbose)
+}
+
+// collectStatsForContainer collects stats for a specific container using Docker API streaming.
+func (sc *StatsCollector) collectStatsForContainer(ctx context.Context, containerID string, verbose bool) {
+	defer sc.wg.Done()
+
+	// Use Docker API streaming stats - much more efficient than CLI
+	statsResponse, err := sc.client.ContainerStats(ctx, containerID, true)
+	if err != nil {
+		if verbose {
+			log.Printf("Failed to get stats stream for container %s: %v", containerID[:12], err)
+		}
+		return
+	}
+	defer statsResponse.Body.Close()
+
+	decoder := json.NewDecoder(statsResponse.Body)
+	var prevStats *container.Stats
+
+	for {
+		select {
+		case <-sc.stopChan:
+			return
+		case <-ctx.Done():
+			return
+		default:
+			var stats container.Stats
+			if err := decoder.Decode(&stats); err != nil {
+				// EOF is expected when container stops or stream ends
+				if err.Error() != "EOF" && verbose {
+					log.Printf("Failed to decode stats for container %s: %v", containerID[:12], err)
+				}
+				return
+			}
+
+			// Calculate CPU percentage (only if we have previous stats)
+			var cpuPercent float64
+			if prevStats != nil {
+				cpuPercent = calculateCPUPercent(prevStats, &stats)
+			}
+
+			// Calculate memory usage in MB
+			memoryMB := float64(stats.MemoryStats.Usage) / (1024 * 1024)
+
+			// Store the sample (skip first sample since CPU calculation needs previous stats)
+			if prevStats != nil {
+				// Get container stats reference without holding the main mutex
+				var containerStats *ContainerStats
+				var exists bool
+
+				sc.mutex.RLock()
+				containerStats, exists = sc.containers[containerID]
+				sc.mutex.RUnlock()
+
+				if exists && containerStats != nil {
+					containerStats.mutex.Lock()
+					containerStats.Stats = append(containerStats.Stats, StatsSample{
+						Timestamp: time.Now(),
+						CPUUsage:  cpuPercent,
+						MemoryMB:  memoryMB,
+					})
+					containerStats.mutex.Unlock()
+				}
+			}
+
+			// Save current stats for next iteration
+			prevStats = &stats
+		}
+	}
+}
+
+// calculateCPUPercent calculates CPU usage percentage from Docker stats.
+func calculateCPUPercent(prevStats, stats *container.Stats) float64 {
+	// CPU calculation based on Docker's implementation
+	cpuDelta := float64(stats.CPUStats.CPUUsage.TotalUsage) - float64(prevStats.CPUStats.CPUUsage.TotalUsage)
+	systemDelta := float64(stats.CPUStats.SystemUsage) - float64(prevStats.CPUStats.SystemUsage)
+
+	if systemDelta > 0 && cpuDelta >= 0 {
+		// Calculate CPU percentage: (container CPU delta / system CPU delta) * number of CPUs * 100
+		numCPUs := float64(len(stats.CPUStats.CPUUsage.PercpuUsage))
+		if numCPUs == 0 {
+			// Fallback: if PercpuUsage is not available, assume 1 CPU
+			numCPUs = 1.0
+		}
+
+		return (cpuDelta / systemDelta) * numCPUs * 100.0
+	}
+
+	return 0.0
+}
+
+// ContainerStatsSummary represents summary statistics for a container.
+type ContainerStatsSummary struct {
+	ContainerName string
+	SampleCount   int
+	CPU           StatsSummary
+	Memory        StatsSummary
+}
+
+// MemoryViolation represents a container that exceeded the memory limit.
+type MemoryViolation struct {
+	ContainerName string
+	MaxMemoryMB   float64
+	LimitMB       float64
+}
+
+// StatsSummary represents min, max, and average for a metric.
+type StatsSummary struct {
+	Min     float64
+	Max     float64
+	Average float64
+}
+
+// GetSummary returns a summary of collected statistics.
+func (sc *StatsCollector) GetSummary() []ContainerStatsSummary {
+	// Take snapshot of container references without holding main lock long
+	sc.mutex.RLock()
+	containerRefs := make([]*ContainerStats, 0, len(sc.containers))
+	for _, containerStats := range sc.containers {
+		containerRefs = append(containerRefs, containerStats)
+	}
+	sc.mutex.RUnlock()
+
+	summaries := make([]ContainerStatsSummary, 0, len(containerRefs))
+
+	for _, containerStats := range containerRefs {
+		containerStats.mutex.RLock()
+		stats := make([]StatsSample, len(containerStats.Stats))
+		copy(stats, containerStats.Stats)
+		containerName := containerStats.ContainerName
+		containerStats.mutex.RUnlock()
+
+		if len(stats) == 0 {
+			continue
+		}
+
+		summary := ContainerStatsSummary{
+			ContainerName: containerName,
+			SampleCount:   len(stats),
+		}
+
+		// Calculate CPU stats
+		cpuValues := make([]float64, len(stats))
+		memoryValues := make([]float64, len(stats))
+
+		for i, sample := range stats {
+			cpuValues[i] = sample.CPUUsage
+			memoryValues[i] = sample.MemoryMB
+		}
+
+		summary.CPU = calculateStatsSummary(cpuValues)
+		summary.Memory = calculateStatsSummary(memoryValues)
+
+		summaries = append(summaries, summary)
+	}
+
+	// Sort by container name for consistent output
+	sort.Slice(summaries, func(i, j int) bool {
+		return summaries[i].ContainerName < summaries[j].ContainerName
+	})
+
+	return summaries
+}
+
+// calculateStatsSummary calculates min, max, and average for a slice of values.
+func calculateStatsSummary(values []float64) StatsSummary {
+	if len(values) == 0 {
+		return StatsSummary{}
+	}
+
+	min := values[0]
+	max := values[0]
+	sum := 0.0
+
+	for _, value := range values {
+		if value < min {
+			min = value
+		}
+		if value > max {
+			max = value
+		}
+		sum += value
+	}
+
+	return StatsSummary{
+		Min:     min,
+		Max:     max,
+		Average: sum / float64(len(values)),
+	}
+}
+
+// PrintSummary prints the statistics summary to the console.
+func (sc *StatsCollector) PrintSummary() {
+	summaries := sc.GetSummary()
+
+	if len(summaries) == 0 {
+		log.Printf("No container statistics collected")
+		return
+	}
+
+	log.Printf("Container Resource Usage Summary:")
+	log.Printf("================================")
+
+	for _, summary := range summaries {
+		log.Printf("Container: %s (%d samples)", summary.ContainerName, summary.SampleCount)
+		log.Printf("  CPU Usage:    Min: %6.2f%%  Max: %6.2f%%  Avg: %6.2f%%",
+			summary.CPU.Min, summary.CPU.Max, summary.CPU.Average)
+		log.Printf("  Memory Usage: Min: %6.1f MB Max: %6.1f MB Avg: %6.1f MB",
+			summary.Memory.Min, summary.Memory.Max, summary.Memory.Average)
+		log.Printf("")
+	}
+}
+
+// CheckMemoryLimits checks if any containers exceeded their memory limits.
+func (sc *StatsCollector) CheckMemoryLimits(hsLimitMB, tsLimitMB float64) []MemoryViolation {
+	if hsLimitMB <= 0 && tsLimitMB <= 0 {
+		return nil
+	}
+
+	summaries := sc.GetSummary()
+	var violations []MemoryViolation
+
+	for _, summary := range summaries {
+		var limitMB float64
+		if strings.HasPrefix(summary.ContainerName, "hs-") {
+			limitMB = hsLimitMB
+		} else if strings.HasPrefix(summary.ContainerName, "ts-") {
+			limitMB = tsLimitMB
+		} else {
+			continue // Skip containers that don't match our patterns
+		}
+
+		if limitMB > 0 && summary.Memory.Max > limitMB {
+			violations = append(violations, MemoryViolation{
+				ContainerName: summary.ContainerName,
+				MaxMemoryMB:   summary.Memory.Max,
+				LimitMB:       limitMB,
+			})
+		}
+	}
+
+	return violations
+}
+
+// PrintSummaryAndCheckLimits prints the statistics summary and returns memory violations if any.
+func (sc *StatsCollector) PrintSummaryAndCheckLimits(hsLimitMB, tsLimitMB float64) []MemoryViolation {
+	sc.PrintSummary()
+	return sc.CheckMemoryLimits(hsLimitMB, tsLimitMB)
+}
+
+// Close closes the stats collector and cleans up resources.
+func (sc *StatsCollector) Close() error {
+	sc.StopCollection()
+	return sc.client.Close()
+}
--- a/cmd/hi/tar_utils.go
+++ b/cmd/hi/tar_utils.go
@@ -68,7 +68,7 @@ func extractDirectoryFromTar(tarReader io.Reader, targetDir string) error {
 			continue // Skip potentially dangerous paths
 		}

-		targetPath := filepath.Join(targetDir, filepath.Base(cleanName))
+		targetPath := filepath.Join(targetDir, cleanName)

 		switch header.Typeflag {
 		case tar.TypeDir:
@@ -77,6 +77,11 @@ func extractDirectoryFromTar(tarReader io.Reader, targetDir string) error {
 				return fmt.Errorf("failed to create directory %s: %w", targetPath, err)
 			}
 		case tar.TypeReg:
+			// Ensure parent directories exist
+			if err := os.MkdirAll(filepath.Dir(targetPath), 0o755); err != nil {
+				return fmt.Errorf("failed to create parent directories for %s: %w", targetPath, err)
+			}
+			
 			// Create file
 			outFile, err := os.Create(targetPath)
 			if err != nil {
--- a/cmd/mapresponses/main.go
+++ b/cmd/mapresponses/main.go
@@ -0,0 +1,61 @@
+package main
+
+import (
+	"encoding/json"
+	"fmt"
+	"os"
+
+	"github.com/creachadair/command"
+	"github.com/creachadair/flax"
+	"github.com/juanfont/headscale/hscontrol/mapper"
+	"github.com/juanfont/headscale/integration/integrationutil"
+)
+
+type MapConfig struct {
+	Directory string `flag:"directory,Directory to read map responses from"`
+}
+
+var mapConfig MapConfig
+
+func main() {
+	root := command.C{
+		Name: "mapresponses",
+		Help: "MapResponses is a tool to map and compare map responses from a directory",
+		Commands: []*command.C{
+			{
+				Name:     "online",
+				Help:     "",
+				Usage:    "run [test-pattern] [flags]",
+				SetFlags: command.Flags(flax.MustBind, &mapConfig),
+				Run:      runOnline,
+			},
+			command.HelpCommand(nil),
+		},
+	}
+
+	env := root.NewEnv(nil).MergeFlags(true)
+	command.RunOrFail(env, os.Args[1:])
+}
+
+// runIntegrationTest executes the integration test workflow.
+func runOnline(env *command.Env) error {
+	if mapConfig.Directory == "" {
+		return fmt.Errorf("directory is required")
+	}
+
+	resps, err := mapper.ReadMapResponsesFromDirectory(mapConfig.Directory)
+	if err != nil {
+		return fmt.Errorf("reading map responses from directory: %w", err)
+	}
+
+	expected := integrationutil.BuildExpectedOnlineMap(resps)
+
+	out, err := json.MarshalIndent(expected, "", "  ")
+	if err != nil {
+		return fmt.Errorf("marshaling expected online map: %w", err)
+	}
+
+	os.Stderr.Write(out)
+	os.Stderr.Write([]byte("\n"))
+	return nil
+}
--- a/config-example.yaml
+++ b/config-example.yaml
@@ -105,7 +105,7 @@ derp:

    # For better connection stability (especially when using an Exit-Node and DNS is not working),
    # it is possible to optionally add the public IPv4 and IPv6 address to the Derp-Map using:
-    ipv4: 1.2.3.4
+    ipv4: 198.51.100.1
    ipv6: 2001:db8::1

  # List of externally available DERP maps encoded in JSON
@@ -128,7 +128,7 @@ derp:
  auto_update_enabled: true

  # How often should we check for DERP updates?
-  update_frequency: 24h
+  update_frequency: 3h

 # Disables the automatic check for headscale updates on startup
 disable_check_updates: false
@@ -225,9 +225,11 @@ tls_cert_path: ""
 tls_key_path: ""

 log:
+  # Valid log levels: panic, fatal, error, warn, info, debug, trace
+  level: info
+
  # Output formatting for logs: text or json
  format: text
-  level: info

 ## Policy
 # headscale supports Tailscale's ACL policies.
@@ -273,9 +275,9 @@ dns:
  # `hostname.base_domain` (e.g., _myhost.example.com_).
  base_domain: example.com

-  # Whether to use the local DNS settings of a node (default) or override the
-  # local DNS settings and force the use of Headscale's DNS configuration.
-  override_local_dns: false
+  # Whether to use the local DNS settings of a node or override the local DNS
+  # settings (default) and force the use of Headscale's DNS configuration.
+  override_local_dns: true

  # List of DNS servers to expose to clients.
  nameservers:
@@ -291,8 +293,7 @@ dns:

    # Split DNS (see https://tailscale.com/kb/1054/dns/),
    # a map of domains and which DNS server to use for each.
-    split:
-      {}
+    split: {}
      # foo.bar.com:
      #   - 1.1.1.1
      # darp.headscale.net:
--- a/derp-example.yaml
+++ b/derp-example.yaml
@@ -1,5 +1,6 @@
 # If you plan to somehow use headscale, please deploy your own DERP infra: https://tailscale.com/kb/1118/custom-derp-servers/
 regions:
+  1: null   # Disable DERP region with ID 1
  900:
    regionid: 900
    regioncode: custom
@@ -7,9 +8,9 @@ regions:
    nodes:
      - name: 900a
        regionid: 900
-        hostname: myderp.mydomain.no
-        ipv4: 123.123.123.123
-        ipv6: "2604:a880:400:d1::828:b001"
+        hostname: myderp.example.com
+        ipv4: 198.51.100.1
+        ipv6: 2001:db8::1
        stunport: 0
        stunonly: false
        derpport: 0
--- a/docs/about/faq.md
+++ b/docs/about/faq.md
@@ -51,11 +51,11 @@ is homelabbers and self-hosters. Of course, we do not prevent people from using
 it in a commercial/professional setting and often get questions about scaling.

 Please note that when Headscale is developed, performance is not part of the
-consideration as the main audience is considered to be users with a moddest
+consideration as the main audience is considered to be users with a modest
 amount of devices. We focus on correctness and feature parity with Tailscale
 SaaS over time.

-To understand if you might be able to use Headscale for your usecase, I will
+To understand if you might be able to use Headscale for your use case, I will
 describe two scenarios in an effort to explain what is the central bottleneck
 of Headscale:

@@ -76,7 +76,7 @@ new "world map" is created for every node in the network.
 This means that under certain conditions, Headscale can likely handle 100s
 of devices (maybe more), if there is _little to no change_ happening in the
 network. For example, in Scenario 1, the process of computing the world map is
-extremly demanding due to the size of the network, but when the map has been
+extremely demanding due to the size of the network, but when the map has been
 created and the nodes are not changing, the Headscale instance will likely
 return to a very low resource usage until the next time there is an event
 requiring the new map.
@@ -94,14 +94,14 @@ learn about the current state of the world.
 We expect that the performance will improve over time as we improve the code
 base, but it is not a focus. In general, we will never make the tradeoff to make
 things faster on the cost of less maintainable or readable code. We are a small
-team and have to optimise for maintainabillity.
+team and have to optimise for maintainability.

 ## Which database should I use?

 We recommend the use of SQLite as database for headscale:

 - SQLite is simple to setup and easy to use
- It scales well for all of headscale's usecases
+- It scales well for all of headscale's use cases
 - Development and testing happens primarily on SQLite
 - PostgreSQL is still supported, but is considered to be in "maintenance mode"

--- a/docs/about/features.md
+++ b/docs/about/features.md
@@ -19,7 +19,7 @@ provides on overview of Headscale's feature and compatibility with the Tailscale
    - [x] [Exit nodes](../ref/routes.md#exit-node)
 - [x] Dual stack (IPv4 and IPv6)
 - [x] Ephemeral nodes
- [x] Embedded [DERP server](https://tailscale.com/kb/1232/derp-servers)
+- [x] Embedded [DERP server](../ref/derp.md)
 - [x] Access control lists ([GitHub label "policy"](https://github.com/juanfont/headscale/labels/policy%20%F0%9F%93%9D))
    - [x] ACL management via API
    - [x] Some [Autogroups](https://tailscale.com/kb/1396/targets#autogroups), currently: `autogroup:internet`,
--- a/docs/ref/acls.md
+++ b/docs/ref/acls.md
@@ -9,9 +9,38 @@ When using ACL's the User borders are no longer applied. All machines
 whichever the User have the ability to communicate with other hosts as
 long as the ACL's permits this exchange.

-## ACLs use case example
+## ACL Setup

-Let's build an example use case for a small business (It may be the place where
+To enable and configure ACLs in Headscale, you need to specify the path to your ACL policy file in the `policy.path` key in `config.yaml`.
+
+Your ACL policy file must be formatted using [huJSON](https://github.com/tailscale/hujson).
+
+Info on how these policies are written can be found
+[here](https://tailscale.com/kb/1018/acls/).
+
+Please reload or restart Headscale after updating the ACL file. Headscale may be reloaded either via its systemd service
+(`sudo systemctl reload headscale`) or by sending a SIGHUP signal (`sudo kill -HUP $(pidof headscale)`) to the main
+process. Headscale logs the result of ACL policy processing after each reload.
+
+## Simple Examples
+
+- [**Allow All**](https://tailscale.com/kb/1192/acl-samples#allow-all-default-acl): If you define an ACL file but completely omit the `"acls"` field from its content, Headscale will default to an "allow all" policy. This means all devices connected to your tailnet will be able to communicate freely with each other.
+
+    ```json
+    {}
+    ```
+
+- [**Deny All**](https://tailscale.com/kb/1192/acl-samples#deny-all): To prevent all communication within your tailnet, you can include an empty array for the `"acls"` field in your policy file.
+
+    ```json
+    {
+      "acls": []
+    }
+    ```
+
+## Complex Example
+
+Let's build a more complex example use case for a small business (It may be the place where
 ACL's are the most useful).

 We have a small company with a boss, an admin, two developers and an intern.
@@ -38,10 +67,6 @@ servers.

 ![ACL implementation example](../images/headscale-acl-network.png)

-## ACL setup
-
-ACLs have to be written in [huJSON](https://github.com/tailscale/hujson).
-
 When [registering the servers](../usage/getting-started.md#register-a-node) we
 will need to add the flag `--advertise-tags=tag:<tag1>,tag:<tag2>`, and the user
 that is registering the server should be allowed to do it. Since anyone can add
@@ -49,14 +74,6 @@ tags to a server they can register, the check of the tags is done on headscale
 server and only valid tags are applied. A tag is valid if the user that is
 registering it is allowed to do it.

-To use ACLs in headscale, you must edit your `config.yaml` file. In there you will find a `policy.path` parameter. This
-will need to point to your ACL file. More info on how these policies are written can be found
-[here](https://tailscale.com/kb/1018/acls/).
-
-Please reload or restart Headscale after updating the ACL file. Headscale may be reloaded either via its systemd service
-(`sudo systemctl reload headscale`) or by sending a SIGHUP signal (`sudo kill -HUP $(pidof headscale)`) to the main
-process. Headscale logs the result of ACL policy processing after each reload.
-
 Here are the ACL's to implement the same permissions as above:

 ```json title="acl.json"
--- a/docs/ref/debug.md
+++ b/docs/ref/debug.md
@@ -0,0 +1,115 @@
+# Debugging and troubleshooting
+
+Headscale and Tailscale provide debug and introspection capabilities that can be helpful when things don't work as
+expected. This page explains some debugging techniques to help pinpoint problems.
+
+Please also have a look at [Tailscale's Troubleshooting guide](https://tailscale.com/kb/1023/troubleshooting). It offers
+a many tips and suggestions to troubleshoot common issues.
+
+## Tailscale
+
+The Tailscale client itself offers many commands to introspect its state as well as the state of the network:
+
+- [Check local network conditions](https://tailscale.com/kb/1080/cli#netcheck): `tailscale netcheck`
+- [Get the client status](https://tailscale.com/kb/1080/cli#status): `tailscale status --json`
+- [Get DNS status](https://tailscale.com/kb/1080/cli#dns): `tailscale dns status --all`
+- Client logs: `tailscale debug daemon-logs`
+- Client netmap: `tailscale debug netmap`
+- Test DERP connection: `tailscale debug derp headscale`
+- And many more, see: `tailscale debug --help`
+
+Many of the commands are helpful when trying to understand differences between Headscale and Tailscale SaaS.
+
+## Headscale
+
+### Application logging
+
+The log levels `debug` and `trace` can be useful to get more information from Headscale.
+
+```yaml hl_lines="3"
+log:
+  # Valid log levels: panic, fatal, error, warn, info, debug, trace
+  level: debug
+```
+
+### Database logging
+
+The database debug mode logs all database queries. Enable it to see how Headscale interacts with its database. This also
+requires the application log level to be set to either `debug` or `trace`.
+
+```yaml hl_lines="3 7"
+database:
+  # Enable debug mode. This setting requires the log.level to be set to "debug" or "trace".
+  debug: false
+
+log:
+  # Valid log levels: panic, fatal, error, warn, info, debug, trace
+  level: debug
+```
+
+### Metrics and debug endpoint
+
+Headscale provides a metrics and debug endpoint. It allows to introspect different aspects such as:
+
+- Information about the Go runtime, memory usage and statistics
+- Connected nodes and pending registrations
+- Active ACLs, filters and SSH policy
+- Current DERPMap
+- Prometheus metrics
+
+!!! warning "Keep the metrics and debug endpoint private"
+
+    The listen address and port can be configured with the `metrics_listen_addr` variable in the [configuration
+    file](./configuration.md). By default it listens on localhost, port 9090.
+
+    Keep the metrics and debug endpoint private to your internal network and don't expose it to the Internet.
+
+Query metrics via <http://localhost:9090/metrics> and get an overview of available debug information via
+<http://localhost:9090/debug/>. Metrics may be queried from outside localhost but the debug interface is subject to
+additional protection despite listening on all interfaces.
+
+=== "Direct access"
+
+    Access the debug interface directly on the server where Headscale is installed.
+
+    ```console
+    curl http://localhost:9090/debug/
+    ```
+
+=== "SSH port forwarding"
+
+    Use SSH port forwarding to forward Headscale's metrics and debug port to your device.
+
+    ```console
+    ssh <HEADSCALE_SERVER> -L 9090:localhost:9090
+    ```
+
+    Access the debug interface on your device by opening <http://localhost:9090/debug/> in your web browser.
+
+=== "Via debug key"
+
+    The access control of the debug interface supports the use of a debug key. Traffic is accepted if the path to a
+    debug key is set via the environment variable `TS_DEBUG_KEY_PATH` and the debug key sent as value for `debugkey`
+    parameter with each request.
+
+    ```console
+    openssl rand -hex 32 | tee debugkey.txt
+    export TS_DEBUG_KEY_PATH=debugkey.txt
+    headscale serve
+    ```
+
+    Access the debug interface on your device by opening `http://<IP_OF_HEADSCALE>:9090/debug/?debugkey=<DEBUG_KEY>` in
+    your web browser. The `debugkey` parameter must be sent with every request.
+
+=== "Via debug IP address"
+
+    The debug endpoint expects traffic from localhost. A different debug IP address may be configured by setting the
+    `TS_ALLOW_DEBUG_IP` environment variable before starting Headscale. The debug IP address is ignored when the HTTP
+    header `X-Forwarded-For` is present.
+
+    ```console
+    export TS_ALLOW_DEBUG_IP=192.168.0.10       # IP address of your device
+    headscale serve
+    ```
+
+    Access the debug interface on your device by opening `http://<IP_OF_HEADSCALE>:9090/debug/` in your web browser.
--- a/docs/ref/derp.md
+++ b/docs/ref/derp.md
@@ -0,0 +1,175 @@
+# DERP
+
+A [DERP (Designated Encrypted Relay for Packets) server](https://tailscale.com/kb/1232/derp-servers) is mainly used to
+relay traffic between two nodes in case a direct connection can't be established. Headscale provides an embedded DERP
+server to ensure seamless connectivity between nodes.
+
+## Configuration
+
+DERP related settings are configured within the `derp` section of the [configuration file](./configuration.md). The
+following sections only use a few of the available settings, check the [example configuration](./configuration.md) for
+all available configuration options.
+
+### Enable embedded DERP
+
+Headscale ships with an embedded DERP server which allows to run your own self-hosted DERP server easily. The embedded
+DERP server is disabled by default and needs to be enabled. In addition, you should configure the public IPv4 and public
+IPv6 address of your Headscale server for improved connection stability:
+
+```yaml title="config.yaml" hl_lines="3-5"
+derp:
+  server:
+    enabled: true
+    ipv4: 198.51.100.1
+    ipv6: 2001:db8::1
+```
+
+Keep in mind that [additional ports are needed to run a DERP server](../setup/requirements.md#ports-in-use). Besides
+relaying traffic, it also uses STUN (udp/3478) to help clients discover their public IP addresses and perform NAT
+traversal. [Check DERP server connectivity](#check-derp-server-connectivity) to see if everything works.
+
+### Remove Tailscale's DERP servers
+
+Once enabled, Headscale's embedded DERP is added to the list of free-to-use [DERP
+servers](https://tailscale.com/kb/1232/derp-servers) offered by Tailscale Inc. To only use Headscale's embedded DERP
+server, disable the loading of the default DERP map:
+
+```yaml title="config.yaml" hl_lines="6"
+derp:
+  server:
+    enabled: true
+    ipv4: 198.51.100.1
+    ipv6: 2001:db8::1
+  urls: []
+```
+
+!!! warning "Single point of failure"
+
+    Removing Tailscale's DERP servers means that there is now just a single DERP server available for clients. This is a
+    single point of failure and could hamper connectivity.
+
+    [Check DERP server connectivity](#check-derp-server-connectivity) with your embedded DERP server before removing
+    Tailscale's DERP servers.
+
+### Customize DERP map
+
+The DERP map offered to clients can be customized with a [dedicated YAML-configuration
+file](https://github.com/juanfont/headscale/blob/main/derp-example.yaml). This allows to modify previously loaded DERP
+maps fetched via URL or to offer your own, custom DERP servers to nodes.
+
+=== "Remove specific DERP regions"
+
+    The free-to-use [DERP servers](https://tailscale.com/kb/1232/derp-servers) are organized into regions via a region
+    ID. You can explicitly disable a specific region by setting its region ID to `null`. The following sample
+    `derp.yaml` disables the New York DERP region (which has the region ID 1):
+
+     ```yaml title="derp.yaml"
+     regions:
+       1: null
+     ```
+
+    Use the following configuration to serve the default DERP map (excluding New York) to nodes:
+
+    ```yaml title="config.yaml" hl_lines="6 7"
+    derp:
+      server:
+        enabled: false
+      urls:
+        - https://controlplane.tailscale.com/derpmap/default
+      paths:
+        - /etc/headscale/derp.yaml
+    ```
+
+=== "Provide custom DERP servers"
+
+    The following sample `derp.yaml` references two custom regions (`custom-east` with ID 900 and `custom-west` with ID 901)
+    with one custom DERP server in each region. Each DERP server offers DERP relay via HTTPS on tcp/443, support for captive
+    portal checks via HTTP on tcp/80 and STUN on udp/3478. See the definitions of
+    [DERPMap](https://pkg.go.dev/tailscale.com/tailcfg#DERPMap),
+    [DERPRegion](https://pkg.go.dev/tailscale.com/tailcfg#DERPRegion) and
+    [DERPNode](https://pkg.go.dev/tailscale.com/tailcfg#DERPNode) for all available options.
+
+    ```yaml title="derp.yaml"
+    regions:
+      900:
+        regionid: 900
+        regioncode: custom-east
+        regionname: My region (east)
+        nodes:
+          - name: 900a
+            regionid: 900
+            hostname: derp900a.example.com
+            ipv4: 198.51.100.1
+            ipv6: 2001:db8::1
+            canport80: true
+      901:
+        regionid: 901
+        regioncode: custom-west
+        regionname: My Region (west)
+        nodes:
+          - name: 901a
+            regionid: 901
+            hostname: derp901a.example.com
+            ipv4: 198.51.100.2
+            ipv6: 2001:db8::2
+            canport80: true
+    ```
+
+    Use the following configuration to only serve the two DERP servers from the above `derp.yaml`:
+
+    ```yaml title="config.yaml" hl_lines="5 6"
+    derp:
+      server:
+        enabled: false
+      urls: []
+      paths:
+        - /etc/headscale/derp.yaml
+    ```
+
+Independent of the custom DERP map, you may choose to [enable the embedded DERP server and have it automatically added
+to the custom DERP map](#enable-embedded-derp).
+
+### Verify clients
+
+Access to DERP serves can be restricted to nodes that are members of your Tailnet. Relay access is denied for unknown
+clients.
+
+=== "Embedded DERP"
+
+    Client verification is enabled by default.
+
+    ```yaml title="config.yaml" hl_lines="3"
+    derp:
+      server:
+        verify_clients: true
+    ```
+
+=== "3rd-party DERP"
+
+    Tailscale's `derper` provides two parameters to configure client verification:
+
+    - Use the `-verify-client-url` parameter of the `derper` and point it towards the `/verify` endpoint of your
+      Headscale server (e.g `https://headscale.example.com/verify`). The DERP server will query your Headscale instance
+      as soon as a client connects with it to ask whether access should be allowed or denied. Access is allowed if
+      Headscale knows about the connecting client and denied otherwise.
+    - The parameter `-verify-client-url-fail-open` controls what should happen when the DERP server can't reach the
+      Headscale instance. By default, it will allow access if Headscale is unreachable.
+
+## Check DERP server connectivity
+
+Any Tailscale client may be used to introspect the DERP map and to check for connectivity issues with DERP servers.
+
+- Display DERP map: `tailscale debug derp-map`
+- Check connectivity with the embedded DERP[^1]:`tailscale debug derp headscale`
+
+Additional DERP related metrics and information is available via the [metrics and debug
+endpoint](./debug.md#metrics-and-debug-endpoint).
+
+[^1]:
+    This assumes that the default region code of the [configuration file](./configuration.md) is used.
+
+## Limitations
+
+- The embedded DERP server can't be used for Tailscale's captive portal checks as it doesn't support the `/generate_204`
+  endpoint via HTTP on port tcp/80.
+- There are no speed or throughput optimisations, the main purpose is to assist in node connectivity.
--- a/docs/ref/dns.md
+++ b/docs/ref/dns.md
@@ -23,7 +23,7 @@ hostname and port combination "http://hostname-in-magic-dns.myvpn.example.com:30

 !!! warning "Limitations"

-    Currently, [only A and AAAA records are processed by Tailscale](https://github.com/tailscale/tailscale/blob/v1.78.3/ipn/ipnlocal/local.go#L4461-L4479).
+    Currently, [only A and AAAA records are processed by Tailscale](https://github.com/tailscale/tailscale/blob/v1.86.5/ipn/ipnlocal/node_backend.go#L662).

 1.  Configure extra DNS records using one of the available configuration options:

--- a/docs/ref/integration/reverse-proxy.md
+++ b/docs/ref/integration/reverse-proxy.md
@@ -13,7 +13,7 @@ Running headscale behind a reverse proxy is useful when running multiple applica

 The reverse proxy MUST be configured to support WebSockets to communicate with Tailscale clients.

-WebSockets support is also required when using the headscale embedded DERP server. In this case, you will also need to expose the UDP port used for STUN (by default, udp/3478). Please check our [config-example.yaml](https://github.com/juanfont/headscale/blob/main/config-example.yaml).
+WebSockets support is also required when using the Headscale [embedded DERP server](../derp.md). In this case, you will also need to expose the UDP port used for STUN (by default, udp/3478). Please check our [config-example.yaml](https://github.com/juanfont/headscale/blob/main/config-example.yaml).

 ### Cloudflare

--- a/docs/ref/integration/tools.md
+++ b/docs/ref/integration/tools.md
@@ -13,3 +13,4 @@ This page collects third-party tools, client libraries, and scripts related to h
 | headscalebacktosqlite | [Github](https://github.com/bigbozza/headscalebacktosqlite)     | Migrate headscale from PostgreSQL back to SQLite                     |
 | headscale-pf          | [Github](https://github.com/YouSysAdmin/headscale-pf)           | Populates user groups based on user groups in Jumpcloud or Authentik |
 | headscale-client-go   | [Github](https://github.com/hibare/headscale-client-go)         | A Go client implementation for the Headscale HTTP API.               |
+| headscale-zabbix      | [Github](https://github.com/dblanque/headscale-zabbix)          | A Zabbix Monitoring Template for the Headscale Service.              |
--- a/docs/ref/oidc.md
+++ b/docs/ref/oidc.md
@@ -2,7 +2,7 @@

 Headscale supports authentication via external identity providers using OpenID Connect (OIDC). It features:

- Autoconfiguration via OpenID Connect Discovery Protocol
+- Auto configuration via OpenID Connect Discovery Protocol
 - [Proof Key for Code Exchange (PKCE) code verification](#enable-pkce-recommended)
 - [Authorization based on a user's domain, email address or group membership](#authorize-users-with-filters)
 - Synchronization of [standard OIDC claims](#supported-oidc-claims)
@@ -142,7 +142,7 @@ Access Token.
 === "Use expiration from Access Token"

    Please keep in mind that the Access Token is typically a short-lived token that expires within a few minutes. You
-    will have to configure token expiration in your identity provider to avoid frequent reauthentication.
+    will have to configure token expiration in your identity provider to avoid frequent re-authentication.


    ```yaml hl_lines="5"
@@ -184,7 +184,7 @@ You may refer to users in the Headscale policy via:
 ## Supported OIDC claims

 Headscale uses [the standard OIDC claims](https://openid.net/specs/openid-connect-core-1_0.html#StandardClaims) to
-populate and update its local user profile on each login. OIDC claims are read from the ID Token or from the UserInfo
+populate and update its local user profile on each login. OIDC claims are read from the ID Token and from the UserInfo
 endpoint.

 | Headscale profile   | OIDC claim           | Notes / examples                                                                                  |
@@ -230,19 +230,6 @@ are known to work:

 Authelia is fully supported by Headscale.

-#### Additional configuration to authorize users based on filters
-
-Authelia (4.39.0 or newer) no longer provides standard OIDC claims such as `email` or `groups` via the ID Token. The
-OIDC `email` and `groups` claims are used to [authorize users with filters](#authorize-users-with-filters). This extra
-configuration step is **only** needed if you need to authorize access based on one of the following user properties:
-
- domain
- email address
- group membership
-
-Please follow the instructions from Authelia's documentation on how to [Restore Functionality Prior to Claims
-Parameter](https://www.authelia.com/integration/openid-connect/openid-connect-1.0-claims/#restore-functionality-prior-to-claims-parameter).
-
 ### Authentik

 - Authentik is fully supported by Headscale.
@@ -297,13 +284,15 @@ you need to [authorize access based on group membership](#authorize-users-with-f

 - Create a new client scope `groups` for OpenID Connect:
    - Configure a `Group Membership` mapper with name `groups` and the token claim name `groups`.
-    - Enable the mapper for the ID Token, Access Token and UserInfo endpoint.
+    - Add the mapper to at least the UserInfo endpoint.
 - Configure the new client scope for your Headscale client:
    - Edit the Headscale client.
    - Search for the client scope `group`.
    - Add it with assigned type `Default`.
- [Configure the allowed groups in Headscale](#authorize-users-with-filters). Keep in mind that groups in Keycloak start
-  with a leading `/`.
+- [Configure the allowed groups in Headscale](#authorize-users-with-filters). How groups need to be specified depends on
+  Keycloak's `Full group path` option:
+    - `Full group path` is enabled: groups contain their full path, e.g. `/top/group1`
+    - `Full group path` is disabled: only the name of the group is used, e.g. `group1`

 ### Microsoft Entra ID

@@ -315,3 +304,6 @@ Entra ID is: `https://login.microsoftonline.com/<tenant-UUID>/v2.0`. The followi

 - `domain_hint: example.com` to use your own domain
 - `prompt: select_account` to force an account picker during login
+
+Groups for the [allowed groups filter](#authorize-users-with-filters) need to be specified with their group ID instead
+of the group name.
--- a/docs/ref/routes.md
+++ b/docs/ref/routes.md
@@ -49,7 +49,7 @@ ID | Hostname | Approved | Available                  | Serving (Primary)
 Approve all desired routes of a subnet router by specifying them as comma separated list:

 ```console
-$ headscale nodes approve-routes --node 1 --routes 10.0.0.0/8,192.168.0.0/24
+$ headscale nodes approve-routes --identifier 1 --routes 10.0.0.0/8,192.168.0.0/24
 Node updated
 ```

@@ -175,7 +175,7 @@ ID | Hostname | Approved | Available       | Serving (Primary)
 For exit nodes, it is sufficient to approve either the IPv4 or IPv6 route. The other will be approved automatically.

 ```console
-$ headscale nodes approve-routes --node 1 --routes 0.0.0.0/0
+$ headscale nodes approve-routes --identifier 1 --routes 0.0.0.0/0
 Node updated
 ```

--- a/docs/setup/install/container.md
+++ b/docs/setup/install/container.md
@@ -112,11 +112,11 @@ docker exec -it headscale \

 ### Register a machine using a pre authenticated key

-Generate a key using the command line:
+Generate a key using the command line for the user with ID 1:

 ```shell
 docker exec -it headscale \
-  headscale preauthkeys create --user myfirstuser --reusable --expiration 24h
+  headscale preauthkeys create --user 1 --reusable --expiration 24h
 ```

 This will return a pre-authenticated key that can be used to connect a node to headscale with the `tailscale up` command:
--- a/docs/setup/requirements.md
+++ b/docs/setup/requirements.md
@@ -4,11 +4,35 @@ Headscale should just work as long as the following requirements are met:

 - A server with a public IP address for headscale. A dual-stack setup with a public IPv4 and a public IPv6 address is
  recommended.
- Headscale is served via HTTPS on port 443[^1].
+- Headscale is served via HTTPS on port 443[^1] and [may use additional ports](#ports-in-use).
 - A reasonably modern Linux or BSD based operating system.
 - A dedicated local user account to run headscale.
 - A little bit of command line knowledge to configure and operate headscale.

+## Ports in use
+
+The ports in use vary with the intended scenario and enabled features. Some of the listed ports may be changed via the
+[configuration file](../ref/configuration.md) but we recommend to stick with the default values.
+
+- tcp/80
+    - Expose publicly: yes
+    - HTTP, used by Let's Encrypt to verify ownership via the HTTP-01 challenge.
+    - Only required if the built-in Let's Enrypt client with the HTTP-01 challenge is used. See [TLS](../ref/tls.md) for
+      details.
+- tcp/443
+    - Expose publicly: yes
+    - HTTPS, required to make Headscale available to Tailscale clients[^1]
+    - Required if the [embedded DERP server](../ref/derp.md) is enabled
+- udp/3478
+    - Expose publicly: yes
+    - STUN, required if the [embedded DERP server](../ref/derp.md) is enabled
+- tcp/50443
+    - Expose publicly: yes
+    - Only required if the gRPC interface is used to [remote-control Headscale](../ref/remote-cli.md).
+- tcp/9090
+    - Expose publicly: no
+    - [Metrics and debug endpoint](../ref/debug.md#metrics-and-debug-endpoint)
+
 ## Assumptions

 The headscale documentation and the provided examples are written with a few assumptions in mind:
--- a/docs/usage/connect/android.md
+++ b/docs/usage/connect/android.md
@@ -6,9 +6,23 @@ This documentation has the goal of showing how a user can use the official Andro

 Install the official Tailscale Android client from the [Google Play Store](https://play.google.com/store/apps/details?id=com.tailscale.ipn) or [F-Droid](https://f-droid.org/packages/com.tailscale.ipn/).

-## Configuring the headscale URL
+## Connect via normal, interactive login

 - Open the app and select the settings menu in the upper-right corner
 - Tap on `Accounts`
 - In the kebab menu icon (three dots) in the upper-right corner select `Use an alternate server`
 - Enter your server URL (e.g `https://headscale.example.com`) and follow the instructions
+- The client connects automatically as soon as the node registration is complete on headscale. Until then, nothing is
+  visible in the server logs.
+
+## Connect using a preauthkey
+
+- Open the app and select the settings menu in the upper-right corner
+- Tap on `Accounts`
+- In the kebab menu icon (three dots) in the upper-right corner select `Use an alternate server`
+- Enter your server URL (e.g `https://headscale.example.com`). If login prompts open, close it and continue
+- Open the settings menu in the upper-right corner
+- Tap on `Accounts`
+- In the kebab menu icon (three dots) in the upper-right corner select `Use an auth key`
+- Enter your [preauthkey generated from headscale](../getting-started.md#using-a-preauthkey)
+- If needed, tap `Log in` on the main screen. You should now be connected to your headscale.
--- a/docs/usage/getting-started.md
+++ b/docs/usage/getting-started.md
@@ -117,14 +117,14 @@ headscale instance. By default, the key is valid for one hour and can only be us
 === "Native"

    ```shell
-    headscale preauthkeys create --user <USER>
+    headscale preauthkeys create --user <USER_ID>
    ```

 === "Container"

    ```shell
    docker exec -it headscale \
-      headscale preauthkeys create --user <USER>
+      headscale preauthkeys create --user <USER_ID>
    ```

 The command returns the preauthkey on success which is used to connect a node to the headscale instance via the
--- a/flake.lock
+++ b/flake.lock
@@ -20,11 +20,11 @@
    },
    "nixpkgs": {
      "locked": {
-        "lastModified": 1752012998,
-        "narHash": "sha256-Q82Ms+FQmgOBkdoSVm+FBpuFoeUAffNerR5yVV7SgT8=",
+        "lastModified": 1755829505,
+        "narHash": "sha256-4/Jd+LkQ2ssw8luQVkqVs9spDBVE6h/u/hC/tzngsPo=",
        "owner": "NixOS",
        "repo": "nixpkgs",
-        "rev": "2a2130494ad647f953593c4e84ea4df839fbd68c",
+        "rev": "f937f8ecd1c70efd7e9f90ba13dfb400cf559de4",
        "type": "github"
      },
      "original": {
--- a/flake.nix
+++ b/flake.nix
@@ -19,7 +19,7 @@
      overlay = _: prev: let
        pkgs = nixpkgs.legacyPackages.${prev.system};
        buildGo = pkgs.buildGo124Module;
-        vendorHash = "sha256-S2GnCg2dyfjIyi5gXhVEuRs5Bop2JAhZcnhg1fu4/Gg=";
+        vendorHash = "sha256-hIY6asY3rOIqf/5P6lFmnNCDWcqNPJaj+tqJuOvGJlo=";
      in {
        headscale = buildGo {
          pname = "headscale";
--- a/gen/go/headscale/v1/apikey.pb.go
+++ b/gen/go/headscale/v1/apikey.pb.go
@@ -1,6 +1,6 @@
 // Code generated by protoc-gen-go. DO NOT EDIT.
 // versions:
-// 	protoc-gen-go v1.36.6
+// 	protoc-gen-go v1.36.8
 // 	protoc        (unknown)
 // source: headscale/v1/apikey.proto

--- a/gen/go/headscale/v1/device.pb.go
+++ b/gen/go/headscale/v1/device.pb.go
@@ -1,6 +1,6 @@
 // Code generated by protoc-gen-go. DO NOT EDIT.
 // versions:
-// 	protoc-gen-go v1.36.6
+// 	protoc-gen-go v1.36.8
 // 	protoc        (unknown)
 // source: headscale/v1/device.proto

--- a/gen/go/headscale/v1/headscale.pb.go
+++ b/gen/go/headscale/v1/headscale.pb.go
@@ -1,6 +1,6 @@
 // Code generated by protoc-gen-go. DO NOT EDIT.
 // versions:
-// 	protoc-gen-go v1.36.6
+// 	protoc-gen-go v1.36.8
 // 	protoc        (unknown)
 // source: headscale/v1/headscale.proto

--- a/gen/go/headscale/v1/node.pb.go
+++ b/gen/go/headscale/v1/node.pb.go
@@ -1,6 +1,6 @@
 // Code generated by protoc-gen-go. DO NOT EDIT.
 // versions:
-// 	protoc-gen-go v1.36.6
+// 	protoc-gen-go v1.36.8
 // 	protoc        (unknown)
 // source: headscale/v1/node.proto

@@ -913,10 +913,6 @@ func (x *RenameNodeResponse) GetNode() *Node {
 type ListNodesRequest struct {
 	state         protoimpl.MessageState `protogen:"open.v1"`
 	User          string                 `protobuf:"bytes,1,opt,name=user,proto3" json:"user,omitempty"`
-	Id            uint64                 `protobuf:"varint,2,opt,name=id,proto3" json:"id,omitempty"`
-	Name          string                 `protobuf:"bytes,3,opt,name=name,proto3" json:"name,omitempty"`
-	Hostname      string                 `protobuf:"bytes,4,opt,name=hostname,proto3" json:"hostname,omitempty"`
-	IpAddresses   []string               `protobuf:"bytes,5,rep,name=ip_addresses,json=ipAddresses,proto3" json:"ip_addresses,omitempty"`
 	unknownFields protoimpl.UnknownFields
 	sizeCache     protoimpl.SizeCache
 }
@@ -958,34 +954,6 @@ func (x *ListNodesRequest) GetUser() string {
 	return ""
 }

-func (x *ListNodesRequest) GetId() uint64 {
-	if x != nil {
-		return x.Id
-	}
-	return 0
-}
-
-func (x *ListNodesRequest) GetName() string {
-	if x != nil {
-		return x.Name
-	}
-	return ""
-}
-
-func (x *ListNodesRequest) GetHostname() string {
-	if x != nil {
-		return x.Hostname
-	}
-	return ""
-}
-
-func (x *ListNodesRequest) GetIpAddresses() []string {
-	if x != nil {
-		return x.IpAddresses
-	}
-	return nil
-}
-
 type ListNodesResponse struct {
 	state         protoimpl.MessageState `protogen:"open.v1"`
 	Nodes         []*Node                `protobuf:"bytes,1,rep,name=nodes,proto3" json:"nodes,omitempty"`
@@ -1390,13 +1358,9 @@ const file_headscale_v1_node_proto_rawDesc = "" +
 	"\anode_id\x18\x01 \x01(\x04R\x06nodeId\x12\x19\n" +
 	"\bnew_name\x18\x02 \x01(\tR\anewName\"<\n" +
 	"\x12RenameNodeResponse\x12&\n" +
-	"\x04node\x18\x01 \x01(\v2\x12.headscale.v1.NodeR\x04node\"\x89\x01\n" +
+	"\x04node\x18\x01 \x01(\v2\x12.headscale.v1.NodeR\x04node\"&\n" +
 	"\x10ListNodesRequest\x12\x12\n" +
-	"\x04user\x18\x01 \x01(\tR\x04user\x12\x0e\n" +
-	"\x02id\x18\x02 \x01(\x04R\x02id\x12\x12\n" +
-	"\x04name\x18\x03 \x01(\tR\x04name\x12\x1a\n" +
-	"\bhostname\x18\x04 \x01(\tR\bhostname\x12!\n" +
-	"\fip_addresses\x18\x05 \x03(\tR\vipAddresses\"=\n" +
+	"\x04user\x18\x01 \x01(\tR\x04user\"=\n" +
 	"\x11ListNodesResponse\x12(\n" +
 	"\x05nodes\x18\x01 \x03(\v2\x12.headscale.v1.NodeR\x05nodes\">\n" +
 	"\x0fMoveNodeRequest\x12\x17\n" +
--- a/gen/go/headscale/v1/policy.pb.go
+++ b/gen/go/headscale/v1/policy.pb.go
@@ -1,6 +1,6 @@
 // Code generated by protoc-gen-go. DO NOT EDIT.
 // versions:
-// 	protoc-gen-go v1.36.6
+// 	protoc-gen-go v1.36.8
 // 	protoc        (unknown)
 // source: headscale/v1/policy.proto

--- a/gen/go/headscale/v1/preauthkey.pb.go
+++ b/gen/go/headscale/v1/preauthkey.pb.go
@@ -1,6 +1,6 @@
 // Code generated by protoc-gen-go. DO NOT EDIT.
 // versions:
-// 	protoc-gen-go v1.36.6
+// 	protoc-gen-go v1.36.8
 // 	protoc        (unknown)
 // source: headscale/v1/preauthkey.proto

--- a/gen/go/headscale/v1/user.pb.go
+++ b/gen/go/headscale/v1/user.pb.go
@@ -1,6 +1,6 @@
 // Code generated by protoc-gen-go. DO NOT EDIT.
 // versions:
-// 	protoc-gen-go v1.36.6
+// 	protoc-gen-go v1.36.8
 // 	protoc        (unknown)
 // source: headscale/v1/user.proto

--- a/gen/openapiv2/headscale/v1/headscale.swagger.json
+++ b/gen/openapiv2/headscale/v1/headscale.swagger.json
@@ -187,35 +187,6 @@
            "in": "query",
            "required": false,
            "type": "string"
-          },
-          {
-            "name": "id",
-            "in": "query",
-            "required": false,
-            "type": "string",
-            "format": "uint64"
-          },
-          {
-            "name": "name",
-            "in": "query",
-            "required": false,
-            "type": "string"
-          },
-          {
-            "name": "hostname",
-            "in": "query",
-            "required": false,
-            "type": "string"
-          },
-          {
-            "name": "ipAddresses",
-            "in": "query",
-            "required": false,
-            "type": "array",
-            "items": {
-              "type": "string"
-            },
-            "collectionFormat": "multi"
          }
        ],
        "tags": [
--- a/go.mod
+++ b/go.mod
@@ -1,11 +1,10 @@
 module github.com/juanfont/headscale

-go 1.24.0
+go 1.24.4

-toolchain go1.24.2
+toolchain go1.24.6

 require (
-	github.com/AlecAivazis/survey/v2 v2.3.7
 	github.com/arl/statsviz v0.6.0
 	github.com/cenkalti/backoff/v5 v5.0.2
 	github.com/chasefleming/elem-go v0.30.0
@@ -18,12 +17,12 @@ require (
 	github.com/fsnotify/fsnotify v1.9.0
 	github.com/glebarez/sqlite v1.11.0
 	github.com/go-gormigrate/gormigrate/v2 v2.1.4
+	github.com/go-json-experiment/json v0.0.0-20250223041408-d3c622f1b874
 	github.com/gofrs/uuid/v5 v5.3.2
 	github.com/google/go-cmp v0.7.0
 	github.com/gorilla/mux v1.8.1
 	github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.0
 	github.com/jagottsicher/termcolor v1.0.2
-	github.com/klauspost/compress v1.18.0
 	github.com/oauth2-proxy/mockoidc v0.0.0-20240214162133-caebfff84d25
 	github.com/ory/dockertest/v3 v3.12.0
 	github.com/philip-bui/grpc-zerolog v1.0.1
@@ -43,11 +42,11 @@ require (
 	github.com/tailscale/tailsql v0.0.0-20250421235516-02f85f087b97
 	github.com/tcnksm/go-latest v0.0.0-20170313132115-e3007ae9052e
 	go4.org/netipx v0.0.0-20231129151722-fdeea329fbba
-	golang.org/x/crypto v0.39.0
+	golang.org/x/crypto v0.40.0
 	golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0
-	golang.org/x/net v0.41.0
+	golang.org/x/net v0.42.0
 	golang.org/x/oauth2 v0.30.0
-	golang.org/x/sync v0.15.0
+	golang.org/x/sync v0.16.0
 	google.golang.org/genproto/googleapis/api v0.0.0-20250603155806-513f23925822
 	google.golang.org/grpc v1.73.0
 	google.golang.org/protobuf v1.36.6
@@ -55,7 +54,7 @@ require (
 	gopkg.in/yaml.v3 v3.0.1
 	gorm.io/driver/postgres v1.6.0
 	gorm.io/gorm v1.30.0
-	tailscale.com v1.84.2
+	tailscale.com v1.86.5
 	zgo.at/zcache/v2 v2.2.0
 	zombiezen.com/go/postgrestest v1.0.1
 )
@@ -132,11 +131,10 @@ require (
 	github.com/glebarez/go-sqlite v1.22.0 // indirect
 	github.com/go-jose/go-jose/v3 v3.0.4 // indirect
 	github.com/go-jose/go-jose/v4 v4.1.0 // indirect
-	github.com/go-json-experiment/json v0.0.0-20250223041408-d3c622f1b874 // indirect
 	github.com/go-logr/logr v1.4.2 // indirect
 	github.com/go-logr/stdr v1.2.2 // indirect
 	github.com/go-ole/go-ole v1.3.0 // indirect
-	github.com/go-viper/mapstructure/v2 v2.2.1 // indirect
+	github.com/go-viper/mapstructure/v2 v2.4.0 // indirect
 	github.com/godbus/dbus/v5 v5.1.1-0.20230522191255-76236955d466 // indirect
 	github.com/gogo/protobuf v1.3.2 // indirect
 	github.com/golang-jwt/jwt/v5 v5.2.2 // indirect
@@ -150,8 +148,6 @@ require (
 	github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510 // indirect
 	github.com/google/uuid v1.6.0 // indirect
 	github.com/gookit/color v1.5.4 // indirect
-	github.com/gorilla/csrf v1.7.3 // indirect
-	github.com/gorilla/securecookie v1.1.2 // indirect
 	github.com/gorilla/websocket v1.5.3 // indirect
 	github.com/hashicorp/go-version v1.7.0 // indirect
 	github.com/hdevalence/ed25519consensus v0.2.0 // indirect
@@ -165,7 +161,7 @@ require (
 	github.com/jinzhu/now v1.1.5 // indirect
 	github.com/jmespath/go-jmespath v0.4.0 // indirect
 	github.com/jsimonetti/rtnetlink v1.4.1 // indirect
-	github.com/kballard/go-shellquote v0.0.0-20180428030007-95032a82bc51 // indirect
+	github.com/klauspost/compress v1.18.0 // indirect
 	github.com/kr/pretty v0.3.1 // indirect
 	github.com/kr/text v0.2.0 // indirect
 	github.com/lib/pq v1.10.9 // indirect
@@ -177,7 +173,6 @@ require (
 	github.com/mdlayher/netlink v1.7.3-0.20250113171957-fbb4dce95f42 // indirect
 	github.com/mdlayher/sdnotify v1.0.0 // indirect
 	github.com/mdlayher/socket v0.5.0 // indirect
-	github.com/mgutz/ansi v0.0.0-20200706080929-d51e80ef957d // indirect
 	github.com/miekg/dns v1.1.58 // indirect
 	github.com/mitchellh/go-ps v1.0.0 // indirect
 	github.com/moby/docker-image-spec v1.3.1 // indirect
@@ -215,7 +210,7 @@ require (
 	github.com/tailscale/peercred v0.0.0-20250107143737-35a0c7bd7edc // indirect
 	github.com/tailscale/setec v0.0.0-20250305161714-445cadbbca3d // indirect
 	github.com/tailscale/web-client-prebuilt v0.0.0-20250124233751-d4cd19a26976 // indirect
-	github.com/tailscale/wireguard-go v0.0.0-20250304000100-91a0587fb251 // indirect
+	github.com/tailscale/wireguard-go v0.0.0-20250716170648-1d0488a3d7da // indirect
 	github.com/vishvananda/netns v0.0.4 // indirect
 	github.com/x448/float16 v0.8.4 // indirect
 	github.com/xeipuuv/gojsonpointer v0.0.0-20190905194746-02993c407bfb // indirect
@@ -231,14 +226,19 @@ require (
 	go.opentelemetry.io/otel/trace v1.36.0 // indirect
 	go.uber.org/multierr v1.11.0 // indirect
 	go4.org/mem v0.0.0-20240501181205-ae6ca9944745 // indirect
-	golang.org/x/mod v0.25.0 // indirect
-	golang.org/x/sys v0.33.0 // indirect
-	golang.org/x/term v0.32.0 // indirect
-	golang.org/x/text v0.26.0 // indirect
-	golang.org/x/time v0.10.0 // indirect
-	golang.org/x/tools v0.33.0 // indirect
+	golang.org/x/mod v0.26.0 // indirect
+	golang.org/x/sys v0.34.0 // indirect
+	golang.org/x/term v0.33.0 // indirect
+	golang.org/x/text v0.27.0 // indirect
+	golang.org/x/time v0.11.0 // indirect
+	golang.org/x/tools v0.35.0 // indirect
 	golang.zx2c4.com/wintun v0.0.0-20230126152724-0fa3db229ce2 // indirect
 	golang.zx2c4.com/wireguard/windows v0.5.3 // indirect
 	google.golang.org/genproto/googleapis/rpc v0.0.0-20250603155806-513f23925822 // indirect
 	gvisor.dev/gvisor v0.0.0-20250205023644-9414b50a5633 // indirect
 )
+
+tool (
+	golang.org/x/tools/cmd/stringer
+	tailscale.com/cmd/viewer
+)
--- a/go.sum
+++ b/go.sum
@@ -14,8 +14,6 @@ filippo.io/edwards25519 v1.1.0 h1:FNf4tywRC1HmFuKW5xopWpigGjJKiJSV0Cqo0cJWDaA=
 filippo.io/edwards25519 v1.1.0/go.mod h1:BxyFTGdWcka3PhytdK4V28tE5sGfRvvvRV7EaN4VDT4=
 filippo.io/mkcert v1.4.4 h1:8eVbbwfVlaqUM7OwuftKc2nuYOoTDQWqsoXmzoXZdbc=
 filippo.io/mkcert v1.4.4/go.mod h1:VyvOchVuAye3BoUsPUOOofKygVwLV2KQMVFJNRq+1dA=
-github.com/AlecAivazis/survey/v2 v2.3.7 h1:6I/u8FvytdGsgonrYsVn2t8t4QiRnh6QSTqkkhIiSjQ=
-github.com/AlecAivazis/survey/v2 v2.3.7/go.mod h1:xUTIdE4KCOIjsBAE1JYsUPoCqYdZ1reCfTwbto0Fduo=
 github.com/Azure/go-ansiterm v0.0.0-20250102033503-faa5f7b0171c h1:udKWzYgxTojEKWjV8V+WSxDXJ4NFATAsZjh8iIbsQIg=
 github.com/Azure/go-ansiterm v0.0.0-20250102033503-faa5f7b0171c/go.mod h1:xomTg63KZ2rFqZQzSB4Vz2SUXa1BpHTVz9L5PTmPC4E=
 github.com/BurntSushi/toml v1.4.1-0.20240526193622-a339e1f7089c h1:pxW6RcqyfI9/kWtOwnv/G+AzdKuy2ZrqINhenH4HyNs=
@@ -31,8 +29,6 @@ github.com/MarvinJWendt/testza v0.5.2 h1:53KDo64C1z/h/d/stCYCPY69bt/OSwjq5KpFNwi
 github.com/MarvinJWendt/testza v0.5.2/go.mod h1:xu53QFE5sCdjtMCKk8YMQ2MnymimEctc4n3EjyIYvEY=
 github.com/Microsoft/go-winio v0.6.2 h1:F2VQgta7ecxGYO8k3ZZz3RS8fVIXVxONVUPlNERoyfY=
 github.com/Microsoft/go-winio v0.6.2/go.mod h1:yd8OoFMLzJbo9gZq8j5qaps8bJ9aShtEA8Ipt1oGCvU=
-github.com/Netflix/go-expect v0.0.0-20220104043353-73e0943537d2 h1:+vx7roKuyA63nhn5WAunQHLTznkw5W8b1Xc0dNjp83s=
-github.com/Netflix/go-expect v0.0.0-20220104043353-73e0943537d2/go.mod h1:HBCaDeC1lPdgDeDbhX8XFpy1jqjK0IBG8W5K+xYqA0w=
 github.com/Nvveen/Gotty v0.0.0-20120604004816-cd527374f1e5 h1:TngWCqHvy9oXAN6lEVMRuU21PR1EtLVZJmdB18Gu3Rw=
 github.com/Nvveen/Gotty v0.0.0-20120604004816-cd527374f1e5/go.mod h1:lmUJ/7eu/Q8D7ML55dXQrVaamCz2vxCfdQBasLZfHKk=
 github.com/akutz/memconn v0.1.0 h1:NawI0TORU4hcOMsMr11g7vwlCdkYeLKXBcxWu2W/P8A=
@@ -131,7 +127,6 @@ github.com/creachadair/mds v0.24.3/go.mod h1:0oeHt9QWu8VfnmskOL4zi2CumjEvB29Scmt
 github.com/creachadair/taskgroup v0.13.2 h1:3KyqakBuFsm3KkXi/9XIb0QcA8tEzLHLgaoidf0MdVc=
 github.com/creachadair/taskgroup v0.13.2/go.mod h1:i3V1Zx7H8RjwljUEeUWYT30Lmb9poewSb2XI1yTwD0g=
 github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
-github.com/creack/pty v1.1.17/go.mod h1:MOBLtS5ELjhRRrroQr9kyvTxUAFNvYEK993ew/Vr4O4=
 github.com/creack/pty v1.1.23 h1:4M6+isWdcStXEf15G/RbrMPOQj1dZ7HPZCGwE4kOeP0=
 github.com/creack/pty v1.1.23/go.mod h1:08sCNb52WyoAwi2QDyzUCTgcvVFhUzewun7wtTfvcwE=
 github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
@@ -154,8 +149,6 @@ github.com/docker/go-connections v0.5.0 h1:USnMq7hx7gwdVZq1L49hLXaFtUdTADjXGp+uj
 github.com/docker/go-connections v0.5.0/go.mod h1:ov60Kzw0kKElRwhNs9UlUHAE/F9Fe6GLaXnqyDdmEXc=
 github.com/docker/go-units v0.5.0 h1:69rxXcBk27SvSaaxTtLh/8llcHD8vYHT7WSdRZ/jvr4=
 github.com/docker/go-units v0.5.0/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
-github.com/dsnet/try v0.0.3 h1:ptR59SsrcFUYbT/FhAbKTV6iLkeD6O18qfIWRml2fqI=
-github.com/dsnet/try v0.0.3/go.mod h1:WBM8tRpUmnXXhY1U6/S8dt6UWdHTQ7y8A5YSkRCkq40=
 github.com/dustin/go-humanize v1.0.1 h1:GzkhY7T5VNhEkwH0PVJgjz+fX1rhBrR7pRT3mDkpeCY=
 github.com/dustin/go-humanize v1.0.1/go.mod h1:Mu1zIs6XwVuF/gI1OepvI0qD18qycQx+mFykh5fBlto=
 github.com/felixge/fgprof v0.9.3/go.mod h1:RdbpDgzqYVh/T9fPELJyV7EYJuHB55UTEULNun8eiPw=
@@ -194,8 +187,8 @@ github.com/go-ole/go-ole v1.3.0 h1:Dt6ye7+vXGIKZ7Xtk4s6/xVdGDQynvom7xCFEdWr6uE=
 github.com/go-ole/go-ole v1.3.0/go.mod h1:5LS6F96DhAwUc7C+1HLexzMXY1xGRSryjyPPKW6zv78=
 github.com/go-sql-driver/mysql v1.8.1 h1:LedoTUt/eveggdHS9qUFC1EFSa8bU2+1pZjSRpvNJ1Y=
 github.com/go-sql-driver/mysql v1.8.1/go.mod h1:wEBSXgmK//2ZFJyE+qWnIsVGmvmEKlqwuVSjsCm7DZg=
-github.com/go-viper/mapstructure/v2 v2.2.1 h1:ZAaOCxANMuZx5RCeg0mBdEZk7DZasvvZIxtHqx8aGss=
-github.com/go-viper/mapstructure/v2 v2.2.1/go.mod h1:oJDH3BJKyqBA2TXFhDsKDGDTlndYOZ6rGS0BRZIxGhM=
+github.com/go-viper/mapstructure/v2 v2.4.0 h1:EBsztssimR/CONLSZZ04E8qAkxNYq4Qp9LvH92wZUgs=
+github.com/go-viper/mapstructure/v2 v2.4.0/go.mod h1:oJDH3BJKyqBA2TXFhDsKDGDTlndYOZ6rGS0BRZIxGhM=
 github.com/go4org/plan9netshell v0.0.0-20250324183649-788daa080737 h1:cf60tHxREO3g1nroKr2osU3JWZsJzkfi7rEg+oAB0Lo=
 github.com/go4org/plan9netshell v0.0.0-20250324183649-788daa080737/go.mod h1:MIS0jDzbU/vuM9MC4YnBITCv+RYuTRq8dJzmCrFsK9g=
 github.com/gobwas/httphead v0.1.0/go.mod h1:O/RXo79gxV8G+RqlR/otEwx4Q36zl9rqC5u12GKvMCM=
@@ -226,8 +219,6 @@ github.com/google/go-querystring v1.1.0 h1:AnCroh3fv4ZBgVIf1Iwtovgjaw/GiKJo8M8yD
 github.com/google/go-querystring v1.1.0/go.mod h1:Kcdr2DB4koayq7X8pmAG4sNG59So17icRSOU623lUBU=
 github.com/google/go-tpm v0.9.4 h1:awZRf9FwOeTunQmHoDYSHJps3ie6f1UlhS1fOdPEt1I=
 github.com/google/go-tpm v0.9.4/go.mod h1:h9jEsEECg7gtLis0upRBQU+GhYVH6jMjrFxI8u6bVUY=
-github.com/google/gofuzz v1.2.0 h1:xRy4A+RhZaiKjJ1bPfwQ8sedCA+YS2YcCHW6ec7JMi0=
-github.com/google/gofuzz v1.2.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
 github.com/google/nftables v0.2.1-0.20240414091927-5e242ec57806 h1:wG8RYIyctLhdFk6Vl1yPGtSRtwGpVkWyZww1OCil2MI=
 github.com/google/nftables v0.2.1-0.20240414091927-5e242ec57806/go.mod h1:Beg6V6zZ3oEn0JuiUQ4wqwuyqqzasOltcoXPtgLbFp4=
 github.com/google/pprof v0.0.0-20211214055906-6f57359322fd/go.mod h1:KgnwoLYCZ8IQu3XUZ8Nc/bM9CCZFOyjUNOSygVozoDg=
@@ -242,12 +233,8 @@ github.com/gookit/color v1.4.2/go.mod h1:fqRyamkC1W8uxl+lxCQxOT09l/vYfZ+QeiX3rKQ
 github.com/gookit/color v1.5.0/go.mod h1:43aQb+Zerm/BWh2GnrgOQm7ffz7tvQXEKV6BFMl7wAo=
 github.com/gookit/color v1.5.4 h1:FZmqs7XOyGgCAxmWyPslpiok1k05wmY3SJTytgvYFs0=
 github.com/gookit/color v1.5.4/go.mod h1:pZJOeOS8DM43rXbp4AZo1n9zCU2qjpcRko0b6/QJi9w=
-github.com/gorilla/csrf v1.7.3 h1:BHWt6FTLZAb2HtWT5KDBf6qgpZzvtbp9QWDRKZMXJC0=
-github.com/gorilla/csrf v1.7.3/go.mod h1:F1Fj3KG23WYHE6gozCmBAezKookxbIvUJT+121wTuLk=
 github.com/gorilla/mux v1.8.1 h1:TuBL49tXwgrFYWhqrNgrUNEY92u81SPhu7sTdzQEiWY=
 github.com/gorilla/mux v1.8.1/go.mod h1:AKf9I4AEqPTmMytcMc0KkNouC66V3BtZ4qD5fmWSiMQ=
-github.com/gorilla/securecookie v1.1.2 h1:YCIWL56dvtr73r6715mJs5ZvhtnY73hBvEF8kXD8ePA=
-github.com/gorilla/securecookie v1.1.2/go.mod h1:NfCASbcHqRSY+3a8tlWJwsQap2VX5pwzwo4h3eOamfo=
 github.com/gorilla/websocket v1.5.3 h1:saDtZ6Pbx/0u+bgYQ3q96pZgCzfhKXGPqt7kZ72aNNg=
 github.com/gorilla/websocket v1.5.3/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE=
 github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.0 h1:+epNPbD5EqgpEMm5wrl4Hqts3jZt8+kYaqUisuuIGTk=
@@ -256,8 +243,6 @@ github.com/hashicorp/go-version v1.7.0 h1:5tqGy27NaOTB8yJKUZELlFAS/LTKJkrmONwQKe
 github.com/hashicorp/go-version v1.7.0/go.mod h1:fltr4n8CU8Ke44wwGCBoEymUuxUHl09ZGVZPK5anwXA=
 github.com/hdevalence/ed25519consensus v0.2.0 h1:37ICyZqdyj0lAZ8P4D1d1id3HqbbG1N3iBb1Tb4rdcU=
 github.com/hdevalence/ed25519consensus v0.2.0/go.mod h1:w3BHWjwJbFU29IRHL1Iqkw3sus+7FctEyM4RqDxYNzo=
-github.com/hinshun/vt10x v0.0.0-20220119200601-820417d04eec h1:qv2VnGeEQHchGaZ/u7lxST/RaJw+cv273q79D81Xbog=
-github.com/hinshun/vt10x v0.0.0-20220119200601-820417d04eec/go.mod h1:Q48J4R4DvxnHolD5P8pOtXigYlRuPLGl6moFx3ulM68=
 github.com/ianlancetaylor/demangle v0.0.0-20210905161508-09a460cdf81d/go.mod h1:aYm2/VgdVmcIU8iMfdMvDMsRAQjcfZSKFby6HOFvi/w=
 github.com/ianlancetaylor/demangle v0.0.0-20230524184225-eabc099b10ab/go.mod h1:gx7rwoVhcfuVKG5uya9Hs3Sxj7EIvldVofAWIUtGouw=
 github.com/illarion/gonotify/v3 v3.0.2 h1:O7S6vcopHexutmpObkeWsnzMJt/r1hONIEogeVNmJMk=
@@ -289,8 +274,6 @@ github.com/jmespath/go-jmespath/internal/testify v1.5.1/go.mod h1:L3OGu8Wl2/fWfC
 github.com/josharian/intern v1.0.0/go.mod h1:5DoeVV0s6jJacbCEi61lwdGj/aVlrQvzHFFd8Hwg//Y=
 github.com/jsimonetti/rtnetlink v1.4.1 h1:JfD4jthWBqZMEffc5RjgmlzpYttAVw1sdnmiNaPO3hE=
 github.com/jsimonetti/rtnetlink v1.4.1/go.mod h1:xJjT7t59UIZ62GLZbv6PLLo8VFrostJMPBAheR6OM8w=
-github.com/kballard/go-shellquote v0.0.0-20180428030007-95032a82bc51 h1:Z9n2FFNUXsshfwJMBgNA0RU6/i7WVaAegv3PtuIHPMs=
-github.com/kballard/go-shellquote v0.0.0-20180428030007-95032a82bc51/go.mod h1:CzGEWj7cYgsdH8dAjBGEr58BoE7ScuLd+fwFZ44+/x8=
 github.com/kisielk/errcheck v1.5.0/go.mod h1:pFxgyoBC7bSaBwPgfKdkLd5X25qrDl4LWUI2bnpBCr8=
 github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck=
 github.com/klauspost/compress v1.18.0 h1:c/Cqfb0r+Yi+JtIEq73FWXVkRonBlf0CRNYc8Zttxdo=
@@ -321,11 +304,9 @@ github.com/lib/pq v1.10.9/go.mod h1:AlVN5x4E4T544tWzH6hKfbfQvm3HdbOxrmggDNAPY9o=
 github.com/lithammer/fuzzysearch v1.1.8 h1:/HIuJnjHuXS8bKaiTMeeDlW2/AyIWk2brx1V8LFgLN4=
 github.com/lithammer/fuzzysearch v1.1.8/go.mod h1:IdqeyBClc3FFqSzYq/MXESsS4S0FsZ5ajtkr5xPLts4=
 github.com/mailru/easyjson v0.7.7/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJJLY9Nlc=
-github.com/mattn/go-colorable v0.1.2/go.mod h1:U0ppj6V5qS13XJ6of8GYAs25YV2eR4EVcfRqFIhoBtE=
 github.com/mattn/go-colorable v0.1.13/go.mod h1:7S9/ev0klgBDR4GtXTXX8a3vIGJpMovkB8vQcUbaXHg=
 github.com/mattn/go-colorable v0.1.14 h1:9A9LHSqF/7dyVVX6g0U9cwm9pG3kP9gSzcuIPHPsaIE=
 github.com/mattn/go-colorable v0.1.14/go.mod h1:6LmQG8QLFO4G5z1gPvYEzlUgJ2wF+stgPZH1UqBm1s8=
-github.com/mattn/go-isatty v0.0.8/go.mod h1:Iq45c/XA43vh69/j3iqttzPXn0bhXyGjM0Hdxcsrc5s=
 github.com/mattn/go-isatty v0.0.16/go.mod h1:kYGgaQfpe5nmfYZH+SKPsOc2e4SrIfOl2e/yFXSvRLM=
 github.com/mattn/go-isatty v0.0.19/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
 github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
@@ -341,9 +322,6 @@ github.com/mdlayher/sdnotify v1.0.0 h1:Ma9XeLVN/l0qpyx1tNeMSeTjCPH6NtuD6/N9XdTlQ
 github.com/mdlayher/sdnotify v1.0.0/go.mod h1:HQUmpM4XgYkhDLtd+Uad8ZFK1T9D5+pNxnXQjCeJlGE=
 github.com/mdlayher/socket v0.5.0 h1:ilICZmJcQz70vrWVes1MFera4jGiWNocSkykwwoy3XI=
 github.com/mdlayher/socket v0.5.0/go.mod h1:WkcBFfvyG8QENs5+hfQPl1X6Jpd2yeLIYgrGFmJiJxI=
-github.com/mgutz/ansi v0.0.0-20170206155736-9520e82c474b/go.mod h1:01TrycV0kFyexm33Z7vhZRXopbI8J3TDReVlkTgMUxE=
-github.com/mgutz/ansi v0.0.0-20200706080929-d51e80ef957d h1:5PJl274Y63IEHC+7izoQE9x6ikvDFZS2mDVS3drnohI=
-github.com/mgutz/ansi v0.0.0-20200706080929-d51e80ef957d/go.mod h1:01TrycV0kFyexm33Z7vhZRXopbI8J3TDReVlkTgMUxE=
 github.com/miekg/dns v1.1.58 h1:ca2Hdkz+cDg/7eNF6V56jjzuZ4aCAE+DbVkILdQWG/4=
 github.com/miekg/dns v1.1.58/go.mod h1:Ypv+3b/KadlvW9vJfXOTf300O4UqaHFzFCuHz+rPkBY=
 github.com/mitchellh/go-ps v1.0.0 h1:i6ampVEEF4wQFF+bkYfwYgY+F/uYJDktmvLPf7qIgjc=
@@ -492,8 +470,8 @@ github.com/tailscale/web-client-prebuilt v0.0.0-20250124233751-d4cd19a26976 h1:U
 github.com/tailscale/web-client-prebuilt v0.0.0-20250124233751-d4cd19a26976/go.mod h1:agQPE6y6ldqCOui2gkIh7ZMztTkIQKH049tv8siLuNQ=
 github.com/tailscale/wf v0.0.0-20240214030419-6fbb0a674ee6 h1:l10Gi6w9jxvinoiq15g8OToDdASBni4CyJOdHY1Hr8M=
 github.com/tailscale/wf v0.0.0-20240214030419-6fbb0a674ee6/go.mod h1:ZXRML051h7o4OcI0d3AaILDIad/Xw0IkXaHM17dic1Y=
-github.com/tailscale/wireguard-go v0.0.0-20250304000100-91a0587fb251 h1:h/41LFTrwMxB9Xvvug0kRdQCU5TlV1+pAMQw0ZtDE3U=
-github.com/tailscale/wireguard-go v0.0.0-20250304000100-91a0587fb251/go.mod h1:BOm5fXUBFM+m9woLNBoxI9TaBXXhGNP50LX/TGIvGb4=
+github.com/tailscale/wireguard-go v0.0.0-20250716170648-1d0488a3d7da h1:jVRUZPRs9sqyKlYHHzHjAqKN+6e/Vog6NpHYeNPJqOw=
+github.com/tailscale/wireguard-go v0.0.0-20250716170648-1d0488a3d7da/go.mod h1:BOm5fXUBFM+m9woLNBoxI9TaBXXhGNP50LX/TGIvGb4=
 github.com/tailscale/xnet v0.0.0-20240729143630-8497ac4dab2e h1:zOGKqN5D5hHhiYUp091JqK7DPCqSARyUfduhGUY8Bek=
 github.com/tailscale/xnet v0.0.0-20240729143630-8497ac4dab2e/go.mod h1:orPd6JZXXRyuDusYilywte7k094d7dycXXU5YnWsrwg=
 github.com/tc-hib/winres v0.2.1 h1:YDE0FiP0VmtRaDn7+aaChp1KiF4owBiJa5l964l5ujA=
@@ -555,20 +533,20 @@ golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8U
 golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
 golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=
 golang.org/x/crypto v0.19.0/go.mod h1:Iy9bg/ha4yyC70EfRS8jz+B6ybOBKMaSxLj6P6oBDfU=
-golang.org/x/crypto v0.39.0 h1:SHs+kF4LP+f+p14esP5jAoDpHU8Gu/v9lFRK6IT5imM=
-golang.org/x/crypto v0.39.0/go.mod h1:L+Xg3Wf6HoL4Bn4238Z6ft6KfEpN0tJGo53AAPC632U=
+golang.org/x/crypto v0.40.0 h1:r4x+VvoG5Fm+eJcxMaY8CQM7Lb0l1lsmjGBQ6s8BfKM=
+golang.org/x/crypto v0.40.0/go.mod h1:Qr1vMER5WyS2dfPHAlsOj01wgLbsyWtFn/aY+5+ZdxY=
 golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0 h1:R84qjqJb5nVJMxqWYb3np9L5ZsaDtB+a39EqjV0JSUM=
 golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0/go.mod h1:S9Xr4PYopiDyqSyp5NjCrhFrqg6A5zA2E/iPHPhqnS8=
 golang.org/x/exp/typeparams v0.0.0-20240314144324-c7f7c6466f7f h1:phY1HzDcf18Aq9A8KkmRtY9WvOFIxN8wgfvy6Zm1DV8=
 golang.org/x/exp/typeparams v0.0.0-20240314144324-c7f7c6466f7f/go.mod h1:AbB0pIl9nAr9wVwH+Z2ZpaocVmF5I4GyWCDIsVjR0bk=
-golang.org/x/image v0.24.0 h1:AN7zRgVsbvmTfNyqIbbOraYL8mSwcKncEj8ofjgzcMQ=
-golang.org/x/image v0.24.0/go.mod h1:4b/ITuLfqYq1hqZcjofwctIhi7sZh2WaCjvsBNjjya8=
+golang.org/x/image v0.27.0 h1:C8gA4oWU/tKkdCfYT6T2u4faJu3MeNS5O8UPWlPF61w=
+golang.org/x/image v0.27.0/go.mod h1:xbdrClrAUway1MUTEZDq9mz/UpRwYAkFFNUslZtcB+g=
 golang.org/x/mod v0.2.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
 golang.org/x/mod v0.3.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
 golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4=
 golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
-golang.org/x/mod v0.25.0 h1:n7a+ZbQKQA/Ysbyb0/6IbB1H/X41mKgbhfv7AfG/44w=
-golang.org/x/mod v0.25.0/go.mod h1:IXM97Txy2VM4PJ3gI61r1YEk/gAj6zAHN3AdZt6S9Ww=
+golang.org/x/mod v0.26.0 h1:EGMPT//Ezu+ylkCijjPc+f4Aih7sZvaAr+O3EHBxvZg=
+golang.org/x/mod v0.26.0/go.mod h1:/j6NAhSk8iQ723BGAUyoAcn7SlD7s15Dp9Nd/SfeaFQ=
 golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
 golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
 golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
@@ -577,8 +555,8 @@ golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v
 golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c=
 golang.org/x/net v0.6.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
 golang.org/x/net v0.10.0/go.mod h1:0qNGK6F8kojg2nk9dLZ2mShWaEBan6FAoqfSigmmuDg=
-golang.org/x/net v0.41.0 h1:vBTly1HeNPEn3wtREYfy4GZ/NECgw2Cnl+nK6Nz3uvw=
-golang.org/x/net v0.41.0/go.mod h1:B/K4NNqkfmg07DQYrbwvSluqCJOOXwUjeb/5lOisjbA=
+golang.org/x/net v0.42.0 h1:jzkYrhi3YQWD6MLBJcsklgQsoAcw89EcZbJw8Z614hs=
+golang.org/x/net v0.42.0/go.mod h1:FF1RA5d3u7nAYA4z2TkclSCKh68eSXtiFwcWQpPXdt8=
 golang.org/x/oauth2 v0.30.0 h1:dnDm7JmhM45NNpd8FDDeLhK6FwqbOf4MLCM9zb1BOHI=
 golang.org/x/oauth2 v0.30.0/go.mod h1:B++QgG3ZKulg6sRPGD/mqlHQs5rB3Ml9erfeDY7xKlU=
 golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
@@ -587,10 +565,9 @@ golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJ
 golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
 golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
 golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
-golang.org/x/sync v0.15.0 h1:KWH3jNZsfyT6xfAfKiz6MRNmd46ByHDYaZ7KSkCtdW8=
-golang.org/x/sync v0.15.0/go.mod h1:1dzgHSNfp02xaA81J2MS99Qcpr2w7fw1gpm99rleRqA=
+golang.org/x/sync v0.16.0 h1:ycBJEhp9p4vXvUZNszeOq0kGTPghopOL8q0fq3vstxw=
+golang.org/x/sync v0.16.0/go.mod h1:1dzgHSNfp02xaA81J2MS99Qcpr2w7fw1gpm99rleRqA=
 golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
-golang.org/x/sys v0.0.0-20190222072716-a9d3bda3a223/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
 golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
 golang.org/x/sys v0.0.0-20200217220822-9197077df867/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
 golang.org/x/sys v0.0.0-20200728102440-3e129f6d46b1/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
@@ -615,8 +592,8 @@ golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
 golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
 golang.org/x/sys v0.12.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
 golang.org/x/sys v0.17.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
-golang.org/x/sys v0.33.0 h1:q3i8TbbEz+JRD9ywIRlyRAQbM0qF7hu24q3teo2hbuw=
-golang.org/x/sys v0.33.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
+golang.org/x/sys v0.34.0 h1:H5Y5sJ2L2JRdyv7ROF1he/lPdvFsd0mJHFw2ThKHxLA=
+golang.org/x/sys v0.34.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
 golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
 golang.org/x/term v0.0.0-20210220032956-6a3ed077a48d/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
 golang.org/x/term v0.0.0-20210615171337-6886f2dfbf5b/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
@@ -624,27 +601,26 @@ golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuX
 golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k=
 golang.org/x/term v0.8.0/go.mod h1:xPskH00ivmX89bAKVGSKKtLOWNx2+17Eiy94tnKShWo=
 golang.org/x/term v0.17.0/go.mod h1:lLRBjIVuehSbZlaOtGMbcMncT+aqLLLmKrsjNrUguwk=
-golang.org/x/term v0.32.0 h1:DR4lr0TjUs3epypdhTOkMmuF5CDFJ/8pOnbzMZPQ7bg=
-golang.org/x/term v0.32.0/go.mod h1:uZG1FhGx848Sqfsq4/DlJr3xGGsYMu/L5GW4abiaEPQ=
+golang.org/x/term v0.33.0 h1:NuFncQrRcaRvVmgRkvM3j/F00gWIAlcmlB8ACEKmGIg=
+golang.org/x/term v0.33.0/go.mod h1:s18+ql9tYWp1IfpV9DmCtQDDSRBUjKaw9M1eAv5UeF0=
 golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
 golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
 golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
-golang.org/x/text v0.4.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
 golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
 golang.org/x/text v0.9.0/go.mod h1:e1OnstbJyHTd6l/uOt8jFFHp6TRDWZR/bV3emEE/zU8=
 golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
-golang.org/x/text v0.26.0 h1:P42AVeLghgTYr4+xUnTRKDMqpar+PtX7KWuNQL21L8M=
-golang.org/x/text v0.26.0/go.mod h1:QK15LZJUUQVJxhz7wXgxSy/CJaTFjd0G+YLonydOVQA=
-golang.org/x/time v0.10.0 h1:3usCWA8tQn0L8+hFJQNgzpWbd89begxN66o1Ojdn5L4=
-golang.org/x/time v0.10.0/go.mod h1:3BpzKBy/shNhVucY/MWOyx10tF3SFh9QdLuxbVysPQM=
+golang.org/x/text v0.27.0 h1:4fGWRpyh641NLlecmyl4LOe6yDdfaYNrGb2zdfo4JV4=
+golang.org/x/text v0.27.0/go.mod h1:1D28KMCvyooCX9hBiosv5Tz/+YLxj0j7XhWjpSUF7CU=
+golang.org/x/time v0.11.0 h1:/bpjEDfN9tkoN/ryeYHnv5hcMlc8ncjMcM4XBk5NWV0=
+golang.org/x/time v0.11.0/go.mod h1:CDIdPxbZBQxdj6cxyCIdrNogrJKMJ7pr37NYpMcMDSg=
 golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
 golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
 golang.org/x/tools v0.0.0-20200619180055-7c47624df98f/go.mod h1:EkVYQZoAsY45+roYkvgYkIh4xh/qjgUK9TdY2XT94GE=
 golang.org/x/tools v0.0.0-20210106214847-113979e3529a/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA=
 golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc=
 golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU=
-golang.org/x/tools v0.33.0 h1:4qz2S3zmRxbGIhDIAgjxvFutSvH5EfnsYrRBj0UI0bc=
-golang.org/x/tools v0.33.0/go.mod h1:CIJMaWEY88juyUfo7UbgPqbC8rU2OqfAV1h2Qp0oMYI=
+golang.org/x/tools v0.35.0 h1:mBffYraMEf7aa0sB+NuKnuCy8qI/9Bughn8dC2Gu5r0=
+golang.org/x/tools v0.35.0/go.mod h1:NKdj5HkL/73byiZSJjqJgKn3ep7KjFkBOkR/Hps3VPw=
 golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
 golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
 golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
@@ -712,8 +688,8 @@ modernc.org/token v1.1.0 h1:Xl7Ap9dKaEs5kLoOQeQmPWevfnk/DM5qcLcYlA8ys6Y=
 modernc.org/token v1.1.0/go.mod h1:UGzOrNV1mAFSEB63lOFHIpNRUVMvYTc6yu1SMY/XTDM=
 software.sslmate.com/src/go-pkcs12 v0.4.0 h1:H2g08FrTvSFKUj+D309j1DPfk5APnIdAQAB8aEykJ5k=
 software.sslmate.com/src/go-pkcs12 v0.4.0/go.mod h1:Qiz0EyvDRJjjxGyUQa2cCNZn/wMyzrRJ/qcDXOQazLI=
-tailscale.com v1.84.2 h1:v6aM4RWUgYiV52LRAx6ET+dlGnvO/5lnqPXb7/pMnR0=
-tailscale.com v1.84.2/go.mod h1:6/S63NMAhmncYT/1zIPDJkvCuZwMw+JnUuOfSPNazpo=
+tailscale.com v1.86.5 h1:yBtWFjuLYDmxVnfnvPbZNZcKADCYgNfMd0rUAOA9XCs=
+tailscale.com v1.86.5/go.mod h1:Lm8dnzU2i/Emw15r6sl3FRNp/liSQ/nYw6ZSQvIdZ1M=
 zgo.at/zcache/v2 v2.2.0 h1:K29/IPjMniZfveYE+IRXfrl11tMzHkIPuyGrfVZ2fGo=
 zgo.at/zcache/v2 v2.2.0/go.mod h1:gyCeoLVo01QjDZynjime8xUGHHMbsLiPyUTBpDGd4Gk=
 zombiezen.com/go/postgrestest v1.0.1 h1:aXoADQAJmZDU3+xilYVut0pHhgc0sF8ZspPW9gFNwP4=
--- a/hscontrol/app.go
+++ b/hscontrol/app.go
@@ -17,6 +17,7 @@ import (
 	"syscall"
 	"time"

+	"github.com/cenkalti/backoff/v5"
 	"github.com/davecgh/go-spew/spew"
 	"github.com/gorilla/mux"
 	grpcRuntime "github.com/grpc-ecosystem/grpc-gateway/v2/runtime"
@@ -28,14 +29,15 @@ import (
 	derpServer "github.com/juanfont/headscale/hscontrol/derp/server"
 	"github.com/juanfont/headscale/hscontrol/dns"
 	"github.com/juanfont/headscale/hscontrol/mapper"
-	"github.com/juanfont/headscale/hscontrol/notifier"
 	"github.com/juanfont/headscale/hscontrol/state"
 	"github.com/juanfont/headscale/hscontrol/types"
+	"github.com/juanfont/headscale/hscontrol/types/change"
 	"github.com/juanfont/headscale/hscontrol/util"
 	zerolog "github.com/philip-bui/grpc-zerolog"
 	"github.com/pkg/profile"
 	zl "github.com/rs/zerolog"
 	"github.com/rs/zerolog/log"
+	"github.com/sasha-s/go-deadlock"
 	"golang.org/x/crypto/acme"
 	"golang.org/x/crypto/acme/autocert"
 	"golang.org/x/sync/errgroup"
@@ -64,6 +66,19 @@ var (
 	)
 )

+var (
+	debugDeadlock        = envknob.Bool("HEADSCALE_DEBUG_DEADLOCK")
+	debugDeadlockTimeout = envknob.RegisterDuration("HEADSCALE_DEBUG_DEADLOCK_TIMEOUT")
+)
+
+func init() {
+	deadlock.Opts.Disable = !debugDeadlock
+	if debugDeadlock {
+		deadlock.Opts.DeadlockTimeout = debugDeadlockTimeout()
+		deadlock.Opts.PrintAllCurrentGoroutines = true
+	}
+}
+
 const (
 	AuthPrefix         = "Bearer "
 	updateInterval     = 5 * time.Second
@@ -82,11 +97,10 @@ type Headscale struct {

 	// Things that generate changes
 	extraRecordMan *dns.ExtraRecordsMan
-	mapper         *mapper.Mapper
-	nodeNotifier   *notifier.Notifier
 	authProvider   AuthProvider
+	mapBatcher     mapper.Batcher

-	pollNetMapStreamWG sync.WaitGroup
+	clientStreamsOpen sync.WaitGroup
 }

 var (
@@ -115,34 +129,29 @@ func NewHeadscale(cfg *types.Config) (*Headscale, error) {
 	}

 	app := Headscale{
-		cfg:                cfg,
-		noisePrivateKey:    noisePrivateKey,
-		pollNetMapStreamWG: sync.WaitGroup{},
-		nodeNotifier:       notifier.NewNotifier(cfg),
-		state:              s,
+		cfg:               cfg,
+		noisePrivateKey:   noisePrivateKey,
+		clientStreamsOpen: sync.WaitGroup{},
+		state:             s,
 	}

 	// Initialize ephemeral garbage collector
 	ephemeralGC := db.NewEphemeralGarbageCollector(func(ni types.NodeID) {
-		node, err := app.state.GetNodeByID(ni)
-		if err != nil {
-			log.Err(err).Uint64("node.id", ni.Uint64()).Msgf("failed to get ephemeral node for deletion")
+		node, ok := app.state.GetNodeByID(ni)
+		if !ok {
+			log.Error().Uint64("node.id", ni.Uint64()).Msg("Ephemeral node deletion failed")
+			log.Debug().Caller().Uint64("node.id", ni.Uint64()).Msg("Ephemeral node deletion failed because node not found in NodeStore")
 			return
 		}

 		policyChanged, err := app.state.DeleteNode(node)
 		if err != nil {
-			log.Err(err).Uint64("node.id", ni.Uint64()).Msgf("failed to delete ephemeral node")
+			log.Error().Err(err).Uint64("node.id", ni.Uint64()).Str("node.name", node.Hostname()).Msg("Ephemeral node deletion failed")
 			return
 		}

-		// Send policy update notifications if needed
-		if policyChanged {
-			ctx := types.NotifyCtx(context.Background(), "ephemeral-gc-policy", node.Hostname)
-			app.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-		}
-
-		log.Debug().Uint64("node.id", ni.Uint64()).Msgf("deleted ephemeral node")
+		app.Change(policyChanged)
+		log.Debug().Caller().Uint64("node.id", ni.Uint64()).Str("node.name", node.Hostname()).Msg("Ephemeral node deleted because garbage collection timeout reached")
 	})
 	app.ephemeralGC = ephemeralGC

@@ -153,10 +162,9 @@ func NewHeadscale(cfg *types.Config) (*Headscale, error) {
 		defer cancel()
 		oidcProvider, err := NewAuthProviderOIDC(
 			ctx,
+			&app,
 			cfg.ServerURL,
 			&cfg.OIDC,
-			app.state,
-			app.nodeNotifier,
 		)
 		if err != nil {
 			if cfg.OIDC.OnlyStartIfOIDCIsAvailable {
@@ -262,31 +270,41 @@ func (h *Headscale) scheduledTasks(ctx context.Context) {
 			return

 		case <-expireTicker.C:
-			var update types.StateUpdate
+			var expiredNodeChanges []change.ChangeSet
 			var changed bool

-			lastExpiryCheck, update, changed = h.state.ExpireExpiredNodes(lastExpiryCheck)
+			lastExpiryCheck, expiredNodeChanges, changed = h.state.ExpireExpiredNodes(lastExpiryCheck)

 			if changed {
-				log.Trace().Interface("nodes", update.ChangePatches).Msgf("expiring nodes")
+				log.Trace().Interface("changes", expiredNodeChanges).Msgf("expiring nodes")

-				ctx := types.NotifyCtx(context.Background(), "expire-expired", "na")
-				h.nodeNotifier.NotifyAll(ctx, update)
+				// Send the changes directly since they're already in the new format
+				for _, nodeChange := range expiredNodeChanges {
+					h.Change(nodeChange)
+				}
 			}

 		case <-derpTickerChan:
 			log.Info().Msg("Fetching DERPMap updates")
-			derpMap := derp.GetDERPMap(h.cfg.DERP)
-			if h.cfg.DERP.ServerEnabled && h.cfg.DERP.AutomaticallyAddEmbeddedDerpRegion {
-				region, _ := h.DERPServer.GenerateRegion()
-				derpMap.Regions[region.RegionID] = &region
-			}
+			derpMap, err := backoff.Retry(ctx, func() (*tailcfg.DERPMap, error) {
+				derpMap, err := derp.GetDERPMap(h.cfg.DERP)
+				if err != nil {
+					return nil, err
+				}
+				if h.cfg.DERP.ServerEnabled && h.cfg.DERP.AutomaticallyAddEmbeddedDerpRegion {
+					region, _ := h.DERPServer.GenerateRegion()
+					derpMap.Regions[region.RegionID] = &region
+				}

-			ctx := types.NotifyCtx(context.Background(), "derpmap-update", "na")
-			h.nodeNotifier.NotifyAll(ctx, types.StateUpdate{
-				Type:    types.StateDERPUpdated,
-				DERPMap: derpMap,
-			})
+				return derpMap, nil
+			}, backoff.WithBackOff(backoff.NewExponentialBackOff()))
+			if err != nil {
+				log.Error().Err(err).Msg("failed to build new DERPMap, retrying later")
+				continue
+			}
+			h.state.SetDERPMap(derpMap)
+
+			h.Change(change.DERPSet)

 		case records, ok := <-extraRecordsUpdate:
 			if !ok {
@@ -294,19 +312,16 @@ func (h *Headscale) scheduledTasks(ctx context.Context) {
 			}
 			h.cfg.TailcfgDNSConfig.ExtraRecords = records

-			ctx := types.NotifyCtx(context.Background(), "dns-extrarecord", "all")
-			// TODO(kradalby): We can probably do better than sending a full update here,
-			// but for now this will ensure that all of the nodes get the new records.
-			h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
+			h.Change(change.ExtraRecordsSet)
 		}
 	}
 }

 func (h *Headscale) grpcAuthenticationInterceptor(ctx context.Context,
-	req interface{},
+	req any,
 	info *grpc.UnaryServerInfo,
 	handler grpc.UnaryHandler,
-) (interface{}, error) {
+) (any, error) {
 	// Check if the request is coming from the on-server client.
 	// This is not secure, but it is to maintain maintainability
 	// with the "legacy" database-based client
@@ -365,64 +380,53 @@ func (h *Headscale) httpAuthenticationMiddleware(next http.Handler) http.Handler
 		writer http.ResponseWriter,
 		req *http.Request,
 	) {
-		log.Trace().
-			Caller().
-			Str("client_address", req.RemoteAddr).
-			Msg("HTTP authentication invoked")
-
-		authHeader := req.Header.Get("authorization")
-
-		if !strings.HasPrefix(authHeader, AuthPrefix) {
-			log.Error().
+		if err := func() error {
+			log.Trace().
 				Caller().
 				Str("client_address", req.RemoteAddr).
-				Msg(`missing "Bearer " prefix in "Authorization" header`)
-			writer.WriteHeader(http.StatusUnauthorized)
-			_, err := writer.Write([]byte("Unauthorized"))
+				Msg("HTTP authentication invoked")
+
+			authHeader := req.Header.Get("Authorization")
+
+			if !strings.HasPrefix(authHeader, AuthPrefix) {
+				log.Error().
+					Caller().
+					Str("client_address", req.RemoteAddr).
+					Msg(`missing "Bearer " prefix in "Authorization" header`)
+				writer.WriteHeader(http.StatusUnauthorized)
+				_, err := writer.Write([]byte("Unauthorized"))
+				return err
+			}
+
+			valid, err := h.state.ValidateAPIKey(strings.TrimPrefix(authHeader, AuthPrefix))
 			if err != nil {
 				log.Error().
 					Caller().
 					Err(err).
-					Msg("Failed to write response")
+					Str("client_address", req.RemoteAddr).
+					Msg("failed to validate token")
+
+				writer.WriteHeader(http.StatusInternalServerError)
+				_, err := writer.Write([]byte("Unauthorized"))
+				return err
 			}

-			return
-		}
+			if !valid {
+				log.Info().
+					Str("client_address", req.RemoteAddr).
+					Msg("invalid token")

-		valid, err := h.state.ValidateAPIKey(strings.TrimPrefix(authHeader, AuthPrefix))
-		if err != nil {
+				writer.WriteHeader(http.StatusUnauthorized)
+				_, err := writer.Write([]byte("Unauthorized"))
+				return err
+			}
+
+			return nil
+		}(); err != nil {
 			log.Error().
 				Caller().
 				Err(err).
-				Str("client_address", req.RemoteAddr).
-				Msg("failed to validate token")
-
-			writer.WriteHeader(http.StatusInternalServerError)
-			_, err := writer.Write([]byte("Unauthorized"))
-			if err != nil {
-				log.Error().
-					Caller().
-					Err(err).
-					Msg("Failed to write response")
-			}
-
-			return
-		}
-
-		if !valid {
-			log.Info().
-				Str("client_address", req.RemoteAddr).
-				Msg("invalid token")
-
-			writer.WriteHeader(http.StatusUnauthorized)
-			_, err := writer.Write([]byte("Unauthorized"))
-			if err != nil {
-				log.Error().
-					Caller().
-					Err(err).
-					Msg("Failed to write response")
-			}
-
+				Msg("Failed to write HTTP response")
 			return
 		}

@@ -448,6 +452,7 @@ func (h *Headscale) createRouter(grpcMux *grpcRuntime.ServeMux) *mux.Router {
 	router.HandleFunc(ts2021UpgradePath, h.NoiseUpgradeHandler).
 		Methods(http.MethodPost, http.MethodGet)

+	router.HandleFunc("/robots.txt", h.RobotsHandler).Methods(http.MethodGet)
 	router.HandleFunc("/health", h.HealthHandler).Methods(http.MethodGet)
 	router.HandleFunc("/key", h.KeyHandler).Methods(http.MethodGet)
 	router.HandleFunc("/register/{registration_id}", h.authProvider.RegisterHandler).
@@ -484,65 +489,14 @@ func (h *Headscale) createRouter(grpcMux *grpcRuntime.ServeMux) *mux.Router {
 	return router
 }

-// // TODO(kradalby): Do a variant of this, and polman which only updates the node that has changed.
-// // Maybe we should attempt a new in memory state and not go via the DB?
-// // Maybe this should be implemented as an event bus?
-// // A bool is returned indicating if a full update was sent to all nodes
-// func usersChangedHook(db *db.HSDatabase, polMan policy.PolicyManager, notif *notifier.Notifier) error {
-// 	users, err := db.ListUsers()
-// 	if err != nil {
-// 		return err
-// 	}
-
-// 	changed, err := polMan.SetUsers(users)
-// 	if err != nil {
-// 		return err
-// 	}
-
-// 	if changed {
-// 		ctx := types.NotifyCtx(context.Background(), "acl-users-change", "all")
-// 		notif.NotifyAll(ctx, types.UpdateFull())
-// 	}
-
-// 	return nil
-// }
-
-// // TODO(kradalby): Do a variant of this, and polman which only updates the node that has changed.
-// // Maybe we should attempt a new in memory state and not go via the DB?
-// // Maybe this should be implemented as an event bus?
-// // A bool is returned indicating if a full update was sent to all nodes
-// func nodesChangedHook(
-// 	db *db.HSDatabase,
-// 	polMan policy.PolicyManager,
-// 	notif *notifier.Notifier,
-// ) (bool, error) {
-// 	nodes, err := db.ListNodes()
-// 	if err != nil {
-// 		return false, err
-// 	}
-
-// 	filterChanged, err := polMan.SetNodes(nodes)
-// 	if err != nil {
-// 		return false, err
-// 	}
-
-// 	if filterChanged {
-// 		ctx := types.NotifyCtx(context.Background(), "acl-nodes-change", "all")
-// 		notif.NotifyAll(ctx, types.UpdateFull())
-
-// 		return true, nil
-// 	}
-
-// 	return false, nil
-// }
-
 // Serve launches the HTTP and gRPC server service Headscale and the API.
 func (h *Headscale) Serve() error {
+	var err error
 	capver.CanOldCodeBeCleanedUp()

 	if profilingEnabled {
 		if profilingPath != "" {
-			err := os.MkdirAll(profilingPath, os.ModePerm)
+			err = os.MkdirAll(profilingPath, os.ModePerm)
 			if err != nil {
 				log.Fatal().Err(err).Msg("failed to create profiling directory")
 			}
@@ -557,48 +511,49 @@ func (h *Headscale) Serve() error {
 		spew.Dump(h.cfg)
 	}

-	log.Info().Str("version", types.Version).Str("commit", types.GitCommitHash).Msg("Starting Headscale")
+	versionInfo := types.GetVersionInfo()
+	log.Info().Str("version", versionInfo.Version).Str("commit", versionInfo.Commit).Msg("Starting Headscale")
 	log.Info().
 		Str("minimum_version", capver.TailscaleVersion(capver.MinSupportedCapabilityVersion)).
 		Msg("Clients with a lower minimum version will be rejected")

-	// Fetch an initial DERP Map before we start serving
-	h.mapper = mapper.NewMapper(h.state, h.cfg, h.nodeNotifier)
+	h.mapBatcher = mapper.NewBatcherAndMapper(h.cfg, h.state)
+	h.mapBatcher.Start()
+	defer h.mapBatcher.Close()

-	// TODO(kradalby): fix state part.
 	if h.cfg.DERP.ServerEnabled {
 		// When embedded DERP is enabled we always need a STUN server
 		if h.cfg.DERP.STUNAddr == "" {
 			return errSTUNAddressNotSet
 		}

-		region, err := h.DERPServer.GenerateRegion()
-		if err != nil {
-			return fmt.Errorf("generating DERP region for embedded server: %w", err)
-		}
-
-		if h.cfg.DERP.AutomaticallyAddEmbeddedDerpRegion {
-			h.state.DERPMap().Regions[region.RegionID] = &region
-		}
-
 		go h.DERPServer.ServeSTUN()
 	}

-	if len(h.state.DERPMap().Regions) == 0 {
+	derpMap, err := derp.GetDERPMap(h.cfg.DERP)
+	if err != nil {
+		return fmt.Errorf("failed to get DERPMap: %w", err)
+	}
+
+	if h.cfg.DERP.ServerEnabled && h.cfg.DERP.AutomaticallyAddEmbeddedDerpRegion {
+		region, _ := h.DERPServer.GenerateRegion()
+		derpMap.Regions[region.RegionID] = &region
+	}
+
+	if len(derpMap.Regions) == 0 {
 		return errEmptyInitialDERPMap
 	}

+	h.state.SetDERPMap(derpMap)
+
 	// Start ephemeral node garbage collector and schedule all nodes
 	// that are already in the database and ephemeral. If they are still
 	// around between restarts, they will reconnect and the GC will
 	// be cancelled.
 	go h.ephemeralGC.Start()
-	ephmNodes, err := h.state.ListEphemeralNodes()
-	if err != nil {
-		return fmt.Errorf("failed to list ephemeral nodes: %w", err)
-	}
-	for _, node := range ephmNodes {
-		h.ephemeralGC.Schedule(node.ID, h.cfg.EphemeralNodeInactivityTimeout)
+	ephmNodes := h.state.ListEphemeralNodes()
+	for _, node := range ephmNodes.All() {
+		h.ephemeralGC.Schedule(node.ID(), h.cfg.EphemeralNodeInactivityTimeout)
 	}

 	if h.cfg.DNSConfig.ExtraRecordsPath != "" {
@@ -828,19 +783,14 @@ func (h *Headscale) Serve() error {
 					continue
 				}

-				changed, err := h.state.ReloadPolicy()
+				changes, err := h.state.ReloadPolicy()
 				if err != nil {
 					log.Error().Err(err).Msgf("reloading policy")
 					continue
 				}

-				if changed {
-					log.Info().
-						Msg("ACL policy successfully reloaded, notifying nodes of change")
+				h.Change(changes...)

-					ctx := types.NotifyCtx(context.Background(), "acl-sighup", "na")
-					h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-				}
 			default:
 				info := func(msg string) { log.Info().Msg(msg) }
 				log.Info().
@@ -864,11 +814,11 @@ func (h *Headscale) Serve() error {
 					log.Error().Err(err).Msg("failed to shutdown http")
 				}

-				info("closing node notifier")
-				h.nodeNotifier.Close()
+				info("closing batcher")
+				h.mapBatcher.Close()

 				info("waiting for netmap stream to close")
-				h.pollNetMapStreamWG.Wait()
+				h.clientStreamsOpen.Wait()

 				info("shutting down grpc server (socket)")
 				grpcSocket.GracefulStop()
@@ -894,11 +844,11 @@ func (h *Headscale) Serve() error {
 				info("closing socket listener")
 				socketListener.Close()

-				// Close db connections
-				info("closing database connection")
+				// Close state connections
+				info("closing state and database")
 				err = h.state.Close()
 				if err != nil {
-					log.Error().Err(err).Msg("failed to close db")
+					log.Error().Err(err).Msg("failed to close state")
 				}

 				log.Info().
@@ -1047,3 +997,10 @@ func readOrCreatePrivateKey(path string) (*key.MachinePrivate, error) {

 	return &machineKey, nil
 }
+
+// Change is used to send changes to nodes.
+// All change should be enqueued here and empty will be automatically
+// ignored.
+func (h *Headscale) Change(cs ...change.ChangeSet) {
+	h.mapBatcher.AddWork(cs...)
+}
--- a/hscontrol/auth.go
+++ b/hscontrol/auth.go
@@ -10,6 +10,8 @@ import (
 	"time"

 	"github.com/juanfont/headscale/hscontrol/types"
+	"github.com/juanfont/headscale/hscontrol/types/change"
+	"github.com/rs/zerolog/log"
 	"gorm.io/gorm"
 	"tailscale.com/tailcfg"
 	"tailscale.com/types/key"
@@ -26,13 +28,10 @@ func (h *Headscale) handleRegister(
 	regReq tailcfg.RegisterRequest,
 	machineKey key.MachinePublic,
 ) (*tailcfg.RegisterResponse, error) {
-	node, err := h.state.GetNodeByNodeKey(regReq.NodeKey)
-	if err != nil && !errors.Is(err, gorm.ErrRecordNotFound) {
-		return nil, fmt.Errorf("looking up node in database: %w", err)
-	}
+	node, ok := h.state.GetNodeByNodeKey(regReq.NodeKey)

-	if node != nil {
-		resp, err := h.handleExistingNode(node, regReq, machineKey)
+	if ok {
+		resp, err := h.handleExistingNode(node.AsStruct(), regReq, machineKey)
 		if err != nil {
 			return nil, fmt.Errorf("handling existing node: %w", err)
 		}
@@ -47,6 +46,12 @@ func (h *Headscale) handleRegister(
 	if regReq.Auth != nil && regReq.Auth.AuthKey != "" {
 		resp, err := h.handleRegisterWithAuthKey(regReq, machineKey)
 		if err != nil {
+			// Preserve HTTPError types so they can be handled properly by the HTTP layer
+			var httpErr HTTPError
+			if errors.As(err, &httpErr) {
+				return nil, httpErr
+			}
+
 			return nil, fmt.Errorf("handling register with auth key: %w", err)
 		}

@@ -71,6 +76,17 @@ func (h *Headscale) handleExistingNode(
 	}

 	expired := node.IsExpired()
+
+	// If the node is expired and this is not a re-authentication attempt,
+	// force the client to re-authenticate
+	if expired && regReq.Auth == nil {
+		return &tailcfg.RegisterResponse{
+			NodeKeyExpired:    true,
+			MachineAuthorized: false,
+			AuthURL:           "", // Client will need to re-authenticate
+		}, nil
+	}
+
 	if !expired && !regReq.Expiry.IsZero() {
 		requestExpiry := regReq.Expiry

@@ -82,39 +98,27 @@ func (h *Headscale) handleExistingNode(
 		// If the request expiry is in the past, we consider it a logout.
 		if requestExpiry.Before(time.Now()) {
 			if node.IsEphemeral() {
-				policyChanged, err := h.state.DeleteNode(node)
+				c, err := h.state.DeleteNode(node.View())
 				if err != nil {
 					return nil, fmt.Errorf("deleting ephemeral node: %w", err)
 				}

-				// Send policy update notifications if needed
-				if policyChanged {
-					ctx := types.NotifyCtx(context.Background(), "auth-logout-ephemeral-policy", "na")
-					h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-				} else {
-					ctx := types.NotifyCtx(context.Background(), "logout-ephemeral", "na")
-					h.nodeNotifier.NotifyAll(ctx, types.UpdatePeerRemoved(node.ID))
-				}
+				h.Change(c)

 				return nil, nil
 			}
 		}

-		n, policyChanged, err := h.state.SetNodeExpiry(node.ID, requestExpiry)
+		updatedNode, c, err := h.state.SetNodeExpiry(node.ID, requestExpiry)
 		if err != nil {
 			return nil, fmt.Errorf("setting node expiry: %w", err)
 		}

-		// Send policy update notifications if needed
-		if policyChanged {
-			ctx := types.NotifyCtx(context.Background(), "auth-expiry-policy", "na")
-			h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-		} else {
-			ctx := types.NotifyCtx(context.Background(), "logout-expiry", "na")
-			h.nodeNotifier.NotifyWithIgnore(ctx, types.UpdateExpire(node.ID, requestExpiry), node.ID)
-		}
+		h.Change(c)

-		return nodeToRegisterResponse(n), nil
+		// CRITICAL: Use the updated node view for the response
+		// The original node object has stale expiry information
+		node = updatedNode.AsStruct()
 	}

 	return nodeToRegisterResponse(node), nil
@@ -184,6 +188,12 @@ func (h *Headscale) handleRegisterWithAuthKey(
 		return nil, err
 	}

+	// If node is not valid, it means an ephemeral node was deleted during logout
+	if !node.Valid() {
+		h.Change(changed)
+		return nil, nil
+	}
+
 	// This is a bit of a back and forth, but we have a bit of a chicken and egg
 	// dependency here.
 	// Because the way the policy manager works, we need to have the node
@@ -195,30 +205,33 @@ func (h *Headscale) handleRegisterWithAuthKey(
 	// ensure we send an update.
 	// This works, but might be another good candidate for doing some sort of
 	// eventbus.
-	routesChanged := h.state.AutoApproveRoutes(node)
+	// TODO(kradalby): This needs to be ran as part of the batcher maybe?
+	// now since we dont update the node/pol here anymore
+	routeChange := h.state.AutoApproveRoutes(node)
+
 	if _, _, err := h.state.SaveNode(node); err != nil {
 		return nil, fmt.Errorf("saving auto approved routes to node: %w", err)
 	}

-	if routesChanged {
-		ctx := types.NotifyCtx(context.Background(), "node updated", node.Hostname)
-		h.nodeNotifier.NotifyAll(ctx, types.UpdatePeerChanged(node.ID))
-	} else if changed {
-		ctx := types.NotifyCtx(context.Background(), "node created", node.Hostname)
-		h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-	} else {
-		// Existing node re-registering without route changes
-		// Still need to notify peers about the node being active again
-		// Use UpdateFull to ensure all peers get complete peer maps
-		ctx := types.NotifyCtx(context.Background(), "node re-registered", node.Hostname)
-		h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
+	if routeChange && changed.Empty() {
+		changed = change.NodeAdded(node.ID())
 	}
+	h.Change(changed)
+
+	// TODO(kradalby): I think this is covered above, but we need to validate that.
+	// // If policy changed due to node registration, send a separate policy change
+	// if policyChanged {
+	// 	policyChange := change.PolicyChange()
+	// 	h.Change(policyChange)
+	// }
+
+	user := node.User()

 	return &tailcfg.RegisterResponse{
 		MachineAuthorized: true,
 		NodeKeyExpired:    node.IsExpired(),
-		User:              *node.User.TailscaleUser(),
-		Login:             *node.User.TailscaleLogin(),
+		User:              *user.TailscaleUser(),
+		Login:             *user.TailscaleLogin(),
 	}, nil
 }

@@ -251,6 +264,8 @@ func (h *Headscale) handleRegisterInteractive(
 		nodeToRegister,
 	)

+	log.Info().Msgf("Starting node registration using key: %s", registrationId)
+
 	return &tailcfg.RegisterResponse{
 		AuthURL: h.authProvider.AuthURL(registrationId),
 	}, nil
--- a/hscontrol/capver/capver.go
+++ b/hscontrol/capver/capver.go
@@ -1,5 +1,7 @@
 package capver

+//go:generate go run ../../tools/capver/main.go
+
 import (
 	"slices"
 	"sort"
@@ -10,7 +12,7 @@ import (
 	"tailscale.com/util/set"
 )

-const MinSupportedCapabilityVersion tailcfg.CapabilityVersion = 88
+const MinSupportedCapabilityVersion tailcfg.CapabilityVersion = 90

 // CanOldCodeBeCleanedUp is intended to be called on startup to see if
 // there are old code that can ble cleaned up, entries should contain
--- a/hscontrol/capver/capver_generated.go
+++ b/hscontrol/capver/capver_generated.go
@@ -1,14 +1,10 @@
 package capver

-// Generated DO NOT EDIT
+//Generated DO NOT EDIT

 import "tailscale.com/tailcfg"

 var tailscaleToCapVer = map[string]tailcfg.CapabilityVersion{
-	"v1.60.0": 87,
-	"v1.60.1": 87,
-	"v1.62.0": 88,
-	"v1.62.1": 88,
 	"v1.64.0": 90,
 	"v1.64.1": 90,
 	"v1.64.2": 90,
@@ -36,18 +32,21 @@ var tailscaleToCapVer = map[string]tailcfg.CapabilityVersion{
 	"v1.80.3": 113,
 	"v1.82.0": 115,
 	"v1.82.5": 115,
+	"v1.84.0": 116,
+	"v1.84.1": 116,
+	"v1.84.2": 116,
 }

+
 var capVerToTailscaleVer = map[tailcfg.CapabilityVersion]string{
-	87:  "v1.60.0",
-	88:  "v1.62.0",
-	90:  "v1.64.0",
-	95:  "v1.66.0",
-	97:  "v1.68.0",
-	102: "v1.70.0",
-	104: "v1.72.0",
-	106: "v1.74.0",
-	109: "v1.78.0",
-	113: "v1.80.0",
-	115: "v1.82.0",
+	90:		"v1.64.0",
+	95:		"v1.66.0",
+	97:		"v1.68.0",
+	102:		"v1.70.0",
+	104:		"v1.72.0",
+	106:		"v1.74.0",
+	109:		"v1.78.0",
+	113:		"v1.80.0",
+	115:		"v1.82.0",
+	116:		"v1.84.0",
 }
--- a/hscontrol/capver/capver_test.go
+++ b/hscontrol/capver/capver_test.go
@@ -13,11 +13,10 @@ func TestTailscaleLatestMajorMinor(t *testing.T) {
 		stripV   bool
 		expected []string
 	}{
-		{3, false, []string{"v1.78", "v1.80", "v1.82"}},
-		{2, true, []string{"1.80", "1.82"}},
+		{3, false, []string{"v1.80", "v1.82", "v1.84"}},
+		{2, true, []string{"1.82", "1.84"}},
 		// Lazy way to see all supported versions
 		{10, true, []string{
-			"1.64",
 			"1.66",
 			"1.68",
 			"1.70",
@@ -27,6 +26,7 @@ func TestTailscaleLatestMajorMinor(t *testing.T) {
 			"1.78",
 			"1.80",
 			"1.82",
+			"1.84",
 		}},
 		{0, false, nil},
 	}
@@ -46,7 +46,6 @@ func TestCapVerMinimumTailscaleVersion(t *testing.T) {
 		input    tailcfg.CapabilityVersion
 		expected string
 	}{
-		{88, "v1.62.0"},
 		{90, "v1.64.0"},
 		{95, "v1.66.0"},
 		{106, "v1.74.0"},
--- a/hscontrol/db/db.go
+++ b/hscontrol/db/db.go
@@ -260,7 +260,7 @@ func NewHeadscaleDatabase(
 									log.Error().Err(err).Msg("Error creating route")
 								} else {
 									log.Info().
-										Uint64("node_id", route.NodeID).
+										Uint64("node.id", route.NodeID).
 										Str("prefix", prefix.String()).
 										Msg("Route migrated")
 								}
@@ -496,7 +496,7 @@ func NewHeadscaleDatabase(
 				ID: "202407191627",
 				Migrate: func(tx *gorm.DB) error {
 					// Fix an issue where the automigration in GORM expected a constraint to
-					// exists that didnt, and add the one it wanted.
+					// exists that didn't, and add the one it wanted.
 					// Fixes https://github.com/juanfont/headscale/issues/2351
 					if cfg.Type == types.DatabasePostgres {
 						err := tx.Exec(`
@@ -870,23 +870,23 @@ AND auth_key_id NOT IN (
 					// Copy data directly using SQL
 					dataCopySQL := []string{
 						`INSERT INTO users (id, name, display_name, email, provider_identifier, provider, profile_pic_url, created_at, updated_at, deleted_at)
-             SELECT id, name, display_name, email, provider_identifier, provider, profile_pic_url, created_at, updated_at, deleted_at 
+             SELECT id, name, display_name, email, provider_identifier, provider, profile_pic_url, created_at, updated_at, deleted_at
             FROM users_old`,

 						`INSERT INTO pre_auth_keys (id, key, user_id, reusable, ephemeral, used, tags, expiration, created_at)
-             SELECT id, key, user_id, reusable, ephemeral, used, tags, expiration, created_at 
+             SELECT id, key, user_id, reusable, ephemeral, used, tags, expiration, created_at
             FROM pre_auth_keys_old`,

 						`INSERT INTO api_keys (id, prefix, hash, expiration, last_seen, created_at)
-             SELECT id, prefix, hash, expiration, last_seen, created_at 
+             SELECT id, prefix, hash, expiration, last_seen, created_at
             FROM api_keys_old`,

 						`INSERT INTO nodes (id, machine_key, node_key, disco_key, endpoints, host_info, ipv4, ipv6, hostname, given_name, user_id, register_method, forced_tags, auth_key_id, last_seen, expiry, approved_routes, created_at, updated_at, deleted_at)
-             SELECT id, machine_key, node_key, disco_key, endpoints, host_info, ipv4, ipv6, hostname, given_name, user_id, register_method, forced_tags, auth_key_id, last_seen, expiry, approved_routes, created_at, updated_at, deleted_at 
+             SELECT id, machine_key, node_key, disco_key, endpoints, host_info, ipv4, ipv6, hostname, given_name, user_id, register_method, forced_tags, auth_key_id, last_seen, expiry, approved_routes, created_at, updated_at, deleted_at
             FROM nodes_old`,

 						`INSERT INTO policies (id, data, created_at, updated_at, deleted_at)
-             SELECT id, data, created_at, updated_at, deleted_at 
+             SELECT id, data, created_at, updated_at, deleted_at
             FROM policies_old`,
 					}

@@ -934,7 +934,7 @@ AND auth_key_id NOT IN (
 			},
 			// From this point, the following rules must be followed:
 			// - NEVER use gorm.AutoMigrate, write the exact migration steps needed
-			// - AutoMigrate depends on the struct staying exactly the same, which it wont over time.
+			// - AutoMigrate depends on the struct staying exactly the same, which it won't over time.
 			// - Never write migrations that requires foreign keys to be disabled.
 		},
 	)
@@ -1131,7 +1131,7 @@ func runMigrations(cfg types.DatabaseConfig, dbConn *gorm.DB, migrations *gormig
 		}

 		for _, migrationID := range migrationIDs {
-			log.Trace().Str("migration_id", migrationID).Msg("Running migration")
+			log.Trace().Caller().Str("migration_id", migrationID).Msg("Running migration")
 			needsFKDisabled := migrationsRequiringFKDisabled[migrationID]

 			if needsFKDisabled {
--- a/hscontrol/db/db_test.go
+++ b/hscontrol/db/db_test.go
@@ -7,7 +7,6 @@ import (
 	"os/exec"
 	"path/filepath"
 	"slices"
-	"sort"
 	"strings"
 	"testing"
 	"time"
@@ -362,8 +361,8 @@ func TestSQLiteMigrationAndDataValidation(t *testing.T) {
 				}

 				if diff := cmp.Diff(expectedKeys, keys, cmp.Comparer(func(a, b []string) bool {
-					sort.Sort(sort.StringSlice(a))
-					sort.Sort(sort.StringSlice(b))
+					slices.Sort(a)
+					slices.Sort(b)
 					return slices.Equal(a, b)
 				}), cmpopts.IgnoreFields(types.PreAuthKey{}, "User", "CreatedAt", "Reusable", "Ephemeral", "Used", "Expiration")); diff != "" {
 					t.Errorf("TestSQLiteMigrationAndDataValidation() pre-auth key tags migration mismatch (-want +got):\n%s", diff)
--- a/hscontrol/db/ip.go
+++ b/hscontrol/db/ip.go
@@ -275,7 +275,7 @@ func (db *HSDatabase) BackfillNodeIPs(i *IPAllocator) ([]string, error) {
 			return errors.New("backfilling IPs: ip allocator was nil")
 		}

-		log.Trace().Msgf("starting to backfill IPs")
+		log.Trace().Caller().Msgf("starting to backfill IPs")

 		nodes, err := ListNodes(tx)
 		if err != nil {
@@ -283,7 +283,7 @@ func (db *HSDatabase) BackfillNodeIPs(i *IPAllocator) ([]string, error) {
 		}

 		for _, node := range nodes {
-			log.Trace().Uint64("node.id", node.ID.Uint64()).Msg("checking if need backfill")
+			log.Trace().Caller().Uint64("node.id", node.ID.Uint64()).Str("node.name", node.Hostname).Msg("IP backfill check started because node found in database")

 			changed := false
 			// IPv4 prefix is set, but node ip is missing, alloc
--- a/hscontrol/db/node.go
+++ b/hscontrol/db/node.go
@@ -7,15 +7,19 @@ import (
 	"net/netip"
 	"slices"
 	"sort"
+	"strconv"
 	"sync"
+	"testing"
 	"time"

 	"github.com/juanfont/headscale/hscontrol/types"
+	"github.com/juanfont/headscale/hscontrol/types/change"
 	"github.com/juanfont/headscale/hscontrol/util"
 	"github.com/rs/zerolog/log"
 	"gorm.io/gorm"
 	"tailscale.com/tailcfg"
 	"tailscale.com/types/key"
+	"tailscale.com/types/ptr"
 )

 const (
@@ -30,18 +34,13 @@ var (
 		"node not found in registration cache",
 	)
 	ErrCouldNotConvertNodeInterface = errors.New("failed to convert node interface")
-	ErrDifferentRegisteredUser      = errors.New(
-		"node was previously registered with a different user",
-	)
 )

 // ListPeers returns peers of node, regardless of any Policy or if the node is expired.
 // If no peer IDs are given, all peers are returned.
 // If at least one peer ID is given, only these peer nodes will be returned.
 func (hsdb *HSDatabase) ListPeers(nodeID types.NodeID, peerIDs ...types.NodeID) (types.Nodes, error) {
-	return Read(hsdb.DB, func(rx *gorm.DB) (types.Nodes, error) {
-		return ListPeers(rx, nodeID, peerIDs...)
-	})
+	return ListPeers(hsdb.DB, nodeID, peerIDs...)
 }

 // ListPeers returns peers of node, regardless of any Policy or if the node is expired.
@@ -66,9 +65,7 @@ func ListPeers(tx *gorm.DB, nodeID types.NodeID, peerIDs ...types.NodeID) (types
 // ListNodes queries the database for either all nodes if no parameters are given
 // or for the given nodes if at least one node ID is given as parameter.
 func (hsdb *HSDatabase) ListNodes(nodeIDs ...types.NodeID) (types.Nodes, error) {
-	return Read(hsdb.DB, func(rx *gorm.DB) (types.Nodes, error) {
-		return ListNodes(rx, nodeIDs...)
-	})
+	return ListNodes(hsdb.DB, nodeIDs...)
 }

 // ListNodes queries the database for either all nodes if no parameters are given
@@ -120,9 +117,7 @@ func getNode(tx *gorm.DB, uid types.UserID, name string) (*types.Node, error) {
 }

 func (hsdb *HSDatabase) GetNodeByID(id types.NodeID) (*types.Node, error) {
-	return Read(hsdb.DB, func(rx *gorm.DB) (*types.Node, error) {
-		return GetNodeByID(rx, id)
-	})
+	return GetNodeByID(hsdb.DB, id)
 }

 // GetNodeByID finds a Node by ID and returns the Node struct.
@@ -140,9 +135,7 @@ func GetNodeByID(tx *gorm.DB, id types.NodeID) (*types.Node, error) {
 }

 func (hsdb *HSDatabase) GetNodeByMachineKey(machineKey key.MachinePublic) (*types.Node, error) {
-	return Read(hsdb.DB, func(rx *gorm.DB) (*types.Node, error) {
-		return GetNodeByMachineKey(rx, machineKey)
-	})
+	return GetNodeByMachineKey(hsdb.DB, machineKey)
 }

 // GetNodeByMachineKey finds a Node by its MachineKey and returns the Node struct.
@@ -163,9 +156,7 @@ func GetNodeByMachineKey(
 }

 func (hsdb *HSDatabase) GetNodeByNodeKey(nodeKey key.NodePublic) (*types.Node, error) {
-	return Read(hsdb.DB, func(rx *gorm.DB) (*types.Node, error) {
-		return GetNodeByNodeKey(rx, nodeKey)
-	})
+	return GetNodeByNodeKey(hsdb.DB, nodeKey)
 }

 // GetNodeByNodeKey finds a Node by its NodeKey and returns the Node struct.
@@ -266,24 +257,18 @@ func SetLastSeen(tx *gorm.DB, nodeID types.NodeID, lastSeen time.Time) error {
 }

 // RenameNode takes a Node struct and a new GivenName for the nodes
-// and renames it. If the name is not unique, it will return an error.
+// and renames it. Validation should be done in the state layer before calling this function.
 func RenameNode(tx *gorm.DB,
 	nodeID types.NodeID, newName string,
 ) error {
-	err := util.CheckForFQDNRules(
-		newName,
-	)
-	if err != nil {
-		return fmt.Errorf("renaming node: %w", err)
+	// Check if the new name is unique
+	var count int64
+	if err := tx.Model(&types.Node{}).Where("given_name = ? AND id != ?", newName, nodeID).Count(&count).Error; err != nil {
+		return fmt.Errorf("failed to check name uniqueness: %w", err)
 	}

-	uniq, err := isUniqueName(tx, newName)
-	if err != nil {
-		return fmt.Errorf("checking if name is unique: %w", err)
-	}
-
-	if !uniq {
-		return fmt.Errorf("name is not unique: %s", newName)
+	if count > 0 {
+		return errors.New("name is not unique")
 	}

 	if err := tx.Model(&types.Node{}).Where("id = ?", nodeID).Update("given_name", newName).Error; err != nil {
@@ -339,106 +324,19 @@ func (hsdb *HSDatabase) DeleteEphemeralNode(
 	})
 }

-// HandleNodeFromAuthPath is called from the OIDC or CLI auth path
-// with a registrationID to register or reauthenticate a node.
-// If the node found in the registration cache is not already registered,
-// it will be registered with the user and the node will be removed from the cache.
-// If the node is already registered, the expiry will be updated.
-// The node, and a boolean indicating if it was a new node or not, will be returned.
-func (hsdb *HSDatabase) HandleNodeFromAuthPath(
-	registrationID types.RegistrationID,
-	userID types.UserID,
-	nodeExpiry *time.Time,
-	registrationMethod string,
-	ipv4 *netip.Addr,
-	ipv6 *netip.Addr,
-) (*types.Node, bool, error) {
-	var newNode bool
-	node, err := Write(hsdb.DB, func(tx *gorm.DB) (*types.Node, error) {
-		if reg, ok := hsdb.regCache.Get(registrationID); ok {
-			if node, _ := GetNodeByNodeKey(tx, reg.Node.NodeKey); node == nil {
-				user, err := GetUserByID(tx, userID)
-				if err != nil {
-					return nil, fmt.Errorf(
-						"failed to find user in register node from auth callback, %w",
-						err,
-					)
-				}
+// RegisterNodeForTest is used only for testing purposes to register a node directly in the database.
+// Production code should use state.HandleNodeFromAuthPath or state.HandleNodeFromPreAuthKey.
+func RegisterNodeForTest(tx *gorm.DB, node types.Node, ipv4 *netip.Addr, ipv6 *netip.Addr) (*types.Node, error) {
+	if !testing.Testing() {
+		panic("RegisterNodeForTest can only be called during tests")
+	}

-				log.Debug().
-					Str("registration_id", registrationID.String()).
-					Str("username", user.Username()).
-					Str("registrationMethod", registrationMethod).
-					Str("expiresAt", fmt.Sprintf("%v", nodeExpiry)).
-					Msg("Registering node from API/CLI or auth callback")
-
-				// TODO(kradalby): This looks quite wrong? why ID 0?
-				// Why not always?
-				// Registration of expired node with different user
-				if reg.Node.ID != 0 &&
-					reg.Node.UserID != user.ID {
-					return nil, ErrDifferentRegisteredUser
-				}
-
-				reg.Node.UserID = user.ID
-				reg.Node.User = *user
-				reg.Node.RegisterMethod = registrationMethod
-
-				if nodeExpiry != nil {
-					reg.Node.Expiry = nodeExpiry
-				}
-
-				node, err := RegisterNode(
-					tx,
-					reg.Node,
-					ipv4, ipv6,
-				)
-
-				if err == nil {
-					hsdb.regCache.Delete(registrationID)
-				}
-
-				// Signal to waiting clients that the machine has been registered.
-				select {
-				case reg.Registered <- node:
-				default:
-				}
-				close(reg.Registered)
-
-				newNode = true
-
-				return node, err
-			} else {
-				// If the node is already registered, this is a refresh.
-				err := NodeSetExpiry(tx, node.ID, *nodeExpiry)
-				if err != nil {
-					return nil, err
-				}
-
-				return node, nil
-			}
-		}
-
-		return nil, ErrNodeNotFoundRegistrationCache
-	})
-
-	return node, newNode, err
-}
-
-func (hsdb *HSDatabase) RegisterNode(node types.Node, ipv4 *netip.Addr, ipv6 *netip.Addr) (*types.Node, error) {
-	return Write(hsdb.DB, func(tx *gorm.DB) (*types.Node, error) {
-		return RegisterNode(tx, node, ipv4, ipv6)
-	})
-}
-
-// RegisterNode is executed from the CLI to register a new Node using its MachineKey.
-func RegisterNode(tx *gorm.DB, node types.Node, ipv4 *netip.Addr, ipv6 *netip.Addr) (*types.Node, error) {
 	log.Debug().
 		Str("node", node.Hostname).
 		Str("machine_key", node.MachineKey.ShortString()).
 		Str("node_key", node.NodeKey.ShortString()).
 		Str("user", node.User.Username()).
-		Msg("Registering node")
+		Msg("Registering test node")

 	// If the a new node is registered with the same machine key, to the same user,
 	// update the existing node.
@@ -448,8 +346,14 @@ func RegisterNode(tx *gorm.DB, node types.Node, ipv4 *netip.Addr, ipv6 *netip.Ad
 	if oldNode != nil && oldNode.UserID == node.UserID {
 		node.ID = oldNode.ID
 		node.GivenName = oldNode.GivenName
-		ipv4 = oldNode.IPv4
-		ipv6 = oldNode.IPv6
+		node.ApprovedRoutes = oldNode.ApprovedRoutes
+		// Don't overwrite the provided IPs with old ones when they exist
+		if ipv4 == nil {
+			ipv4 = oldNode.IPv4
+		}
+		if ipv6 == nil {
+			ipv6 = oldNode.IPv6
+		}
 	}

 	// If the node exists and it already has IP(s), we just save it
@@ -466,7 +370,7 @@ func RegisterNode(tx *gorm.DB, node types.Node, ipv4 *netip.Addr, ipv6 *netip.Ad
 			Str("machine_key", node.MachineKey.ShortString()).
 			Str("node_key", node.NodeKey.ShortString()).
 			Str("user", node.User.Username()).
-			Msg("Node authorized again")
+			Msg("Test node authorized again")

 		return &node, nil
 	}
@@ -475,7 +379,7 @@ func RegisterNode(tx *gorm.DB, node types.Node, ipv4 *netip.Addr, ipv6 *netip.Ad
 	node.IPv6 = ipv6

 	if node.GivenName == "" {
-		givenName, err := ensureUniqueGivenName(tx, node.Hostname)
+		givenName, err := EnsureUniqueGivenName(tx, node.Hostname)
 		if err != nil {
 			return nil, fmt.Errorf("failed to ensure unique given name: %w", err)
 		}
@@ -490,7 +394,7 @@ func RegisterNode(tx *gorm.DB, node types.Node, ipv4 *netip.Addr, ipv6 *netip.Ad
 	log.Trace().
 		Caller().
 		Str("node", node.Hostname).
-		Msg("Node registered with the database")
+		Msg("Test node registered with the database")

 	return &node, nil
 }
@@ -563,7 +467,8 @@ func isUniqueName(tx *gorm.DB, name string) (bool, error) {
 	return len(nodes) == 0, nil
 }

-func ensureUniqueGivenName(
+// EnsureUniqueGivenName generates a unique given name for a node based on its hostname.
+func EnsureUniqueGivenName(
 	tx *gorm.DB,
 	name string,
 ) (string, error) {
@@ -594,17 +499,18 @@ func ensureUniqueGivenName(
 // containing the expired nodes, and a boolean indicating if any nodes were found.
 func ExpireExpiredNodes(tx *gorm.DB,
 	lastCheck time.Time,
-) (time.Time, types.StateUpdate, bool) {
+) (time.Time, []change.ChangeSet, bool) {
 	// use the time of the start of the function to ensure we
 	// dont miss some nodes by returning it _after_ we have
 	// checked everything.
 	started := time.Now()

 	expired := make([]*tailcfg.PeerChange, 0)
+	var updates []change.ChangeSet

 	nodes, err := ListNodes(tx)
 	if err != nil {
-		return time.Unix(0, 0), types.StateUpdate{}, false
+		return time.Unix(0, 0), nil, false
 	}
 	for _, node := range nodes {
 		if node.IsExpired() && node.Expiry.After(lastCheck) {
@@ -612,14 +518,15 @@ func ExpireExpiredNodes(tx *gorm.DB,
 				NodeID:    tailcfg.NodeID(node.ID),
 				KeyExpiry: node.Expiry,
 			})
+			updates = append(updates, change.KeyExpiry(node.ID))
 		}
 	}

 	if len(expired) > 0 {
-		return started, types.UpdatePeerPatch(expired...), true
+		return started, updates, true
 	}

-	return started, types.StateUpdate{}, false
+	return started, nil, false
 }

 // EphemeralGarbageCollector is a garbage collector that will delete nodes after
@@ -732,3 +639,138 @@ func (e *EphemeralGarbageCollector) Start() {
 		}
 	}
 }
+
+func (hsdb *HSDatabase) CreateNodeForTest(user *types.User, hostname ...string) *types.Node {
+	if !testing.Testing() {
+		panic("CreateNodeForTest can only be called during tests")
+	}
+
+	if user == nil {
+		panic("CreateNodeForTest requires a valid user")
+	}
+
+	nodeName := "testnode"
+	if len(hostname) > 0 && hostname[0] != "" {
+		nodeName = hostname[0]
+	}
+
+	// Create a preauth key for the node
+	pak, err := hsdb.CreatePreAuthKey(types.UserID(user.ID), false, false, nil, nil)
+	if err != nil {
+		panic(fmt.Sprintf("failed to create preauth key for test node: %v", err))
+	}
+
+	nodeKey := key.NewNode()
+	machineKey := key.NewMachine()
+	discoKey := key.NewDisco()
+
+	node := &types.Node{
+		MachineKey:     machineKey.Public(),
+		NodeKey:        nodeKey.Public(),
+		DiscoKey:       discoKey.Public(),
+		Hostname:       nodeName,
+		UserID:         user.ID,
+		RegisterMethod: util.RegisterMethodAuthKey,
+		AuthKeyID:      ptr.To(pak.ID),
+	}
+
+	err = hsdb.DB.Save(node).Error
+	if err != nil {
+		panic(fmt.Sprintf("failed to create test node: %v", err))
+	}
+
+	return node
+}
+
+func (hsdb *HSDatabase) CreateRegisteredNodeForTest(user *types.User, hostname ...string) *types.Node {
+	if !testing.Testing() {
+		panic("CreateRegisteredNodeForTest can only be called during tests")
+	}
+
+	node := hsdb.CreateNodeForTest(user, hostname...)
+
+	// Allocate IPs for the test node using the database's IP allocator
+	// This is a simplified allocation for testing - in production this would use State.ipAlloc
+	ipv4, ipv6, err := hsdb.allocateTestIPs(node.ID)
+	if err != nil {
+		panic(fmt.Sprintf("failed to allocate IPs for test node: %v", err))
+	}
+
+	var registeredNode *types.Node
+	err = hsdb.DB.Transaction(func(tx *gorm.DB) error {
+		var err error
+		registeredNode, err = RegisterNodeForTest(tx, *node, ipv4, ipv6)
+		return err
+	})
+	if err != nil {
+		panic(fmt.Sprintf("failed to register test node: %v", err))
+	}
+
+	return registeredNode
+}
+
+func (hsdb *HSDatabase) CreateNodesForTest(user *types.User, count int, hostnamePrefix ...string) []*types.Node {
+	if !testing.Testing() {
+		panic("CreateNodesForTest can only be called during tests")
+	}
+
+	if user == nil {
+		panic("CreateNodesForTest requires a valid user")
+	}
+
+	prefix := "testnode"
+	if len(hostnamePrefix) > 0 && hostnamePrefix[0] != "" {
+		prefix = hostnamePrefix[0]
+	}
+
+	nodes := make([]*types.Node, count)
+	for i := range count {
+		hostname := prefix + "-" + strconv.Itoa(i)
+		nodes[i] = hsdb.CreateNodeForTest(user, hostname)
+	}
+
+	return nodes
+}
+
+func (hsdb *HSDatabase) CreateRegisteredNodesForTest(user *types.User, count int, hostnamePrefix ...string) []*types.Node {
+	if !testing.Testing() {
+		panic("CreateRegisteredNodesForTest can only be called during tests")
+	}
+
+	if user == nil {
+		panic("CreateRegisteredNodesForTest requires a valid user")
+	}
+
+	prefix := "testnode"
+	if len(hostnamePrefix) > 0 && hostnamePrefix[0] != "" {
+		prefix = hostnamePrefix[0]
+	}
+
+	nodes := make([]*types.Node, count)
+	for i := range count {
+		hostname := prefix + "-" + strconv.Itoa(i)
+		nodes[i] = hsdb.CreateRegisteredNodeForTest(user, hostname)
+	}
+
+	return nodes
+}
+
+// allocateTestIPs allocates sequential test IPs for nodes during testing.
+func (hsdb *HSDatabase) allocateTestIPs(nodeID types.NodeID) (*netip.Addr, *netip.Addr, error) {
+	if !testing.Testing() {
+		panic("allocateTestIPs can only be called during tests")
+	}
+
+	// Use simple sequential allocation for tests
+	// IPv4: 100.64.0.x (where x is nodeID)
+	// IPv6: fd7a:115c:a1e0::x (where x is nodeID)
+
+	if nodeID > 254 {
+		return nil, nil, fmt.Errorf("test node ID %d too large for simple IP allocation", nodeID)
+	}
+
+	ipv4 := netip.AddrFrom4([4]byte{100, 64, 0, byte(nodeID)})
+	ipv6 := netip.AddrFrom16([16]byte{0xfd, 0x7a, 0x11, 0x5c, 0xa1, 0xe0, 0, 0, 0, 0, 0, 0, 0, 0, 0, byte(nodeID)})
+
+	return &ipv4, &ipv6, nil
+}
--- a/hscontrol/db/node_test.go
+++ b/hscontrol/db/node_test.go
@@ -6,7 +6,6 @@ import (
 	"math/big"
 	"net/netip"
 	"regexp"
-	"strconv"
 	"sync"
 	"testing"
 	"time"
@@ -26,82 +25,36 @@ import (
 )

 func (s *Suite) TestGetNode(c *check.C) {
-	user, err := db.CreateUser(types.User{Name: "test"})
-	c.Assert(err, check.IsNil)
+	user := db.CreateUserForTest("test")

-	pak, err := db.CreatePreAuthKey(types.UserID(user.ID), false, false, nil, nil)
-	c.Assert(err, check.IsNil)
-
-	_, err = db.getNode(types.UserID(user.ID), "testnode")
+	_, err := db.getNode(types.UserID(user.ID), "testnode")
 	c.Assert(err, check.NotNil)

-	nodeKey := key.NewNode()
-	machineKey := key.NewMachine()
-
-	node := &types.Node{
-		ID:             0,
-		MachineKey:     machineKey.Public(),
-		NodeKey:        nodeKey.Public(),
-		Hostname:       "testnode",
-		UserID:         user.ID,
-		RegisterMethod: util.RegisterMethodAuthKey,
-		AuthKeyID:      ptr.To(pak.ID),
-	}
-	trx := db.DB.Save(node)
-	c.Assert(trx.Error, check.IsNil)
+	node := db.CreateNodeForTest(user, "testnode")

 	_, err = db.getNode(types.UserID(user.ID), "testnode")
 	c.Assert(err, check.IsNil)
+	c.Assert(node.Hostname, check.Equals, "testnode")
 }

 func (s *Suite) TestGetNodeByID(c *check.C) {
-	user, err := db.CreateUser(types.User{Name: "test"})
-	c.Assert(err, check.IsNil)
+	user := db.CreateUserForTest("test")

-	pak, err := db.CreatePreAuthKey(types.UserID(user.ID), false, false, nil, nil)
-	c.Assert(err, check.IsNil)
-
-	_, err = db.GetNodeByID(0)
+	_, err := db.GetNodeByID(0)
 	c.Assert(err, check.NotNil)

-	nodeKey := key.NewNode()
-	machineKey := key.NewMachine()
+	node := db.CreateNodeForTest(user, "testnode")

-	node := types.Node{
-		ID:             0,
-		MachineKey:     machineKey.Public(),
-		NodeKey:        nodeKey.Public(),
-		Hostname:       "testnode",
-		UserID:         user.ID,
-		RegisterMethod: util.RegisterMethodAuthKey,
-		AuthKeyID:      ptr.To(pak.ID),
-	}
-	trx := db.DB.Save(&node)
-	c.Assert(trx.Error, check.IsNil)
-
-	_, err = db.GetNodeByID(0)
+	retrievedNode, err := db.GetNodeByID(node.ID)
 	c.Assert(err, check.IsNil)
+	c.Assert(retrievedNode.Hostname, check.Equals, "testnode")
 }

 func (s *Suite) TestHardDeleteNode(c *check.C) {
-	user, err := db.CreateUser(types.User{Name: "test"})
-	c.Assert(err, check.IsNil)
+	user := db.CreateUserForTest("test")
+	node := db.CreateNodeForTest(user, "testnode3")

-	nodeKey := key.NewNode()
-	machineKey := key.NewMachine()
-
-	node := types.Node{
-		ID:             0,
-		MachineKey:     machineKey.Public(),
-		NodeKey:        nodeKey.Public(),
-		Hostname:       "testnode3",
-		UserID:         user.ID,
-		RegisterMethod: util.RegisterMethodAuthKey,
-	}
-	trx := db.DB.Save(&node)
-	c.Assert(trx.Error, check.IsNil)
-
-	err = db.DeleteNode(&node)
+	err := db.DeleteNode(node)
 	c.Assert(err, check.IsNil)

 	_, err = db.getNode(types.UserID(user.ID), "testnode3")
@@ -109,42 +62,21 @@ func (s *Suite) TestHardDeleteNode(c *check.C) {
 }

 func (s *Suite) TestListPeers(c *check.C) {
-	user, err := db.CreateUser(types.User{Name: "test"})
-	c.Assert(err, check.IsNil)
+	user := db.CreateUserForTest("test")

-	pak, err := db.CreatePreAuthKey(types.UserID(user.ID), false, false, nil, nil)
-	c.Assert(err, check.IsNil)
-
-	_, err = db.GetNodeByID(0)
+	_, err := db.GetNodeByID(0)
 	c.Assert(err, check.NotNil)

-	for index := range 11 {
-		nodeKey := key.NewNode()
-		machineKey := key.NewMachine()
+	nodes := db.CreateNodesForTest(user, 11, "testnode")

-		node := types.Node{
-			ID:             types.NodeID(index),
-			MachineKey:     machineKey.Public(),
-			NodeKey:        nodeKey.Public(),
-			Hostname:       "testnode" + strconv.Itoa(index),
-			UserID:         user.ID,
-			RegisterMethod: util.RegisterMethodAuthKey,
-			AuthKeyID:      ptr.To(pak.ID),
-		}
-		trx := db.DB.Save(&node)
-		c.Assert(trx.Error, check.IsNil)
-	}
-
-	node0ByID, err := db.GetNodeByID(0)
+	firstNode := nodes[0]
+	peersOfFirstNode, err := db.ListPeers(firstNode.ID)
 	c.Assert(err, check.IsNil)

-	peersOfNode0, err := db.ListPeers(node0ByID.ID)
-	c.Assert(err, check.IsNil)
-
-	c.Assert(len(peersOfNode0), check.Equals, 9)
-	c.Assert(peersOfNode0[0].Hostname, check.Equals, "testnode2")
-	c.Assert(peersOfNode0[5].Hostname, check.Equals, "testnode7")
-	c.Assert(peersOfNode0[8].Hostname, check.Equals, "testnode10")
+	c.Assert(len(peersOfFirstNode), check.Equals, 10)
+	c.Assert(peersOfFirstNode[0].Hostname, check.Equals, "testnode-1")
+	c.Assert(peersOfFirstNode[5].Hostname, check.Equals, "testnode-6")
+	c.Assert(peersOfFirstNode[9].Hostname, check.Equals, "testnode-10")
 }

 func (s *Suite) TestExpireNode(c *check.C) {
@@ -360,12 +292,57 @@ func TestHeadscale_generateGivenName(t *testing.T) {

 func TestAutoApproveRoutes(t *testing.T) {
 	tests := []struct {
-		name   string
-		acl    string
-		routes []netip.Prefix
-		want   []netip.Prefix
-		want2  []netip.Prefix
+		name         string
+		acl          string
+		routes       []netip.Prefix
+		want         []netip.Prefix
+		want2        []netip.Prefix
+		expectChange bool // whether to expect route changes
 	}{
+		{
+			name: "no-auto-approvers-empty-policy",
+			acl: `
+{
+	"groups": {
+		"group:admins": ["test@"]
+	},
+	"acls": [
+		{
+			"action": "accept",
+			"src": ["group:admins"],
+			"dst": ["group:admins:*"]
+		}
+	]
+}`,
+			routes:       []netip.Prefix{netip.MustParsePrefix("10.33.0.0/16")},
+			want:         []netip.Prefix{}, // Should be empty - no auto-approvers
+			want2:        []netip.Prefix{}, // Should be empty - no auto-approvers
+			expectChange: false,            // No changes expected
+		},
+		{
+			name: "no-auto-approvers-explicit-empty",
+			acl: `
+{
+	"groups": {
+		"group:admins": ["test@"]
+	},
+	"acls": [
+		{
+			"action": "accept",
+			"src": ["group:admins"],
+			"dst": ["group:admins:*"]
+		}
+	],
+	"autoApprovers": {
+		"routes": {},
+		"exitNode": []
+	}
+}`,
+			routes:       []netip.Prefix{netip.MustParsePrefix("10.33.0.0/16")},
+			want:         []netip.Prefix{}, // Should be empty - explicitly empty auto-approvers
+			want2:        []netip.Prefix{}, // Should be empty - explicitly empty auto-approvers
+			expectChange: false,            // No changes expected
+		},
 		{
 			name: "2068-approve-issue-sub-kube",
 			acl: `
@@ -384,8 +361,9 @@ func TestAutoApproveRoutes(t *testing.T) {
 		}
 	}
 }`,
-			routes: []netip.Prefix{netip.MustParsePrefix("10.42.7.0/24")},
-			want:   []netip.Prefix{netip.MustParsePrefix("10.42.7.0/24")},
+			routes:       []netip.Prefix{netip.MustParsePrefix("10.42.7.0/24")},
+			want:         []netip.Prefix{netip.MustParsePrefix("10.42.7.0/24")},
+			expectChange: true, // Routes should be approved
 		},
 		{
 			name: "2068-approve-issue-sub-exit-tag",
@@ -429,6 +407,7 @@ func TestAutoApproveRoutes(t *testing.T) {
 				tsaddr.AllIPv4(),
 				tsaddr.AllIPv6(),
 			},
+			expectChange: true, // Routes should be approved
 		},
 	}

@@ -489,28 +468,40 @@ func TestAutoApproveRoutes(t *testing.T) {
 				require.NoError(t, err)
 				require.NotNil(t, pm)

-				changed1 := policy.AutoApproveRoutes(pm, &node)
-				assert.True(t, changed1)
+				newRoutes1, changed1 := policy.ApproveRoutesWithPolicy(pm, node.View(), node.ApprovedRoutes, tt.routes)
+				assert.Equal(t, tt.expectChange, changed1)

-				err = adb.DB.Save(&node).Error
-				require.NoError(t, err)
+				if changed1 {
+					err = SetApprovedRoutes(adb.DB, node.ID, newRoutes1)
+					require.NoError(t, err)
+				}

-				_ = policy.AutoApproveRoutes(pm, &nodeTagged)
-
-				err = adb.DB.Save(&nodeTagged).Error
-				require.NoError(t, err)
+				newRoutes2, changed2 := policy.ApproveRoutesWithPolicy(pm, nodeTagged.View(), node.ApprovedRoutes, tt.routes)
+				if changed2 {
+					err = SetApprovedRoutes(adb.DB, nodeTagged.ID, newRoutes2)
+					require.NoError(t, err)
+				}

 				node1ByID, err := adb.GetNodeByID(1)
 				require.NoError(t, err)

-				if diff := cmp.Diff(tt.want, node1ByID.SubnetRoutes(), util.Comparers...); diff != "" {
+				// For empty auto-approvers tests, handle nil vs empty slice comparison
+				expectedRoutes1 := tt.want
+				if len(expectedRoutes1) == 0 {
+					expectedRoutes1 = nil
+				}
+				if diff := cmp.Diff(expectedRoutes1, node1ByID.SubnetRoutes(), util.Comparers...); diff != "" {
 					t.Errorf("unexpected enabled routes (-want +got):\n%s", diff)
 				}

 				node2ByID, err := adb.GetNodeByID(2)
 				require.NoError(t, err)

-				if diff := cmp.Diff(tt.want2, node2ByID.SubnetRoutes(), util.Comparers...); diff != "" {
+				expectedRoutes2 := tt.want2
+				if len(expectedRoutes2) == 0 {
+					expectedRoutes2 = nil
+				}
+				if diff := cmp.Diff(expectedRoutes2, node2ByID.SubnetRoutes(), util.Comparers...); diff != "" {
 					t.Errorf("unexpected enabled routes (-want +got):\n%s", diff)
 				}
 			})
@@ -688,11 +679,11 @@ func TestRenameNode(t *testing.T) {
 	require.NoError(t, err)

 	err = db.DB.Transaction(func(tx *gorm.DB) error {
-		_, err := RegisterNode(tx, node, nil, nil)
+		_, err := RegisterNodeForTest(tx, node, nil, nil)
 		if err != nil {
 			return err
 		}
-		_, err = RegisterNode(tx, node2, nil, nil)
+		_, err = RegisterNodeForTest(tx, node2, nil, nil)

 		return err
 	})
@@ -789,11 +780,11 @@ func TestListPeers(t *testing.T) {
 	require.NoError(t, err)

 	err = db.DB.Transaction(func(tx *gorm.DB) error {
-		_, err := RegisterNode(tx, node1, nil, nil)
+		_, err := RegisterNodeForTest(tx, node1, nil, nil)
 		if err != nil {
 			return err
 		}
-		_, err = RegisterNode(tx, node2, nil, nil)
+		_, err = RegisterNodeForTest(tx, node2, nil, nil)

 		return err
 	})
@@ -874,11 +865,11 @@ func TestListNodes(t *testing.T) {
 	require.NoError(t, err)

 	err = db.DB.Transaction(func(tx *gorm.DB) error {
-		_, err := RegisterNode(tx, node1, nil, nil)
+		_, err := RegisterNodeForTest(tx, node1, nil, nil)
 		if err != nil {
 			return err
 		}
-		_, err = RegisterNode(tx, node2, nil, nil)
+		_, err = RegisterNodeForTest(tx, node2, nil, nil)

 		return err
 	})
--- a/hscontrol/db/preauth_keys.go
+++ b/hscontrol/db/preauth_keys.go
@@ -5,6 +5,7 @@ import (
 	"encoding/hex"
 	"errors"
 	"fmt"
+	"slices"
 	"strings"
 	"time"

@@ -47,8 +48,9 @@ func CreatePreAuthKey(
 		return nil, err
 	}

-	// Remove duplicates
+	// Remove duplicates and sort for consistency
 	aclTags = set.SetOf(aclTags).Slice()
+	slices.Sort(aclTags)

 	// TODO(kradalby): factor out and create a reusable tag validation,
 	// check if there is one in Tailscale's lib.
@@ -109,9 +111,7 @@ func ListPreAuthKeysByUser(tx *gorm.DB, uid types.UserID) ([]types.PreAuthKey, e
 }

 func (hsdb *HSDatabase) GetPreAuthKey(key string) (*types.PreAuthKey, error) {
-	return Read(hsdb.DB, func(rx *gorm.DB) (*types.PreAuthKey, error) {
-		return GetPreAuthKey(rx, key)
-	})
+	return GetPreAuthKey(hsdb.DB, key)
 }

 // GetPreAuthKey returns a PreAuthKey for a given key. The caller is responsible
@@ -155,11 +155,8 @@ func UsePreAuthKey(tx *gorm.DB, k *types.PreAuthKey) error {

 // MarkExpirePreAuthKey marks a PreAuthKey as expired.
 func ExpirePreAuthKey(tx *gorm.DB, k *types.PreAuthKey) error {
-	if err := tx.Model(&k).Update("Expiration", time.Now()).Error; err != nil {
-		return err
-	}
-
-	return nil
+	now := time.Now()
+	return tx.Model(&types.PreAuthKey{}).Where("id = ?", k.ID).Update("expiration", now).Error
 }

 func generateKey() (string, error) {
--- a/hscontrol/db/preauth_keys_test.go
+++ b/hscontrol/db/preauth_keys_test.go
@@ -1,7 +1,7 @@
 package db

 import (
-	"sort"
+	"slices"
 	"testing"

 	"github.com/juanfont/headscale/hscontrol/types"
@@ -57,7 +57,7 @@ func (*Suite) TestPreAuthKeyACLTags(c *check.C) {
 	listedPaks, err := db.ListPreAuthKeys(types.UserID(user.ID))
 	c.Assert(err, check.IsNil)
 	gotTags := listedPaks[0].Proto().GetAclTags()
-	sort.Sort(sort.StringSlice(gotTags))
+	slices.Sort(gotTags)
 	c.Assert(gotTags, check.DeepEquals, tags)
 }

--- a/hscontrol/db/text_serialiser.go
+++ b/hscontrol/db/text_serialiser.go
@@ -10,7 +10,7 @@ import (
 )

 // Got from https://github.com/xdg-go/strum/blob/main/types.go
-var textUnmarshalerType = reflect.TypeOf((*encoding.TextUnmarshaler)(nil)).Elem()
+var textUnmarshalerType = reflect.TypeFor[encoding.TextUnmarshaler]()

 func isTextUnmarshaler(rv reflect.Value) bool {
 	return rv.Type().Implements(textUnmarshalerType)
--- a/hscontrol/db/users.go
+++ b/hscontrol/db/users.go
@@ -3,6 +3,8 @@ package db
 import (
 	"errors"
 	"fmt"
+	"strconv"
+	"testing"

 	"github.com/juanfont/headscale/hscontrol/types"
 	"github.com/juanfont/headscale/hscontrol/util"
@@ -110,9 +112,7 @@ func RenameUser(tx *gorm.DB, uid types.UserID, newName string) error {
 }

 func (hsdb *HSDatabase) GetUserByID(uid types.UserID) (*types.User, error) {
-	return Read(hsdb.DB, func(rx *gorm.DB) (*types.User, error) {
-		return GetUserByID(rx, uid)
-	})
+	return GetUserByID(hsdb.DB, uid)
 }

 func GetUserByID(tx *gorm.DB, uid types.UserID) (*types.User, error) {
@@ -146,9 +146,7 @@ func GetUserByOIDCIdentifier(tx *gorm.DB, id string) (*types.User, error) {
 }

 func (hsdb *HSDatabase) ListUsers(where ...*types.User) ([]types.User, error) {
-	return Read(hsdb.DB, func(rx *gorm.DB) ([]types.User, error) {
-		return ListUsers(rx, where...)
-	})
+	return ListUsers(hsdb.DB, where...)
 }

 // ListUsers gets all the existing users.
@@ -200,20 +198,58 @@ func ListNodesByUser(tx *gorm.DB, uid types.UserID) (types.Nodes, error) {
 }

 // AssignNodeToUser assigns a Node to a user.
+// Note: Validation should be done in the state layer before calling this function.
 func AssignNodeToUser(tx *gorm.DB, nodeID types.NodeID, uid types.UserID) error {
-	node, err := GetNodeByID(tx, nodeID)
-	if err != nil {
-		return err
+	// Check if the user exists
+	var userExists bool
+	if err := tx.Model(&types.User{}).Select("count(*) > 0").Where("id = ?", uid).Find(&userExists).Error; err != nil {
+		return fmt.Errorf("failed to check if user exists: %w", err)
 	}
-	user, err := GetUserByID(tx, uid)
-	if err != nil {
-		return err
+
+	if !userExists {
+		return ErrUserNotFound
 	}
-	node.User = *user
-	node.UserID = user.ID
-	if result := tx.Save(&node); result.Error != nil {
-		return result.Error
+
+	if err := tx.Model(&types.Node{}).Where("id = ?", nodeID).Update("user_id", uid).Error; err != nil {
+		return fmt.Errorf("failed to assign node to user: %w", err)
 	}

 	return nil
 }
+
+func (hsdb *HSDatabase) CreateUserForTest(name ...string) *types.User {
+	if !testing.Testing() {
+		panic("CreateUserForTest can only be called during tests")
+	}
+
+	userName := "testuser"
+	if len(name) > 0 && name[0] != "" {
+		userName = name[0]
+	}
+
+	user, err := hsdb.CreateUser(types.User{Name: userName})
+	if err != nil {
+		panic(fmt.Sprintf("failed to create test user: %v", err))
+	}
+
+	return user
+}
+
+func (hsdb *HSDatabase) CreateUsersForTest(count int, namePrefix ...string) []*types.User {
+	if !testing.Testing() {
+		panic("CreateUsersForTest can only be called during tests")
+	}
+
+	prefix := "testuser"
+	if len(namePrefix) > 0 && namePrefix[0] != "" {
+		prefix = namePrefix[0]
+	}
+
+	users := make([]*types.User, count)
+	for i := range count {
+		name := prefix + "-" + strconv.Itoa(i)
+		users[i] = hsdb.CreateUserForTest(name)
+	}
+
+	return users
+}
--- a/hscontrol/db/users_test.go
+++ b/hscontrol/db/users_test.go
@@ -11,8 +11,7 @@ import (
 )

 func (s *Suite) TestCreateAndDestroyUser(c *check.C) {
-	user, err := db.CreateUser(types.User{Name: "test"})
-	c.Assert(err, check.IsNil)
+	user := db.CreateUserForTest("test")
 	c.Assert(user.Name, check.Equals, "test")

 	users, err := db.ListUsers()
@@ -30,8 +29,7 @@ func (s *Suite) TestDestroyUserErrors(c *check.C) {
 	err := db.DestroyUser(9998)
 	c.Assert(err, check.Equals, ErrUserNotFound)

-	user, err := db.CreateUser(types.User{Name: "test"})
-	c.Assert(err, check.IsNil)
+	user := db.CreateUserForTest("test")

 	pak, err := db.CreatePreAuthKey(types.UserID(user.ID), false, false, nil, nil)
 	c.Assert(err, check.IsNil)
@@ -64,8 +62,7 @@ func (s *Suite) TestDestroyUserErrors(c *check.C) {
 }

 func (s *Suite) TestRenameUser(c *check.C) {
-	userTest, err := db.CreateUser(types.User{Name: "test"})
-	c.Assert(err, check.IsNil)
+	userTest := db.CreateUserForTest("test")
 	c.Assert(userTest.Name, check.Equals, "test")

 	users, err := db.ListUsers()
@@ -86,8 +83,7 @@ func (s *Suite) TestRenameUser(c *check.C) {
 	err = db.RenameUser(99988, "test")
 	c.Assert(err, check.Equals, ErrUserNotFound)

-	userTest2, err := db.CreateUser(types.User{Name: "test2"})
-	c.Assert(err, check.IsNil)
+	userTest2 := db.CreateUserForTest("test2")
 	c.Assert(userTest2.Name, check.Equals, "test2")

 	want := "UNIQUE constraint failed"
@@ -98,11 +94,8 @@ func (s *Suite) TestRenameUser(c *check.C) {
 }

 func (s *Suite) TestSetMachineUser(c *check.C) {
-	oldUser, err := db.CreateUser(types.User{Name: "old"})
-	c.Assert(err, check.IsNil)
-
-	newUser, err := db.CreateUser(types.User{Name: "new"})
-	c.Assert(err, check.IsNil)
+	oldUser := db.CreateUserForTest("old")
+	newUser := db.CreateUserForTest("new")

 	pak, err := db.CreatePreAuthKey(types.UserID(oldUser.ID), false, false, nil, nil)
 	c.Assert(err, check.IsNil)
--- a/hscontrol/debug.go
+++ b/hscontrol/debug.go
@@ -4,62 +4,83 @@ import (
 	"encoding/json"
 	"fmt"
 	"net/http"
-	"os"
+	"strings"

 	"github.com/arl/statsviz"
+	"github.com/juanfont/headscale/hscontrol/mapper"
 	"github.com/juanfont/headscale/hscontrol/types"
-	"github.com/juanfont/headscale/hscontrol/util"
 	"github.com/prometheus/client_golang/prometheus/promhttp"
-	"tailscale.com/tailcfg"
 	"tailscale.com/tsweb"
 )

 func (h *Headscale) debugHTTPServer() *http.Server {
 	debugMux := http.NewServeMux()
 	debug := tsweb.Debugger(debugMux)
-	debug.Handle("notifier", "Connected nodes in notifier", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		w.WriteHeader(http.StatusOK)
-		w.Write([]byte(h.nodeNotifier.String()))
+
+	// State overview endpoint
+	debug.Handle("overview", "State overview", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		// Check Accept header to determine response format
+		acceptHeader := r.Header.Get("Accept")
+		wantsJSON := strings.Contains(acceptHeader, "application/json")
+
+		if wantsJSON {
+			overview := h.state.DebugOverviewJSON()
+			overviewJSON, err := json.MarshalIndent(overview, "", "  ")
+			if err != nil {
+				httpError(w, err)
+				return
+			}
+			w.Header().Set("Content-Type", "application/json")
+			w.WriteHeader(http.StatusOK)
+			w.Write(overviewJSON)
+		} else {
+			// Default to text/plain for backward compatibility
+			overview := h.state.DebugOverview()
+			w.Header().Set("Content-Type", "text/plain")
+			w.WriteHeader(http.StatusOK)
+			w.Write([]byte(overview))
+		}
 	}))
+
+	// Configuration endpoint
 	debug.Handle("config", "Current configuration", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		config, err := json.MarshalIndent(h.cfg, "", "  ")
+		config := h.state.DebugConfig()
+		configJSON, err := json.MarshalIndent(config, "", "  ")
 		if err != nil {
 			httpError(w, err)
 			return
 		}
 		w.Header().Set("Content-Type", "application/json")
 		w.WriteHeader(http.StatusOK)
-		w.Write(config)
+		w.Write(configJSON)
 	}))
-	debug.Handle("policy", "Current policy", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		switch h.cfg.Policy.Mode {
-		case types.PolicyModeDB:
-			p, err := h.state.GetPolicy()
-			if err != nil {
-				httpError(w, err)
-				return
-			}
-			w.Header().Set("Content-Type", "application/json")
-			w.WriteHeader(http.StatusOK)
-			w.Write([]byte(p.Data))
-		case types.PolicyModeFile:
-			// Read the file directly for debug purposes
-			absPath := util.AbsolutePathFromConfigPath(h.cfg.Policy.Path)
-			pol, err := os.ReadFile(absPath)
-			if err != nil {
-				httpError(w, err)
-				return
-			}
-			w.Header().Set("Content-Type", "application/json")
-			w.WriteHeader(http.StatusOK)
-			w.Write(pol)
-		default:
-			httpError(w, fmt.Errorf("unsupported policy mode: %s", h.cfg.Policy.Mode))
-		}
-	}))
-	debug.Handle("filter", "Current filter", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		filter, _ := h.state.Filter()

+	// Policy endpoint
+	debug.Handle("policy", "Current policy", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		policy, err := h.state.DebugPolicy()
+		if err != nil {
+			httpError(w, err)
+			return
+		}
+		// Policy data is HuJSON, which is a superset of JSON
+		// Set content type based on Accept header preference
+		acceptHeader := r.Header.Get("Accept")
+		if strings.Contains(acceptHeader, "application/json") {
+			w.Header().Set("Content-Type", "application/json")
+		} else {
+			w.Header().Set("Content-Type", "text/plain")
+		}
+		w.WriteHeader(http.StatusOK)
+		w.Write([]byte(policy))
+	}))
+
+	// Filter rules endpoint
+	debug.Handle("filter", "Current filter rules", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		filter, err := h.state.DebugFilter()
+		if err != nil {
+			httpError(w, err)
+			return
+		}
 		filterJSON, err := json.MarshalIndent(filter, "", "  ")
 		if err != nil {
 			httpError(w, err)
@@ -69,25 +90,11 @@ func (h *Headscale) debugHTTPServer() *http.Server {
 		w.WriteHeader(http.StatusOK)
 		w.Write(filterJSON)
 	}))
-	debug.Handle("ssh", "SSH Policy per node", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		nodes, err := h.state.ListNodes()
-		if err != nil {
-			httpError(w, err)
-			return
-		}

-		sshPol := make(map[string]*tailcfg.SSHPolicy)
-		for _, node := range nodes {
-			pol, err := h.state.SSHPolicy(node.View())
-			if err != nil {
-				httpError(w, err)
-				return
-			}
-
-			sshPol[fmt.Sprintf("id:%d  hostname:%s givenname:%s", node.ID, node.Hostname, node.GivenName)] = pol
-		}
-
-		sshJSON, err := json.MarshalIndent(sshPol, "", "  ")
+	// SSH policies endpoint
+	debug.Handle("ssh", "SSH policies per node", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		sshPolicies := h.state.DebugSSHPolicies()
+		sshJSON, err := json.MarshalIndent(sshPolicies, "", "  ")
 		if err != nil {
 			httpError(w, err)
 			return
@@ -96,33 +103,169 @@ func (h *Headscale) debugHTTPServer() *http.Server {
 		w.WriteHeader(http.StatusOK)
 		w.Write(sshJSON)
 	}))
-	debug.Handle("derpmap", "Current DERPMap", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		dm := h.state.DERPMap()

-		dmJSON, err := json.MarshalIndent(dm, "", "  ")
+	// DERP map endpoint
+	debug.Handle("derp", "DERP map configuration", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		// Check Accept header to determine response format
+		acceptHeader := r.Header.Get("Accept")
+		wantsJSON := strings.Contains(acceptHeader, "application/json")
+
+		if wantsJSON {
+			derpInfo := h.state.DebugDERPJSON()
+			derpJSON, err := json.MarshalIndent(derpInfo, "", "  ")
+			if err != nil {
+				httpError(w, err)
+				return
+			}
+			w.Header().Set("Content-Type", "application/json")
+			w.WriteHeader(http.StatusOK)
+			w.Write(derpJSON)
+		} else {
+			// Default to text/plain for backward compatibility
+			derpInfo := h.state.DebugDERPMap()
+			w.Header().Set("Content-Type", "text/plain")
+			w.WriteHeader(http.StatusOK)
+			w.Write([]byte(derpInfo))
+		}
+	}))
+
+	// NodeStore endpoint
+	debug.Handle("nodestore", "NodeStore information", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		// Check Accept header to determine response format
+		acceptHeader := r.Header.Get("Accept")
+		wantsJSON := strings.Contains(acceptHeader, "application/json")
+
+		if wantsJSON {
+			nodeStoreNodes := h.state.DebugNodeStoreJSON()
+			nodeStoreJSON, err := json.MarshalIndent(nodeStoreNodes, "", "  ")
+			if err != nil {
+				httpError(w, err)
+				return
+			}
+			w.Header().Set("Content-Type", "application/json")
+			w.WriteHeader(http.StatusOK)
+			w.Write(nodeStoreJSON)
+		} else {
+			// Default to text/plain for backward compatibility
+			nodeStoreInfo := h.state.DebugNodeStore()
+			w.Header().Set("Content-Type", "text/plain")
+			w.WriteHeader(http.StatusOK)
+			w.Write([]byte(nodeStoreInfo))
+		}
+	}))
+
+	// Registration cache endpoint
+	debug.Handle("registration-cache", "Registration cache information", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		cacheInfo := h.state.DebugRegistrationCache()
+		cacheJSON, err := json.MarshalIndent(cacheInfo, "", "  ")
 		if err != nil {
 			httpError(w, err)
 			return
 		}
 		w.Header().Set("Content-Type", "application/json")
 		w.WriteHeader(http.StatusOK)
-		w.Write(dmJSON)
+		w.Write(cacheJSON)
 	}))
-	debug.Handle("registration-cache", "Pending registrations", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		// TODO(kradalby): This should be replaced with a proper state method that returns registration info
+
+	// Routes endpoint
+	debug.Handle("routes", "Primary routes", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		// Check Accept header to determine response format
+		acceptHeader := r.Header.Get("Accept")
+		wantsJSON := strings.Contains(acceptHeader, "application/json")
+
+		if wantsJSON {
+			routes := h.state.DebugRoutes()
+			routesJSON, err := json.MarshalIndent(routes, "", "  ")
+			if err != nil {
+				httpError(w, err)
+				return
+			}
+			w.Header().Set("Content-Type", "application/json")
+			w.WriteHeader(http.StatusOK)
+			w.Write(routesJSON)
+		} else {
+			// Default to text/plain for backward compatibility
+			routes := h.state.DebugRoutesString()
+			w.Header().Set("Content-Type", "text/plain")
+			w.WriteHeader(http.StatusOK)
+			w.Write([]byte(routes))
+		}
+	}))
+
+	// Policy manager endpoint
+	debug.Handle("policy-manager", "Policy manager state", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		// Check Accept header to determine response format
+		acceptHeader := r.Header.Get("Accept")
+		wantsJSON := strings.Contains(acceptHeader, "application/json")
+
+		if wantsJSON {
+			policyManagerInfo := h.state.DebugPolicyManagerJSON()
+			policyManagerJSON, err := json.MarshalIndent(policyManagerInfo, "", "  ")
+			if err != nil {
+				httpError(w, err)
+				return
+			}
+			w.Header().Set("Content-Type", "application/json")
+			w.WriteHeader(http.StatusOK)
+			w.Write(policyManagerJSON)
+		} else {
+			// Default to text/plain for backward compatibility
+			policyManagerInfo := h.state.DebugPolicyManager()
+			w.Header().Set("Content-Type", "text/plain")
+			w.WriteHeader(http.StatusOK)
+			w.Write([]byte(policyManagerInfo))
+		}
+	}))
+
+	debug.Handle("mapresponses", "Map responses for all nodes", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		res, err := h.mapBatcher.DebugMapResponses()
+		if err != nil {
+			httpError(w, err)
+			return
+		}
+
+		if res == nil {
+			w.WriteHeader(http.StatusOK)
+			w.Write([]byte("HEADSCALE_DEBUG_DUMP_MAPRESPONSE_PATH not set"))
+			return
+		}
+
+		resJSON, err := json.MarshalIndent(res, "", "  ")
+		if err != nil {
+			httpError(w, err)
+			return
+		}
 		w.Header().Set("Content-Type", "application/json")
 		w.WriteHeader(http.StatusOK)
-		w.Write([]byte("{}")) // For now, return empty object
+		w.Write(resJSON)
 	}))
-	debug.Handle("routes", "Routes", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		w.Header().Set("Content-Type", "text/plain")
-		w.WriteHeader(http.StatusOK)
-		w.Write([]byte(h.state.PrimaryRoutesString()))
-	}))
-	debug.Handle("policy-manager", "Policy Manager", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		w.Header().Set("Content-Type", "text/plain")
-		w.WriteHeader(http.StatusOK)
-		w.Write([]byte(h.state.PolicyDebugString()))
+
+	// Batcher endpoint
+	debug.Handle("batcher", "Batcher connected nodes", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		// Check Accept header to determine response format
+		acceptHeader := r.Header.Get("Accept")
+		wantsJSON := strings.Contains(acceptHeader, "application/json")
+
+		if wantsJSON {
+			batcherInfo := h.debugBatcherJSON()
+
+			batcherJSON, err := json.MarshalIndent(batcherInfo, "", "  ")
+			if err != nil {
+				httpError(w, err)
+				return
+			}
+
+			w.Header().Set("Content-Type", "application/json")
+			w.WriteHeader(http.StatusOK)
+			w.Write(batcherJSON)
+		} else {
+			// Default to text/plain for backward compatibility
+			batcherInfo := h.debugBatcher()
+
+			w.Header().Set("Content-Type", "text/plain")
+			w.WriteHeader(http.StatusOK)
+			w.Write([]byte(batcherInfo))
+		}
 	}))

 	err := statsviz.Register(debugMux)
@@ -142,3 +285,124 @@ func (h *Headscale) debugHTTPServer() *http.Server {

 	return debugHTTPServer
 }
+
+// debugBatcher returns debug information about the batcher's connected nodes.
+func (h *Headscale) debugBatcher() string {
+	var sb strings.Builder
+	sb.WriteString("=== Batcher Connected Nodes ===\n\n")
+
+	totalNodes := 0
+	connectedCount := 0
+
+	// Collect nodes and sort them by ID
+	type nodeStatus struct {
+		id                types.NodeID
+		connected         bool
+		activeConnections int
+	}
+
+	var nodes []nodeStatus
+
+	// Try to get detailed debug info if we have a LockFreeBatcher
+	if batcher, ok := h.mapBatcher.(*mapper.LockFreeBatcher); ok {
+		debugInfo := batcher.Debug()
+		for nodeID, info := range debugInfo {
+			nodes = append(nodes, nodeStatus{
+				id:                nodeID,
+				connected:         info.Connected,
+				activeConnections: info.ActiveConnections,
+			})
+			totalNodes++
+			if info.Connected {
+				connectedCount++
+			}
+		}
+	} else {
+		// Fallback to basic connection info
+		connectedMap := h.mapBatcher.ConnectedMap()
+		connectedMap.Range(func(nodeID types.NodeID, connected bool) bool {
+			nodes = append(nodes, nodeStatus{
+				id:                nodeID,
+				connected:         connected,
+				activeConnections: 0,
+			})
+			totalNodes++
+			if connected {
+				connectedCount++
+			}
+			return true
+		})
+	}
+
+	// Sort by node ID
+	for i := 0; i < len(nodes); i++ {
+		for j := i + 1; j < len(nodes); j++ {
+			if nodes[i].id > nodes[j].id {
+				nodes[i], nodes[j] = nodes[j], nodes[i]
+			}
+		}
+	}
+
+	// Output sorted nodes
+	for _, node := range nodes {
+		status := "disconnected"
+		if node.connected {
+			status = "connected"
+		}
+
+		if node.activeConnections > 0 {
+			sb.WriteString(fmt.Sprintf("Node %d:\t%s (%d connections)\n", node.id, status, node.activeConnections))
+		} else {
+			sb.WriteString(fmt.Sprintf("Node %d:\t%s\n", node.id, status))
+		}
+	}
+
+	sb.WriteString(fmt.Sprintf("\nSummary: %d connected, %d total\n", connectedCount, totalNodes))
+
+	return sb.String()
+}
+
+// DebugBatcherInfo represents batcher connection information in a structured format.
+type DebugBatcherInfo struct {
+	ConnectedNodes map[string]DebugBatcherNodeInfo `json:"connected_nodes"` // NodeID -> node connection info
+	TotalNodes     int                             `json:"total_nodes"`
+}
+
+// DebugBatcherNodeInfo represents connection information for a single node.
+type DebugBatcherNodeInfo struct {
+	Connected         bool `json:"connected"`
+	ActiveConnections int  `json:"active_connections"`
+}
+
+// debugBatcherJSON returns structured debug information about the batcher's connected nodes.
+func (h *Headscale) debugBatcherJSON() DebugBatcherInfo {
+	info := DebugBatcherInfo{
+		ConnectedNodes: make(map[string]DebugBatcherNodeInfo),
+		TotalNodes:     0,
+	}
+
+	// Try to get detailed debug info if we have a LockFreeBatcher
+	if batcher, ok := h.mapBatcher.(*mapper.LockFreeBatcher); ok {
+		debugInfo := batcher.Debug()
+		for nodeID, debugData := range debugInfo {
+			info.ConnectedNodes[fmt.Sprintf("%d", nodeID)] = DebugBatcherNodeInfo{
+				Connected:         debugData.Connected,
+				ActiveConnections: debugData.ActiveConnections,
+			}
+			info.TotalNodes++
+		}
+	} else {
+		// Fallback to basic connection info
+		connectedMap := h.mapBatcher.ConnectedMap()
+		connectedMap.Range(func(nodeID types.NodeID, connected bool) bool {
+			info.ConnectedNodes[fmt.Sprintf("%d", nodeID)] = DebugBatcherNodeInfo{
+				Connected:         connected,
+				ActiveConnections: 0,
+			}
+			info.TotalNodes++
+			return true
+		})
+	}
+
+	return info
+}
--- a/hscontrol/derp/derp.go
+++ b/hscontrol/derp/derp.go
@@ -1,15 +1,22 @@
 package derp

 import (
+	"cmp"
 	"context"
 	"encoding/json"
+	"hash/crc64"
 	"io"
+	"maps"
+	"math/rand"
 	"net/http"
 	"net/url"
 	"os"
+	"reflect"
+	"sync"
+	"time"

 	"github.com/juanfont/headscale/hscontrol/types"
-	"github.com/rs/zerolog/log"
+	"github.com/spf13/viper"
 	"gopkg.in/yaml.v3"
 	"tailscale.com/tailcfg"
 )
@@ -72,61 +79,91 @@ func mergeDERPMaps(derpMaps []*tailcfg.DERPMap) *tailcfg.DERPMap {
 	}

 	for _, derpMap := range derpMaps {
-		for id, region := range derpMap.Regions {
-			result.Regions[id] = region
+		maps.Copy(result.Regions, derpMap.Regions)
+	}
+
+	for id, region := range result.Regions {
+		if region == nil {
+			delete(result.Regions, id)
 		}
 	}

 	return &result
 }

-func GetDERPMap(cfg types.DERPConfig) *tailcfg.DERPMap {
+func GetDERPMap(cfg types.DERPConfig) (*tailcfg.DERPMap, error) {
 	var derpMaps []*tailcfg.DERPMap
 	if cfg.DERPMap != nil {
 		derpMaps = append(derpMaps, cfg.DERPMap)
 	}

-	for _, path := range cfg.Paths {
-		log.Debug().
-			Str("func", "GetDERPMap").
-			Str("path", path).
-			Msg("Loading DERPMap from path")
-		derpMap, err := loadDERPMapFromPath(path)
+	for _, addr := range cfg.URLs {
+		derpMap, err := loadDERPMapFromURL(addr)
 		if err != nil {
-			log.Error().
-				Str("func", "GetDERPMap").
-				Str("path", path).
-				Err(err).
-				Msg("Could not load DERP map from path")
-
-			break
+			return nil, err
 		}

 		derpMaps = append(derpMaps, derpMap)
 	}

-	for _, addr := range cfg.URLs {
-		derpMap, err := loadDERPMapFromURL(addr)
-		log.Debug().
-			Str("func", "GetDERPMap").
-			Str("url", addr.String()).
-			Msg("Loading DERPMap from path")
+	for _, path := range cfg.Paths {
+		derpMap, err := loadDERPMapFromPath(path)
 		if err != nil {
-			log.Error().
-				Str("func", "GetDERPMap").
-				Str("url", addr.String()).
-				Err(err).
-				Msg("Could not load DERP map from path")
-
-			break
+			return nil, err
 		}

 		derpMaps = append(derpMaps, derpMap)
 	}

 	derpMap := mergeDERPMaps(derpMaps)
+	shuffleDERPMap(derpMap)

-	log.Trace().Interface("derpMap", derpMap).Msg("DERPMap loaded")
-
-	return derpMap
+	return derpMap, nil
+}
+
+func shuffleDERPMap(dm *tailcfg.DERPMap) {
+	if dm == nil || len(dm.Regions) == 0 {
+		return
+	}
+
+	for id, region := range dm.Regions {
+		if len(region.Nodes) == 0 {
+			continue
+		}
+
+		dm.Regions[id] = shuffleRegionNoClone(region)
+	}
+}
+
+var crc64Table = crc64.MakeTable(crc64.ISO)
+
+var (
+	derpRandomOnce sync.Once
+	derpRandomInst *rand.Rand
+	derpRandomMu   sync.Mutex
+)
+
+func derpRandom() *rand.Rand {
+	derpRandomMu.Lock()
+	defer derpRandomMu.Unlock()
+
+	derpRandomOnce.Do(func() {
+		seed := cmp.Or(viper.GetString("dns.base_domain"), time.Now().String())
+		rnd := rand.New(rand.NewSource(0))
+		rnd.Seed(int64(crc64.Checksum([]byte(seed), crc64Table)))
+		derpRandomInst = rnd
+	})
+	return derpRandomInst
+}
+
+func resetDerpRandomForTesting() {
+	derpRandomMu.Lock()
+	defer derpRandomMu.Unlock()
+	derpRandomOnce = sync.Once{}
+	derpRandomInst = nil
+}
+
+func shuffleRegionNoClone(r *tailcfg.DERPRegion) *tailcfg.DERPRegion {
+	derpRandom().Shuffle(len(r.Nodes), reflect.Swapper(r.Nodes))
+	return r
 }
--- a/hscontrol/derp/derp_test.go
+++ b/hscontrol/derp/derp_test.go
@@ -0,0 +1,283 @@
+package derp
+
+import (
+	"testing"
+
+	"github.com/google/go-cmp/cmp"
+	"github.com/spf13/viper"
+	"tailscale.com/tailcfg"
+)
+
+func TestShuffleDERPMapDeterministic(t *testing.T) {
+	tests := []struct {
+		name       string
+		baseDomain string
+		derpMap    *tailcfg.DERPMap
+		expected   *tailcfg.DERPMap
+	}{
+		{
+			name:       "single region with 4 nodes",
+			baseDomain: "test1.example.com",
+			derpMap: &tailcfg.DERPMap{
+				Regions: map[int]*tailcfg.DERPRegion{
+					1: {
+						RegionID:   1,
+						RegionCode: "nyc",
+						RegionName: "New York City",
+						Nodes: []*tailcfg.DERPNode{
+							{Name: "1f", RegionID: 1, HostName: "derp1f.tailscale.com"},
+							{Name: "1g", RegionID: 1, HostName: "derp1g.tailscale.com"},
+							{Name: "1h", RegionID: 1, HostName: "derp1h.tailscale.com"},
+							{Name: "1i", RegionID: 1, HostName: "derp1i.tailscale.com"},
+						},
+					},
+				},
+			},
+			expected: &tailcfg.DERPMap{
+				Regions: map[int]*tailcfg.DERPRegion{
+					1: {
+						RegionID:   1,
+						RegionCode: "nyc",
+						RegionName: "New York City",
+						Nodes: []*tailcfg.DERPNode{
+							{Name: "1g", RegionID: 1, HostName: "derp1g.tailscale.com"},
+							{Name: "1f", RegionID: 1, HostName: "derp1f.tailscale.com"},
+							{Name: "1i", RegionID: 1, HostName: "derp1i.tailscale.com"},
+							{Name: "1h", RegionID: 1, HostName: "derp1h.tailscale.com"},
+						},
+					},
+				},
+			},
+		},
+		{
+			name:       "multiple regions with nodes",
+			baseDomain: "test2.example.com",
+			derpMap: &tailcfg.DERPMap{
+				Regions: map[int]*tailcfg.DERPRegion{
+					10: {
+						RegionID:   10,
+						RegionCode: "sea",
+						RegionName: "Seattle",
+						Nodes: []*tailcfg.DERPNode{
+							{Name: "10b", RegionID: 10, HostName: "derp10b.tailscale.com"},
+							{Name: "10c", RegionID: 10, HostName: "derp10c.tailscale.com"},
+							{Name: "10d", RegionID: 10, HostName: "derp10d.tailscale.com"},
+						},
+					},
+					2: {
+						RegionID:   2,
+						RegionCode: "sfo",
+						RegionName: "San Francisco",
+						Nodes: []*tailcfg.DERPNode{
+							{Name: "2d", RegionID: 2, HostName: "derp2d.tailscale.com"},
+							{Name: "2e", RegionID: 2, HostName: "derp2e.tailscale.com"},
+							{Name: "2f", RegionID: 2, HostName: "derp2f.tailscale.com"},
+						},
+					},
+				},
+			},
+			expected: &tailcfg.DERPMap{
+				Regions: map[int]*tailcfg.DERPRegion{
+					10: {
+						RegionID:   10,
+						RegionCode: "sea",
+						RegionName: "Seattle",
+						Nodes: []*tailcfg.DERPNode{
+							{Name: "10b", RegionID: 10, HostName: "derp10b.tailscale.com"},
+							{Name: "10c", RegionID: 10, HostName: "derp10c.tailscale.com"},
+							{Name: "10d", RegionID: 10, HostName: "derp10d.tailscale.com"},
+						},
+					},
+					2: {
+						RegionID:   2,
+						RegionCode: "sfo",
+						RegionName: "San Francisco",
+						Nodes: []*tailcfg.DERPNode{
+							{Name: "2f", RegionID: 2, HostName: "derp2f.tailscale.com"},
+							{Name: "2e", RegionID: 2, HostName: "derp2e.tailscale.com"},
+							{Name: "2d", RegionID: 2, HostName: "derp2d.tailscale.com"},
+						},
+					},
+				},
+			},
+		},
+		{
+			name:       "large region with many nodes",
+			baseDomain: "test3.example.com",
+			derpMap: &tailcfg.DERPMap{
+				Regions: map[int]*tailcfg.DERPRegion{
+					4: {
+						RegionID:   4,
+						RegionCode: "fra",
+						RegionName: "Frankfurt",
+						Nodes: []*tailcfg.DERPNode{
+							{Name: "4f", RegionID: 4, HostName: "derp4f.tailscale.com"},
+							{Name: "4g", RegionID: 4, HostName: "derp4g.tailscale.com"},
+							{Name: "4h", RegionID: 4, HostName: "derp4h.tailscale.com"},
+							{Name: "4i", RegionID: 4, HostName: "derp4i.tailscale.com"},
+						},
+					},
+				},
+			},
+			expected: &tailcfg.DERPMap{
+				Regions: map[int]*tailcfg.DERPRegion{
+					4: {
+						RegionID:   4,
+						RegionCode: "fra",
+						RegionName: "Frankfurt",
+						Nodes: []*tailcfg.DERPNode{
+							{Name: "4f", RegionID: 4, HostName: "derp4f.tailscale.com"},
+							{Name: "4h", RegionID: 4, HostName: "derp4h.tailscale.com"},
+							{Name: "4g", RegionID: 4, HostName: "derp4g.tailscale.com"},
+							{Name: "4i", RegionID: 4, HostName: "derp4i.tailscale.com"},
+						},
+					},
+				},
+			},
+		},
+		{
+			name:       "same region different base domain",
+			baseDomain: "different.example.com",
+			derpMap: &tailcfg.DERPMap{
+				Regions: map[int]*tailcfg.DERPRegion{
+					4: {
+						RegionID:   4,
+						RegionCode: "fra",
+						RegionName: "Frankfurt",
+						Nodes: []*tailcfg.DERPNode{
+							{Name: "4f", RegionID: 4, HostName: "derp4f.tailscale.com"},
+							{Name: "4g", RegionID: 4, HostName: "derp4g.tailscale.com"},
+							{Name: "4h", RegionID: 4, HostName: "derp4h.tailscale.com"},
+							{Name: "4i", RegionID: 4, HostName: "derp4i.tailscale.com"},
+						},
+					},
+				},
+			},
+			expected: &tailcfg.DERPMap{
+				Regions: map[int]*tailcfg.DERPRegion{
+					4: {
+						RegionID:   4,
+						RegionCode: "fra",
+						RegionName: "Frankfurt",
+						Nodes: []*tailcfg.DERPNode{
+							{Name: "4g", RegionID: 4, HostName: "derp4g.tailscale.com"},
+							{Name: "4i", RegionID: 4, HostName: "derp4i.tailscale.com"},
+							{Name: "4f", RegionID: 4, HostName: "derp4f.tailscale.com"},
+							{Name: "4h", RegionID: 4, HostName: "derp4h.tailscale.com"},
+						},
+					},
+				},
+			},
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			viper.Set("dns.base_domain", tt.baseDomain)
+			defer viper.Reset()
+			resetDerpRandomForTesting()
+
+			testMap := tt.derpMap.View().AsStruct()
+			shuffleDERPMap(testMap)
+
+			if diff := cmp.Diff(tt.expected, testMap); diff != "" {
+				t.Errorf("Shuffled DERP map doesn't match expected (-expected +actual):\n%s", diff)
+			}
+		})
+	}
+
+}
+
+func TestShuffleDERPMapEdgeCases(t *testing.T) {
+	tests := []struct {
+		name    string
+		derpMap *tailcfg.DERPMap
+	}{
+		{
+			name:    "nil derp map",
+			derpMap: nil,
+		},
+		{
+			name: "empty derp map",
+			derpMap: &tailcfg.DERPMap{
+				Regions: map[int]*tailcfg.DERPRegion{},
+			},
+		},
+		{
+			name: "region with no nodes",
+			derpMap: &tailcfg.DERPMap{
+				Regions: map[int]*tailcfg.DERPRegion{
+					1: {
+						RegionID:   1,
+						RegionCode: "empty",
+						RegionName: "Empty Region",
+						Nodes:      []*tailcfg.DERPNode{},
+					},
+				},
+			},
+		},
+		{
+			name: "region with single node",
+			derpMap: &tailcfg.DERPMap{
+				Regions: map[int]*tailcfg.DERPRegion{
+					1: {
+						RegionID:   1,
+						RegionCode: "single",
+						RegionName: "Single Node Region",
+						Nodes: []*tailcfg.DERPNode{
+							{Name: "1a", RegionID: 1, HostName: "derp1a.tailscale.com"},
+						},
+					},
+				},
+			},
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			shuffleDERPMap(tt.derpMap)
+		})
+	}
+}
+
+func TestShuffleDERPMapWithoutBaseDomain(t *testing.T) {
+	viper.Reset()
+	resetDerpRandomForTesting()
+
+	derpMap := &tailcfg.DERPMap{
+		Regions: map[int]*tailcfg.DERPRegion{
+			1: {
+				RegionID:   1,
+				RegionCode: "test",
+				RegionName: "Test Region",
+				Nodes: []*tailcfg.DERPNode{
+					{Name: "1a", RegionID: 1, HostName: "derp1a.test.com"},
+					{Name: "1b", RegionID: 1, HostName: "derp1b.test.com"},
+					{Name: "1c", RegionID: 1, HostName: "derp1c.test.com"},
+					{Name: "1d", RegionID: 1, HostName: "derp1d.test.com"},
+				},
+			},
+		},
+	}
+
+	original := derpMap.View().AsStruct()
+	shuffleDERPMap(derpMap)
+
+	if len(derpMap.Regions) != 1 || len(derpMap.Regions[1].Nodes) != 4 {
+		t.Error("Shuffle corrupted DERP map structure")
+	}
+
+	originalNodes := make(map[string]bool)
+	for _, node := range original.Regions[1].Nodes {
+		originalNodes[node.Name] = true
+	}
+
+	shuffledNodes := make(map[string]bool)
+	for _, node := range derpMap.Regions[1].Nodes {
+		shuffledNodes[node.Name] = true
+	}
+
+	if diff := cmp.Diff(originalNodes, shuffledNodes); diff != "" {
+		t.Errorf("Shuffle changed node set (-original +shuffled):\n%s", diff)
+	}
+}
--- a/hscontrol/derp/server/derp_server.go
+++ b/hscontrol/derp/server/derp_server.go
@@ -20,6 +20,7 @@ import (
 	"github.com/juanfont/headscale/hscontrol/util"
 	"github.com/rs/zerolog/log"
 	"tailscale.com/derp"
+	"tailscale.com/envknob"
 	"tailscale.com/net/stun"
 	"tailscale.com/net/wsconn"
 	"tailscale.com/tailcfg"
@@ -35,6 +36,11 @@ const (
 	DerpVerifyScheme = "headscale-derp-verify"
 )

+// debugUseDERPIP is a debug-only flag that causes the DERP server to resolve
+// hostnames to IP addresses when generating the DERP region configuration.
+// This is useful for integration testing where DNS resolution may be unreliable.
+var debugUseDERPIP = envknob.Bool("HEADSCALE_DEBUG_DERP_USE_IP")
+
 type DERPServer struct {
 	serverURL     string
 	key           key.NodePrivate
@@ -70,7 +76,10 @@ func (d *DERPServer) GenerateRegion() (tailcfg.DERPRegion, error) {
 	}
 	var host string
 	var port int
-	host, portStr, err := net.SplitHostPort(serverURL.Host)
+	var portStr string
+
+	// Extract hostname and port from URL
+	host, portStr, err = net.SplitHostPort(serverURL.Host)
 	if err != nil {
 		if serverURL.Scheme == "https" {
 			host = serverURL.Host
@@ -86,6 +95,19 @@ func (d *DERPServer) GenerateRegion() (tailcfg.DERPRegion, error) {
 		}
 	}

+	// If debug flag is set, resolve hostname to IP address
+	if debugUseDERPIP {
+		ips, err := net.LookupIP(host)
+		if err != nil {
+			log.Error().Caller().Err(err).Msgf("Failed to resolve DERP hostname %s to IP, using hostname", host)
+		} else if len(ips) > 0 {
+			// Use the first IP address
+			ipStr := ips[0].String()
+			log.Info().Caller().Msgf("HEADSCALE_DEBUG_DERP_USE_IP: Resolved %s to %s", host, ipStr)
+			host = ipStr
+		}
+	}
+
 	localDERPregion := tailcfg.DERPRegion{
 		RegionID:   d.cfg.ServerRegionID,
 		RegionCode: d.cfg.ServerRegionCode,
@@ -139,7 +161,7 @@ func (d *DERPServer) DERPHandler(
 			log.Error().
 				Caller().
 				Err(err).
-				Msg("Failed to write response")
+				Msg("Failed to write HTTP response")
 		}

 		return
@@ -177,7 +199,7 @@ func (d *DERPServer) serveWebsocket(writer http.ResponseWriter, req *http.Reques
 			log.Error().
 				Caller().
 				Err(err).
-				Msg("Failed to write response")
+				Msg("Failed to write HTTP response")
 		}

 		return
@@ -207,7 +229,7 @@ func (d *DERPServer) servePlain(writer http.ResponseWriter, req *http.Request) {
 			log.Error().
 				Caller().
 				Err(err).
-				Msg("Failed to write response")
+				Msg("Failed to write HTTP response")
 		}

 		return
@@ -223,7 +245,7 @@ func (d *DERPServer) servePlain(writer http.ResponseWriter, req *http.Request) {
 			log.Error().
 				Caller().
 				Err(err).
-				Msg("Failed to write response")
+				Msg("Failed to write HTTP response")
 		}

 		return
@@ -262,7 +284,7 @@ func DERPProbeHandler(
 			log.Error().
 				Caller().
 				Err(err).
-				Msg("Failed to write response")
+				Msg("Failed to write HTTP response")
 		}
 	}
 }
@@ -276,7 +298,7 @@ func DERPProbeHandler(
 // An example implementation is found here https://derp.tailscale.com/bootstrap-dns
 // Coordination server is included automatically, since local DERP is using the same DNS Name in d.serverURL.
 func DERPBootstrapDNSHandler(
-	derpMap *tailcfg.DERPMap,
+	derpMap tailcfg.DERPMapView,
 ) func(http.ResponseWriter, *http.Request) {
 	return func(
 		writer http.ResponseWriter,
@@ -287,18 +309,18 @@ func DERPBootstrapDNSHandler(
 		resolvCtx, cancel := context.WithTimeout(req.Context(), time.Minute)
 		defer cancel()
 		var resolver net.Resolver
-		for _, region := range derpMap.Regions {
-			for _, node := range region.Nodes { // we don't care if we override some nodes
-				addrs, err := resolver.LookupIP(resolvCtx, "ip", node.HostName)
+		for _, region := range derpMap.Regions().All() {
+			for _, node := range region.Nodes().All() { // we don't care if we override some nodes
+				addrs, err := resolver.LookupIP(resolvCtx, "ip", node.HostName())
 				if err != nil {
 					log.Trace().
 						Caller().
 						Err(err).
-						Msgf("bootstrap DNS lookup failed %q", node.HostName)
+						Msgf("bootstrap DNS lookup failed %q", node.HostName())

 					continue
 				}
-				dnsEntries[node.HostName] = addrs
+				dnsEntries[node.HostName()] = addrs
 			}
 		}
 		writer.Header().Set("Content-Type", "application/json")
@@ -308,7 +330,7 @@ func DERPBootstrapDNSHandler(
 			log.Error().
 				Caller().
 				Err(err).
-				Msg("Failed to write response")
+				Msg("Failed to write HTTP response")
 		}
 	}
 }
--- a/hscontrol/grpcv1.go
+++ b/hscontrol/grpcv1.go
@@ -1,3 +1,5 @@
+//go:generate buf generate --template ../buf.gen.yaml -o .. ../proto
+
 // nolint
 package hscontrol

@@ -13,7 +15,6 @@ import (
 	"strings"
 	"time"

-	"github.com/puzpuzpuz/xsync/v4"
 	"github.com/rs/zerolog/log"
 	"github.com/samber/lo"
 	"google.golang.org/grpc/codes"
@@ -23,10 +24,12 @@ import (
 	"tailscale.com/net/tsaddr"
 	"tailscale.com/tailcfg"
 	"tailscale.com/types/key"
+	"tailscale.com/types/views"

 	v1 "github.com/juanfont/headscale/gen/go/headscale/v1"
 	"github.com/juanfont/headscale/hscontrol/state"
 	"github.com/juanfont/headscale/hscontrol/types"
+	"github.com/juanfont/headscale/hscontrol/types/change"
 	"github.com/juanfont/headscale/hscontrol/util"
 )

@@ -56,12 +59,15 @@ func (api headscaleV1APIServer) CreateUser(
 		return nil, status.Errorf(codes.Internal, "failed to create user: %s", err)
 	}

-	// Send policy update notifications if needed
-	if policyChanged {
-		ctx := types.NotifyCtx(context.Background(), "grpc-user-created", user.Name)
-		api.h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
+	c := change.UserAdded(types.UserID(user.ID))
+
+	// TODO(kradalby): Both of these might be policy changes, find a better way to merge.
+	if !policyChanged.Empty() {
+		c.Change = change.Policy
 	}

+	api.h.Change(c)
+
 	return &v1.CreateUserResponse{User: user.Proto()}, nil
 }

@@ -74,16 +80,13 @@ func (api headscaleV1APIServer) RenameUser(
 		return nil, err
 	}

-	_, policyChanged, err := api.h.state.RenameUser(types.UserID(oldUser.ID), request.GetNewName())
+	_, c, err := api.h.state.RenameUser(types.UserID(oldUser.ID), request.GetNewName())
 	if err != nil {
 		return nil, err
 	}

 	// Send policy update notifications if needed
-	if policyChanged {
-		ctx := types.NotifyCtx(context.Background(), "grpc-user-renamed", request.GetNewName())
-		api.h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-	}
+	api.h.Change(c)

 	newUser, err := api.h.state.GetUserByName(request.GetNewName())
 	if err != nil {
@@ -107,6 +110,8 @@ func (api headscaleV1APIServer) DeleteUser(
 		return nil, err
 	}

+	api.h.Change(change.UserRemoved(types.UserID(user.ID)))
+
 	return &v1.DeleteUserResponse{}, nil
 }

@@ -232,6 +237,7 @@ func (api headscaleV1APIServer) RegisterNode(
 	request *v1.RegisterNodeRequest,
 ) (*v1.RegisterNodeResponse, error) {
 	log.Trace().
+		Caller().
 		Str("user", request.GetUser()).
 		Str("registration_id", request.GetKey()).
 		Msg("Registering node")
@@ -246,7 +252,7 @@ func (api headscaleV1APIServer) RegisterNode(
 		return nil, fmt.Errorf("looking up user: %w", err)
 	}

-	node, _, err := api.h.state.HandleNodeFromAuthPath(
+	node, nodeChange, err := api.h.state.HandleNodeFromAuthPath(
 		registrationId,
 		types.UserID(user.ID),
 		nil,
@@ -267,22 +273,13 @@ func (api headscaleV1APIServer) RegisterNode(
 	// ensure we send an update.
 	// This works, but might be another good candidate for doing some sort of
 	// eventbus.
-	routesChanged := api.h.state.AutoApproveRoutes(node)
-	_, policyChanged, err := api.h.state.SaveNode(node)
+	_ = api.h.state.AutoApproveRoutes(node)
+	_, _, err = api.h.state.SaveNode(node)
 	if err != nil {
 		return nil, fmt.Errorf("saving auto approved routes to node: %w", err)
 	}

-	// Send policy update notifications if needed (from SaveNode or route changes)
-	if policyChanged {
-		ctx := types.NotifyCtx(context.Background(), "grpc-nodes-change", "all")
-		api.h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-	}
-
-	if routesChanged {
-		ctx = types.NotifyCtx(context.Background(), "web-node-login", node.Hostname)
-		api.h.nodeNotifier.NotifyAll(ctx, types.UpdatePeerChanged(node.ID))
-	}
+	api.h.Change(nodeChange)

 	return &v1.RegisterNodeResponse{Node: node.Proto()}, nil
 }
@@ -291,17 +288,13 @@ func (api headscaleV1APIServer) GetNode(
 	ctx context.Context,
 	request *v1.GetNodeRequest,
 ) (*v1.GetNodeResponse, error) {
-	node, err := api.h.state.GetNodeByID(types.NodeID(request.GetNodeId()))
-	if err != nil {
-		return nil, err
+	node, ok := api.h.state.GetNodeByID(types.NodeID(request.GetNodeId()))
+	if !ok {
+		return nil, status.Errorf(codes.NotFound, "node not found")
 	}

 	resp := node.Proto()

-	// Populate the online field based on
-	// currently connected nodes.
-	resp.Online = api.h.nodeNotifier.IsConnected(node.ID)
-
 	return &v1.GetNodeResponse{Node: resp}, nil
 }

@@ -316,24 +309,18 @@ func (api headscaleV1APIServer) SetTags(
 		}
 	}

-	node, policyChanged, err := api.h.state.SetNodeTags(types.NodeID(request.GetNodeId()), request.GetTags())
+	node, nodeChange, err := api.h.state.SetNodeTags(types.NodeID(request.GetNodeId()), request.GetTags())
 	if err != nil {
 		return &v1.SetTagsResponse{
 			Node: nil,
 		}, status.Error(codes.InvalidArgument, err.Error())
 	}

-	// Send policy update notifications if needed
-	if policyChanged {
-		ctx := types.NotifyCtx(context.Background(), "grpc-node-tags", node.Hostname)
-		api.h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-	}
-
-	ctx = types.NotifyCtx(ctx, "cli-settags", node.Hostname)
-	api.h.nodeNotifier.NotifyWithIgnore(ctx, types.UpdatePeerChanged(node.ID), node.ID)
+	api.h.Change(nodeChange)

 	log.Trace().
-		Str("node", node.Hostname).
+		Caller().
+		Str("node", node.Hostname()).
 		Strs("tags", request.GetTags()).
 		Msg("Changing tags of node")

@@ -344,7 +331,13 @@ func (api headscaleV1APIServer) SetApprovedRoutes(
 	ctx context.Context,
 	request *v1.SetApprovedRoutesRequest,
 ) (*v1.SetApprovedRoutesResponse, error) {
-	var routes []netip.Prefix
+	log.Debug().
+		Caller().
+		Uint64("node.id", request.GetNodeId()).
+		Strs("requestedRoutes", request.GetRoutes()).
+		Msg("gRPC SetApprovedRoutes called")
+
+	var newApproved []netip.Prefix
 	for _, route := range request.GetRoutes() {
 		prefix, err := netip.ParsePrefix(route)
 		if err != nil {
@@ -354,35 +347,35 @@ func (api headscaleV1APIServer) SetApprovedRoutes(
 		// If the prefix is an exit route, add both. The client expect both
 		// to annotate the node as an exit node.
 		if prefix == tsaddr.AllIPv4() || prefix == tsaddr.AllIPv6() {
-			routes = append(routes, tsaddr.AllIPv4(), tsaddr.AllIPv6())
+			newApproved = append(newApproved, tsaddr.AllIPv4(), tsaddr.AllIPv6())
 		} else {
-			routes = append(routes, prefix)
+			newApproved = append(newApproved, prefix)
 		}
 	}
-	tsaddr.SortPrefixes(routes)
-	routes = slices.Compact(routes)
+	tsaddr.SortPrefixes(newApproved)
+	newApproved = slices.Compact(newApproved)

-	node, policyChanged, err := api.h.state.SetApprovedRoutes(types.NodeID(request.GetNodeId()), routes)
+	node, nodeChange, err := api.h.state.SetApprovedRoutes(types.NodeID(request.GetNodeId()), newApproved)
 	if err != nil {
 		return nil, status.Error(codes.InvalidArgument, err.Error())
 	}

-	// Send policy update notifications if needed
-	if policyChanged {
-		ctx := types.NotifyCtx(context.Background(), "grpc-routes-approved", node.Hostname)
-		api.h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-	}
-
-	if api.h.state.SetNodeRoutes(node.ID, node.SubnetRoutes()...) {
-		ctx := types.NotifyCtx(ctx, "poll-primary-change", node.Hostname)
-		api.h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-	} else {
-		ctx = types.NotifyCtx(ctx, "cli-approveroutes", node.Hostname)
-		api.h.nodeNotifier.NotifyWithIgnore(ctx, types.UpdatePeerChanged(node.ID), node.ID)
-	}
+	// Always propagate node changes from SetApprovedRoutes
+	api.h.Change(nodeChange)

 	proto := node.Proto()
-	proto.SubnetRoutes = util.PrefixesToString(api.h.state.GetNodePrimaryRoutes(node.ID))
+	// Populate SubnetRoutes with PrimaryRoutes to ensure it includes only the
+	// routes that are actively served from the node (per architectural requirement in types/node.go)
+	primaryRoutes := api.h.state.GetNodePrimaryRoutes(node.ID())
+	proto.SubnetRoutes = util.PrefixesToString(primaryRoutes)
+
+	log.Debug().
+		Caller().
+		Uint64("node.id", node.ID().Uint64()).
+		Strs("approvedRoutes", util.PrefixesToString(node.ApprovedRoutes().AsSlice())).
+		Strs("primaryRoutes", util.PrefixesToString(primaryRoutes)).
+		Strs("finalSubnetRoutes", proto.SubnetRoutes).
+		Msg("gRPC SetApprovedRoutes completed")

 	return &v1.SetApprovedRoutesResponse{Node: proto}, nil
 }
@@ -404,24 +397,17 @@ func (api headscaleV1APIServer) DeleteNode(
 	ctx context.Context,
 	request *v1.DeleteNodeRequest,
 ) (*v1.DeleteNodeResponse, error) {
-	node, err := api.h.state.GetNodeByID(types.NodeID(request.GetNodeId()))
+	node, ok := api.h.state.GetNodeByID(types.NodeID(request.GetNodeId()))
+	if !ok {
+		return nil, status.Errorf(codes.NotFound, "node not found")
+	}
+
+	nodeChange, err := api.h.state.DeleteNode(node)
 	if err != nil {
 		return nil, err
 	}

-	policyChanged, err := api.h.state.DeleteNode(node)
-	if err != nil {
-		return nil, err
-	}
-
-	// Send policy update notifications if needed
-	if policyChanged {
-		ctx := types.NotifyCtx(context.Background(), "grpc-node-deleted", node.Hostname)
-		api.h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-	}
-
-	ctx = types.NotifyCtx(ctx, "cli-deletenode", node.Hostname)
-	api.h.nodeNotifier.NotifyAll(ctx, types.UpdatePeerRemoved(node.ID))
+	api.h.Change(nodeChange)

 	return &v1.DeleteNodeResponse{}, nil
 }
@@ -432,29 +418,18 @@ func (api headscaleV1APIServer) ExpireNode(
 ) (*v1.ExpireNodeResponse, error) {
 	now := time.Now()

-	node, policyChanged, err := api.h.state.SetNodeExpiry(types.NodeID(request.GetNodeId()), now)
+	node, nodeChange, err := api.h.state.SetNodeExpiry(types.NodeID(request.GetNodeId()), now)
 	if err != nil {
 		return nil, err
 	}

-	// Send policy update notifications if needed
-	if policyChanged {
-		ctx := types.NotifyCtx(context.Background(), "grpc-node-expired", node.Hostname)
-		api.h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-	}
-
-	ctx = types.NotifyCtx(ctx, "cli-expirenode-self", node.Hostname)
-	api.h.nodeNotifier.NotifyByNodeID(
-		ctx,
-		types.UpdateSelf(node.ID),
-		node.ID)
-
-	ctx = types.NotifyCtx(ctx, "cli-expirenode-peers", node.Hostname)
-	api.h.nodeNotifier.NotifyWithIgnore(ctx, types.UpdateExpire(node.ID, now), node.ID)
+	// TODO(kradalby): Ensure that both the selfupdate and peer updates are sent
+	api.h.Change(nodeChange)

 	log.Trace().
-		Str("node", node.Hostname).
-		Time("expiry", *node.Expiry).
+		Caller().
+		Str("node", node.Hostname()).
+		Time("expiry", *node.AsStruct().Expiry).
 		Msg("node expired")

 	return &v1.ExpireNodeResponse{Node: node.Proto()}, nil
@@ -464,25 +439,17 @@ func (api headscaleV1APIServer) RenameNode(
 	ctx context.Context,
 	request *v1.RenameNodeRequest,
 ) (*v1.RenameNodeResponse, error) {
-	node, policyChanged, err := api.h.state.RenameNode(types.NodeID(request.GetNodeId()), request.GetNewName())
+	node, nodeChange, err := api.h.state.RenameNode(types.NodeID(request.GetNodeId()), request.GetNewName())
 	if err != nil {
 		return nil, err
 	}

-	// Send policy update notifications if needed
-	if policyChanged {
-		ctx := types.NotifyCtx(context.Background(), "grpc-node-renamed", node.Hostname)
-		api.h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-	}
-
-	ctx = types.NotifyCtx(ctx, "cli-renamenode-self", node.Hostname)
-	api.h.nodeNotifier.NotifyByNodeID(ctx, types.UpdateSelf(node.ID), node.ID)
-
-	ctx = types.NotifyCtx(ctx, "cli-renamenode-peers", node.Hostname)
-	api.h.nodeNotifier.NotifyWithIgnore(ctx, types.UpdatePeerChanged(node.ID), node.ID)
+	// TODO(kradalby): investigate if we need selfupdate
+	api.h.Change(nodeChange)

 	log.Trace().
-		Str("node", node.Hostname).
+		Caller().
+		Str("node", node.Hostname()).
 		Str("new_name", request.GetNewName()).
 		Msg("node renamed")

@@ -493,101 +460,49 @@ func (api headscaleV1APIServer) ListNodes(
 	ctx context.Context,
 	request *v1.ListNodesRequest,
 ) (*v1.ListNodesResponse, error) {
-	var nodes types.Nodes
-	var err error
+	// TODO(kradalby): it looks like this can be simplified a lot,
+	// the filtering of nodes by user, vs nodes as a whole can
+	// probably be done once.
+	// TODO(kradalby): This should be done in one tx.
+	if request.GetUser() != "" {
+		user, err := api.h.state.GetUserByName(request.GetUser())
+		if err != nil {
+			return nil, err
+		}

-	isLikelyConnected := api.h.nodeNotifier.LikelyConnectedMap()
+		nodes := api.h.state.ListNodesByUser(types.UserID(user.ID))

-	// Start with all nodes and apply filters
-	nodes, err = api.h.state.ListNodes()
-	if err != nil {
-		return nil, err
+		response := nodesToProto(api.h.state, nodes)
+		return &v1.ListNodesResponse{Nodes: response}, nil
 	}

-	// Apply filters based on request
-	nodes = api.filterNodes(nodes, request)
+	nodes := api.h.state.ListNodes()

-	sort.Slice(nodes, func(i, j int) bool {
-		return nodes[i].ID < nodes[j].ID
-	})
-
-	response := nodesToProto(api.h.state, isLikelyConnected, nodes)
+	response := nodesToProto(api.h.state, nodes)
 	return &v1.ListNodesResponse{Nodes: response}, nil
 }

-// filterNodes applies the filters from ListNodesRequest to the node list
-func (api headscaleV1APIServer) filterNodes(nodes types.Nodes, request *v1.ListNodesRequest) types.Nodes {
-	var filtered types.Nodes
-
-	for _, node := range nodes {
-		// Filter by user
-		if request.GetUser() != "" && node.User.Name != request.GetUser() {
-			continue
-		}
-
-		// Filter by ID (backward compatibility)
-		if request.GetId() != 0 && uint64(node.ID) != request.GetId() {
-			continue
-		}
-
-		// Filter by name (exact match)
-		if request.GetName() != "" && node.Hostname != request.GetName() {
-			continue
-		}
-
-		// Filter by hostname (alias for name)
-		if request.GetHostname() != "" && node.Hostname != request.GetHostname() {
-			continue
-		}
-
-		// Filter by IP addresses
-		if len(request.GetIpAddresses()) > 0 {
-			hasMatchingIP := false
-			for _, requestIP := range request.GetIpAddresses() {
-				for _, nodeIP := range node.IPs() {
-					if nodeIP.String() == requestIP {
-						hasMatchingIP = true
-						break
-					}
-				}
-				if hasMatchingIP {
-					break
-				}
-			}
-			if !hasMatchingIP {
-				continue
-			}
-		}
-
-		// If we get here, node matches all filters
-		filtered = append(filtered, node)
-	}
-
-	return filtered
-}
-
-func nodesToProto(state *state.State, isLikelyConnected *xsync.MapOf[types.NodeID, bool], nodes types.Nodes) []*v1.Node {
-	response := make([]*v1.Node, len(nodes))
-	for index, node := range nodes {
+func nodesToProto(state *state.State, nodes views.Slice[types.NodeView]) []*v1.Node {
+	response := make([]*v1.Node, nodes.Len())
+	for index, node := range nodes.All() {
 		resp := node.Proto()

-		// Populate the online field based on
-		// currently connected nodes.
-		if val, ok := isLikelyConnected.Load(node.ID); ok && val {
-			resp.Online = true
-		}
-
 		var tags []string
 		for _, tag := range node.RequestTags() {
-			if state.NodeCanHaveTag(node.View(), tag) {
+			if state.NodeCanHaveTag(node, tag) {
 				tags = append(tags, tag)
 			}
 		}
-		resp.ValidTags = lo.Uniq(append(tags, node.ForcedTags...))
-		resp.SubnetRoutes = util.PrefixesToString(append(state.GetNodePrimaryRoutes(node.ID), node.ExitRoutes()...))
+		resp.ValidTags = lo.Uniq(append(tags, node.ForcedTags().AsSlice()...))
+
+		resp.SubnetRoutes = util.PrefixesToString(append(state.GetNodePrimaryRoutes(node.ID()), node.ExitRoutes()...))
 		response[index] = resp
 	}

+	sort.Slice(response, func(i, j int) bool {
+		return response[i].Id < response[j].Id
+	})
+
 	return response
 }

@@ -595,24 +510,14 @@ func (api headscaleV1APIServer) MoveNode(
 	ctx context.Context,
 	request *v1.MoveNodeRequest,
 ) (*v1.MoveNodeResponse, error) {
-	node, policyChanged, err := api.h.state.AssignNodeToUser(types.NodeID(request.GetNodeId()), types.UserID(request.GetUser()))
+	node, nodeChange, err := api.h.state.AssignNodeToUser(types.NodeID(request.GetNodeId()), types.UserID(request.GetUser()))
 	if err != nil {
 		return nil, err
 	}

-	// Send policy update notifications if needed
-	if policyChanged {
-		ctx := types.NotifyCtx(context.Background(), "grpc-node-moved", node.Hostname)
-		api.h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
-	}
-
-	ctx = types.NotifyCtx(ctx, "cli-movenode-self", node.Hostname)
-	api.h.nodeNotifier.NotifyByNodeID(
-		ctx,
-		types.UpdateSelf(node.ID),
-		node.ID)
-	ctx = types.NotifyCtx(ctx, "cli-movenode", node.Hostname)
-	api.h.nodeNotifier.NotifyWithIgnore(ctx, types.UpdatePeerChanged(node.ID), node.ID)
+	// TODO(kradalby): Ensure the policy is also sent
+	// TODO(kradalby): ensure that both the selfupdate and peer updates are sent
+	api.h.Change(nodeChange)

 	return &v1.MoveNodeResponse{Node: node.Proto()}, nil
 }
@@ -621,7 +526,7 @@ func (api headscaleV1APIServer) BackfillNodeIPs(
 	ctx context.Context,
 	request *v1.BackfillNodeIPsRequest,
 ) (*v1.BackfillNodeIPsResponse, error) {
-	log.Trace().Msg("Backfill called")
+	log.Trace().Caller().Msg("Backfill called")

 	if !request.Confirmed {
 		return nil, errors.New("not confirmed, aborting")
@@ -765,17 +670,15 @@ func (api headscaleV1APIServer) SetPolicy(
 	// a scenario where they might be allowed if the server has no nodes
 	// yet, but it should help for the general case and for hot reloading
 	// configurations.
-	nodes, err := api.h.state.ListNodes()
-	if err != nil {
-		return nil, fmt.Errorf("loading nodes from database to validate policy: %w", err)
-	}
-	changed, err := api.h.state.SetPolicy([]byte(p))
+	nodes := api.h.state.ListNodes()
+
+	_, err := api.h.state.SetPolicy([]byte(p))
 	if err != nil {
 		return nil, fmt.Errorf("setting policy: %w", err)
 	}

-	if len(nodes) > 0 {
-		_, err = api.h.state.SSHPolicy(nodes[0].View())
+	if nodes.Len() > 0 {
+		_, err = api.h.state.SSHPolicy(nodes.At(0))
 		if err != nil {
 			return nil, fmt.Errorf("verifying SSH rules: %w", err)
 		}
@@ -786,15 +689,20 @@ func (api headscaleV1APIServer) SetPolicy(
 		return nil, err
 	}

-	// Only send update if the packet filter has changed.
-	if changed {
-		err = api.h.state.AutoApproveNodes()
-		if err != nil {
-			return nil, err
-		}
+	// Always reload policy to ensure route re-evaluation, even if policy content hasn't changed.
+	// This ensures that routes are re-evaluated for auto-approval in cases where routes
+	// were manually disabled but could now be auto-approved with the current policy.
+	cs, err := api.h.state.ReloadPolicy()
+	if err != nil {
+		return nil, fmt.Errorf("reloading policy: %w", err)
+	}

-		ctx := types.NotifyCtx(context.Background(), "acl-update", "na")
-		api.h.nodeNotifier.NotifyAll(ctx, types.UpdateFull())
+	if len(cs) > 0 {
+		api.h.Change(cs...)
+	} else {
+		log.Debug().
+			Caller().
+			Msg("No policy changes to distribute because ReloadPolicy returned empty changeset")
 	}

 	response := &v1.SetPolicyResponse{
@@ -802,6 +710,10 @@ func (api headscaleV1APIServer) SetPolicy(
 		UpdatedAt: timestamppb.New(updated.UpdatedAt),
 	}

+	log.Debug().
+		Caller().
+		Msg("gRPC SetPolicy completed successfully because response prepared")
+
 	return response, nil
 }

@@ -824,7 +736,7 @@ func (api headscaleV1APIServer) DebugCreateNode(
 		Caller().
 		Interface("route-prefix", routes).
 		Interface("route-str", request.GetRoutes()).
-		Msg("")
+		Msg("Creating routes for node")

 	hostinfo := tailcfg.Hostinfo{
 		RoutableIPs: routes,
@@ -853,6 +765,7 @@ func (api headscaleV1APIServer) DebugCreateNode(
 	}

 	log.Debug().
+		Caller().
 		Str("registration_id", registrationId.String()).
 		Msg("adding debug machine via CLI, appending to registration cache")

--- a/hscontrol/handlers.go
+++ b/hscontrol/handlers.go
@@ -91,16 +91,22 @@ func (h *Headscale) handleVerifyRequest(

 	var derpAdmitClientRequest tailcfg.DERPAdmitClientRequest
 	if err := json.Unmarshal(body, &derpAdmitClientRequest); err != nil {
-		return fmt.Errorf("cannot parse derpAdmitClientRequest: %w", err)
+		return NewHTTPError(http.StatusBadRequest, "Bad Request: invalid JSON", fmt.Errorf("cannot parse derpAdmitClientRequest: %w", err))
 	}

-	nodes, err := h.state.ListNodes()
-	if err != nil {
-		return fmt.Errorf("cannot list nodes: %w", err)
+	nodes := h.state.ListNodes()
+
+	// Check if any node has the requested NodeKey
+	var nodeKeyFound bool
+	for _, node := range nodes.All() {
+		if node.NodeKey() == derpAdmitClientRequest.NodePublic {
+			nodeKeyFound = true
+			break
+		}
 	}

 	resp := &tailcfg.DERPAdmitClientResponse{
-		Allow: nodes.ContainsNodeKey(derpAdmitClientRequest.NodePublic),
+		Allow: nodeKeyFound,
 	}

 	return json.NewEncoder(writer).Encode(resp)
@@ -180,6 +186,21 @@ func (h *Headscale) HealthHandler(
 	respond(nil)
 }

+func (h *Headscale) RobotsHandler(
+	writer http.ResponseWriter,
+	req *http.Request,
+) {
+	writer.Header().Set("Content-Type", "text/plain")
+	writer.WriteHeader(http.StatusOK)
+	_, err := writer.Write([]byte("User-agent: *\nDisallow: /"))
+	if err != nil {
+		log.Error().
+			Caller().
+			Err(err).
+			Msg("Failed to write HTTP response")
+	}
+}
+
 var codeStyleRegisterWebAPI = styles.Props{
 	styles.Display:         "block",
 	styles.Padding:         "20px",
--- a/hscontrol/mapper/batcher.go
+++ b/hscontrol/mapper/batcher.go
@@ -0,0 +1,178 @@
+package mapper
+
+import (
+	"errors"
+	"fmt"
+	"time"
+
+	"github.com/juanfont/headscale/hscontrol/state"
+	"github.com/juanfont/headscale/hscontrol/types"
+	"github.com/juanfont/headscale/hscontrol/types/change"
+	"github.com/puzpuzpuz/xsync/v4"
+	"github.com/rs/zerolog/log"
+	"tailscale.com/tailcfg"
+	"tailscale.com/types/ptr"
+)
+
+type batcherFunc func(cfg *types.Config, state *state.State) Batcher
+
+// Batcher defines the common interface for all batcher implementations.
+type Batcher interface {
+	Start()
+	Close()
+	AddNode(id types.NodeID, c chan<- *tailcfg.MapResponse, version tailcfg.CapabilityVersion) error
+	RemoveNode(id types.NodeID, c chan<- *tailcfg.MapResponse) bool
+	IsConnected(id types.NodeID) bool
+	ConnectedMap() *xsync.Map[types.NodeID, bool]
+	AddWork(c ...change.ChangeSet)
+	MapResponseFromChange(id types.NodeID, c change.ChangeSet) (*tailcfg.MapResponse, error)
+	DebugMapResponses() (map[types.NodeID][]tailcfg.MapResponse, error)
+}
+
+func NewBatcher(batchTime time.Duration, workers int, mapper *mapper) *LockFreeBatcher {
+	return &LockFreeBatcher{
+		mapper:  mapper,
+		workers: workers,
+		tick:    time.NewTicker(batchTime),
+
+		// The size of this channel is arbitrary chosen, the sizing should be revisited.
+		workCh:         make(chan work, workers*200),
+		nodes:          xsync.NewMap[types.NodeID, *multiChannelNodeConn](),
+		connected:      xsync.NewMap[types.NodeID, *time.Time](),
+		pendingChanges: xsync.NewMap[types.NodeID, []change.ChangeSet](),
+	}
+}
+
+// NewBatcherAndMapper creates a Batcher implementation.
+func NewBatcherAndMapper(cfg *types.Config, state *state.State) Batcher {
+	m := newMapper(cfg, state)
+	b := NewBatcher(cfg.Tuning.BatchChangeDelay, cfg.Tuning.BatcherWorkers, m)
+	m.batcher = b
+
+	return b
+}
+
+// nodeConnection interface for different connection implementations.
+type nodeConnection interface {
+	nodeID() types.NodeID
+	version() tailcfg.CapabilityVersion
+	send(data *tailcfg.MapResponse) error
+}
+
+// generateMapResponse generates a [tailcfg.MapResponse] for the given NodeID that is based on the provided [change.ChangeSet].
+func generateMapResponse(nodeID types.NodeID, version tailcfg.CapabilityVersion, mapper *mapper, c change.ChangeSet) (*tailcfg.MapResponse, error) {
+	if c.Empty() {
+		return nil, nil
+	}
+
+	// Validate inputs before processing
+	if nodeID == 0 {
+		return nil, fmt.Errorf("invalid nodeID: %d", nodeID)
+	}
+
+	if mapper == nil {
+		return nil, fmt.Errorf("mapper is nil for nodeID %d", nodeID)
+	}
+
+	var (
+		mapResp *tailcfg.MapResponse
+		err     error
+	)
+
+	switch c.Change {
+	case change.DERP:
+		mapResp, err = mapper.derpMapResponse(nodeID)
+
+	case change.NodeCameOnline, change.NodeWentOffline:
+		if c.IsSubnetRouter {
+			// TODO(kradalby): This can potentially be a peer update of the old and new subnet router.
+			mapResp, err = mapper.fullMapResponse(nodeID, version)
+		} else {
+			// CRITICAL FIX: Read actual online status from NodeStore when available,
+			// fall back to deriving from change type for unit tests or when NodeStore is empty
+			var onlineStatus bool
+			if node, found := mapper.state.GetNodeByID(c.NodeID); found && node.IsOnline().Valid() {
+				// Use actual NodeStore status when available (production case)
+				onlineStatus = node.IsOnline().Get()
+			} else {
+				// Fall back to deriving from change type (unit test case or initial setup)
+				onlineStatus = c.Change == change.NodeCameOnline
+			}
+
+			mapResp, err = mapper.peerChangedPatchResponse(nodeID, []*tailcfg.PeerChange{
+				{
+					NodeID: c.NodeID.NodeID(),
+					Online: ptr.To(onlineStatus),
+				},
+			})
+		}
+
+	case change.NodeNewOrUpdate:
+		mapResp, err = mapper.fullMapResponse(nodeID, version)
+
+	case change.NodeRemove:
+		mapResp, err = mapper.peerRemovedResponse(nodeID, c.NodeID)
+
+	default:
+		// The following will always hit this:
+		// change.Full, change.Policy
+		mapResp, err = mapper.fullMapResponse(nodeID, version)
+	}
+
+	if err != nil {
+		return nil, fmt.Errorf("generating map response for nodeID %d: %w", nodeID, err)
+	}
+
+	// TODO(kradalby): Is this necessary?
+	// Validate the generated map response - only check for nil response
+	// Note: mapResp.Node can be nil for peer updates, which is valid
+	if mapResp == nil && c.Change != change.DERP && c.Change != change.NodeRemove {
+		return nil, fmt.Errorf("generated nil map response for nodeID %d change %s", nodeID, c.Change.String())
+	}
+
+	return mapResp, nil
+}
+
+// handleNodeChange generates and sends a [tailcfg.MapResponse] for a given node and [change.ChangeSet].
+func handleNodeChange(nc nodeConnection, mapper *mapper, c change.ChangeSet) error {
+	if nc == nil {
+		return errors.New("nodeConnection is nil")
+	}
+
+	nodeID := nc.nodeID()
+
+	log.Debug().Caller().Uint64("node.id", nodeID.Uint64()).Str("change.type", c.Change.String()).Msg("Node change processing started because change notification received")
+
+	var data *tailcfg.MapResponse
+	var err error
+	data, err = generateMapResponse(nodeID, nc.version(), mapper, c)
+	if err != nil {
+		return fmt.Errorf("generating map response for node %d: %w", nodeID, err)
+	}
+
+	if data == nil {
+		// No data to send is valid for some change types
+		return nil
+	}
+
+	// Send the map response
+	err = nc.send(data)
+	if err != nil {
+		return fmt.Errorf("sending map response to node %d: %w", nodeID, err)
+	}
+
+	return nil
+}
+
+// workResult represents the result of processing a change.
+type workResult struct {
+	mapResponse *tailcfg.MapResponse
+	err         error
+}
+
+// work represents a unit of work to be processed by workers.
+type work struct {
+	c        change.ChangeSet
+	nodeID   types.NodeID
+	resultCh chan<- workResult // optional channel for synchronous operations
+}
--- a/hscontrol/mapper/batcher_lockfree.go
+++ b/hscontrol/mapper/batcher_lockfree.go
@@ -0,0 +1,718 @@
+package mapper
+
+import (
+	"context"
+	"crypto/rand"
+	"fmt"
+	"sync"
+	"sync/atomic"
+	"time"
+
+	"github.com/juanfont/headscale/hscontrol/types"
+	"github.com/juanfont/headscale/hscontrol/types/change"
+	"github.com/puzpuzpuz/xsync/v4"
+	"github.com/rs/zerolog/log"
+	"tailscale.com/tailcfg"
+	"tailscale.com/types/ptr"
+)
+
+// LockFreeBatcher uses atomic operations and concurrent maps to eliminate mutex contention.
+type LockFreeBatcher struct {
+	tick    *time.Ticker
+	mapper  *mapper
+	workers int
+
+	nodes     *xsync.Map[types.NodeID, *multiChannelNodeConn]
+	connected *xsync.Map[types.NodeID, *time.Time]
+
+	// Work queue channel
+	workCh chan work
+	ctx    context.Context
+	cancel context.CancelFunc
+
+	// Batching state
+	pendingChanges *xsync.Map[types.NodeID, []change.ChangeSet]
+
+	// Metrics
+	totalNodes      atomic.Int64
+	totalUpdates    atomic.Int64
+	workQueuedCount atomic.Int64
+	workProcessed   atomic.Int64
+	workErrors      atomic.Int64
+}
+
+// AddNode registers a new node connection with the batcher and sends an initial map response.
+// It creates or updates the node's connection data, validates the initial map generation,
+// and notifies other nodes that this node has come online.
+func (b *LockFreeBatcher) AddNode(id types.NodeID, c chan<- *tailcfg.MapResponse, version tailcfg.CapabilityVersion) error {
+	addNodeStart := time.Now()
+
+	// Generate connection ID
+	connID := generateConnectionID()
+
+	// Create new connection entry
+	now := time.Now()
+	newEntry := &connectionEntry{
+		id:      connID,
+		c:       c,
+		version: version,
+		created: now,
+	}
+	// Initialize last used timestamp
+	newEntry.lastUsed.Store(now.Unix())
+
+	// Get or create multiChannelNodeConn - this reuses existing offline nodes for rapid reconnection
+	nodeConn, loaded := b.nodes.LoadOrStore(id, newMultiChannelNodeConn(id, b.mapper))
+
+	if !loaded {
+		b.totalNodes.Add(1)
+	}
+
+	// Add connection to the list (lock-free)
+	nodeConn.addConnection(newEntry)
+
+	// Use the worker pool for controlled concurrency instead of direct generation
+	initialMap, err := b.MapResponseFromChange(id, change.FullSelf(id))
+
+	if err != nil {
+		log.Error().Uint64("node.id", id.Uint64()).Err(err).Msg("Initial map generation failed")
+		nodeConn.removeConnectionByChannel(c)
+		return fmt.Errorf("failed to generate initial map for node %d: %w", id, err)
+	}
+
+	// Use a blocking send with timeout for initial map since the channel should be ready
+	// and we want to avoid the race condition where the receiver isn't ready yet
+	select {
+	case c <- initialMap:
+		// Success
+	case <-time.After(5 * time.Second):
+		log.Error().Uint64("node.id", id.Uint64()).Err(fmt.Errorf("timeout")).Msg("Initial map send timeout")
+		log.Debug().Caller().Uint64("node.id", id.Uint64()).Dur("timeout.duration", 5*time.Second).
+			Msg("Initial map send timed out because channel was blocked or receiver not ready")
+		nodeConn.removeConnectionByChannel(c)
+		return fmt.Errorf("failed to send initial map to node %d: timeout", id)
+	}
+
+	// Update connection status
+	b.connected.Store(id, nil) // nil = connected
+
+	// Node will automatically receive updates through the normal flow
+	// The initial full map already contains all current state
+
+	log.Debug().Caller().Uint64("node.id", id.Uint64()).Dur("total.duration", time.Since(addNodeStart)).
+		Int("active.connections", nodeConn.getActiveConnectionCount()).
+		Msg("Node connection established in batcher because AddNode completed successfully")
+
+	return nil
+}
+
+// RemoveNode disconnects a node from the batcher, marking it as offline and cleaning up its state.
+// It validates the connection channel matches one of the current connections, closes that specific connection,
+// and keeps the node entry alive for rapid reconnections instead of aggressive deletion.
+// Reports if the node still has active connections after removal.
+func (b *LockFreeBatcher) RemoveNode(id types.NodeID, c chan<- *tailcfg.MapResponse) bool {
+	nodeConn, exists := b.nodes.Load(id)
+	if !exists {
+		log.Debug().Caller().Uint64("node.id", id.Uint64()).Msg("RemoveNode called for non-existent node because node not found in batcher")
+		return false
+	}
+
+	// Remove specific connection
+	removed := nodeConn.removeConnectionByChannel(c)
+	if !removed {
+		log.Debug().Caller().Uint64("node.id", id.Uint64()).Msg("RemoveNode: channel not found because connection already removed or invalid")
+		return false
+	}
+
+	// Check if node has any remaining active connections
+	if nodeConn.hasActiveConnections() {
+		log.Debug().Caller().Uint64("node.id", id.Uint64()).
+			Int("active.connections", nodeConn.getActiveConnectionCount()).
+			Msg("Node connection removed but keeping online because other connections remain")
+		return true // Node still has active connections
+	}
+
+	// No active connections - keep the node entry alive for rapid reconnections
+	// The node will get a fresh full map when it reconnects
+	log.Debug().Caller().Uint64("node.id", id.Uint64()).Msg("Node disconnected from batcher because all connections removed, keeping entry for rapid reconnection")
+	b.connected.Store(id, ptr.To(time.Now()))
+
+	return false
+}
+
+// AddWork queues a change to be processed by the batcher.
+func (b *LockFreeBatcher) AddWork(c ...change.ChangeSet) {
+	b.addWork(c...)
+}
+
+func (b *LockFreeBatcher) Start() {
+	b.ctx, b.cancel = context.WithCancel(context.Background())
+	go b.doWork()
+}
+
+func (b *LockFreeBatcher) Close() {
+	if b.cancel != nil {
+		b.cancel()
+		b.cancel = nil
+	}
+
+	// Only close workCh once
+	select {
+	case <-b.workCh:
+		// Channel is already closed
+	default:
+		close(b.workCh)
+	}
+
+	// Close the underlying channels supplying the data to the clients.
+	b.nodes.Range(func(nodeID types.NodeID, conn *multiChannelNodeConn) bool {
+		conn.close()
+		return true
+	})
+}
+
+func (b *LockFreeBatcher) doWork() {
+	for i := range b.workers {
+		go b.worker(i + 1)
+	}
+
+	// Create a cleanup ticker for removing truly disconnected nodes
+	cleanupTicker := time.NewTicker(5 * time.Minute)
+	defer cleanupTicker.Stop()
+
+	for {
+		select {
+		case <-b.tick.C:
+			// Process batched changes
+			b.processBatchedChanges()
+		case <-cleanupTicker.C:
+			// Clean up nodes that have been offline for too long
+			b.cleanupOfflineNodes()
+		case <-b.ctx.Done():
+			log.Info().Msg("batcher context done, stopping to feed workers")
+			return
+		}
+	}
+}
+
+func (b *LockFreeBatcher) worker(workerID int) {
+	for {
+		select {
+		case w, ok := <-b.workCh:
+			if !ok {
+				log.Debug().Int("worker.id", workerID).Msgf("worker channel closing, shutting down worker %d", workerID)
+				return
+			}
+
+			b.workProcessed.Add(1)
+
+			// If the resultCh is set, it means that this is a work request
+			// where there is a blocking function waiting for the map that
+			// is being generated.
+			// This is used for synchronous map generation.
+			if w.resultCh != nil {
+				var result workResult
+				if nc, exists := b.nodes.Load(w.nodeID); exists {
+					var err error
+					result.mapResponse, err = generateMapResponse(nc.nodeID(), nc.version(), b.mapper, w.c)
+					result.err = err
+					if result.err != nil {
+						b.workErrors.Add(1)
+						log.Error().Err(result.err).
+							Int("worker.id", workerID).
+							Uint64("node.id", w.nodeID.Uint64()).
+							Str("change", w.c.Change.String()).
+							Msg("failed to generate map response for synchronous work")
+					}
+				} else {
+					result.err = fmt.Errorf("node %d not found", w.nodeID)
+
+					b.workErrors.Add(1)
+					log.Error().Err(result.err).
+						Int("worker.id", workerID).
+						Uint64("node.id", w.nodeID.Uint64()).
+						Msg("node not found for synchronous work")
+				}
+
+				// Send result
+				select {
+				case w.resultCh <- result:
+				case <-b.ctx.Done():
+					return
+				}
+
+				continue
+			}
+
+			// If resultCh is nil, this is an asynchronous work request
+			// that should be processed and sent to the node instead of
+			// returned to the caller.
+			if nc, exists := b.nodes.Load(w.nodeID); exists {
+				// Apply change to node - this will handle offline nodes gracefully
+				// and queue work for when they reconnect
+				err := nc.change(w.c)
+				if err != nil {
+					b.workErrors.Add(1)
+					log.Error().Err(err).
+						Int("worker.id", workerID).
+						Uint64("node.id", w.c.NodeID.Uint64()).
+						Str("change", w.c.Change.String()).
+						Msg("failed to apply change")
+				}
+			}
+		case <-b.ctx.Done():
+			log.Debug().Int("workder.id", workerID).Msg("batcher context is done, exiting worker")
+			return
+		}
+	}
+}
+
+func (b *LockFreeBatcher) addWork(c ...change.ChangeSet) {
+	b.addToBatch(c...)
+}
+
+// queueWork safely queues work.
+func (b *LockFreeBatcher) queueWork(w work) {
+	b.workQueuedCount.Add(1)
+
+	select {
+	case b.workCh <- w:
+		// Successfully queued
+	case <-b.ctx.Done():
+		// Batcher is shutting down
+		return
+	}
+}
+
+// addToBatch adds a change to the pending batch.
+func (b *LockFreeBatcher) addToBatch(c ...change.ChangeSet) {
+	// Short circuit if any of the changes is a full update, which
+	// means we can skip sending individual changes.
+	if change.HasFull(c) {
+		b.nodes.Range(func(nodeID types.NodeID, _ *multiChannelNodeConn) bool {
+			b.pendingChanges.Store(nodeID, []change.ChangeSet{{Change: change.Full}})
+
+			return true
+		})
+		return
+	}
+
+	all, self := change.SplitAllAndSelf(c)
+
+	for _, changeSet := range self {
+		changes, _ := b.pendingChanges.LoadOrStore(changeSet.NodeID, []change.ChangeSet{})
+		changes = append(changes, changeSet)
+		b.pendingChanges.Store(changeSet.NodeID, changes)
+
+		return
+	}
+
+	b.nodes.Range(func(nodeID types.NodeID, _ *multiChannelNodeConn) bool {
+		rel := change.RemoveUpdatesForSelf(nodeID, all)
+
+		changes, _ := b.pendingChanges.LoadOrStore(nodeID, []change.ChangeSet{})
+		changes = append(changes, rel...)
+		b.pendingChanges.Store(nodeID, changes)
+
+		return true
+	})
+}
+
+// processBatchedChanges processes all pending batched changes.
+func (b *LockFreeBatcher) processBatchedChanges() {
+	if b.pendingChanges == nil {
+		return
+	}
+
+	// Process all pending changes
+	b.pendingChanges.Range(func(nodeID types.NodeID, changes []change.ChangeSet) bool {
+		if len(changes) == 0 {
+			return true
+		}
+
+		// Send all batched changes for this node
+		for _, c := range changes {
+			b.queueWork(work{c: c, nodeID: nodeID, resultCh: nil})
+		}
+
+		// Clear the pending changes for this node
+		b.pendingChanges.Delete(nodeID)
+
+		return true
+	})
+}
+
+// cleanupOfflineNodes removes nodes that have been offline for too long to prevent memory leaks.
+// TODO(kradalby): reevaluate if we want to keep this.
+func (b *LockFreeBatcher) cleanupOfflineNodes() {
+	cleanupThreshold := 15 * time.Minute
+	now := time.Now()
+
+	var nodesToCleanup []types.NodeID
+
+	// Find nodes that have been offline for too long
+	b.connected.Range(func(nodeID types.NodeID, disconnectTime *time.Time) bool {
+		if disconnectTime != nil && now.Sub(*disconnectTime) > cleanupThreshold {
+			// Double-check the node doesn't have active connections
+			if nodeConn, exists := b.nodes.Load(nodeID); exists {
+				if !nodeConn.hasActiveConnections() {
+					nodesToCleanup = append(nodesToCleanup, nodeID)
+				}
+			}
+		}
+		return true
+	})
+
+	// Clean up the identified nodes
+	for _, nodeID := range nodesToCleanup {
+		log.Info().Uint64("node.id", nodeID.Uint64()).
+			Dur("offline_duration", cleanupThreshold).
+			Msg("Cleaning up node that has been offline for too long")
+
+		b.nodes.Delete(nodeID)
+		b.connected.Delete(nodeID)
+		b.totalNodes.Add(-1)
+	}
+
+	if len(nodesToCleanup) > 0 {
+		log.Info().Int("cleaned_nodes", len(nodesToCleanup)).
+			Msg("Completed cleanup of long-offline nodes")
+	}
+}
+
+// IsConnected is lock-free read that checks if a node has any active connections.
+func (b *LockFreeBatcher) IsConnected(id types.NodeID) bool {
+	// First check if we have active connections for this node
+	if nodeConn, exists := b.nodes.Load(id); exists {
+		if nodeConn.hasActiveConnections() {
+			return true
+		}
+	}
+
+	// Check disconnected timestamp with grace period
+	val, ok := b.connected.Load(id)
+	if !ok {
+		return false
+	}
+
+	// nil means connected
+	if val == nil {
+		return true
+	}
+
+	return false
+}
+
+// ConnectedMap returns a lock-free map of all connected nodes.
+func (b *LockFreeBatcher) ConnectedMap() *xsync.Map[types.NodeID, bool] {
+	ret := xsync.NewMap[types.NodeID, bool]()
+
+	// First, add all nodes with active connections
+	b.nodes.Range(func(id types.NodeID, nodeConn *multiChannelNodeConn) bool {
+		if nodeConn.hasActiveConnections() {
+			ret.Store(id, true)
+		}
+		return true
+	})
+
+	// Then add all entries from the connected map
+	b.connected.Range(func(id types.NodeID, val *time.Time) bool {
+		// Only add if not already added as connected above
+		if _, exists := ret.Load(id); !exists {
+			if val == nil {
+				// nil means connected
+				ret.Store(id, true)
+			} else {
+				// timestamp means disconnected
+				ret.Store(id, false)
+			}
+		}
+		return true
+	})
+
+	return ret
+}
+
+// MapResponseFromChange queues work to generate a map response and waits for the result.
+// This allows synchronous map generation using the same worker pool.
+func (b *LockFreeBatcher) MapResponseFromChange(id types.NodeID, c change.ChangeSet) (*tailcfg.MapResponse, error) {
+	resultCh := make(chan workResult, 1)
+
+	// Queue the work with a result channel using the safe queueing method
+	b.queueWork(work{c: c, nodeID: id, resultCh: resultCh})
+
+	// Wait for the result
+	select {
+	case result := <-resultCh:
+		return result.mapResponse, result.err
+	case <-b.ctx.Done():
+		return nil, fmt.Errorf("batcher shutting down while generating map response for node %d", id)
+	}
+}
+
+// connectionEntry represents a single connection to a node.
+type connectionEntry struct {
+	id       string // unique connection ID
+	c        chan<- *tailcfg.MapResponse
+	version  tailcfg.CapabilityVersion
+	created  time.Time
+	lastUsed atomic.Int64 // Unix timestamp of last successful send
+}
+
+// multiChannelNodeConn manages multiple concurrent connections for a single node.
+type multiChannelNodeConn struct {
+	id     types.NodeID
+	mapper *mapper
+
+	mutex       sync.RWMutex
+	connections []*connectionEntry
+
+	updateCount atomic.Int64
+}
+
+// generateConnectionID generates a unique connection identifier.
+func generateConnectionID() string {
+	bytes := make([]byte, 8)
+	rand.Read(bytes)
+	return fmt.Sprintf("%x", bytes)
+}
+
+// newMultiChannelNodeConn creates a new multi-channel node connection.
+func newMultiChannelNodeConn(id types.NodeID, mapper *mapper) *multiChannelNodeConn {
+	return &multiChannelNodeConn{
+		id:     id,
+		mapper: mapper,
+	}
+}
+
+func (mc *multiChannelNodeConn) close() {
+	mc.mutex.Lock()
+	defer mc.mutex.Unlock()
+
+	for _, conn := range mc.connections {
+		close(conn.c)
+	}
+}
+
+// addConnection adds a new connection.
+func (mc *multiChannelNodeConn) addConnection(entry *connectionEntry) {
+	mutexWaitStart := time.Now()
+	log.Debug().Caller().Uint64("node.id", mc.id.Uint64()).Str("chan", fmt.Sprintf("%p", entry.c)).Str("conn.id", entry.id).
+		Msg("addConnection: waiting for mutex - POTENTIAL CONTENTION POINT")
+
+	mc.mutex.Lock()
+	mutexWaitDur := time.Since(mutexWaitStart)
+	defer mc.mutex.Unlock()
+
+	mc.connections = append(mc.connections, entry)
+	log.Debug().Caller().Uint64("node.id", mc.id.Uint64()).Str("chan", fmt.Sprintf("%p", entry.c)).Str("conn.id", entry.id).
+		Int("total_connections", len(mc.connections)).
+		Dur("mutex_wait_time", mutexWaitDur).
+		Msg("Successfully added connection after mutex wait")
+}
+
+// removeConnectionByChannel removes a connection by matching channel pointer.
+func (mc *multiChannelNodeConn) removeConnectionByChannel(c chan<- *tailcfg.MapResponse) bool {
+	mc.mutex.Lock()
+	defer mc.mutex.Unlock()
+
+	for i, entry := range mc.connections {
+		if entry.c == c {
+			// Remove this connection
+			mc.connections = append(mc.connections[:i], mc.connections[i+1:]...)
+			log.Debug().Caller().Uint64("node.id", mc.id.Uint64()).Str("chan", fmt.Sprintf("%p", c)).
+				Int("remaining_connections", len(mc.connections)).
+				Msg("Successfully removed connection")
+			return true
+		}
+	}
+	return false
+}
+
+// hasActiveConnections checks if the node has any active connections.
+func (mc *multiChannelNodeConn) hasActiveConnections() bool {
+	mc.mutex.RLock()
+	defer mc.mutex.RUnlock()
+
+	return len(mc.connections) > 0
+}
+
+// getActiveConnectionCount returns the number of active connections.
+func (mc *multiChannelNodeConn) getActiveConnectionCount() int {
+	mc.mutex.RLock()
+	defer mc.mutex.RUnlock()
+
+	return len(mc.connections)
+}
+
+// send broadcasts data to all active connections for the node.
+func (mc *multiChannelNodeConn) send(data *tailcfg.MapResponse) error {
+	if data == nil {
+		return nil
+	}
+
+	mc.mutex.Lock()
+	defer mc.mutex.Unlock()
+
+	if len(mc.connections) == 0 {
+		// During rapid reconnection, nodes may temporarily have no active connections
+		// This is not an error - the node will receive a full map when it reconnects
+		log.Debug().Caller().Uint64("node.id", mc.id.Uint64()).
+			Msg("send: skipping send to node with no active connections (likely rapid reconnection)")
+		return nil // Return success instead of error
+	}
+
+	log.Debug().Caller().Uint64("node.id", mc.id.Uint64()).
+		Int("total_connections", len(mc.connections)).
+		Msg("send: broadcasting to all connections")
+
+	var lastErr error
+	successCount := 0
+	var failedConnections []int // Track failed connections for removal
+
+	// Send to all connections
+	for i, conn := range mc.connections {
+		log.Debug().Caller().Uint64("node.id", mc.id.Uint64()).Str("chan", fmt.Sprintf("%p", conn.c)).
+			Str("conn.id", conn.id).Int("connection_index", i).
+			Msg("send: attempting to send to connection")
+
+		if err := conn.send(data); err != nil {
+			lastErr = err
+			failedConnections = append(failedConnections, i)
+			log.Warn().Err(err).
+				Uint64("node.id", mc.id.Uint64()).Str("chan", fmt.Sprintf("%p", conn.c)).
+				Str("conn.id", conn.id).Int("connection_index", i).
+				Msg("send: connection send failed")
+		} else {
+			successCount++
+			log.Debug().Caller().Uint64("node.id", mc.id.Uint64()).Str("chan", fmt.Sprintf("%p", conn.c)).
+				Str("conn.id", conn.id).Int("connection_index", i).
+				Msg("send: successfully sent to connection")
+		}
+	}
+
+	// Remove failed connections (in reverse order to maintain indices)
+	for i := len(failedConnections) - 1; i >= 0; i-- {
+		idx := failedConnections[i]
+		log.Debug().Caller().Uint64("node.id", mc.id.Uint64()).
+			Str("conn.id", mc.connections[idx].id).
+			Msg("send: removing failed connection")
+		mc.connections = append(mc.connections[:idx], mc.connections[idx+1:]...)
+	}
+
+	mc.updateCount.Add(1)
+
+	log.Info().Uint64("node.id", mc.id.Uint64()).
+		Int("successful_sends", successCount).
+		Int("failed_connections", len(failedConnections)).
+		Int("remaining_connections", len(mc.connections)).
+		Msg("send: completed broadcast")
+
+	// Success if at least one send succeeded
+	if successCount > 0 {
+		return nil
+	}
+
+	return fmt.Errorf("node %d: all connections failed, last error: %w", mc.id, lastErr)
+}
+
+// send sends data to a single connection entry with timeout-based stale connection detection.
+func (entry *connectionEntry) send(data *tailcfg.MapResponse) error {
+	if data == nil {
+		return nil
+	}
+
+	// Use a short timeout to detect stale connections where the client isn't reading the channel.
+	// This is critical for detecting Docker containers that are forcefully terminated
+	// but still have channels that appear open.
+	select {
+	case entry.c <- data:
+		// Update last used timestamp on successful send
+		entry.lastUsed.Store(time.Now().Unix())
+		return nil
+	case <-time.After(50 * time.Millisecond):
+		// Connection is likely stale - client isn't reading from channel
+		// This catches the case where Docker containers are killed but channels remain open
+		return fmt.Errorf("connection %s: timeout sending to channel (likely stale connection)", entry.id)
+	}
+}
+
+// nodeID returns the node ID.
+func (mc *multiChannelNodeConn) nodeID() types.NodeID {
+	return mc.id
+}
+
+// version returns the capability version from the first active connection.
+// All connections for a node should have the same version in practice.
+func (mc *multiChannelNodeConn) version() tailcfg.CapabilityVersion {
+	mc.mutex.RLock()
+	defer mc.mutex.RUnlock()
+
+	if len(mc.connections) == 0 {
+		return 0
+	}
+
+	return mc.connections[0].version
+}
+
+// change applies a change to all active connections for the node.
+func (mc *multiChannelNodeConn) change(c change.ChangeSet) error {
+	return handleNodeChange(mc, mc.mapper, c)
+}
+
+// DebugNodeInfo contains debug information about a node's connections.
+type DebugNodeInfo struct {
+	Connected         bool `json:"connected"`
+	ActiveConnections int  `json:"active_connections"`
+}
+
+// Debug returns a pre-baked map of node debug information for the debug interface.
+func (b *LockFreeBatcher) Debug() map[types.NodeID]DebugNodeInfo {
+	result := make(map[types.NodeID]DebugNodeInfo)
+
+	// Get all nodes with their connection status using immediate connection logic
+	// (no grace period) for debug purposes
+	b.nodes.Range(func(id types.NodeID, nodeConn *multiChannelNodeConn) bool {
+		nodeConn.mutex.RLock()
+		activeConnCount := len(nodeConn.connections)
+		nodeConn.mutex.RUnlock()
+
+		// Use immediate connection status: if active connections exist, node is connected
+		// If not, check the connected map for nil (connected) vs timestamp (disconnected)
+		connected := false
+		if activeConnCount > 0 {
+			connected = true
+		} else {
+			// Check connected map for immediate status
+			if val, ok := b.connected.Load(id); ok && val == nil {
+				connected = true
+			}
+		}
+
+		result[id] = DebugNodeInfo{
+			Connected:         connected,
+			ActiveConnections: activeConnCount,
+		}
+		return true
+	})
+
+	// Add all entries from the connected map to capture both connected and disconnected nodes
+	b.connected.Range(func(id types.NodeID, val *time.Time) bool {
+		// Only add if not already processed above
+		if _, exists := result[id]; !exists {
+			// Use immediate connection status for debug (no grace period)
+			connected := (val == nil) // nil means connected, timestamp means disconnected
+			result[id] = DebugNodeInfo{
+				Connected:         connected,
+				ActiveConnections: 0,
+			}
+		}
+		return true
+	})
+
+	return result
+}
+
+func (b *LockFreeBatcher) DebugMapResponses() (map[types.NodeID][]tailcfg.MapResponse, error) {
+	return b.mapper.debugMapResponses()
+}
--- a/hscontrol/mapper/batcher_test.go
+++ b/hscontrol/mapper/batcher_test.go
--- a/hscontrol/mapper/builder.go
+++ b/hscontrol/mapper/builder.go
@@ -0,0 +1,291 @@
+package mapper
+
+import (
+	"errors"
+	"net/netip"
+	"sort"
+	"time"
+
+	"github.com/juanfont/headscale/hscontrol/policy"
+	"github.com/juanfont/headscale/hscontrol/types"
+	"tailscale.com/tailcfg"
+	"tailscale.com/types/views"
+	"tailscale.com/util/multierr"
+)
+
+// MapResponseBuilder provides a fluent interface for building tailcfg.MapResponse.
+type MapResponseBuilder struct {
+	resp   *tailcfg.MapResponse
+	mapper *mapper
+	nodeID types.NodeID
+	capVer tailcfg.CapabilityVersion
+	errs   []error
+
+	debugType debugType
+}
+
+type debugType string
+
+const (
+	fullResponseDebug   debugType = "full"
+	patchResponseDebug  debugType = "patch"
+	removeResponseDebug debugType = "remove"
+	changeResponseDebug debugType = "change"
+	derpResponseDebug   debugType = "derp"
+)
+
+// NewMapResponseBuilder creates a new builder with basic fields set.
+func (m *mapper) NewMapResponseBuilder(nodeID types.NodeID) *MapResponseBuilder {
+	now := time.Now()
+	return &MapResponseBuilder{
+		resp: &tailcfg.MapResponse{
+			KeepAlive:   false,
+			ControlTime: &now,
+		},
+		mapper: m,
+		nodeID: nodeID,
+		errs:   nil,
+	}
+}
+
+// addError adds an error to the builder's error list.
+func (b *MapResponseBuilder) addError(err error) {
+	if err != nil {
+		b.errs = append(b.errs, err)
+	}
+}
+
+// hasErrors returns true if the builder has accumulated any errors.
+func (b *MapResponseBuilder) hasErrors() bool {
+	return len(b.errs) > 0
+}
+
+// WithCapabilityVersion sets the capability version for the response.
+func (b *MapResponseBuilder) WithCapabilityVersion(capVer tailcfg.CapabilityVersion) *MapResponseBuilder {
+	b.capVer = capVer
+	return b
+}
+
+// WithSelfNode adds the requesting node to the response.
+func (b *MapResponseBuilder) WithSelfNode() *MapResponseBuilder {
+	nodeView, ok := b.mapper.state.GetNodeByID(b.nodeID)
+	if !ok {
+		b.addError(errors.New("node not found"))
+		return b
+	}
+
+	// Always use batcher's view of online status for self node
+	// The batcher respects grace periods for logout scenarios
+	node := nodeView.AsStruct()
+	// if b.mapper.batcher != nil {
+	// 	node.IsOnline = ptr.To(b.mapper.batcher.IsConnected(b.nodeID))
+	// }
+
+	_, matchers := b.mapper.state.Filter()
+	tailnode, err := tailNode(
+		node.View(), b.capVer, b.mapper.state,
+		func(id types.NodeID) []netip.Prefix {
+			return policy.ReduceRoutes(node.View(), b.mapper.state.GetNodePrimaryRoutes(id), matchers)
+		},
+		b.mapper.cfg)
+	if err != nil {
+		b.addError(err)
+		return b
+	}
+
+	b.resp.Node = tailnode
+
+	return b
+}
+
+func (b *MapResponseBuilder) WithDebugType(t debugType) *MapResponseBuilder {
+	if debugDumpMapResponsePath != "" {
+		b.debugType = t
+	}
+
+	return b
+}
+
+// WithDERPMap adds the DERP map to the response.
+func (b *MapResponseBuilder) WithDERPMap() *MapResponseBuilder {
+	b.resp.DERPMap = b.mapper.state.DERPMap().AsStruct()
+	return b
+}
+
+// WithDomain adds the domain configuration.
+func (b *MapResponseBuilder) WithDomain() *MapResponseBuilder {
+	b.resp.Domain = b.mapper.cfg.Domain()
+	return b
+}
+
+// WithCollectServicesDisabled sets the collect services flag to false.
+func (b *MapResponseBuilder) WithCollectServicesDisabled() *MapResponseBuilder {
+	b.resp.CollectServices.Set(false)
+	return b
+}
+
+// WithDebugConfig adds debug configuration
+// It disables log tailing if the mapper's LogTail is not enabled.
+func (b *MapResponseBuilder) WithDebugConfig() *MapResponseBuilder {
+	b.resp.Debug = &tailcfg.Debug{
+		DisableLogTail: !b.mapper.cfg.LogTail.Enabled,
+	}
+	return b
+}
+
+// WithSSHPolicy adds SSH policy configuration for the requesting node.
+func (b *MapResponseBuilder) WithSSHPolicy() *MapResponseBuilder {
+	node, ok := b.mapper.state.GetNodeByID(b.nodeID)
+	if !ok {
+		b.addError(errors.New("node not found"))
+		return b
+	}
+
+	sshPolicy, err := b.mapper.state.SSHPolicy(node)
+	if err != nil {
+		b.addError(err)
+		return b
+	}
+
+	b.resp.SSHPolicy = sshPolicy
+
+	return b
+}
+
+// WithDNSConfig adds DNS configuration for the requesting node.
+func (b *MapResponseBuilder) WithDNSConfig() *MapResponseBuilder {
+	node, ok := b.mapper.state.GetNodeByID(b.nodeID)
+	if !ok {
+		b.addError(errors.New("node not found"))
+		return b
+	}
+
+	b.resp.DNSConfig = generateDNSConfig(b.mapper.cfg, node)
+
+	return b
+}
+
+// WithUserProfiles adds user profiles for the requesting node and given peers.
+func (b *MapResponseBuilder) WithUserProfiles(peers views.Slice[types.NodeView]) *MapResponseBuilder {
+	node, ok := b.mapper.state.GetNodeByID(b.nodeID)
+	if !ok {
+		b.addError(errors.New("node not found"))
+		return b
+	}
+
+	b.resp.UserProfiles = generateUserProfiles(node, peers)
+
+	return b
+}
+
+// WithPacketFilters adds packet filter rules based on policy.
+func (b *MapResponseBuilder) WithPacketFilters() *MapResponseBuilder {
+	node, ok := b.mapper.state.GetNodeByID(b.nodeID)
+	if !ok {
+		b.addError(errors.New("node not found"))
+		return b
+	}
+
+	filter, _ := b.mapper.state.Filter()
+
+	// CapVer 81: 2023-11-17: MapResponse.PacketFilters (incremental packet filter updates)
+	// Currently, we do not send incremental package filters, however using the
+	// new PacketFilters field and "base" allows us to send a full update when we
+	// have to send an empty list, avoiding the hack in the else block.
+	b.resp.PacketFilters = map[string][]tailcfg.FilterRule{
+		"base": policy.ReduceFilterRules(node, filter),
+	}
+
+	return b
+}
+
+// WithPeers adds full peer list with policy filtering (for full map response).
+func (b *MapResponseBuilder) WithPeers(peers views.Slice[types.NodeView]) *MapResponseBuilder {
+	tailPeers, err := b.buildTailPeers(peers)
+	if err != nil {
+		b.addError(err)
+		return b
+	}
+
+	b.resp.Peers = tailPeers
+
+	return b
+}
+
+// WithPeerChanges adds changed peers with policy filtering (for incremental updates).
+func (b *MapResponseBuilder) WithPeerChanges(peers views.Slice[types.NodeView]) *MapResponseBuilder {
+	tailPeers, err := b.buildTailPeers(peers)
+	if err != nil {
+		b.addError(err)
+		return b
+	}
+
+	b.resp.PeersChanged = tailPeers
+
+	return b
+}
+
+// buildTailPeers converts views.Slice[types.NodeView] to []tailcfg.Node with policy filtering and sorting.
+func (b *MapResponseBuilder) buildTailPeers(peers views.Slice[types.NodeView]) ([]*tailcfg.Node, error) {
+	node, ok := b.mapper.state.GetNodeByID(b.nodeID)
+	if !ok {
+		return nil, errors.New("node not found")
+	}
+
+	filter, matchers := b.mapper.state.Filter()
+
+	// If there are filter rules present, see if there are any nodes that cannot
+	// access each-other at all and remove them from the peers.
+	var changedViews views.Slice[types.NodeView]
+	if len(filter) > 0 {
+		changedViews = policy.ReduceNodes(node, peers, matchers)
+	} else {
+		changedViews = peers
+	}
+
+	tailPeers, err := tailNodes(
+		changedViews, b.capVer, b.mapper.state,
+		func(id types.NodeID) []netip.Prefix {
+			return policy.ReduceRoutes(node, b.mapper.state.GetNodePrimaryRoutes(id), matchers)
+		},
+		b.mapper.cfg)
+	if err != nil {
+		return nil, err
+	}
+
+	// Peers is always returned sorted by Node.ID.
+	sort.SliceStable(tailPeers, func(x, y int) bool {
+		return tailPeers[x].ID < tailPeers[y].ID
+	})
+
+	return tailPeers, nil
+}
+
+// WithPeerChangedPatch adds peer change patches.
+func (b *MapResponseBuilder) WithPeerChangedPatch(changes []*tailcfg.PeerChange) *MapResponseBuilder {
+	b.resp.PeersChangedPatch = changes
+	return b
+}
+
+// WithPeersRemoved adds removed peer IDs.
+func (b *MapResponseBuilder) WithPeersRemoved(removedIDs ...types.NodeID) *MapResponseBuilder {
+	var tailscaleIDs []tailcfg.NodeID
+	for _, id := range removedIDs {
+		tailscaleIDs = append(tailscaleIDs, id.NodeID())
+	}
+	b.resp.PeersRemoved = tailscaleIDs
+
+	return b
+}
+
+// Build finalizes the response and returns marshaled bytes
+func (b *MapResponseBuilder) Build() (*tailcfg.MapResponse, error) {
+	if len(b.errs) > 0 {
+		return nil, multierr.New(b.errs...)
+	}
+	if debugDumpMapResponsePath != "" {
+		writeDebugMapResponse(b.resp, b.debugType, b.nodeID)
+	}
+
+	return b.resp, nil
+}
--- a/hscontrol/mapper/builder_test.go
+++ b/hscontrol/mapper/builder_test.go
@@ -0,0 +1,347 @@
+package mapper
+
+import (
+	"testing"
+	"time"
+
+	"github.com/juanfont/headscale/hscontrol/state"
+	"github.com/juanfont/headscale/hscontrol/types"
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+	"tailscale.com/tailcfg"
+)
+
+func TestMapResponseBuilder_Basic(t *testing.T) {
+	cfg := &types.Config{
+		BaseDomain: "example.com",
+		LogTail: types.LogTailConfig{
+			Enabled: true,
+		},
+	}
+
+	mockState := &state.State{}
+	m := &mapper{
+		cfg:   cfg,
+		state: mockState,
+	}
+
+	nodeID := types.NodeID(1)
+
+	builder := m.NewMapResponseBuilder(nodeID)
+
+	// Test basic builder creation
+	assert.NotNil(t, builder)
+	assert.Equal(t, nodeID, builder.nodeID)
+	assert.NotNil(t, builder.resp)
+	assert.False(t, builder.resp.KeepAlive)
+	assert.NotNil(t, builder.resp.ControlTime)
+	assert.WithinDuration(t, time.Now(), *builder.resp.ControlTime, time.Second)
+}
+
+func TestMapResponseBuilder_WithCapabilityVersion(t *testing.T) {
+	cfg := &types.Config{}
+	mockState := &state.State{}
+	m := &mapper{
+		cfg:   cfg,
+		state: mockState,
+	}
+
+	nodeID := types.NodeID(1)
+	capVer := tailcfg.CapabilityVersion(42)
+
+	builder := m.NewMapResponseBuilder(nodeID).
+		WithCapabilityVersion(capVer)
+
+	assert.Equal(t, capVer, builder.capVer)
+	assert.False(t, builder.hasErrors())
+}
+
+func TestMapResponseBuilder_WithDomain(t *testing.T) {
+	domain := "test.example.com"
+	cfg := &types.Config{
+		ServerURL:  "https://test.example.com",
+		BaseDomain: domain,
+	}
+
+	mockState := &state.State{}
+	m := &mapper{
+		cfg:   cfg,
+		state: mockState,
+	}
+
+	nodeID := types.NodeID(1)
+
+	builder := m.NewMapResponseBuilder(nodeID).
+		WithDomain()
+
+	assert.Equal(t, domain, builder.resp.Domain)
+	assert.False(t, builder.hasErrors())
+}
+
+func TestMapResponseBuilder_WithCollectServicesDisabled(t *testing.T) {
+	cfg := &types.Config{}
+	mockState := &state.State{}
+	m := &mapper{
+		cfg:   cfg,
+		state: mockState,
+	}
+
+	nodeID := types.NodeID(1)
+
+	builder := m.NewMapResponseBuilder(nodeID).
+		WithCollectServicesDisabled()
+
+	value, isSet := builder.resp.CollectServices.Get()
+	assert.True(t, isSet)
+	assert.False(t, value)
+	assert.False(t, builder.hasErrors())
+}
+
+func TestMapResponseBuilder_WithDebugConfig(t *testing.T) {
+	tests := []struct {
+		name           string
+		logTailEnabled bool
+		expected       bool
+	}{
+		{
+			name:           "LogTail enabled",
+			logTailEnabled: true,
+			expected:       false, // DisableLogTail should be false when LogTail is enabled
+		},
+		{
+			name:           "LogTail disabled",
+			logTailEnabled: false,
+			expected:       true, // DisableLogTail should be true when LogTail is disabled
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			cfg := &types.Config{
+				LogTail: types.LogTailConfig{
+					Enabled: tt.logTailEnabled,
+				},
+			}
+			mockState := &state.State{}
+			m := &mapper{
+				cfg:   cfg,
+				state: mockState,
+			}
+
+			nodeID := types.NodeID(1)
+
+			builder := m.NewMapResponseBuilder(nodeID).
+				WithDebugConfig()
+
+			require.NotNil(t, builder.resp.Debug)
+			assert.Equal(t, tt.expected, builder.resp.Debug.DisableLogTail)
+			assert.False(t, builder.hasErrors())
+		})
+	}
+}
+
+func TestMapResponseBuilder_WithPeerChangedPatch(t *testing.T) {
+	cfg := &types.Config{}
+	mockState := &state.State{}
+	m := &mapper{
+		cfg:   cfg,
+		state: mockState,
+	}
+
+	nodeID := types.NodeID(1)
+	changes := []*tailcfg.PeerChange{
+		{
+			NodeID:     123,
+			DERPRegion: 1,
+		},
+		{
+			NodeID:     456,
+			DERPRegion: 2,
+		},
+	}
+
+	builder := m.NewMapResponseBuilder(nodeID).
+		WithPeerChangedPatch(changes)
+
+	assert.Equal(t, changes, builder.resp.PeersChangedPatch)
+	assert.False(t, builder.hasErrors())
+}
+
+func TestMapResponseBuilder_WithPeersRemoved(t *testing.T) {
+	cfg := &types.Config{}
+	mockState := &state.State{}
+	m := &mapper{
+		cfg:   cfg,
+		state: mockState,
+	}
+
+	nodeID := types.NodeID(1)
+	removedID1 := types.NodeID(123)
+	removedID2 := types.NodeID(456)
+
+	builder := m.NewMapResponseBuilder(nodeID).
+		WithPeersRemoved(removedID1, removedID2)
+
+	expected := []tailcfg.NodeID{
+		removedID1.NodeID(),
+		removedID2.NodeID(),
+	}
+	assert.Equal(t, expected, builder.resp.PeersRemoved)
+	assert.False(t, builder.hasErrors())
+}
+
+func TestMapResponseBuilder_ErrorHandling(t *testing.T) {
+	cfg := &types.Config{}
+	mockState := &state.State{}
+	m := &mapper{
+		cfg:   cfg,
+		state: mockState,
+	}
+
+	nodeID := types.NodeID(1)
+
+	// Simulate an error in the builder
+	builder := m.NewMapResponseBuilder(nodeID)
+	builder.addError(assert.AnError)
+
+	// All subsequent calls should continue to work and accumulate errors
+	result := builder.
+		WithDomain().
+		WithCollectServicesDisabled().
+		WithDebugConfig()
+
+	assert.True(t, result.hasErrors())
+	assert.Len(t, result.errs, 1)
+	assert.Equal(t, assert.AnError, result.errs[0])
+
+	// Build should return the error
+	data, err := result.Build()
+	assert.Nil(t, data)
+	assert.Error(t, err)
+}
+
+func TestMapResponseBuilder_ChainedCalls(t *testing.T) {
+	domain := "chained.example.com"
+	cfg := &types.Config{
+		ServerURL:  "https://chained.example.com",
+		BaseDomain: domain,
+		LogTail: types.LogTailConfig{
+			Enabled: false,
+		},
+	}
+
+	mockState := &state.State{}
+	m := &mapper{
+		cfg:   cfg,
+		state: mockState,
+	}
+
+	nodeID := types.NodeID(1)
+	capVer := tailcfg.CapabilityVersion(99)
+
+	builder := m.NewMapResponseBuilder(nodeID).
+		WithCapabilityVersion(capVer).
+		WithDomain().
+		WithCollectServicesDisabled().
+		WithDebugConfig()
+
+	// Verify all fields are set correctly
+	assert.Equal(t, capVer, builder.capVer)
+	assert.Equal(t, domain, builder.resp.Domain)
+	value, isSet := builder.resp.CollectServices.Get()
+	assert.True(t, isSet)
+	assert.False(t, value)
+	assert.NotNil(t, builder.resp.Debug)
+	assert.True(t, builder.resp.Debug.DisableLogTail)
+	assert.False(t, builder.hasErrors())
+}
+
+func TestMapResponseBuilder_MultipleWithPeersRemoved(t *testing.T) {
+	cfg := &types.Config{}
+	mockState := &state.State{}
+	m := &mapper{
+		cfg:   cfg,
+		state: mockState,
+	}
+
+	nodeID := types.NodeID(1)
+	removedID1 := types.NodeID(100)
+	removedID2 := types.NodeID(200)
+
+	// Test calling WithPeersRemoved multiple times
+	builder := m.NewMapResponseBuilder(nodeID).
+		WithPeersRemoved(removedID1).
+		WithPeersRemoved(removedID2)
+
+	// Second call should overwrite the first
+	expected := []tailcfg.NodeID{removedID2.NodeID()}
+	assert.Equal(t, expected, builder.resp.PeersRemoved)
+	assert.False(t, builder.hasErrors())
+}
+
+func TestMapResponseBuilder_EmptyPeerChangedPatch(t *testing.T) {
+	cfg := &types.Config{}
+	mockState := &state.State{}
+	m := &mapper{
+		cfg:   cfg,
+		state: mockState,
+	}
+
+	nodeID := types.NodeID(1)
+
+	builder := m.NewMapResponseBuilder(nodeID).
+		WithPeerChangedPatch([]*tailcfg.PeerChange{})
+
+	assert.Empty(t, builder.resp.PeersChangedPatch)
+	assert.False(t, builder.hasErrors())
+}
+
+func TestMapResponseBuilder_NilPeerChangedPatch(t *testing.T) {
+	cfg := &types.Config{}
+	mockState := &state.State{}
+	m := &mapper{
+		cfg:   cfg,
+		state: mockState,
+	}
+
+	nodeID := types.NodeID(1)
+
+	builder := m.NewMapResponseBuilder(nodeID).
+		WithPeerChangedPatch(nil)
+
+	assert.Nil(t, builder.resp.PeersChangedPatch)
+	assert.False(t, builder.hasErrors())
+}
+
+func TestMapResponseBuilder_MultipleErrors(t *testing.T) {
+	cfg := &types.Config{}
+	mockState := &state.State{}
+	m := &mapper{
+		cfg:   cfg,
+		state: mockState,
+	}
+
+	nodeID := types.NodeID(1)
+
+	// Create a builder and add multiple errors
+	builder := m.NewMapResponseBuilder(nodeID)
+	builder.addError(assert.AnError)
+	builder.addError(assert.AnError)
+	builder.addError(nil) // This should be ignored
+
+	// All subsequent calls should continue to work
+	result := builder.
+		WithDomain().
+		WithCollectServicesDisabled()
+
+	assert.True(t, result.hasErrors())
+	assert.Len(t, result.errs, 2) // nil error should be ignored
+
+	// Build should return a multierr
+	data, err := result.Build()
+	assert.Nil(t, data)
+	assert.Error(t, err)
+
+	// The error should contain information about multiple errors
+	assert.Contains(t, err.Error(), "multiple errors")
+}
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
dependabot[bot]	3ba53497c3	build(deps): bump github.com/go-viper/mapstructure/v2 Bumps [github.com/go-viper/mapstructure/v2](https://github.com/go-viper/mapstructure) from 2.2.1 to 2.4.0. - [Release notes](https://github.com/go-viper/mapstructure/releases) - [Changelog](https://github.com/go-viper/mapstructure/blob/main/CHANGELOG.md) - [Commits](https://github.com/go-viper/mapstructure/compare/v2.2.1...v2.4.0) --- updated-dependencies: - dependency-name: github.com/go-viper/mapstructure/v2 dependency-version: 2.4.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-09-12 14:57:41 +00:00
Kristoffer Dalby	2b30a15a68	cmd: add option to get and set policy directly from database (#2765 )	2025-09-12 16:55:15 +02:00
Kristoffer Dalby	2938d03878	policy: reject unsupported fields (#2764 )	2025-09-12 14:47:56 +02:00
Kristoffer Dalby	1b1c989268	{policy, node}: allow return paths in route reduction (#2767 )	2025-09-12 11:47:51 +02:00
Kristoffer Dalby	3950f8f171	cli: use gobuild version handling (#2770 )	2025-09-12 11:47:31 +02:00
Kristoffer Dalby	ee0ef396a2	policy: fix ssh usermap, fixing autogroup:nonroot (#2768 )	2025-09-12 09:12:30 +02:00
Kristoffer Dalby	7056fbb63b	derp: fix flaky shuffle test (#2772 )	2025-09-11 13:49:02 +00:00
Kristoffer Dalby	c91b9fc761	poll: add missing godoc (#2763 )	2025-09-11 14:15:19 +02:00
Kristoffer Dalby	d41fb4d540	app: fix sigint hanging When the node notifier was replaced with batcher, we removed its closing, but forgot to add the batchers so it was never stopping node connections and waiting forever. Fixes #2751 Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-11 11:53:26 +02:00
Kristoffer Dalby	01c1f6f82a	policy: validate error message for asterix in ssh (#2766 )	2025-09-10 18:41:43 +02:00
Oleksii Samoliuk	3f6657ae57	fix: documentation	2025-09-09 20:54:47 +02:00
Kristoffer Dalby	0512f7c57e	.github/ISSUE_TEMPLATE: add node number to environment Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 19:04:23 +02:00
Florian Preinstorfer	c6427aa296	Use group id instead of group name for Entra ID	2025-09-09 12:23:34 +02:00
Florian Preinstorfer	4e6d42d5bd	Keycloak's group format is configurable	2025-09-09 12:23:34 +02:00
Florian Preinstorfer	8ff5baadbe	Refresh OIDC docs The UserInfo endpoint is always queried since `5d8a2c2`. This allows to use all OIDC related features without any extra configuration on Authelia. For Keycloak, its sufficient to add the groups mapper to the userinfo endpoint.	2025-09-09 12:23:34 +02:00
Florian Preinstorfer	2f3c365b68	Describe how to remove a DERP region Add documentation for `d29feaef`. Fixes: #2450	2025-09-09 11:05:30 +02:00
Kristoffer Dalby	4893cdac74	integration: make timestamp const Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	476f30ab20	state: ensure netinfo is preserved and not removed the client will send a lot of fields as `nil` if they have not changed. NetInfo, which is inside Hostinfo, is one of those fields and we often would override the whole hostinfo meaning that we would remove netinfo if it hadnt changed. Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	233dffc186	lint and leftover Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	39443184d6	gen: new proto version Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	0303b76e1f	postgres uses more memory Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	684239e015	cmd/mapresponses: add mini tool to inspect mapresp state from integration Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	81b3e8f743	util: harden parsing of traceroute Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	50ed24847b	debug: add json and improve Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	9b962956b5	integration: Eventually, debug output, lint and format Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	3b16b75fe6	integration: rework retry for waiting for node sync Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	9d236571f4	state/nodestore: in memory representation of nodes Initial work on a nodestore which stores all of the nodes and their relations in memory with relationship for peers precalculated. It is a copy-on-write structure, replacing the "snapshot" when a change to the structure occurs. It is optimised for reads, and while batches are not fast, they are grouped together to do less of the expensive peer calculation if there are many changes rapidly. Writes will block until commited, while reads are never blocked. Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	38be30b6d4	derp: allow override to ip for debug Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	7f8b14f6f3	.github/workflows: remove integration retry Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	3326c5b7ec	cmd/hi: lint and format Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	b6d5788231	mapper: produce map before poll Before this patch, we would send a message to each "node stream" that there is an update that needs to be turned into a mapresponse and sent to a node. Producing the mapresponse is a "costly" afair which means that while a node was producing one, it might start blocking and creating full queues from the poller and all the way up to where updates where sent. This could cause updates to time out and being dropped as a bad node going away or spending too time processing would cause all the other nodes to not get any updates. In addition, it contributed to "uncontrolled parallel processing" by potentially doing too many expensive operations at the same time: Each node stream is essentially a channel, meaning that if you have 30 nodes, we will try to process 30 map requests at the same time. If you have 8 cpu cores, that will saturate all the cores immediately and cause a lot of wasted switching between the processing. Now, all the maps are processed by workers in the mapper, and the number of workers are controlable. These would now be recommended to be a bit less than number of CPU cores, allowing us to process them as fast as we can, and then send them to the poll. When the poll recieved the map, it is only responsible for taking it and sending it to the node. This might not directly improve the performance of Headscale, but it will likely make the performance a lot more consistent. And I would argue the design is a lot easier to reason about. Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	33e9e7a71f	CLAUDE: split into agents Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	ccd79ed8d4	mcp: add some standard mcp server Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	f6c4b338fd	.github/workflows: add generate check Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	306d8e1bd4	integration: validate expected online status in ping Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	4927e9d590	fix: improve mapresponses and profiles extraction in hi tool - Fix directory hierarchy flattening by using full paths instead of filepath.Base() - Remove redundant container hostname prefixes from directory names - Strip top-level directory from tar extraction to avoid nested structure - Ensure parent directories exist before creating files - Results in clean structure: control_logs/mapresponses/1-ts-client/file.json	2025-09-09 09:40:00 +02:00
Kristoffer Dalby	8e25f7f9dd	bunch of qol (#2748 )	2025-08-27 17:09:13 +02:00
github-actions[bot]	1a7a2f4196	flake.lock: Update (#2699 )	2025-08-24 12:07:32 +00:00
Dylan Blanqué	860a8a597f	Update tools.md Share/Contribute Headscale Zabbix Monitoring scripts and templates. Thank you for the awesome application to everyone involved in Headscale's development!	2025-08-24 06:05:21 +02:00
cuiweixie	a2a6d20218	Refactor to use reflect.TypeFor	2025-08-23 20:43:49 +02:00
Andrey Bobelev	d29feaef79	chore(derp): allow nil regions in DERPMaps Previously, nil regions were not properly handled. This change allows users to disable regions in DERPMaps. Particularly useful to disable some official regions.	2025-08-23 06:54:14 +02:00
Andrey Bobelev	630bfd265a	chore(derp): prioritize loading DERP maps from URLs This allows users to override default entries provided via URL	2025-08-23 06:54:14 +02:00
Florian Preinstorfer	e949859d33	Add DERP docs	2025-08-22 12:09:31 +02:00
Florian Preinstorfer	4d61da30d0	Use an IPv4 address range suitable for documentation	2025-08-22 12:09:31 +02:00
Kristoffer Dalby	b87567628a	derp: increase update frequency and harden on failures (#2741 )	2025-08-22 10:40:38 +02:00
dotlambda	51c6367bb1	Correctly document the default for dns.override_local_dns	2025-08-19 15:02:49 +02:00
Florian Preinstorfer	be337c6a33	Enable derp.server.verify_clients by default This setting is already enabled in example-config.yaml but would default to false if no key is set.	2025-08-19 11:30:44 +02:00
Shourya Gautam	086fcad7d9	Fix Internal server error on /verify (#2735 ) * converted the returned error to an httpError	2025-08-18 14:39:42 +00:00
afranco	3e3c72ea6f	docs(acls): Add example for allow/deny all acl policy	2025-08-18 16:13:14 +02:00
afranco	43f90d205e	fix: allow all traffic if acls field is omited from the policy	2025-08-18 16:13:14 +02:00
Florian Preinstorfer	7b8b796a71	docs: connect Android using a preauthkey Fixes: #2616	2025-08-18 16:06:17 +02:00
nblock	fa619ea9f3	Fix CHANGELOG for autogroup:member and autogroup:tagged (#2733 )	2025-08-18 08:59:03 +02:00
Florian Preinstorfer	30a1f7e68e	Log registrationID to simplify interactive node registration Some clients such as Android make it hard to transfer the registrationID to the server, its easier to get it from the server logs.	2025-08-15 17:11:38 +02:00
Florian Preinstorfer	30cec3aa2b	Document ports in use Ref: #1767	2025-08-14 09:24:09 +02:00
Fredrik Ekre	5d8a2c25ea	OIDC: Query userinfo endpoint before verifying user This patch includes some changes to the OIDC integration in particular: - Make sure that userinfo claims are queried before comparing the user with the configured allowed groups, email and email domain. - Update user with group claim from the userinfo endpoint which is required for allowed groups to work correctly. This is essentially a continuation of #2545. - Let userinfo claims take precedence over id token claims. With these changes I have verified that Headscale works as expected together with Authelia without the documented escape hatch [0], i.e. everything works even if the id token only contain the iss and sub claims. [0]: https://www.authelia.com/integration/openid-connect/headscale/#configuration-escape-hatch	2025-08-11 17:51:16 +02:00
Jeff Emershaw	b4f7782fd8	support force flag for nodes backfillips	2025-08-10 13:31:24 +02:00
eyjhb	d77874373d	feat: add robots.txt	2025-08-10 10:57:45 +02:00
Kristoffer Dalby	a058bf3cd3	mapper: produce map before poll (#2628 )	2025-07-28 11:15:53 +02:00
Luke Watts	b2a18830ed	docs: fix typos	2025-07-28 10:28:49 +02:00
Kristoffer Dalby	9779adc0b7	integration: run headscale with delve and debug symbols (#2689 )	2025-07-24 17:44:09 +02:00
nblock	e7fe645be5	Fix invocation of golangci-lint (#2703 )	2025-07-24 08:41:20 +02:00
Florian Preinstorfer	bcd80ee773	Add debugging and troubleshooting guide	2025-07-22 14:56:45 +02:00
Florian Preinstorfer	c04e17d82e	Document valid log levels Also change the order as the level seems more important than the format.	2025-07-22 14:56:45 +02:00
Florian Preinstorfer	98fc0563ac	Bump version in docs	2025-07-22 14:56:45 +02:00
Kian-Meng Ang	3123d5286b	Fix typos Found via `codespell -L shs,hastable,userr`	2025-07-21 12:06:07 +02:00
Kristoffer Dalby	7fce5065c4	all: remove 32 bit support (#2692 )	2025-07-16 13:32:59 +02:00
Florian Preinstorfer	a98d9bd05f	The preauthkeys commands expect a user id instead of a username	2025-07-16 09:53:05 +02:00
Florian Preinstorfer	46c59a3fff	Fix command in bug report template	2025-07-15 21:12:32 +02:00