mapper/batcher: replace connected map with per-node disconnectedAt

The Batcher's connected field (*xsync.Map[types.NodeID, *time.Time])
encoded three states via pointer semantics:

  - nil value:    node is connected
  - non-nil time: node disconnected at that timestamp
  - key missing:  node was never seen

This was error-prone (nil meaning 'connected' inverts Go idioms),
redundant with b.nodes + hasActiveConnections(), and required keeping
two parallel maps in sync. It also contained a bug in RemoveNode where
new(time.Now()) was used instead of &now, producing a zero time.

Replace the separate connected map with a disconnectedAt field on
multiChannelNodeConn (atomic.Pointer[time.Time]), tracked directly
on the object that already manages the node's connections.

Changes:
  - Add disconnectedAt field and helpers (markConnected, markDisconnected,
    isConnected, offlineDuration) to multiChannelNodeConn
  - Remove the connected field from Batcher
  - Simplify IsConnected from two map lookups to one
  - Simplify ConnectedMap and Debug from two-map iteration to one
  - Rewrite cleanupOfflineNodes to scan b.nodes directly
  - Remove the markDisconnectedIfNoConns helper
  - Update all tests and benchmarks

Fixes #3141
This commit is contained in:
Kristoffer Dalby
2026-03-14 14:06:52 +00:00
parent 60317064fd
commit 87b8507ac9
6 changed files with 148 additions and 191 deletions

View File

@@ -1121,9 +1121,9 @@ func TestBatcher_QueueWorkAfterClose_DoesNotHang(t *testing.T) {
}
// TestIsConnected_FalseAfterAddNodeFailure is a regression guard for M3.
// Before the fix, AddNode error paths removed the connection but left
// b.connected with its previous value (nil = connected). IsConnected
// would return true for a node with zero active connections.
// Before the fix, AddNode error paths removed the connection but did not
// mark the node as disconnected. IsConnected would return true for a
// node with zero active connections.
func TestIsConnected_FalseAfterAddNodeFailure(t *testing.T) {
b := NewBatcher(50*time.Millisecond, 2, nil)
b.Start()
@@ -1132,12 +1132,11 @@ func TestIsConnected_FalseAfterAddNodeFailure(t *testing.T) {
id := types.NodeID(42)
// Simulate a previous session leaving the node marked as connected.
b.connected.Store(id, nil) // nil = connected
// Pre-create the node entry so AddNode reuses it, and set up a
// multiChannelNodeConn with no mapper so MapResponseFromChange will fail.
// markConnected() simulates a previous session leaving it connected.
nc := newMultiChannelNodeConn(id, nil)
nc.markConnected()
b.nodes.Store(id, nc)
ch := make(chan *tailcfg.MapResponse, 1)