The Batcher's connected field (*xsync.Map[types.NodeID, *time.Time])
encoded three states via pointer semantics:
- nil value: node is connected
- non-nil time: node disconnected at that timestamp
- key missing: node was never seen
This was error-prone (nil meaning 'connected' inverts Go idioms),
redundant with b.nodes + hasActiveConnections(), and required keeping
two parallel maps in sync. It also contained a bug in RemoveNode where
new(time.Now()) was used instead of &now, producing a zero time.
Replace the separate connected map with a disconnectedAt field on
multiChannelNodeConn (atomic.Pointer[time.Time]), tracked directly
on the object that already manages the node's connections.
Changes:
- Add disconnectedAt field and helpers (markConnected, markDisconnected,
isConnected, offlineDuration) to multiChannelNodeConn
- Remove the connected field from Batcher
- Simplify IsConnected from two map lookups to one
- Simplify ConnectedMap and Debug from two-map iteration to one
- Rewrite cleanupOfflineNodes to scan b.nodes directly
- Remove the markDisconnectedIfNoConns helper
- Update all tests and benchmarks
Fixes#3141
- TestBatcher_CloseBeforeStart_DoesNotHang: verifies Close() before
Start() returns promptly now that done is initialized in NewBatcher.
- TestBatcher_QueueWorkAfterClose_DoesNotHang: verifies queueWork
returns via the done channel after Close(), even without Start().
- TestIsConnected_FalseAfterAddNodeFailure: verifies IsConnected
returns false after AddNode fails and removes the last connection.
- TestRemoveConnectionAtIndex_NilsTrailingSlot: verifies the backing
array slot is nil-ed after removal to avoid retaining pointers.
Updates #2545
Add four unit tests guarding fixes introduced in recent commits:
- TestConnectionEntry_SendFastPath_TimerStopped: verifies the
time.NewTimer fix (H1) does not leak goroutines after many
fast-path sends on a buffered channel.
- TestBatcher_CloseWaitsForWorkers: verifies Close() blocks until all
worker goroutines exit (H3), preventing sends on torn-down channels.
- TestBatcher_CloseThenStartIsNoop: verifies the one-shot lifecycle
contract; Start() after Close() must not spawn new goroutines.
- TestBatcher_CloseStopsTicker: verifies Close() stops the internal
ticker to prevent resource leaks.
Updates #2545
Add comprehensive unit tests for the LockFreeBatcher covering
AddNode/RemoveNode lifecycle, addToBatch routing (broadcast, targeted,
full update), processBatchedChanges deduplication, cleanup of offline
nodes, close/shutdown behavior, IsConnected state tracking, and
connected map consistency.
Add benchmarks for connection entry send, multi-channel send and
broadcast, peer diff computation, sentPeers updates, addToBatch at
various scales (10/100/1000 nodes), processBatchedChanges, broadcast
delivery, IsConnected lookups, connected map enumeration, connection
churn, and concurrent send+churn scenarios.
Widen setupBatcherWithTestData to accept testing.TB so benchmarks can
reuse the same database-backed test setup as unit tests.