Deleted node is back online #80

Closed
opened 2025-12-29 01:21:52 +01:00 by adam · 5 comments
Owner

Originally created by @nblock on GitHub (Nov 21, 2021).

Running a test network with 4 nodes on headscale v0.11.0.

On the headscale server, I tried to remove a currently active node. I want to block/delete it from the network.

List nodes:

headscale -n net1 nodes list
ID | Name     | NodeKey | Namespace | IP address | Ephemeral | Last seen           | Online
1  | laptop   | [XX]    | net1      | 100.64.0.1 | false     | 2021-11-21 13:01:42 | true  
2  | test-2   | [YY]    | net1      | 100.64.0.2 | false     | 2021-11-21 13:01:16 | true  
3  | test-1   | [ZZ]    | net1      | 100.64.0.3 | false     | 2021-11-21 13:01:10 | true  
4  | bullseye | [AA]    | net1      | 100.64.0.4 | false     | 2021-11-21 13:01:35 | true 

Delete node

headscale -n net1 nodes delete 4
Do you want to remove the node bullseye? Yes
Node deleted

After deletion:

$ headscale -n net1 nodes list
ID | Name   | NodeKey | Namespace | IP address | Ephemeral | Last seen           | Online
1  | laptop | [XX]    | net1      | 100.64.0.1 | false     | 2021-11-21 13:04:42 | true  
2  | test-2 | [YY]    | net1      | 100.64.0.2 | false     | 2021-11-21 13:04:16 | true  
3  | test-1 | [ZZ]    | net1      | 100.64.0.3 | false     | 2021-11-21 13:05:10 | true  

Just a short time later, without touching any of the nodes or connections

$ headscale -n net1 nodes list
ID | Name     | NodeKey | Namespace | IP address | Ephemeral | Last seen           | Online
1  | laptop   | [XX]    | net1      | 100.64.0.1 | false     | 2021-11-21 13:06:42 | true  
2  | test-2   | [YY]    | net1      | 100.64.0.2 | false     | 2021-11-21 13:06:16 | true  
3  | test-1   | [ZZ]    | net1      | 100.64.0.3 | false     | 2021-11-21 13:07:10 | true  
4  | bullseye | [AA]    | net1      | 100.64.0.4 | false     | 2021-11-21 13:06:37 | true 

Logs

2021-11-21T13:05:17Z ERR Cannot update machine from database error="record not found" channel=Done handler=PollNetMapStream machine=bullseye
2021-11-21T13:05:17Z WRN Ignoring request, cannot find machine with key e64efccaa9e6f4b59f77c3345a1c38ac518b46712f1086f59e96672515a4be0f handler=PollNetMap
[GIN] 2021/11/21 - 13:05:17 | 401 |      376.24µs |    11.22.33.44 | POST     "/machine/e64efccaa9e6f4b59f77c3345a1c38ac518b46712f1086f59e96672515a4be0f/map"
[GIN] 2021/11/21 - 13:05:17 | 200 |         9m19s |    11.22.33.44 | POST     "/machine/e64efccaa9e6f4b59f77c3345a1c38ac518b46712f1086f59e96672515a4be0f/map"
2021-11-21T13:05:18Z INF Client is ready to access the tailnet handler=PollNetMap machine=bullseye
2021-11-21T13:05:18Z INF Sending initial map handler=PollNetMap machine=bullseye
2021-11-21T13:05:18Z INF Notifying peers handler=PollNetMap machine=bullseye

How do I block a node from my network without touching the node itself?

Please bear with me, its the first time I play with headscale - maybe missed something.

Originally created by @nblock on GitHub (Nov 21, 2021). Running a test network with 4 nodes on headscale v0.11.0. On the headscale server, I tried to remove a currently active node. I want to block/delete it from the network. List nodes: ```bash headscale -n net1 nodes list ID | Name | NodeKey | Namespace | IP address | Ephemeral | Last seen | Online 1 | laptop | [XX] | net1 | 100.64.0.1 | false | 2021-11-21 13:01:42 | true 2 | test-2 | [YY] | net1 | 100.64.0.2 | false | 2021-11-21 13:01:16 | true 3 | test-1 | [ZZ] | net1 | 100.64.0.3 | false | 2021-11-21 13:01:10 | true 4 | bullseye | [AA] | net1 | 100.64.0.4 | false | 2021-11-21 13:01:35 | true ``` Delete node ```bash headscale -n net1 nodes delete 4 Do you want to remove the node bullseye? Yes Node deleted ``` After deletion: ```bash $ headscale -n net1 nodes list ID | Name | NodeKey | Namespace | IP address | Ephemeral | Last seen | Online 1 | laptop | [XX] | net1 | 100.64.0.1 | false | 2021-11-21 13:04:42 | true 2 | test-2 | [YY] | net1 | 100.64.0.2 | false | 2021-11-21 13:04:16 | true 3 | test-1 | [ZZ] | net1 | 100.64.0.3 | false | 2021-11-21 13:05:10 | true ``` Just a short time later, without touching any of the nodes or connections ```bash $ headscale -n net1 nodes list ID | Name | NodeKey | Namespace | IP address | Ephemeral | Last seen | Online 1 | laptop | [XX] | net1 | 100.64.0.1 | false | 2021-11-21 13:06:42 | true 2 | test-2 | [YY] | net1 | 100.64.0.2 | false | 2021-11-21 13:06:16 | true 3 | test-1 | [ZZ] | net1 | 100.64.0.3 | false | 2021-11-21 13:07:10 | true 4 | bullseye | [AA] | net1 | 100.64.0.4 | false | 2021-11-21 13:06:37 | true ``` Logs ``` 2021-11-21T13:05:17Z ERR Cannot update machine from database error="record not found" channel=Done handler=PollNetMapStream machine=bullseye 2021-11-21T13:05:17Z WRN Ignoring request, cannot find machine with key e64efccaa9e6f4b59f77c3345a1c38ac518b46712f1086f59e96672515a4be0f handler=PollNetMap [GIN] 2021/11/21 - 13:05:17 | 401 | 376.24µs | 11.22.33.44 | POST "/machine/e64efccaa9e6f4b59f77c3345a1c38ac518b46712f1086f59e96672515a4be0f/map" [GIN] 2021/11/21 - 13:05:17 | 200 | 9m19s | 11.22.33.44 | POST "/machine/e64efccaa9e6f4b59f77c3345a1c38ac518b46712f1086f59e96672515a4be0f/map" 2021-11-21T13:05:18Z INF Client is ready to access the tailnet handler=PollNetMap machine=bullseye 2021-11-21T13:05:18Z INF Sending initial map handler=PollNetMap machine=bullseye 2021-11-21T13:05:18Z INF Notifying peers handler=PollNetMap machine=bullseye ``` How do I block a node from my network without touching the node itself? Please bear with me, its the first time I play with headscale - maybe missed something.
adam closed this issue 2025-12-29 01:21:53 +01:00
Author
Owner

@kradalby commented on GitHub (Nov 22, 2021):

Hi, thanks for reporting this, I am currently working on the registration workflow and I'll add some time for this;

However, in a way, this is somewhat expected if you used an authkey without expiry, as the node will have that and just try to redirect.

In a PR I am working on, there will be an option to expire nodes, which will not let them connect without a valid auth again.

That said, you must ensure that the node does not have a valid auth key if you dont want it to connect again.

@kradalby commented on GitHub (Nov 22, 2021): Hi, thanks for reporting this, I am currently working on the registration workflow and I'll add some time for this; However, in a way, this is somewhat expected if you used an authkey without expiry, as the node will have that and just try to redirect. In a PR I am working on, there will be an option to expire nodes, which will not let them connect without a valid auth again. That said, you must ensure that the node does not have a valid auth key if you dont want it to connect again.
Author
Owner

@nblock commented on GitHub (Nov 22, 2021):

However, in a way, this is somewhat expected if you used an authkey without expiry, as the node will have that and just try to redirect.

So instead of the "regular" auth, should I switch to expiring pre_auth_keys instead?

That said, you must ensure that the node does not have a valid auth key if you dont want it to connect again.

Can I somehow do this from the serverside or do I effectively need to stop tailscaled on the client and purge its state?

@nblock commented on GitHub (Nov 22, 2021): > However, in a way, this is somewhat expected if you used an authkey without expiry, as the node will have that and just try to redirect. So instead of the "regular" auth, should I switch to expiring `pre_auth_keys` instead? > That said, you must ensure that the node does not have a valid auth key if you dont want it to connect again. Can I somehow do this from the serverside or do I effectively need to stop `tailscaled` on the client and purge its state?
Author
Owner

@enoperm commented on GitHub (Jan 16, 2022):

I have encountered this issue as well, it has nothing to with authkeys.

The issue is, there is a Machine instance in memory retained by PollNetMapStream, which implements update notifications for nodes. Removing a node from the database does not terminate the goroutine that serves that machine, and that function - regardless of whether or not the machine is in the databse at the moment - invokes db.Save to update the last_succesful_update column. Either PollNetMapStream needs to check for this case, or the context needs to be terminated. Even if this issue in particular is resolved, this still means there may be a race, where PollNetMapStream reads a machine, it gets updated elsewhere (say, the client pushes a new address or a name change, or whatever is currently/will be supported by protocol), and then PollNetMapStream writes back the old state with a new timestamp. I think the proper fix would be to force GORM to use an UPDATE ... SET last_successful_update=? instead of rewriting entire rows at once, but I have no idea whether it can actually do that without dropping down to just plain old SQL.

@enoperm commented on GitHub (Jan 16, 2022): I have encountered this issue as well, it has nothing to with authkeys. The issue is, there is a `Machine` instance in memory retained by `PollNetMapStream`, which implements update notifications for nodes. Removing a node from the database does not terminate the goroutine that serves that machine, and that function - regardless of whether or not the machine *is* in the databse at the moment - invokes `db.Save` to update the `last_succesful_update` column. Either `PollNetMapStream` needs to check for this case, or the context needs to be terminated. Even if this issue in particular is resolved, this still means there may be a race, where `PollNetMapStream` reads a machine, it gets updated elsewhere (say, the client pushes a new address or a name change, or whatever is currently/will be supported by protocol), and then `PollNetMapStream` writes back the old state with a new timestamp. I *think* the proper fix would be to force GORM to use an `UPDATE ... SET last_successful_update=?` instead of rewriting entire rows at once, but I have no idea whether it can actually do that without dropping down to just plain old SQL.
Author
Owner

@enoperm commented on GitHub (Jan 16, 2022):

Basically, as long as the long poll connection remains active, clients can not be forcefully dropped from the tailnet.

@enoperm commented on GitHub (Jan 16, 2022): Basically, as long as the long poll connection remains active, clients can not be forcefully dropped from the tailnet.
Author
Owner

@enoperm commented on GitHub (Jan 16, 2022):

There seems to be some support for it, but one still needs to know the underlying column names used by GORM: https://gorm.io/docs/update.html

@enoperm commented on GitHub (Jan 16, 2022): There seems to be *some* support for it, ~~but one still needs to know the underlying column names used by GORM~~: https://gorm.io/docs/update.html
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#80