When already-expired node is set to "Never Expire" (expiry is NULL), it does not go back to logged-in status. #681

Closed
opened 2025-12-29 02:21:59 +01:00 by adam · 5 comments
Owner

Originally created by @benmehlman on GitHub (Mar 28, 2024).

Bug description

Using Tailscale's control server: when a node expires, it remains connected to the control server, although it no longer passes tailnet traffic. The node can be restored to operation by selecting "Disable key expiry" in the Tailscale admin UI.
It will start passing traffic again, without having to re-authenticate or take any other action on the node machine itself.

This does not work on headscale.

Environment

  • OS: Debian 12.4
  • Headscale version: v0.23.0-alpha5
  • Tailscale version: 1.62.0
  • Headscale is behind a (reverse) proxy
    Yes, nginx.. I have to in order to host the ui.. but.. I've been in the discord and nobody really knows much about expiration behavior, I seem to be the only person active there who is really interested in expiration behavior right now.

The reverse proxy seems to be working fine as everything else related to node-server communication is working perfectly and it's been very stable.

  • Headscale runs in a container

To Reproduce

These steps assume OIDC is in use...

  1. In config.yaml, set oidc expiry to a short time so that expiration can be easily observed (eg. "5m"), and restart the service,
  2. Run "tailscale up" on the node with the appropriate parameters to connect to the headscale instance.
  3. Complete OIDC login.
  4. Observe that the node is connected to the tailnet as normal.

On the headscale server:

  1. Wait for the node to expire.
  2. Observe that headscale nodes list indicates that the node is connected but expired, as expected.
  3. Test the node connectivity to confirm that it has stopped passing traffic as expected.
  4. Set the node to "Disable key expiry" by using sqlite3 to execute: UPDATE node SET expiry = NULL WHERE id = the_node_id;
  5. Observe that headscale nodes list indicates that the node is "online" and expired is "no".
  6. Observe that, even after some time is allowed for polling (if necessary), the node does not resume passing traffic, and tailscale status on the node remains "Logged out".

Logs and attachments

netmap_recover_after_expiry.json

Originally created by @benmehlman on GitHub (Mar 28, 2024). <!-- Before posting a bug report, discuss the behaviour you are expecting with the Discord community to make sure that it is truly a bug. The issue tracker is not the place to ask for support or how to set up Headscale. Bug reports without the sufficient information will be closed. Headscale is a multinational community across the globe. Our language is English. All bug reports needs to be in English. --> ## Bug description Using Tailscale's control server: when a node expires, it remains connected to the control server, although it no longer passes tailnet traffic. The node can be restored to operation by selecting "Disable key expiry" in the Tailscale admin UI. It will start passing traffic again, without having to re-authenticate or take any other action on the node machine itself. This does not work on headscale. ## Environment <!-- Please add relevant information about your system. For example: - Version of headscale used - Version of tailscale client - OS (e.g. Linux, Mac, Cygwin, WSL, etc.) and version - Kernel version - The relevant config parameters you used - Log output --> - OS: Debian 12.4 - Headscale version: v0.23.0-alpha5 - Tailscale version: 1.62.0 <!-- We do not support running Headscale in a container nor behind a (reverse) proxy. If either of these are true for your environment, ask the community in Discord instead of filing a bug report. --> - [X] Headscale is behind a (reverse) proxy Yes, nginx.. I have to in order to host the ui.. but.. I've been in the discord and nobody really knows much about expiration behavior, I seem to be the only person active there who is really interested in expiration behavior right now. The reverse proxy seems to be working fine as everything else related to node-server communication is working perfectly and it's been very stable. - [ ] Headscale runs in a container ## To Reproduce <!-- Steps to reproduce the behavior. --> These steps assume OIDC is in use... 1. In config.yaml, set oidc expiry to a short time so that expiration can be easily observed (eg. "5m"), and restart the service, 2. Run "tailscale up" on the node with the appropriate parameters to connect to the headscale instance. 3. Complete OIDC login. 4. Observe that the node is connected to the tailnet as normal. On the headscale server: 5. Wait for the node to expire. 6. Observe that `headscale nodes list` indicates that the node is connected but expired, as expected. 7. Test the node connectivity to confirm that it has stopped passing traffic as expected. 8. Set the node to "Disable key expiry" by using `sqlite3` to execute: `UPDATE node SET expiry = NULL WHERE id = the_node_id;` 9. Observe that `headscale nodes list` indicates that the node is "online" and expired is "no". 10. Observe that, even after some time is allowed for polling (if necessary), the node does not resume passing traffic, and `tailscale status` on the node remains "Logged out". ## Logs and attachments <!-- Please attach files with: - Client netmap dump (see below) - ACL configuration - Headscale configuration Dump the netmap of tailscale clients: `tailscale debug netmap > DESCRIPTIVE_NAME.json` [netmap_recover_after_expiry.json](https://github.com/juanfont/headscale/files/14792938/netmap_recover_after_expiry.json) Please provide information describing the netmap, which client, which headscale version etc. --> [netmap_recover_after_expiry.json](https://github.com/juanfont/headscale/files/14792940/netmap_recover_after_expiry.json)
adam added the stalebug labels 2025-12-29 02:21:59 +01:00
adam closed this issue 2025-12-29 02:21:59 +01:00
Author
Owner

@kradalby commented on GitHub (May 1, 2024):

So this will not really work since changing the database will not trigger any of the mechanisms that update the clients. I would think that if you change the database and restarts headscale it might work.

I think essentially what we need is a new command set-expiry or something which sets a new expiry and the nodes are appropriately updated.

I'm going to remove this from 0.23.0, it is important, but it is not a regression at should be tackled after.

@kradalby commented on GitHub (May 1, 2024): So this will not really work since changing the database will not trigger any of the mechanisms that update the clients. I would think that if you change the database and restarts headscale it _might_ work. I think essentially what we need is a new command `set-expiry` or something which sets a new expiry and the nodes are appropriately updated. I'm going to remove this from 0.23.0, it is important, but it is not a regression at should be tackled after.
Author
Owner

@benmehlman commented on GitHub (May 7, 2024):

I did try restarting headscale, it didn't cause the node to come back online.. so, there is some other detail that is not quite right.

May I suggest that rather than a separate api for set-expiry, rather implement PATCH so that as new columns are added in the future it would be easy to add them to the API without adding more endpoints?

Also I suggest adding a separate boolean column for "never_expire". This removes the ambiguity when a node which has never authenticated has an expiry = null.

@benmehlman commented on GitHub (May 7, 2024): I did try restarting headscale, it didn't cause the node to come back online.. so, there is some other detail that is not quite right. May I suggest that rather than a separate api for set-expiry, rather implement PATCH so that as new columns are added in the future it would be easy to add them to the API without adding more endpoints? Also I suggest adding a separate boolean column for "never_expire". This removes the ambiguity when a node which has never authenticated has an expiry = null.
Author
Owner

@github-actions[bot] commented on GitHub (Aug 6, 2024):

This issue is stale because it has been open for 90 days with no activity.

@github-actions[bot] commented on GitHub (Aug 6, 2024): This issue is stale because it has been open for 90 days with no activity.
Author
Owner

@github-actions[bot] commented on GitHub (Aug 13, 2024):

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions[bot] commented on GitHub (Aug 13, 2024): This issue was closed because it has been inactive for 14 days since being marked as stale.
Author
Owner

@HarukaMa commented on GitHub (Aug 13, 2024):

It's only been 7 days, and I think this is still a valid issue (I might run into this in a couple months to be exact).

@HarukaMa commented on GitHub (Aug 13, 2024): It's only been 7 days, and I think this is still a valid issue (I might run into this in a couple months to be exact).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#681