Feature Request: Config-level switch to disable node-key expiration (for long-lived IoT fleets) #1062

Closed
opened 2025-12-29 02:28:02 +01:00 by adam · 4 comments
Owner

Originally created by @NorbNorb on GitHub (Jul 11, 2025).

Use case

We operate hundreds of battery-powered IoT devices that may remain offline for months or even years (e.g. in storage, remote sites without WAN, manual shutdown).

Each unit is provisioned in-factory with a reusable PAK and then shipped; once in the field we have no physical or remote means to re-run authentication.

When the node finally comes back online we must be able to SSH immediately for service, recovery or firmware updates.

Description

Current situation
Headscale tracks node.expiry. Once the expiry passes the node stops forwarding traffic.

The only workaround is to run headscale nodes expire --reset … (or set expiry = NULL in SQLite) per node, either after enrollment or post-expiry.

Automating that reset via cron/API is fragile: a missed run or a future schema change could strand thousands of devices.

Requested behavior

Tailscale has a similar feature: Disabling key expiry

Add a single configuration flag - global or tag-scoped - that disables key expiry entirely for matching nodes.

# config.yaml
...
node_key_expiration:
  disabled: true            # global switch
  # or more granular
  exempt_tags:
    - tag:battery           # never expire devices with this tag

If disabled: true, Headscale should skip scheduling expiry for every new node and ignore expiry checks during map generation.

If exempt_tags is supplied, only nodes carrying one of those tags are exempt; others follow normal 180 d rotation.

Impact

Zero behavior change for existing users: the feature defaults to disabled = false.

Simplifies large-scale IoT deployments by removing a hidden operational pitfall.

This would keep security knobs (manual nodes expire …, ACL tags) intact for cases where an operator really wants to revoke a stale device.

Thank you for considering, that would really have a huge impact for us!

Contribution

  • I can write the design doc for this feature
  • I can contribute this feature

How can it be implemented?

No response

Originally created by @NorbNorb on GitHub (Jul 11, 2025). ### Use case We operate hundreds of battery-powered IoT devices that may remain offline for months or even years (e.g. in storage, remote sites without WAN, manual shutdown). Each unit is provisioned in-factory with a reusable PAK and then shipped; once in the field we have no physical or remote means to re-run authentication. When the node finally comes back online we must be able to SSH immediately for service, recovery or firmware updates. ### Description **Current situation** Headscale tracks `node.expiry`. Once the expiry passes the node stops forwarding traffic. The only workaround is to run `headscale nodes expire --reset …` (or set `expiry = NULL` in SQLite) per node, either after enrollment or post-expiry. Automating that reset via cron/API is fragile: a missed run or a future schema change could strand thousands of devices. **Requested behavior** Tailscale has a similar feature: [Disabling key expiry](https://tailscale.com/kb/1028/key-expiry#disabling-key-expiry) _Add a single configuration flag - global or tag-scoped - that disables key expiry entirely for matching nodes._ ``` # config.yaml ... node_key_expiration: disabled: true # global switch # or more granular exempt_tags: - tag:battery # never expire devices with this tag ``` If `disabled: true`, Headscale should skip scheduling expiry for every new node and ignore `expiry` checks during map generation. If _exempt_tags_ is supplied, only nodes carrying one of those tags are exempt; others follow normal 180 d rotation. **Impact** Zero behavior change for existing users: the feature defaults to `disabled = false`. Simplifies large-scale IoT deployments by removing a hidden operational pitfall. This would keep security knobs (manual nodes expire …, ACL tags) intact for cases where an operator really wants to revoke a stale device. Thank you for considering, that would really have a huge impact for us! ### Contribution - [x] I can write the design doc for this feature - [ ] I can contribute this feature ### How can it be implemented? _No response_
adam added the enhancementstale labels 2025-12-29 02:28:02 +01:00
adam closed this issue 2025-12-29 02:28:02 +01:00
Author
Owner

@Janhouse commented on GitHub (Jul 27, 2025):

I am puzzled how this has not been implemented from day 1. This is such a crucial feature for almost any tailnet. 🤔

@Janhouse commented on GitHub (Jul 27, 2025): I am puzzled how this has not been implemented from day 1. This is such a crucial feature for almost any tailnet. 🤔
Author
Owner

@github-actions[bot] commented on GitHub (Oct 26, 2025):

This issue is stale because it has been open for 90 days with no activity.

@github-actions[bot] commented on GitHub (Oct 26, 2025): This issue is stale because it has been open for 90 days with no activity.
Author
Owner

@github-actions[bot] commented on GitHub (Nov 2, 2025):

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions[bot] commented on GitHub (Nov 2, 2025): This issue was closed because it has been inactive for 14 days since being marked as stale.
Author
Owner

@marcotuna commented on GitHub (Nov 21, 2025):

I’d like to raise a concern about environments where key rotation or renewal isn’t feasible.

In many industrial or remote operational settings, devices are hard to access and maintenance windows are extremely limited. If a node’s key expires, it may become permanently unreachable, since no one on-site can intervene.

Having an option to extend or disable key expiry for specific nodes, or to allow long-lived keys in controlled scenarios, would make Headscale more usable in industrial and isolated environments. It would also help maintain compatibility with Tailscale’s behaviour, which supports extended or disabled key expiry in certain situations, as documented here: https://tailscale.com/kb/1028/key-expiry

I’m also willing to contribute to the implementation of this feature, provided the approach is accepted by the project maintainers.

@marcotuna commented on GitHub (Nov 21, 2025): I’d like to raise a concern about environments where key rotation or renewal isn’t feasible. In many industrial or remote operational settings, devices are hard to access and maintenance windows are extremely limited. If a node’s key expires, it may become permanently unreachable, since no one on-site can intervene. Having an option to extend or disable key expiry for specific nodes, or to allow long-lived keys in controlled scenarios, would make Headscale more usable in industrial and isolated environments. It would also help maintain compatibility with Tailscale’s behaviour, which supports extended or disabled key expiry in certain situations, as documented here: https://tailscale.com/kb/1028/key-expiry I’m also willing to contribute to the implementation of this feature, provided the approach is accepted by the project maintainers.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#1062