Support for SSH check mode in ACLs #679

Open
opened 2025-12-29 02:21:59 +01:00 by adam · 10 comments
Owner

Originally created by @almereyda on GitHub (Mar 27, 2024).

Why

Tailscale upstream supports SSH check mode.

We would like to use it with Headscale, too.

Description

When defining an Tailscale SSH ACL policy with the action set to check, an additional authentication against the OIDC endpoint is required, which grants access within an optional checkPeriod, defaulting to 12 hours and allowing to be set to always.

References

This is related to, but not identical to:

  • #1303, where a user tried to use check mode
  • #1623, where upstream reports about the availability of a web check mode in the Tailscale client, currently restricted to the upstream control plane.
Originally created by @almereyda on GitHub (Mar 27, 2024). ## Why Tailscale upstream supports SSH check mode. We would like to use it with Headscale, too. ## Description When defining an Tailscale SSH ACL policy with the `action` set to `check`, an additional authentication against the OIDC endpoint is required, which grants access within an optional `checkPeriod`, defaulting to 12 hours and allowing to be set to `always`. ## References * [Configure Tailscale SSH with check · Tailscale SSH · Tailscale Docs](https://tailscale.com/kb/1193/tailscale-ssh#configure-tailscale-ssh-with-check-mode) This is related to, but not identical to: - #1303, where a user tried to use `check` mode - #1623, where upstream reports about the availability of a web check mode in the Tailscale client, currently restricted to the upstream control plane.
adam added the enhancementno-stale-botpolicy 📝 labels 2025-12-29 02:21:59 +01:00
Author
Owner

@github-actions[bot] commented on GitHub (Aug 7, 2024):

This issue is stale because it has been open for 90 days with no activity.

@github-actions[bot] commented on GitHub (Aug 7, 2024): This issue is stale because it has been open for 90 days with no activity.
Author
Owner

@almereyda commented on GitHub (Aug 7, 2024):

No stale activity.

@almereyda commented on GitHub (Aug 7, 2024): No stale activity.
Author
Owner

@dparv commented on GitHub (Aug 22, 2024):

+1

@dparv commented on GitHub (Aug 22, 2024): +1
Author
Owner

@lordwelch commented on GitHub (Mar 16, 2025):

I tested check mode and it generally works, however when checkPeriod is set to "always" the node will go offline shortly after and I'll have to either turn off check mode or set the checkPeriod to an actual duration. tested on v0.25.1

@lordwelch commented on GitHub (Mar 16, 2025): I tested check mode and it generally works, however when checkPeriod is set to "always" the node will go offline shortly after and I'll have to either turn off check mode or set the checkPeriod to an actual duration. tested on v0.25.1
Author
Owner

@lordwelch commented on GitHub (Mar 16, 2025):

Also it appears that check mode is implemented differently in headscale currently. When testing with tailscale proper a checkPeriod of 1 minute triggers a re-authentication before allowing me in and for the next minute new connections do not require a new authentication. In headscale it doesn't make me re-authenticate but will kill the session after 1 minute regardless of what I do. tested on v0.25.1

@lordwelch commented on GitHub (Mar 16, 2025): Also it appears that check mode is implemented differently in headscale currently. When testing with tailscale proper a checkPeriod of 1 minute triggers a re-authentication before allowing me in and for the next minute new connections do not require a new authentication. In headscale it doesn't make me re-authenticate but will kill the session after 1 minute regardless of what I do. tested on v0.25.1
Author
Owner

@Codelica commented on GitHub (May 13, 2025):

I took a look at this also, as we'd gladly pay a bounty for SSH check mode support. (Sadly we have no Go devs).

Looking at the policy code for both V1 and V2, they both seem to use the checkPeriod to set SessionDuration in the response:

V1:
43943aeee9/hscontrol/policy/v1/acls.go (L373)
V2:
43943aeee9/hscontrol/policy/v2/filter.go (L92)

Which according to Tailscale code seems to be "how long the session can stay open before being forcefully terminated":

fccba5a2f1/tailcfg/tailcfg.go (L2692)

So it seems this was implemented incorrectly.

My guess is that a conditional combination of Message and HoldAndDelegate is what would be used to block/wait on verification if the user hasn't authenticated within the check period:

fccba5a2f1/tailcfg/tailcfg.go (L2700)

I know the focus is currently on the policy v2 re-write beta, but perhaps it could be reviewed as a bug? :) If not, maybe it's a candidate for the next version?

Thanks for reading...

@Codelica commented on GitHub (May 13, 2025): I took a look at this also, as we'd _gladly_ pay a bounty for SSH check mode support. (Sadly we have no Go devs). Looking at the policy code for both V1 and V2, they both seem to use the `checkPeriod` to set `SessionDuration` in the response: V1: https://github.com/juanfont/headscale/blob/43943aeee9134cf6a76a380ebb1bc3ac7803d830/hscontrol/policy/v1/acls.go#L373 V2: https://github.com/juanfont/headscale/blob/43943aeee9134cf6a76a380ebb1bc3ac7803d830/hscontrol/policy/v2/filter.go#L92 Which according to Tailscale code seems to be "how long the session can stay open before being forcefully terminated": https://github.com/tailscale/tailscale/blob/fccba5a2f1a5e5dbde9e2fa57e33651b8fd047eb/tailcfg/tailcfg.go#L2692 So it seems this was implemented incorrectly. My guess is that a conditional combination of `Message` and `HoldAndDelegate` is what would be used to block/wait on verification if the user hasn't authenticated within the check period: https://github.com/tailscale/tailscale/blob/fccba5a2f1a5e5dbde9e2fa57e33651b8fd047eb/tailcfg/tailcfg.go#L2700 I know the focus is currently on the policy v2 re-write beta, but perhaps it could be reviewed as a bug? :) If not, maybe it's a candidate for the next version? Thanks for reading...
Author
Owner

@kradalby commented on GitHub (May 14, 2025):

@Codelica I noticed as well that there was something funky with the implementation, but I tried to separate the 0.26 release to be the rewrite so the feature creep it. I'm going to release it now, and then we start a new cycle which I will add this as a bug that needs to be fixed.

gladly pay a bounty for SSH check mode support

As people know, I am paid, so not needed, but donating to Juan/The project is always welcome to cover costs for anything from domain and our dream about putting up own build servers.

@kradalby commented on GitHub (May 14, 2025): @Codelica I noticed as well that there was something funky with the implementation, but I tried to separate the 0.26 release to be the rewrite so the feature creep it. I'm going to release it now, and then we start a new cycle which I will add this as a bug that needs to be fixed. > gladly pay a bounty for SSH check mode support As people know, I am paid, so not needed, but donating to Juan/The project is always welcome to cover costs for anything from domain and our dream about putting up own build servers.
Author
Owner

@Codelica commented on GitHub (May 14, 2025):

Excellent! Count us in for testing. 👍 Will get migrated to 0.26.

@Codelica commented on GitHub (May 14, 2025): Excellent! Count us in for testing. 👍 Will get migrated to 0.26.
Author
Owner

@kradalby commented on GitHub (May 19, 2025):

I spent some exploratory time on this last Friday. It is quite a lot more involved than anticipated, but not impossible. I think it should be possible to implement in this release. I am not sure how it will be to test it yet, which is a concern for ensuring it will break over time.

@kradalby commented on GitHub (May 19, 2025): I spent some exploratory time on this last Friday. It is quite a lot more involved than anticipated, but not impossible. I think it should be possible to implement in this release. I am not sure how it will be to test it yet, which is a concern for ensuring it will break over time.
Author
Owner

@Codelica commented on GitHub (May 19, 2025):

I can definitely appreciate that (testing difficulty) with time, oidc and ssh involved. It looks like checkPeriod has a minimum of 1 minute and maximum of 168 hours (1 week), along with an "always" option to always check. But even a test scenario using a short duration, delays, etc could get complicated depending how far it's taken (re-auth declines/failures, etc).

Personally I think it's a key feature for Tailscale SSH access though (especially for connections left up 24x7), and would love to see it working. I can promise once it's functional we'll definitely be doing a "real world" test before deploying any new version to production. If you'd consider adding a final RC release (or somehow indicating a final beta), we could definitely promise to test those before new version releases. I realize that's not as good as an automated test that can be continually run, but hopefully better than nothing. If the logic is isolated and written defensively when possible (fail securely, etc), maybe that and hand testing is enough to start? :)

@Codelica commented on GitHub (May 19, 2025): I can definitely appreciate that (testing difficulty) with time, oidc and ssh involved. It looks like `checkPeriod` has a minimum of 1 minute and maximum of 168 hours (1 week), along with an "always" option to always check. But even a test scenario using a short duration, delays, etc could get complicated depending how far it's taken (re-auth declines/failures, etc). Personally I think it's a key feature for Tailscale SSH access though (especially for connections left up 24x7), and would love to see it working. I can promise once it's functional we'll definitely be doing a "real world" test before deploying any new version to production. If you'd consider adding a final RC release (or somehow indicating a final beta), we could definitely promise to test those before new version releases. I realize that's not as good as an automated test that can be continually run, but hopefully better than nothing. If the logic is isolated and written defensively when possible (fail securely, etc), maybe that and hand testing is enough to start? :)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#679