[Bug] v0.23.0-beta1 breaks built-in DERP #746

Closed
opened 2025-12-29 02:23:10 +01:00 by adam · 17 comments
Owner

Originally created by @christian-heusel on GitHub (Jul 23, 2024).

Is this a support request?

  • This is not a support request

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

$ tailscale status
100.64.0.1      pioneer              chris        linux   -
100.64.0.2      dj-magic-laserbrain  chris        linux   -
100.64.0.5      joeryzen             chris        linux   offline
100.64.0.4      meterpeter           chris        linux   -
100.64.0.6      scotty-the-fifth     chris        linux   idle, tx 4884 rx 0

# Health check:
#     - Tailscale could not connect to the 'Headscale Embedded DERP' relay server. Your Internet connection might be down, or the server might be temporarily unavailable.
#     - Tailscale could not connect to any relay server. Check your Internet connection.

Expected Behavior

The builtin DERP keeps on working with the update, I have also configured and used this setup for a long time now.

Steps To Reproduce

  1. update headscale to version v0.23.0-beta1
  2. observe that the builtin DERP stops working
  3. revert back to v0.23.0-alpha12
  4. observe that the DERP works again

I hope that I did not miss anything in the changelogs, but to me it looks like there was no config changes etc. required to keep this working between the two relevant versions.

Environment

- OS: Debian GNU/Linux trixie/sid 
- Headscale version: v0.23.0-beta1
- Tailscale version: 1.70.0

Runtime environment

  • Headscale is behind a (reverse) proxy
  • Headscale runs in a container

Although both of the above are the case the DERP server is just publicly accesible:

ports:
  - 0.0.0.0:3478:3478

Anything else?

The startup log claims that I do not have any DERP's configured:

headscale  | 2024-07-23T00:58:26Z INF Opening database database=sqlite3 path=/var/lib/headscale/db.sqlite
headscale  | 2024-07-23T00:58:26Z WRN DERP map is empty, not a single DERP map datasource was loaded correctly or contained a region
headscale  | 2024-07-23T00:58:26Z INF home/runner/work/headscale/headscale/hscontrol/derp/server/derp_server.go:103 > DERP region: {RegionID:999 RegionCode:christian-derp RegionName:Headscale Embedded DERP Latitude:0 Longitude:0 Avoid:false Nodes:[0xc00034b7a0]}
headscale  | 2024-07-23T00:58:26Z INF home/runner/work/headscale/headscale/hscontrol/derp/server/derp_server.go:104 > DERP Nodes[0]: &{Name:999 RegionID:999 HostName:vpn.heusel.eu CertName: IPv4: IPv6: STUNPort:3478 STUNOnly:false DERPPort:443 InsecureForTests:false STUNTestIP: CanPort80:false}
headscale  | 2024-07-23T00:58:26Z INF STUN server started at [::]:3478
headscale  | 2024-07-23T00:58:26Z INF Setting up a DERPMap update worker frequency=86400000

and yet this is my derp config (snippet), which used to work with the previous versions:

# DERP is a relay system that Tailscale uses when a direct
# connection cannot be established.
# https://tailscale.com/blog/how-tailscale-works/#encrypted-tcp-relays-derp
#
# headscale needs a list of DERP servers that can be presented
# to the clients.
derp:
  server:
    # If enabled, runs the embedded DERP server and merges it into the rest of the DERP config
    # The Headscale server_url defined above MUST be using https, DERP requires TLS to be in place
    enabled: true

    # Region ID to use for the embedded DERP server.
    # The local DERP prevails if the region ID collides with other region ID coming from
    # the regular DERP config.
    region_id: 999

    # Region code and name are displayed in the Tailscale UI to identify a DERP region
    region_code: "christian-derp"
    region_name: "Headscale Embedded DERP"

    # Listens over UDP at the configured address for STUN connections - to help with NAT traversal.
    # When the embedded DERP server is enabled stun_listen_addr MUST be defined.
    #
    # For more details on how this works, check this great article: https://tailscale.com/blog/how-tailscale-works/
    stun_listen_addr: "0.0.0.0:3478"

    private_key_path: /etc/headscale/derp_server_private.key

  # List of externally available DERP maps encoded in JSON
  # urls:
  #   - https://controlplane.tailscale.com/derpmap/default

  # Locally available DERP map files encoded in YAML
  #
  # This option is mostly interesting for people hosting
  # their own DERP servers:
  # https://tailscale.com/kb/1118/custom-derp-servers/
  #
  # paths:
  #   - /etc/headscale/derp-example.yaml
  paths: []

  # If enabled, a worker will be set up to periodically
  # refresh the given sources and update the derpmap
  # will be set up.
  auto_update_enabled: true

  # How often should we check for DERP updates?
  update_frequency: 24h
Originally created by @christian-heusel on GitHub (Jul 23, 2024). ### Is this a support request? - [X] This is not a support request ### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior ``` $ tailscale status 100.64.0.1 pioneer chris linux - 100.64.0.2 dj-magic-laserbrain chris linux - 100.64.0.5 joeryzen chris linux offline 100.64.0.4 meterpeter chris linux - 100.64.0.6 scotty-the-fifth chris linux idle, tx 4884 rx 0 # Health check: # - Tailscale could not connect to the 'Headscale Embedded DERP' relay server. Your Internet connection might be down, or the server might be temporarily unavailable. # - Tailscale could not connect to any relay server. Check your Internet connection. ``` ### Expected Behavior The builtin DERP keeps on working with the update, I have also configured and used this setup for a long time now. ### Steps To Reproduce 1. update `headscale` to version v0.23.0-beta1 2. observe that the builtin DERP stops working 3. revert back to v0.23.0-alpha12 4. observe that the DERP works again I hope that I did not miss anything in the changelogs, but to me it looks like there was no config changes etc. required to keep this working between the two relevant versions. ### Environment ```markdown - OS: Debian GNU/Linux trixie/sid - Headscale version: v0.23.0-beta1 - Tailscale version: 1.70.0 ``` ### Runtime environment - [X] Headscale is behind a (reverse) proxy - [X] Headscale runs in a container Although both of the above are the case the DERP server is just publicly accesible: ```yaml ports: - 0.0.0.0:3478:3478 ``` ### Anything else? The startup log claims that I do not have any DERP's configured: ``` headscale | 2024-07-23T00:58:26Z INF Opening database database=sqlite3 path=/var/lib/headscale/db.sqlite headscale | 2024-07-23T00:58:26Z WRN DERP map is empty, not a single DERP map datasource was loaded correctly or contained a region headscale | 2024-07-23T00:58:26Z INF home/runner/work/headscale/headscale/hscontrol/derp/server/derp_server.go:103 > DERP region: {RegionID:999 RegionCode:christian-derp RegionName:Headscale Embedded DERP Latitude:0 Longitude:0 Avoid:false Nodes:[0xc00034b7a0]} headscale | 2024-07-23T00:58:26Z INF home/runner/work/headscale/headscale/hscontrol/derp/server/derp_server.go:104 > DERP Nodes[0]: &{Name:999 RegionID:999 HostName:vpn.heusel.eu CertName: IPv4: IPv6: STUNPort:3478 STUNOnly:false DERPPort:443 InsecureForTests:false STUNTestIP: CanPort80:false} headscale | 2024-07-23T00:58:26Z INF STUN server started at [::]:3478 headscale | 2024-07-23T00:58:26Z INF Setting up a DERPMap update worker frequency=86400000 ``` and yet this is my derp config (snippet), which used to work with the previous versions: ```yaml # DERP is a relay system that Tailscale uses when a direct # connection cannot be established. # https://tailscale.com/blog/how-tailscale-works/#encrypted-tcp-relays-derp # # headscale needs a list of DERP servers that can be presented # to the clients. derp: server: # If enabled, runs the embedded DERP server and merges it into the rest of the DERP config # The Headscale server_url defined above MUST be using https, DERP requires TLS to be in place enabled: true # Region ID to use for the embedded DERP server. # The local DERP prevails if the region ID collides with other region ID coming from # the regular DERP config. region_id: 999 # Region code and name are displayed in the Tailscale UI to identify a DERP region region_code: "christian-derp" region_name: "Headscale Embedded DERP" # Listens over UDP at the configured address for STUN connections - to help with NAT traversal. # When the embedded DERP server is enabled stun_listen_addr MUST be defined. # # For more details on how this works, check this great article: https://tailscale.com/blog/how-tailscale-works/ stun_listen_addr: "0.0.0.0:3478" private_key_path: /etc/headscale/derp_server_private.key # List of externally available DERP maps encoded in JSON # urls: # - https://controlplane.tailscale.com/derpmap/default # Locally available DERP map files encoded in YAML # # This option is mostly interesting for people hosting # their own DERP servers: # https://tailscale.com/kb/1118/custom-derp-servers/ # # paths: # - /etc/headscale/derp-example.yaml paths: [] # If enabled, a worker will be set up to periodically # refresh the given sources and update the derpmap # will be set up. auto_update_enabled: true # How often should we check for DERP updates? update_frequency: 24h ```
adam added the bugwell described ❤️ labels 2025-12-29 02:23:10 +01:00
adam closed this issue 2025-12-29 02:23:10 +01:00
Author
Owner

@JohanVlugt commented on GitHub (Jul 24, 2024):

I think you forgot to add /udp in the docker compose.

This new beta update works for me without changing the setup.

    ports:
      - 3478:3478/udp
@JohanVlugt commented on GitHub (Jul 24, 2024): I think you forgot to add `/udp` in the docker compose. This new beta update works for me without changing the setup. ```yaml ports: - 3478:3478/udp ```
Author
Owner

@christian-heusel commented on GitHub (Jul 25, 2024):

Adding in the /udp did indeed solve the issue, but why did this work with the pre-beta versions? 🤔

Also should this maybe be in the upgrade documentation for the final release?

@christian-heusel commented on GitHub (Jul 25, 2024): Adding in the `/udp` did indeed solve the issue, but why did this work with the pre-beta versions? 🤔 Also should this maybe be in the upgrade documentation for the final release?
Author
Owner

@christian-heusel commented on GitHub (Jul 25, 2024):

Ah nevermind it just took tailscale status a moment to realize that the DERP is gone, changing the network config does not help for me 😅

@christian-heusel commented on GitHub (Jul 25, 2024): Ah nevermind it just took `tailscale status` a moment to realize that the DERP is gone, changing the network config does not help for me 😅
Author
Owner

@kradalby commented on GitHub (Jul 25, 2024):

I'm having trouble reproducing this and all of the tests keep passing, it has me quite puzzled.

The error about empty DERP is only covering the DERP loaded via URL/file, so in this case it is displayed before the DERPs from the embedded server, and if there are no DERPs at all, the whole server will halt https://github.com/juanfont/headscale/blob/main/hscontrol/app.go#L516-L518.

Ah nevermind it just took tailscale status a moment to realize that the DERP is gone, changing the network config does not help for me 😅

Does this mean it was there initially, but then disappeared after?

@kradalby commented on GitHub (Jul 25, 2024): I'm having trouble reproducing this and all of the tests keep passing, it has me quite puzzled. The error about empty DERP is only covering the DERP loaded via URL/file, so in this case it is displayed _before_ the DERPs from the embedded server, and if there are no DERPs at all, the whole server will halt https://github.com/juanfont/headscale/blob/main/hscontrol/app.go#L516-L518. > Ah nevermind it just took tailscale status a moment to realize that the DERP is gone, changing the network config does not help for me 😅 Does this mean it _was_ there initially, but then disappeared after?
Author
Owner

@kradalby commented on GitHub (Jul 25, 2024):

I've expanded the DERP tests a bit to ensure that the embedded server isnt removed by the updater in #2030.

# Health check:
#     - Tailscale could not connect to the 'Headscale Embedded DERP' relay server. Your Internet connection might be down, or the server might be temporarily unavailable.
#     - Tailscale could not connect to any relay server. Check your Internet connection.

So this makes me think that this is a networking issue, because headscale sends the DERP server as part of the map update.
I cant really think of anything that would have changed this in the commits between the last alpha and the beta.
Could there be an external event/change to your docker setup 🤔 (odd since reverting works).

I did notice this tho:

headscale  | 2024-07-23T00:58:26Z INF STUN server started at [::]:3478

This could indicate that it only listens to IPv6? however my test logs shows the same, so I would find it odd to be the cause, and I do not think anything related to that has changed.

@kradalby commented on GitHub (Jul 25, 2024): I've expanded the DERP tests a bit to ensure that the embedded server isnt removed by the updater in #2030. ``` # Health check: # - Tailscale could not connect to the 'Headscale Embedded DERP' relay server. Your Internet connection might be down, or the server might be temporarily unavailable. # - Tailscale could not connect to any relay server. Check your Internet connection. ``` So this makes me think that this is a networking issue, because headscale sends the DERP server as part of the map update. I cant really think of anything that would have changed this in the commits between the last alpha and the beta. Could there be an external event/change to your docker setup 🤔 (odd since reverting works). I did notice this tho: ``` headscale | 2024-07-23T00:58:26Z INF STUN server started at [::]:3478 ``` This _could_ indicate that it only listens to IPv6? however my test logs shows the same, so I would find it odd to be the cause, and I do not think anything related to that has changed.
Author
Owner

@christian-heusel commented on GitHub (Jul 25, 2024):

Does this mean it was there initially, but then disappeared after?

No the way I'm testing this is that I'm redeploying the other version on my VPS and then run tailscale status on my client to see if it's still working / printing out the error.

So this makes me think that this is a networking issue, because headscale sends the DERP server as part of the map update.
I cant really think of anything that would have changed this in the commits between the last alpha and the beta.
Could there be an external event/change to your docker setup 🤔 (odd since reverting works).

This was my first thought aswell, but the issue now reproduces over multirple docker versions and really consistently with every switch of images that I do.

This could indicate that it only listens to IPv6? however my test logs shows the same, so I would find it odd to be the cause, and I do not think anything related to that has changed.

After I have switched to the -debug version of the image I was able to check this inside of the container, and the outputs were the same for both versions:

/ # netstat -lntu | grep 3478
udp        0      0 :::3478                 :::*                                
$ ss -tulpn | grep 3478
udp   UNCONN 0      0            0.0.0.0:3478       0.0.0.0:*    users:(("docker-proxy",pid=5895,fd=4))                                      
udp   UNCONN 0      0               [::]:3478          [::]:*    users:(("docker-proxy",pid=5901,fd=4))                                      

So since all of this did not help I also had a look at the output of tailscaled on my client and this looks interesting:

Jul 25 12:30:25 meterpeter tailscaled[131191]: derphttp.Client.Recv: connecting to derp-999 (christian-derp)
Jul 25 12:30:25 meterpeter tailscaled[131191]: magicsock: [0xc0035fd540] derp.Recv(derp-999): derphttp.Client.Recv connect to region 999 (christian-derp): dial tcp4: lookup vpn.heusel.eu: no such host
Jul 25 12:30:25 meterpeter tailscaled[131191]: netcheck: netcheck.runProbe: named node "999" has no v6 address
Jul 25 12:30:25 meterpeter tailscaled[131191]: netcheck: netcheck: DNS lookup error for "vpn.heusel.eu" (node "999" region 999): context canceled
Jul 25 12:30:25 meterpeter tailscaled[131191]: netcheck: netcheck.runProbe: named node "999" has no v4 address
Jul 25 12:30:27 meterpeter tailscaled[131191]: control: NetInfo: NetInfo{varies= hairpin= ipv6=false ipv6os=true udp=true icmpv4=false derp=#999 portmap=UC link="" firewallmode="ipt-default"}

So what actually seems to break is the internal DNS server (or something in that realm) and the DERP is just fallout from the before failure:

# alpha12

$ resolvectl status tailscale0         
Link 9 (tailscale0)
    Current Scopes: DNS
         Protocols: +DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 100.100.100.100
       DNS Servers: 100.100.100.100
        DNS Domain: chris.vpn.heusel.eu ~.

$ resolvectl query --cache=NO vpn.heusel.eu
vpn.heusel.eu: 49.12.6.160                     -- link: tailscale0
               (christian.heusel.eu)

# extra records
$ resolvectl query --cache=NO grafana.vpn.heusel.eu
grafana.vpn.heusel.eu: 100.64.0.6              -- link: tailscale0

# node
$ resolvectl query --cache=NO scotty-the-fifth.chris.vpn.heusel.eu
scotty-the-fifth.chris.vpn.heusel.eu: 100.64.0.6 -- link: tailscale0
# beta1
$ resolvectl status tailscale0
Link 8 (tailscale0)
    Current Scopes: DNS
         Protocols: +DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 100.100.100.100
       DNS Servers: 100.100.100.100
        DNS Domain: vpn.heusel.eu ~.

$ resolvectl query --cache=NO vpn.heusel.eu
vpn.heusel.eu: Name 'vpn.heusel.eu' not found

# extra records
$ resolvectl query --cache=NO grafana.vpn.heusel.eu               
grafana.vpn.heusel.eu: 100.64.0.6              -- link: tailscale0

# node 
$ resolvectl query --cache=NO scotty-the-fifth.chris.vpn.heusel.eu
scotty-the-fifth.chris.vpn.heusel.eu: Name 'scotty-the-fifth.chris.vpn.heusel.eu' not found

So this means apparently it now sets the "DNS Domain" to a different value, but I'm not sure whether that causes the issue 🤔

Since it might be of interest, here is the output of my DNS config:

dns_config:
  override_local_dns: true

  nameservers:
    - 8.8.8.8

  restricted_nameservers:
    fritz.box:
      - 192.168.71.5

  domains: []

  extra_records:
    - name: "grafana.vpn.heusel.eu"
      type: "A"
      value: "100.64.0.6"

    - name: "prometheus.vpn.heusel.eu"
      type: "A"
      value: "100.64.0.6"

    - name: "alertmanager.vpn.heusel.eu"
      type: "A"
      value: "100.64.0.6"

    - name: "repo.vpn.heusel.eu"
      type: "A"
      value: "100.64.0.6"

  magic_dns: true

  base_domain: vpn.heusel.eu

Also @kradalby thanks for looking into this, this is very much appreciated! ❤️

@christian-heusel commented on GitHub (Jul 25, 2024): > Does this mean it was there initially, but then disappeared after? No the way I'm testing this is that I'm redeploying the other version on my VPS and then run tailscale status on my client to see if it's still working / printing out the error. > So this makes me think that this is a networking issue, because headscale sends the DERP server as part of the map update. I cant really think of anything that would have changed this in the commits between the last alpha and the beta. Could there be an external event/change to your docker setup 🤔 (odd since reverting works). This was my first thought aswell, but the issue now reproduces over multirple docker versions and really consistently with every switch of images that I do. > This could indicate that it only listens to IPv6? however my test logs shows the same, so I would find it odd to be the cause, and I do not think anything related to that has changed. After I have switched to the `-debug` version of the image I was able to check this inside of the container, and the outputs were the same for both versions: ``` / # netstat -lntu | grep 3478 udp 0 0 :::3478 :::* ``` ``` $ ss -tulpn | grep 3478 udp UNCONN 0 0 0.0.0.0:3478 0.0.0.0:* users:(("docker-proxy",pid=5895,fd=4)) udp UNCONN 0 0 [::]:3478 [::]:* users:(("docker-proxy",pid=5901,fd=4)) ``` So since all of this did not help I also had a look at the output of tailscaled on my client and this looks interesting: ``` Jul 25 12:30:25 meterpeter tailscaled[131191]: derphttp.Client.Recv: connecting to derp-999 (christian-derp) Jul 25 12:30:25 meterpeter tailscaled[131191]: magicsock: [0xc0035fd540] derp.Recv(derp-999): derphttp.Client.Recv connect to region 999 (christian-derp): dial tcp4: lookup vpn.heusel.eu: no such host Jul 25 12:30:25 meterpeter tailscaled[131191]: netcheck: netcheck.runProbe: named node "999" has no v6 address Jul 25 12:30:25 meterpeter tailscaled[131191]: netcheck: netcheck: DNS lookup error for "vpn.heusel.eu" (node "999" region 999): context canceled Jul 25 12:30:25 meterpeter tailscaled[131191]: netcheck: netcheck.runProbe: named node "999" has no v4 address Jul 25 12:30:27 meterpeter tailscaled[131191]: control: NetInfo: NetInfo{varies= hairpin= ipv6=false ipv6os=true udp=true icmpv4=false derp=#999 portmap=UC link="" firewallmode="ipt-default"} ``` So what actually seems to break is the internal DNS server (or something in that realm) and the DERP is just fallout from the before failure: ``` # alpha12 $ resolvectl status tailscale0 Link 9 (tailscale0) Current Scopes: DNS Protocols: +DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported Current DNS Server: 100.100.100.100 DNS Servers: 100.100.100.100 DNS Domain: chris.vpn.heusel.eu ~. $ resolvectl query --cache=NO vpn.heusel.eu vpn.heusel.eu: 49.12.6.160 -- link: tailscale0 (christian.heusel.eu) # extra records $ resolvectl query --cache=NO grafana.vpn.heusel.eu grafana.vpn.heusel.eu: 100.64.0.6 -- link: tailscale0 # node $ resolvectl query --cache=NO scotty-the-fifth.chris.vpn.heusel.eu scotty-the-fifth.chris.vpn.heusel.eu: 100.64.0.6 -- link: tailscale0 ``` ``` # beta1 $ resolvectl status tailscale0 Link 8 (tailscale0) Current Scopes: DNS Protocols: +DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported Current DNS Server: 100.100.100.100 DNS Servers: 100.100.100.100 DNS Domain: vpn.heusel.eu ~. $ resolvectl query --cache=NO vpn.heusel.eu vpn.heusel.eu: Name 'vpn.heusel.eu' not found # extra records $ resolvectl query --cache=NO grafana.vpn.heusel.eu grafana.vpn.heusel.eu: 100.64.0.6 -- link: tailscale0 # node $ resolvectl query --cache=NO scotty-the-fifth.chris.vpn.heusel.eu scotty-the-fifth.chris.vpn.heusel.eu: Name 'scotty-the-fifth.chris.vpn.heusel.eu' not found ``` So this means apparently it now sets the "DNS Domain" to a different value, but I'm not sure whether that causes the issue 🤔 Since it might be of interest, here is the output of my DNS config: ```yaml dns_config: override_local_dns: true nameservers: - 8.8.8.8 restricted_nameservers: fritz.box: - 192.168.71.5 domains: [] extra_records: - name: "grafana.vpn.heusel.eu" type: "A" value: "100.64.0.6" - name: "prometheus.vpn.heusel.eu" type: "A" value: "100.64.0.6" - name: "alertmanager.vpn.heusel.eu" type: "A" value: "100.64.0.6" - name: "repo.vpn.heusel.eu" type: "A" value: "100.64.0.6" magic_dns: true base_domain: vpn.heusel.eu ``` --- Also @kradalby thanks for looking into this, this is very much appreciated! ❤️
Author
Owner

@christian-heusel commented on GitHub (Jul 25, 2024):

Possible duplicates/related issues given my latest findings: #2029 #2026

@christian-heusel commented on GitHub (Jul 25, 2024): Possible duplicates/related issues given my latest findings: #2029 #2026
Author
Owner

@kradalby commented on GitHub (Jul 25, 2024):

ah yes, a DNS issue might be the potential culprit, while waiting for a reply I started to write up some clearly missing DNS tests, so will continue with that then. I'll post when I have an update, maybe on either of those two issues.

@kradalby commented on GitHub (Jul 25, 2024): ah yes, a DNS issue might be the potential culprit, while waiting for a reply I started to write up some clearly missing DNS tests, so will continue with that then. I'll post when I have an update, maybe on either of those two issues.
Author
Owner

@kradalby commented on GitHub (Aug 1, 2024):

I think #2034 addresses this, would it be possible for you to help me test it? would be great to avoid another bad release like beta1.

Binary is available here: https://github.com/juanfont/headscale/actions/runs/10195837541?pr=2034

@kradalby commented on GitHub (Aug 1, 2024): I think #2034 addresses this, would it be possible for you to help me test it? would be great to avoid another bad release like beta1. Binary is available here: https://github.com/juanfont/headscale/actions/runs/10195837541?pr=2034
Author
Owner

@christian-heusel commented on GitHub (Aug 1, 2024):

@kradalby thanks for working on a fix! 🤗

Except for the fact that I had to rename from dns_config to dns the mentioned PR did not fix the issues 😅
Also there was no error about the rename from restricted_nameservers to split, but setting it also did not help, same for the addition of global in the nameservers directive 🤔

@christian-heusel commented on GitHub (Aug 1, 2024): @kradalby thanks for working on a fix! 🤗 Except for the fact that I had to rename from `dns_config` to `dns` the mentioned PR did not fix the issues 😅 Also there was no error about the rename from `restricted_nameservers` to `split`, but setting it also did not help, same for the addition of `global` in the `nameservers` directive 🤔
Author
Owner

@kradalby commented on GitHub (Aug 2, 2024):

Except for the fact that I had to rename from dns_config to dns the mentioned PR did not fix the issues 😅

Yes, sorry, thats part of the PR, I have one theory looking at your config, can you try setting a dns.base_name different from the DNS name you use for headscale? so magicdns.vpn.heusel.eu as base_name and keep vpn.heusel.eu for the headscale?

Also there was no error about the rename from restricted_nameservers to split, but setting it also did not help, same for the addition of global in the nameservers directive 🤔

Did you not get any warnings at the beginning of your logs? I've made it so if not replaced it should fatal now.

@kradalby commented on GitHub (Aug 2, 2024): > Except for the fact that I had to rename from `dns_config` to `dns` the mentioned PR did not fix the issues 😅 Yes, sorry, thats part of the PR, I have one theory looking at your config, can you try setting a `dns.base_name` different from the DNS name you use for headscale? so `magicdns.vpn.heusel.eu` as `base_name` and keep `vpn.heusel.eu` for the headscale? > Also there was no error about the rename from `restricted_nameservers` to `split`, but setting it also did not help, same for the addition of `global` in the `nameservers` directive 🤔 Did you not get any warnings at the beginning of your logs? I've made it so if not replaced it should fatal now.
Author
Owner

@kradalby commented on GitHub (Aug 2, 2024):

To test, you can set the dns.use_username_in_magic_dns to true, which will be removed, but it will temp give you back the username in the dns, which should have the same effect.

This might be a good thing that we discovered, that having the same base_name and headscale dns name will no longer be possible due to how Tailscale takes over the DNS.

For the record, in Tailscale upstream, this is the same behaviour:

  • derps have their own DNS
  • controlserver has its own dns (login/control.tailscale.com IIRC)
  • "basename" is deparate (e.g. bee-velociraptor.ts.net)

so by headscale injecting username stuff, it did not break before, but that prevents us from achieving some other things, so it sadly has to go.

@kradalby commented on GitHub (Aug 2, 2024): To test, you can set the `dns.use_username_in_magic_dns` to `true`, which will be removed, but it will temp give you back the username in the dns, which should have the same effect. This might be a good thing that we discovered, that having the same base_name and headscale dns name will no longer be possible due to how Tailscale takes over the DNS. For the record, in Tailscale upstream, this is the same behaviour: - derps have their own DNS - controlserver has its own dns (login/control.tailscale.com IIRC) - "basename" is deparate (e.g. bee-velociraptor.ts.net) so by headscale injecting username stuff, it did not break before, but that prevents us from achieving some other things, so it sadly has to go.
Author
Owner

@kradalby commented on GitHub (Aug 16, 2024):

@christian-heusel did you have an opportunity to test this?

@kradalby commented on GitHub (Aug 16, 2024): @christian-heusel did you have an opportunity to test this?
Author
Owner

@christian-heusel commented on GitHub (Aug 16, 2024):

Sorry I forgot about this, will test and report soon!

@christian-heusel commented on GitHub (Aug 16, 2024): Sorry I forgot about this, will test and report soon!
Author
Owner

@christian-heusel commented on GitHub (Aug 17, 2024):

To test, you can set the dns.use_username_in_magic_dns to true, which will be removed, but it will temp give you back the username in the dns, which should have the same effect.

This makes the three types of queries from above work again 😊 👍🏻

Regarding https://github.com/juanfont/headscale/issues/2025#issuecomment-2264760872:

When unsetting the previously set dns.use_username_in_magic_dns and setting the base_name as requested it also works as expected 👍🏻

Did you not get any warnings at the beginning of your logs? I've made it so if not replaced it should fatal now.

Maybe I'm testing this wrong, but I dont get any warnings/fatal versions with the latest version of your branch and the following DNS config snippet (which I have verified to be the active one inside of the confainer by running docker compose exec headscale cat /etc/headscale/config.yaml):

dns:
  override_local_dns: true
  nameservers:
    # global:
      - 8.8.8.8
  restricted_nameservers:
  # split:
    fritz.box:
      - 192.168.71.5
  domains: []
  magic_dns: true
  base_domain: magicdns.vpn.heusel.eu

Instead I'm being warned about a key I don't even have set:

WARN: The "dns.use_username_in_magic_dns" configuration key is deprecated and has been removed. Please see the changelog for more details.
@christian-heusel commented on GitHub (Aug 17, 2024): > To test, you can set the dns.use_username_in_magic_dns to true, which will be removed, but it will temp give you back the username in the dns, which should have the same effect. This makes the three types of queries from above work again 😊 👍🏻 Regarding https://github.com/juanfont/headscale/issues/2025#issuecomment-2264760872: When unsetting the previously set `dns.use_username_in_magic_dns` and setting the `base_name` as requested it also works as expected 👍🏻 > Did you not get any warnings at the beginning of your logs? I've made it so if not replaced it should fatal now. Maybe I'm testing this wrong, but I dont get any warnings/fatal versions with the latest version of your branch and the following DNS config snippet (which I have verified to be the active one inside of the confainer by running `docker compose exec headscale cat /etc/headscale/config.yaml`): ```yaml dns: override_local_dns: true nameservers: # global: - 8.8.8.8 restricted_nameservers: # split: fritz.box: - 192.168.71.5 domains: [] magic_dns: true base_domain: magicdns.vpn.heusel.eu ``` Instead I'm being warned about a key I don't even have set: ``` WARN: The "dns.use_username_in_magic_dns" configuration key is deprecated and has been removed. Please see the changelog for more details. ```
Author
Owner

@christian-heusel commented on GitHub (Aug 17, 2024):

Edit: reverted bogus comment here, I tried to connect against a node of mine that went offline for unbeknownst reaons. 😆

@christian-heusel commented on GitHub (Aug 17, 2024): Edit: reverted bogus comment here, I tried to connect against a node of mine that went offline for unbeknownst reaons. 😆
Author
Owner

@kradalby commented on GitHub (Aug 19, 2024):

Maybe I'm testing this wrong, but I dont get any warnings/fatal versions with the latest version of your branch and the following DNS config snippet (which I have verified to be the active one inside of the confainer by running docker compose exec headscale cat /etc/headscale/config.yaml):

hmm, I you wont really get any errors/warnings for setting the wrong keys, for example dns.nameservers isnt checked, while dns_config.nameservers is checked. I suppose we could do it, but there is no good way in cobra to cover all cases, only the ones we can think about.

At the moment it will only warn if you have the old set, and not the new. if you mix, it wont detect it.

@kradalby commented on GitHub (Aug 19, 2024): > Maybe I'm testing this wrong, but I dont get any warnings/fatal versions with the latest version of your branch and the following DNS config snippet (which I have verified to be the active one inside of the confainer by running `docker compose exec headscale cat /etc/headscale/config.yaml`): hmm, I you wont really get any errors/warnings for setting the wrong keys, for example `dns.nameservers` isnt checked, while `dns_config.nameservers` is checked. I suppose we could do it, but there is no good way in cobra to cover _all_ cases, only the ones we can think about. At the moment it will _only_ warn if you have the old set, and not the new. if you mix, it wont detect it.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#746