mirror of
https://github.com/juanfont/headscale.git
synced 2026-01-11 20:00:28 +01:00
[Feature] Possibility of using a larger database connection pool (while still using the pure Go SQLite implementation) #1011
Closed
opened 2025-12-29 02:27:25 +01:00 by adam
·
13 comments
No Branch/Tag Specified
main
update_flake_lock_action
gh-pages
kradalby/release-v0.27.2
dependabot/go_modules/golang.org/x/crypto-0.45.0
dependabot/go_modules/github.com/opencontainers/runc-1.3.3
copilot/investigate-headscale-issue-2788
copilot/investigate-visibility-issue-2788
copilot/investigate-issue-2833
copilot/debug-issue-2846
copilot/fix-issue-2847
dependabot/go_modules/github.com/go-viper/mapstructure/v2-2.4.0
dependabot/go_modules/github.com/docker/docker-28.3.3incompatible
kradalby/cli-experiement3
doc/0.26.1
doc/0.25.1
doc/0.25.0
doc/0.24.3
doc/0.24.2
doc/0.24.1
doc/0.24.0
kradalby/build-docker-on-pr
topic/docu-versioning
topic/docker-kos
juanfont/fix-crash-node-id
juanfont/better-disclaimer
update-contributors
topic/prettier
revert-1893-add-test-stage-to-docs
add-test-stage-to-docs
remove-node-check-interval
fix-empty-prefix
fix-ephemeral-reusable
bug_report-debuginfo
autogroups
logs-to-stderr
revert-1414-topic/fix_unix_socket
rename-machine-node
port-embedded-derp-tests-v2
port-derp-tests
duplicate-word-linter
update-tailscale-1.36
warn-against-apache
ko-fi-link
more-acl-tests
fix-typo-standalone
parallel-nolint
tparallel-fix
rerouting
ssh-changelog-docs
oidc-cleanup
web-auth-flow-tests
kradalby-gh-runner
fix-proto-lint
remove-funding-links
go-1.19
enable-1.30-in-tests
0.16.x
cosmetic-changes-integration
tmp-fix-integration-docker
fix-integration-docker
configurable-update-interval
show-nodes-online
hs2021
acl-syntax-fixes
ts2021-implementation
fix-spurious-updates
unstable-integration-tests
mandatory-stun
embedded-derp
prtemplate-fix
v0.28.0-beta.1
v0.27.2-rc.1
v0.27.1
v0.27.0
v0.27.0-beta.2
v0.27.0-beta.1
v0.26.1
v0.26.0
v0.26.0-beta.2
v0.26.0-beta.1
v0.25.1
v0.25.0
v0.25.0-beta.2
v0.24.3
v0.25.0-beta.1
v0.24.2
v0.24.1
v0.24.0
v0.24.0-beta.2
v0.24.0-beta.1
v0.23.0
v0.23.0-rc.1
v0.23.0-beta.5
v0.23.0-beta.4
v0.23.0-beta3
v0.23.0-beta2
v0.23.0-beta1
v0.23.0-alpha12
v0.23.0-alpha11
v0.23.0-alpha10
v0.23.0-alpha9
v0.23.0-alpha8
v0.23.0-alpha7
v0.23.0-alpha6
v0.23.0-alpha5
v0.23.0-alpha4
v0.23.0-alpha4-docker-ko-test9
v0.23.0-alpha4-docker-ko-test8
v0.23.0-alpha4-docker-ko-test7
v0.23.0-alpha4-docker-ko-test6
v0.23.0-alpha4-docker-ko-test5
v0.23.0-alpha-docker-release-test-debug2
v0.23.0-alpha-docker-release-test-debug
v0.23.0-alpha4-docker-ko-test4
v0.23.0-alpha4-docker-ko-test3
v0.23.0-alpha4-docker-ko-test2
v0.23.0-alpha4-docker-ko-test
v0.23.0-alpha3
v0.23.0-alpha2
v0.23.0-alpha1
v0.22.3
v0.22.2
v0.23.0-alpha-docker-release-test
v0.22.1
v0.22.0
v0.22.0-alpha3
v0.22.0-alpha2
v0.22.0-alpha1
v0.22.0-nfpmtest
v0.21.0
v0.20.0
v0.19.0
v0.19.0-beta2
v0.19.0-beta1
v0.18.0
v0.18.0-beta4
v0.18.0-beta3
v0.18.0-beta2
v0.18.0-beta1
v0.17.1
v0.17.0
v0.17.0-beta5
v0.17.0-beta4
v0.17.0-beta3
v0.17.0-beta2
v0.17.0-beta1
v0.17.0-alpha4
v0.17.0-alpha3
v0.17.0-alpha2
v0.17.0-alpha1
v0.16.4
v0.16.3
v0.16.2
v0.16.1
v0.16.0
v0.16.0-beta7
v0.16.0-beta6
v0.16.0-beta5
v0.16.0-beta4
v0.16.0-beta3
v0.16.0-beta2
v0.16.0-beta1
v0.15.0
v0.15.0-beta6
v0.15.0-beta5
v0.15.0-beta4
v0.15.0-beta3
v0.15.0-beta2
v0.15.0-beta1
v0.14.0
v0.14.0-beta2
v0.14.0-beta1
v0.13.0
v0.13.0-beta3
v0.13.0-beta2
v0.13.0-beta1
upstream/v0.12.4
v0.12.4
v0.12.3
v0.12.2
v0.12.2-beta1
v0.12.1
v0.12.0-beta2
v0.12.0-beta1
v0.11.0
v0.10.8
v0.10.7
v0.10.6
v0.10.5
v0.10.4
v0.10.3
v0.10.2
v0.10.1
v0.10.0
v0.9.3
v0.9.2
v0.9.1
v0.9.0
v0.8.1
v0.8.0
v0.7.1
v0.7.0
v0.6.1
v0.6.0
v0.5.2
v0.5.1
v0.5.0
v0.4.0
v0.3.6
v0.3.5
v0.3.4
v0.3.3
v0.3.2
v0.3.1
v0.3.0
v0.2.2
v0.2.1
v0.2.0
v0.1.1
v0.1.0
Labels
Clear labels
CLI
DERP
DNS
Nix
OIDC
SSH
bug
database
documentation
duplicate
enhancement
faq
good first issue
grants
help wanted
might-come
needs design doc
needs investigation
no-stale-bot
out of scope
performance
policy 📝
pull-request
question
regression
routes
stale
tags
tailscale-feature-gap
well described ❤️
wontfix
Mirrored from GitHub Pull Request
No Label
enhancement
Milestone
No items
No Milestone
Projects
Clear projects
No project
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: starred/headscale#1011
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @aergus-tng on GitHub (May 6, 2025).
Use case
Under certain loads, having only a single connection becomes a bottleneck. We observed that the server is more responsive under high load when more connections are allowed.
Description
In
hscontrol/db/db.go, there is the following code.This goes back to #349, where there are reports of
SQLITE_BUSYerrors.While I cannot pinpoint which changes in
glebarez/sqlitecould have made a difference , in our tests using the current version ofglebarez/sqlite(v1.11.0), the Headscale server could operate with a larger connection pool, so the number of connections could be made configurable.Contribution
How can it be implemented?
The maximum number of idle and open connections could be made configurable via environment variables or as part of the config file (under the
tuningkey?). The default value for both can be1to keep the current behavior as the default.@kradalby commented on GitHub (May 6, 2025):
hmm, I am sceptical, mainly because the error handling has not really been tuned to handle the
SQLITE_BUSYerrors, which might end up in unexpected behaviour that is hard to recover from.Even if it is not a default, users have a tendency to find options they don't know how to use and for it to produce extra support load.
On the bottleneck, SQLite is famously multiple readers, one writer. I don't expect the writer is our bottleneck, but rather the reader.
I think a more meaningful pattern to explore would be to have (I think I will mess up the terminology here but hopefully the idea gets across) two connection pools:
All the methods in the
dbpackage that exclusively reads (Get*,List*, etc) will only use the reader pool and therefore have less blocks.I am not sure if we would be able to implement this in practice considering GORM, dual database support, but I think it is an interesting pattern for SQLite.
@aergus-tng commented on GitHub (May 7, 2025):
I have the impression that implementing a single writer / multiple readers scheme on top of the current dual database setup would indeed result in rather complicated code you probably wouldn't be willing to merge. Moreover, the Write-Ahead Logging mode of SQLite seems to seems to already provide more or less that:
After having a closer look at the SQLite components, my current guess is that the reason why using concurrent connections in Headscale caused issues in 2022 but seems to work now is actually the fact that the WAL mode became the default in 2024 (cf. #1985) -- we do see quite a few
SQLITE_BUSYerrors when we disable WAL and try to use multiple connections.According to the documentation,
SQLITE_BUSYoccurs in WAL mode only in the following edge cases, recovering from which is arguably not necessary for Headscale:As Headscale is supposed to have full control over its database, this should not occur if Headscale is configured to use WAL.
AFAICT Headscale (or rather the
database/sqldriver) keeps idle connections during normal operation, so this is probably only relevant for the clean up at shutdown, during which it is arguably acceptable that some operations cannot be completed. (In fact, already now we occasionally see "sql: database is closed" errors when we shut down the server under high load.)This is a one-time "hiccup" that should disappear after Headscale has recovered from the crash. (If connections crash systematically, that's probably indication of another issue.)
Given this, I would extend my proposal by a configuration validation for checking that the WAL mode is enabled when using multiple database connections, which should then be robust enough IMHO. However, if you think that a subset of the aforementioned cases constitutes a problem, I would rather propose handling the
SQLITE_BUSYerrors (e.g. with a simple retry mechanism) than implementing an additional pooling layer.PS: The links above point to the documentation of the C implementation of SQLite, but AFAICT the pure Go implementation (
modernc.org/sqlite) follows it closely.@kradalby commented on GitHub (May 7, 2025):
The arguments for now vs then, since we now have WAL, makes sense, it makes sense that the semantics might be a bit different.
I am still interested in a conservative approach, how do we determine how many connections is safe? I would prefer to stay away from the tuning config, as we dont really want users to use it.
In general, I think it is more meaningful over time, to not really go via the database all the time, like we do now. We spend so much time doing unmarshalling of JSON and so on, so I think that would be more impactful. But I suppose this is an ok stepping stone.
@aergus-tng commented on GitHub (May 7, 2025):
I have the impression that with WAL, the number of connections can be scaled quite liberally. For instance, in our tests we used a server with 16 database connections on a 16-core machine and it could support 500 online nodes without
SQLITE_BUSYerrors.After my original comment I've seen that there is already the precedent of the
max_open_connsandmax_idle_connsfields of the Postgres configuration, this is basically the analogous settings for SQLite. After reading the WAL documentation I believe that this is safe enough to expose to the users with appropriate documentation and reasonable defaults, but it's of course your call.That makes sense. (Though depending on the remaining operations, database parallelism might still be useful in the end.)
What exactly do you mean with "this" here?
@kradalby commented on GitHub (May 9, 2025):
I think we can do it, and as mentioned postgres has this. If you want to a PR, that would be appreciated, as you point out appropriate docs would be important. I am not sure if we should keep the default at 1 or not. I am happy to be persuaded there as I do not have the size of infra you have.
As a highlevel acceptance, maybe you could add another check to https://github.com/juanfont/headscale/blob/main/integration/scenario.go#L304 to ensure we dont see this in the integration tests as an acceptance thing? You could change the default in integration to be higher.
@aergus-tng commented on GitHub (May 9, 2025):
Thanks! I will make a PR, including checks for
SQLITE_BUSYinShutdownAssertNoPanics.I would expose the options
max_open_conns,max_idle_connsandconn_max_idle_time_secsfor SQLite as done for Postgres. As these options would then be common to both databases, they could be lifted to thedatabasefield of the config. Should I do that or should there be separate fields for connection settings underdatabase.sqliteanddatabase.postgres?As for the default pool size:
runtime.NumCPU()orruntime.GOMAXPROCS(0)would be reasonable because that would allow using all available CPU capacity under high load (and we do see full CPU load in our tests with multiple database connections, for instance whenNotifyAllis called with a full update). However, I would understand if you don't want the default to be dynamic.database/sql) to benefit from some database concurrency by default while only minimally interfering with existing setups, which might run on single-CPU environments.@aergus-tng commented on GitHub (May 13, 2025):
Unfortunately, it turned out that the conclusions I drew from the WAL documentation were incorrect and I did observe
SQLITE_BUSYerrors in integration tests (see e.g. here).An in-depth discussion of why this happens can be found in this blog post, but the gist is that there are situations where two parallel writing transactions (where at least one starts as a read-only transaction) cannot be both completed while maintaining the "serializable" isolation level guarantee of SQLite and thus result in a
SQLITE_BUSYerror.For Headscale, this seems to happen for instance when saving nodes. My guess is that a saving transaction starts as a read-only transaction because it checks if a node with the ID already exists and is attempted to upgrade to a writing transaction while actually saving the node, so such transactions cannot be arbitrarily parallelized.
The recommendation in the aforementioned blog post is explicitly starting a transaction with
BEGIN IMMEDIATEwhen it is foreseeable that the transaction will write to the database at some point. AFAICT the abstractions layers used in Headscale (database/sqland GORM) unfortunately do not expose such transaction options and people suggest workarounds like runningROLLBACK; BEGIN IMMEDIATEafter starting a transaction, which is far from ideal.Under these circumstances, I think that your original suggestion of ensuring that there is a single writer in application code is the best compromise. I will try adding a mutex to make sure that at most one potentially writing transaction is running at any time.
@aergus-tng commented on GitHub (May 28, 2025):
A status update: We see
SQLITE_BUSYerrors when "too many" connections are allowed (in particular in integration tests with 3 connections on GitHub runners) even when we guard write operations with a mutex.To investigate whether this is an issue with the SQLite driver/implementation, we ran some tests with the C implementation of SQL with its default GORM driver and could not reproduce the busy errors when using a write lock. We therefore suspect that the WAL / single-writer-many-readers implementation of the Go implementation might not be on par with the C implementation and will open an issue over there if we can pinpoint the exact problem.
As for Headscale, though, I probably won't be able to submit a PR with a correct/stable multi-connection implementation using the Go SQLite implementation anytime soon. FWIW, the current state of my attempts can be found on this branch: https://github.com/aergus-tng/headscale/tree/multiple-sqlite-connections
@kradalby commented on GitHub (May 28, 2025):
I havent had time to look closely into this, but since the cgo-free sqlite was added, an interesting other alternative has popped up using wasm instead of transpiling C to Go code, this might be something to checkout: https://github.com/ncruces/go-sqlite3
@aergus-tng commented on GitHub (May 30, 2025):
Thanks for the hint, we tried out
ncruces/go-sqlite3. It seems to be better in terms ofSQLITE_BUSYerrors, in particular some integration tests that never passed with the other Go implementation occasionally pass after some retries (cf. this run), but they are still not eliminated.By now we have also observed
SQLITE_BUSYerrors in some integration test runs (cf. this one) when using the CGO implementation.Judging by these observations, it's probably not an SQLite implementation problem after all (maybe the the faster implementations simply require more load for the busy issues to occur?), but I find it hard to reason about what's happening because I don't really know the internals of SQLite. One speculation I have is that there might be some writing/syncing going on after (non-immediate) transactions, so that the application-level lock is released before the database is fully consistent.
In any case, this will probably need some long-term debugging, and there will probably be not much work on this issue on our side in the next weeks (in particular, I personally will be "out of office" for a while).
@aergus-tng commented on GitHub (Aug 26, 2025):
A brief (non-)update: The potential solution I found (starting all transactions that can possibly write to the database as
IMMEDIATE) appears to be impossible to implement on top of GORM's abstraction layer. I could not spend much time on this since June and don't think that I will be able to contribute a sufficiently robust implementation in the foreseeable future, so this issue can be closed if no one else is interested in this.@kradalby commented on GitHub (Sep 9, 2025):
I've merged #2670, which moves a lot of the heavy read from the database to an in-memory store (for nodes), as mentioned in #2598, we might have moved the bottleneck a bit around, but probably in the right direction.
What do you think about closing this in favour of a "touch the database less" approach?
@aergus-tng commented on GitHub (Sep 15, 2025):
Thanks for implementing an in-memory store, that will surely reduce the DB load. As mentioned before, I'm fine with closing this issue.