Excessive response times for LDAP user accounts #4152

Closed
opened 2025-12-29 18:33:28 +01:00 by adam · 11 comments
Owner

Originally created by @gdanielson on GitHub (Sep 29, 2020).

Environment

  • Python version: 3.7
  • NetBox version: 2.9.4

Steps to Reproduce

  1. Enable LDAP auth, REMOTE_AUTH_BACKEND = 'netbox.authentication.LDAPBackend'

  2. As a standard user or with Staff status enabled, perform standard
    activities via the web browser interface.

  3. All transacations are observed to take an excessive amount of time.

Expected Behavior

Expect that SuperUser and non-SuperUser accounts to have approximately similar
response time experience.

Observed Behavior

A browser refresh of the NetBox homepage takes approx. 20 seconds.

Observing the logfile for the above transaction we see many hundreds of lines
logged saying Initiating TLS followed by many hundreds of lines of ldap
search_s(... activity.

Changing django console handler to level ERROR the above response time is
reduced to approx. 10 seconds

When the same user account is enabled for SuperUser a browser refresh of the
NetBox home page takes approx. 1 second.

These extended response times have been observed since upgrading to v2.9

Originally created by @gdanielson on GitHub (Sep 29, 2020). ### Environment * Python version: 3.7 * NetBox version: 2.9.4 ### Steps to Reproduce 1. Enable LDAP auth, REMOTE_AUTH_BACKEND = 'netbox.authentication.LDAPBackend' 2. As a standard user or with Staff status enabled, perform standard activities via the web browser interface. 3. All transacations are observed to take an excessive amount of time. ### Expected Behavior Expect that SuperUser and non-SuperUser accounts to have approximately similar response time experience. ### Observed Behavior A browser refresh of the NetBox homepage takes approx. 20 seconds. Observing the logfile for the above transaction we see many hundreds of lines logged saying `Initiating TLS` followed by many hundreds of lines of ldap `search_s(...` activity. Changing django console handler to level ERROR the above response time is reduced to approx. 10 seconds When the same user account is enabled for SuperUser a browser refresh of the NetBox home page takes approx. 1 second. These extended response times have been observed since upgrading to v2.9
adam added the type: bugstatus: under review labels 2025-12-29 18:33:28 +01:00
adam closed this issue 2025-12-29 18:33:28 +01:00
Author
Owner

@gdanielson commented on GitHub (Sep 29, 2020):

With the logs showing LDAP activity for every transaction it seems that LDAP caching has mysteriously stopped working.
The settings we have for LDAP caching are:

AUTH_LDAP_CACHE_GROUPS = True
AUTH_LDAP_GROUP_CACHE_TIMEOUT = 1800

The django-auth doc for LDAP caching says that these params are deprecated but do still work.
https://django-auth-ldap.readthedocs.io/en/latest/reference.html?#auth-ldap-cache-timeout

Changed in version 1.6.0: Previously caching was controlled by the settings AUTH_LDAP_CACHE_GROUPS and AUTH_LDAP_GROUP_CACHE_TIMEOUT. If AUTH_LDAP_CACHE_GROUPS is set, the AUTH_LDAP_CACHE_TIMEOUT value is derievd from these deprecated settings.

I will try changing to the new param AUTH_LDAP_CACHE_TIMEOUT then test.

@gdanielson commented on GitHub (Sep 29, 2020): With the logs showing LDAP activity for every transaction it seems that LDAP caching has mysteriously stopped working. The settings we have for LDAP caching are: `AUTH_LDAP_CACHE_GROUPS = True` `AUTH_LDAP_GROUP_CACHE_TIMEOUT = 1800` The django-auth doc for LDAP caching says that these params are deprecated but do still work. https://django-auth-ldap.readthedocs.io/en/latest/reference.html?#auth-ldap-cache-timeout > Changed in version 1.6.0: Previously caching was controlled by the settings AUTH_LDAP_CACHE_GROUPS and AUTH_LDAP_GROUP_CACHE_TIMEOUT. If AUTH_LDAP_CACHE_GROUPS is set, the AUTH_LDAP_CACHE_TIMEOUT value is derievd from these deprecated settings. I will try changing to the new param `AUTH_LDAP_CACHE_TIMEOUT` then test.
Author
Owner

@jeremystretch commented on GitHub (Sep 29, 2020):

@gdanielson Could you have a look at the report in #5194 please and see if it's related to your issue?

@jeremystretch commented on GitHub (Sep 29, 2020): @gdanielson Could you have a look at the report in #5194 please and see if it's related to your issue?
Author
Owner

@gdanielson commented on GitHub (Oct 1, 2020):

@gdanielson Could you have a look at the report in #5194 please and see if it's related to your issue?

We are not seeing obvious increase of CPU consumption over time (we are running in a container in AWS ECS so observability of this is not always so clear cut).

However, there is an unknown element to our issue where from end-user feedback it is not 100% clear that the degraded performance is consistently observed over a period of some days. So we may also be seeing #5194 but will keep watch of how things go over the next few days.

@gdanielson commented on GitHub (Oct 1, 2020): > @gdanielson Could you have a look at the report in #5194 please and see if it's related to your issue? We are not seeing obvious increase of CPU consumption over time (we are running in a container in AWS ECS so observability of this is not always so clear cut). However, there is an unknown element to our issue where from end-user feedback it is not 100% clear that the degraded performance is consistently observed over a period of some days. So we _may_ also be seeing [#5194 ](https://github.com/netbox-community/netbox/issues/5194) but will keep watch of how things go over the next few days.
Author
Owner

@gdanielson commented on GitHub (Oct 1, 2020):

Update to my earlier comment...

I will try changing to the new param AUTH_LDAP_CACHE_TIMEOUT then test.

Changed to AUTH_LDAP_CACHE_TIMEOUT
The web browser response time is now improved. The logs now show that initial LDAP log activity but activity for every single transaction is not logged.
In our case it seems that LDAP caching does not work when using the deprecated settings.
The increased volume of LDAP logging combined with the logging level together made the effect worse.

@gdanielson commented on GitHub (Oct 1, 2020): Update to my earlier comment... > I will try changing to the new param `AUTH_LDAP_CACHE_TIMEOUT` then test. Changed to `AUTH_LDAP_CACHE_TIMEOUT` The web browser response time is now improved. The logs now show that initial LDAP log activity but activity for every single transaction is not logged. In our case it seems that LDAP caching does not work when using the deprecated settings. The increased volume of LDAP logging combined with the logging level together made the effect worse.
Author
Owner

@gdanielson commented on GitHub (Oct 7, 2020):

This problem has re-appeared. Troubleshooting & digging resumes...

@gdanielson commented on GitHub (Oct 7, 2020): This problem has re-appeared. Troubleshooting & digging resumes...
Author
Owner

@jeremystretch commented on GitHub (Oct 7, 2020):

I haven't really been on top of the LDAP-related bugs of late, but there was one report that removing the logging from within LDAPBackend yielded significant performance improvements. Can you try commenting out these lines in netbox.authentication.py and restart the service to see if things improve?

39e4ab164e/netbox/netbox/authentication.py (L175-L178)

@jeremystretch commented on GitHub (Oct 7, 2020): I haven't really been on top of the LDAP-related bugs of late, but there was one report that removing the logging from within LDAPBackend yielded significant performance improvements. Can you try commenting out these lines in `netbox.authentication.py` and restart the service to see if things improve? https://github.com/netbox-community/netbox/blob/39e4ab164e5585d168c51bb5e3bc495087a9716b/netbox/netbox/authentication.py#L175-L178
Author
Owner

@gdanielson commented on GitHub (Oct 9, 2020):

We are seeing a return of the situation where there is logged LDAP activity for what looks like every transaction. For this I think the logging is actually assisting to expose the underlying issue.
Disabling this logging activity will probably improve the end user experience due to the sheer volume of the LDAP logs, however at this stage I feel it will hinder troubleshooting the underlying issue, which is that LDAP caching appears to have stopped working again.

I had to reload our NetBox instances for an unrelated reason so this has now returned response times to an acceptable level. But expecting the problem to return next week. In the meantime I will see if I can educate myself a bit on django_auth_ldap

@gdanielson commented on GitHub (Oct 9, 2020): We are seeing a return of the situation where there is logged LDAP activity for what looks like every transaction. For this I think the logging is actually assisting to expose the underlying issue. Disabling this logging activity will probably improve the end user experience due to the sheer volume of the LDAP logs, however at this stage I feel it will hinder troubleshooting the underlying issue, which is that LDAP caching appears to have stopped working again. I had to reload our NetBox instances for an unrelated reason so this has now returned response times to an acceptable level. But expecting the problem to return next week. In the meantime I will see if I can educate myself a bit on django_auth_ldap
Author
Owner

@gdanielson commented on GitHub (Nov 5, 2020):

Small fyi update. This is still an issue but haven't been able to get any decent debugging time as it re-occurs, a restart fixes it for a few days. Task is under action to upgrade from 2.9.4 to current latest (2.9.8)

@gdanielson commented on GitHub (Nov 5, 2020): Small fyi update. This is still an issue but haven't been able to get any decent debugging time as it re-occurs, a restart fixes it for a few days. Task is under action to upgrade from 2.9.4 to current latest (2.9.8)
Author
Owner

@jeremystretch commented on GitHub (Nov 6, 2020):

@gdanielson Thanks for the update. I'm hoping that the fix for #5282 (in v2.9.8) solves it for you. 🤞

@jeremystretch commented on GitHub (Nov 6, 2020): @gdanielson Thanks for the update. I'm hoping that the fix for #5282 (in v2.9.8) solves it for you. :crossed_fingers:
Author
Owner

@jeremystretch commented on GitHub (Nov 24, 2020):

@gdanielson have you been able to pursue this any further?

@jeremystretch commented on GitHub (Nov 24, 2020): @gdanielson have you been able to pursue this any further?
Author
Owner

@gdanielson commented on GitHub (Nov 25, 2020):

@jeremystretch Just looked at uptimes and we are past the duration where we would've seen problems. So OK to close this. Thanks for the follow up.
p.s. we are on 2.9.9

@gdanielson commented on GitHub (Nov 25, 2020): @jeremystretch Just looked at uptimes and we are past the duration where we would've seen problems. So OK to close this. Thanks for the follow up. p.s. we are on 2.9.9
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/netbox#4152