mirror of
https://github.com/netbox-community/netbox.git
synced 2026-01-11 21:10:29 +01:00
REST response time does not scale properly with limit param while including config_context
#6487
Closed
opened 2025-12-29 19:41:21 +01:00 by adam
·
11 comments
No Branch/Tag Specified
main
update-changelog-comments-docs
feature-removal-issue-type
20911-dropdown
20239-plugin-menu-classes-mutable-state
21097-graphql-id-lookups
feature
fix_module_substitution
20923-dcim-templates
20044-elevation-stuck-lightmode
feature-ip-prefix-link
v4.5-beta1-release
20068-import-moduletype-attrs
20766-fix-german-translation-code-literals
20378-del-script
7604-filter-modifiers-v3
circuit-swap
12318-case-insensitive-uniqueness
20637-improve-device-q-filter
20660-script-load
19724-graphql
20614-update-ruff
14884-script
02496-max-page
19720-macaddress-interface-generic-relation
19408-circuit-terminations-export-templates
20203-openapi-check
fix-19669-api-image-download
7604-filter-modifiers
19275-fixes-interface-bulk-edit
fix-17794-get_field_value_return_list
11507-show-aggregate-and-rir-on-api
9583-add_column_specific_search_field_to_tables
v4.5.0
v4.4.10
v4.4.9
v4.5.0-beta1
v4.4.8
v4.4.7
v4.4.6
v4.4.5
v4.4.4
v4.4.3
v4.4.2
v4.4.1
v4.4.0
v4.3.7
v4.4.0-beta1
v4.3.6
v4.3.5
v4.3.4
v4.3.3
v4.3.2
v4.3.1
v4.3.0
v4.2.9
v4.3.0-beta2
v4.2.8
v4.3.0-beta1
v4.2.7
v4.2.6
v4.2.5
v4.2.4
v4.2.3
v4.2.2
v4.2.1
v4.2.0
v4.1.11
v4.1.10
v4.1.9
v4.1.8
v4.2-beta1
v4.1.7
v4.1.6
v4.1.5
v4.1.4
v4.1.3
v4.1.2
v4.1.1
v4.1.0
v4.0.11
v4.0.10
v4.0.9
v4.1-beta1
v4.0.8
v4.0.7
v4.0.6
v4.0.5
v4.0.3
v4.0.2
v4.0.1
v4.0.0
v3.7.8
v3.7.7
v4.0-beta2
v3.7.6
v3.7.5
v4.0-beta1
v3.7.4
v3.7.3
v3.7.2
v3.7.1
v3.7.0
v3.6.9
v3.6.8
v3.6.7
v3.7-beta1
v3.6.6
v3.6.5
v3.6.4
v3.6.3
v3.6.2
v3.6.1
v3.6.0
v3.5.9
v3.6-beta2
v3.5.8
v3.6-beta1
v3.5.7
v3.5.6
v3.5.5
v3.5.4
v3.5.3
v3.5.2
v3.5.1
v3.5.0
v3.4.10
v3.4.9
v3.5-beta2
v3.4.8
v3.5-beta1
v3.4.7
v3.4.6
v3.4.5
v3.4.4
v3.4.3
v3.4.2
v3.4.1
v3.4.0
v3.3.10
v3.3.9
v3.4-beta1
v3.3.8
v3.3.7
v3.3.6
v3.3.5
v3.3.4
v3.3.3
v3.3.2
v3.3.1
v3.3.0
v3.2.9
v3.2.8
v3.3-beta2
v3.2.7
v3.3-beta1
v3.2.6
v3.2.5
v3.2.4
v3.2.3
v3.2.2
v3.2.1
v3.2.0
v3.1.11
v3.1.10
v3.2-beta2
v3.1.9
v3.2-beta1
v3.1.8
v3.1.7
v3.1.6
v3.1.5
v3.1.4
v3.1.3
v3.1.2
v3.1.1
v3.1.0
v3.0.12
v3.0.11
v3.0.10
v3.1-beta1
v3.0.9
v3.0.8
v3.0.7
v3.0.6
v3.0.5
v3.0.4
v3.0.3
v3.0.2
v3.0.1
v3.0.0
v2.11.12
v3.0-beta2
v2.11.11
v2.11.10
v3.0-beta1
v2.11.9
v2.11.8
v2.11.7
v2.11.6
v2.11.5
v2.11.4
v2.11.3
v2.11.2
v2.11.1
v2.11.0
v2.10.10
v2.10.9
v2.11-beta1
v2.10.8
v2.10.7
v2.10.6
v2.10.5
v2.10.4
v2.10.3
v2.10.2
v2.10.1
v2.10.0
v2.9.11
v2.10-beta2
v2.9.10
v2.10-beta1
v2.9.9
v2.9.8
v2.9.7
v2.9.6
v2.9.5
v2.9.4
v2.9.3
v2.9.2
v2.9.1
v2.9.0
v2.9-beta2
v2.8.9
v2.9-beta1
v2.8.8
v2.8.7
v2.8.6
v2.8.5
v2.8.4
v2.8.3
v2.8.2
v2.8.1
v2.8.0
v2.7.12
v2.7.11
v2.7.10
v2.7.9
v2.7.8
v2.7.7
v2.7.6
v2.7.5
v2.7.4
v2.7.3
v2.7.2
v2.7.1
v2.7.0
v2.6.12
v2.6.11
v2.6.10
v2.6.9
v2.7-beta1
Solcon-2020-01-06
v2.6.8
v2.6.7
v2.6.6
v2.6.5
v2.6.4
v2.6.3
v2.6.2
v2.6.1
v2.6.0
v2.5.13
v2.5.12
v2.6-beta1
v2.5.11
v2.5.10
v2.5.9
v2.5.8
v2.5.7
v2.5.6
v2.5.5
v2.5.4
v2.5.3
v2.5.2
v2.5.1
v2.5.0
v2.4.9
v2.5-beta2
v2.4.8
v2.5-beta1
v2.4.7
v2.4.6
v2.4.5
v2.4.4
v2.4.3
v2.4.2
v2.4.1
v2.4.0
v2.3.7
v2.4-beta1
v2.3.6
v2.3.5
v2.3.4
v2.3.3
v2.3.2
v2.3.1
v2.3.0
v2.2.10
v2.3-beta2
v2.2.9
v2.3-beta1
v2.2.8
v2.2.7
v2.2.6
v2.2.5
v2.2.4
v2.2.3
v2.2.2
v2.2.1
v2.2.0
v2.1.6
v2.2-beta2
v2.1.5
v2.2-beta1
v2.1.4
v2.1.3
v2.1.2
v2.1.1
v2.1.0
v2.0.10
v2.1-beta1
v2.0.9
v2.0.8
v2.0.7
v2.0.6
v2.0.5
v2.0.4
v2.0.3
v2.0.2
v2.0.1
v2.0.0
v2.0-beta3
v1.9.6
v1.9.5
v2.0-beta2
v1.9.4-r1
v1.9.3
v2.0-beta1
v1.9.2
v1.9.1
v1.9.0-r1
v1.8.4
v1.8.3
v1.8.2
v1.8.1
v1.8.0
v1.7.3
v1.7.2-r1
v1.7.1
v1.7.0
v1.6.3
v1.6.2-r1
v1.6.1-r1
1.6.1
v1.6.0
v1.5.2
v1.5.1
v1.5.0
v1.4.2
v1.4.1
v1.4.0
v1.3.2
v1.3.1
v1.3.0
v1.2.2
v1.2.1
v1.2.0
v1.1.0
v1.0.7-r1
v1.0.7
v1.0.6
v1.0.5
v1.0.4
v1.0.3-r1
v1.0.3
1.0.0
Labels
Clear labels
beta
breaking change
complexity: high
complexity: low
complexity: medium
needs milestone
netbox
pending closure
plugin candidate
pull-request
severity: high
severity: low
severity: medium
status: accepted
status: backlog
status: blocked
status: duplicate
status: needs owner
status: needs triage
status: revisions needed
status: under review
topic: GraphQL
topic: Internationalization
topic: OpenAPI
topic: UI/UX
topic: cabling
topic: event rules
topic: htmx navigation
topic: industrialization
topic: migrations
topic: plugins
topic: scripts
topic: templating
topic: testing
type: bug
type: deprecation
type: documentation
type: feature
type: housekeeping
type: translation
Mirrored from GitHub Pull Request
Milestone
No items
No Milestone
Projects
Clear projects
No project
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: starred/netbox#6487
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @BowmanKlinkenberg on GitHub (May 16, 2022).
Originally assigned to: @kkthxbye-code on GitHub.
NetBox version
v3.2.2
Python version
3.9
Steps to Reproduce
/api/dcim/devices, including config_context, with a resultingcountfield less than 100, andlimit=100(for example, use a query parameter that would result in a low number of matches)/api/dcim/devicesincluding config_context, with a resultingcountfield greater than 100,000, andlimit=100(for example, use a query parameter that would result in a high number of matches)Expected Behavior
limitObserved Behavior
exclude=config_contextparameter is included in a query, there is a similar response time between queries that return a large dataset, and queries that return a small datasetexclude=config_contextis omitted (i.e. config context is fetched), queries that return a large dataset increase in response time dramatically vs. smaller queries, regardless oflimit@kkthxbye-code commented on GitHub (May 18, 2022):
I can't quite replicate this. While there is a performance penalty of not excluding config_context, I can't replicate a difference between offset 5000 and no offset.
I can try generating a larger dataset, but some more info with regards to what your config contexts look like (amount, data size, what you are filtering on etc.) would probably help.
@BowmanKlinkenberg commented on GitHub (May 18, 2022):
@kkthxbye-code we did further testing and found that the
offsetparam was not the root of the issue. It's rathercountor in other words the size of the result.With config context enabled, response time seems to scale directly with the
countfield of the result, regardless oflimitWith config context disabled, response time remains low, and scales depending on
limitOur dataset is around 140K device records. So, it was just coincidental that using
offsetwas the largestcountcould get, since querying on offset without other params returns all devices, paged bylimitI'll add another comment with SQL delays our team is seeing in the queries that include config context.
@BowmanKlinkenberg commented on GitHub (May 18, 2022):
Slow Queries from All w/Context
Fast Queries w/o Context
@BowmanKlinkenberg commented on GitHub (May 18, 2022):
PS this isn't unique to a
countof over 100K, it's just easier to measure the bigger the difference gets in test cases. After accounting for the base API response overhead, it seems to scale directly withcountAlso, this doesn't seem to be related to whether or not config context is actually there or what size it is. It seems to be just related to whether
exclude=config_contextis present. It seems that the resulting query performs far differently.@kkthxbye-code commented on GitHub (May 21, 2022):
Had some time to look at it a little. There's two issues.
First is that the query is just heavy.
https://explain.dalibo.com/plan/VkK#plan
There's some obvious squential scans that seem slow:
https://github.com/netbox-community/netbox/blob/develop/netbox/extras/querysets.py#L103-L111
But it's only around 12% of the total query. I'm sure the entire query can be done faster, but I'm no where near good enough at the django ORM. Maybe it's possible to do the config_context annotation after the main query has been limited. I'm hoping @jeremystretch has time at some point to look at this.
Second issue is that as you said, there's a count executed as well. This effectively doubles the query time as django rest frameworks paginator basically just does a count around the entire query. Not sure if it's possible to define a custom pagination_class that doesn't annotate the queryset with config_context. Again, Jeremy or someone with more django experience probably needs to look at it.
For reference, the sql timing showing the count first then the real query:
@maximumG commented on GitHub (May 24, 2022):
We have exactly the same issue on our netbox instance with around 20K devices with 99 config context. Out of these 99 config context, around 10 are really big JSON structure applied to all of the 20K devices.
After some deep investigation it seems that collapsing every JSON config context for every devices in pure postgreSQL takes a long time. As pinpointed by @kkthxbye-code, the same big query is executed twice: on time for the device count rendering and a second time to render every devices with their associated config context.
I guess we can find some pagination workaround to avoid executing this long-running query just to render a device count. Does anyone knows if we can customize the pagination for DRF ?
@kkthxbye-code commented on GitHub (Jun 13, 2022):
I will tentatively own this, with the goal of removing the extra count overhead. Cloning the queryset in the paginator and running queryset.query.annotations.clear() seems to work as intended and removes the vast majority of the extra overhead.
@jeremystretch commented on GitHub (Jun 13, 2022):
Thanks @kkthxbye-code, I've been meaning to dig into this but haven't gotten to it yet. Let me know if I can be of any assistance.
@kkthxbye-code commented on GitHub (Jun 13, 2022):
I implemented a partial fix which should halve the load time of the device list through the API when config context is not excluded. It's really more of a workaround as I still think there should be a way to avoid the query scaling linearly with the number of devices, I'm still just hitting my head against the ORM when trying to fix it.
I'll try to keep digging when I get the time, but we probably still need your help here @jeremystretch !
@jeremystretch commented on GitHub (Jun 17, 2022):
Although there likely remains work we can do to better optimize these queries, I'm going to leave this as closed by @kkthxbye-code's PR #9527 for a few reasons:
Anyone is welcome to submit a follow-on issue if they'd like to further pursue improvements in this area once v3.2.5 has been released.
@BowmanKlinkenberg commented on GitHub (Jun 17, 2022):
Great news! Thanks for your work and attention to this, @kkthxbye-code @jeremystretch!