mirror of
https://github.com/netbox-community/netbox.git
synced 2026-01-13 05:50:33 +01:00
Improve performance of custom field models and serializers #3984
Closed
opened 2025-12-29 18:32:26 +01:00 by adam
·
4 comments
No Branch/Tag Specified
main
21102-fix-graphiql-explorer
update-changelog-comments-docs
20911-dropdown
20239-plugin-menu-classes-mutable-state
21097-graphql-id-lookups
feature
fix_module_substitution
20923-dcim-templates
20044-elevation-stuck-lightmode
feature-ip-prefix-link
v4.5-beta1-release
20068-import-moduletype-attrs
20766-fix-german-translation-code-literals
20378-del-script
7604-filter-modifiers-v3
circuit-swap
12318-case-insensitive-uniqueness
20637-improve-device-q-filter
20660-script-load
19724-graphql
20614-update-ruff
14884-script
02496-max-page
19720-macaddress-interface-generic-relation
19408-circuit-terminations-export-templates
20203-openapi-check
fix-19669-api-image-download
7604-filter-modifiers
19275-fixes-interface-bulk-edit
fix-17794-get_field_value_return_list
11507-show-aggregate-and-rir-on-api
9583-add_column_specific_search_field_to_tables
v4.5.0
v4.4.10
v4.4.9
v4.5.0-beta1
v4.4.8
v4.4.7
v4.4.6
v4.4.5
v4.4.4
v4.4.3
v4.4.2
v4.4.1
v4.4.0
v4.3.7
v4.4.0-beta1
v4.3.6
v4.3.5
v4.3.4
v4.3.3
v4.3.2
v4.3.1
v4.3.0
v4.2.9
v4.3.0-beta2
v4.2.8
v4.3.0-beta1
v4.2.7
v4.2.6
v4.2.5
v4.2.4
v4.2.3
v4.2.2
v4.2.1
v4.2.0
v4.1.11
v4.1.10
v4.1.9
v4.1.8
v4.2-beta1
v4.1.7
v4.1.6
v4.1.5
v4.1.4
v4.1.3
v4.1.2
v4.1.1
v4.1.0
v4.0.11
v4.0.10
v4.0.9
v4.1-beta1
v4.0.8
v4.0.7
v4.0.6
v4.0.5
v4.0.3
v4.0.2
v4.0.1
v4.0.0
v3.7.8
v3.7.7
v4.0-beta2
v3.7.6
v3.7.5
v4.0-beta1
v3.7.4
v3.7.3
v3.7.2
v3.7.1
v3.7.0
v3.6.9
v3.6.8
v3.6.7
v3.7-beta1
v3.6.6
v3.6.5
v3.6.4
v3.6.3
v3.6.2
v3.6.1
v3.6.0
v3.5.9
v3.6-beta2
v3.5.8
v3.6-beta1
v3.5.7
v3.5.6
v3.5.5
v3.5.4
v3.5.3
v3.5.2
v3.5.1
v3.5.0
v3.4.10
v3.4.9
v3.5-beta2
v3.4.8
v3.5-beta1
v3.4.7
v3.4.6
v3.4.5
v3.4.4
v3.4.3
v3.4.2
v3.4.1
v3.4.0
v3.3.10
v3.3.9
v3.4-beta1
v3.3.8
v3.3.7
v3.3.6
v3.3.5
v3.3.4
v3.3.3
v3.3.2
v3.3.1
v3.3.0
v3.2.9
v3.2.8
v3.3-beta2
v3.2.7
v3.3-beta1
v3.2.6
v3.2.5
v3.2.4
v3.2.3
v3.2.2
v3.2.1
v3.2.0
v3.1.11
v3.1.10
v3.2-beta2
v3.1.9
v3.2-beta1
v3.1.8
v3.1.7
v3.1.6
v3.1.5
v3.1.4
v3.1.3
v3.1.2
v3.1.1
v3.1.0
v3.0.12
v3.0.11
v3.0.10
v3.1-beta1
v3.0.9
v3.0.8
v3.0.7
v3.0.6
v3.0.5
v3.0.4
v3.0.3
v3.0.2
v3.0.1
v3.0.0
v2.11.12
v3.0-beta2
v2.11.11
v2.11.10
v3.0-beta1
v2.11.9
v2.11.8
v2.11.7
v2.11.6
v2.11.5
v2.11.4
v2.11.3
v2.11.2
v2.11.1
v2.11.0
v2.10.10
v2.10.9
v2.11-beta1
v2.10.8
v2.10.7
v2.10.6
v2.10.5
v2.10.4
v2.10.3
v2.10.2
v2.10.1
v2.10.0
v2.9.11
v2.10-beta2
v2.9.10
v2.10-beta1
v2.9.9
v2.9.8
v2.9.7
v2.9.6
v2.9.5
v2.9.4
v2.9.3
v2.9.2
v2.9.1
v2.9.0
v2.9-beta2
v2.8.9
v2.9-beta1
v2.8.8
v2.8.7
v2.8.6
v2.8.5
v2.8.4
v2.8.3
v2.8.2
v2.8.1
v2.8.0
v2.7.12
v2.7.11
v2.7.10
v2.7.9
v2.7.8
v2.7.7
v2.7.6
v2.7.5
v2.7.4
v2.7.3
v2.7.2
v2.7.1
v2.7.0
v2.6.12
v2.6.11
v2.6.10
v2.6.9
v2.7-beta1
Solcon-2020-01-06
v2.6.8
v2.6.7
v2.6.6
v2.6.5
v2.6.4
v2.6.3
v2.6.2
v2.6.1
v2.6.0
v2.5.13
v2.5.12
v2.6-beta1
v2.5.11
v2.5.10
v2.5.9
v2.5.8
v2.5.7
v2.5.6
v2.5.5
v2.5.4
v2.5.3
v2.5.2
v2.5.1
v2.5.0
v2.4.9
v2.5-beta2
v2.4.8
v2.5-beta1
v2.4.7
v2.4.6
v2.4.5
v2.4.4
v2.4.3
v2.4.2
v2.4.1
v2.4.0
v2.3.7
v2.4-beta1
v2.3.6
v2.3.5
v2.3.4
v2.3.3
v2.3.2
v2.3.1
v2.3.0
v2.2.10
v2.3-beta2
v2.2.9
v2.3-beta1
v2.2.8
v2.2.7
v2.2.6
v2.2.5
v2.2.4
v2.2.3
v2.2.2
v2.2.1
v2.2.0
v2.1.6
v2.2-beta2
v2.1.5
v2.2-beta1
v2.1.4
v2.1.3
v2.1.2
v2.1.1
v2.1.0
v2.0.10
v2.1-beta1
v2.0.9
v2.0.8
v2.0.7
v2.0.6
v2.0.5
v2.0.4
v2.0.3
v2.0.2
v2.0.1
v2.0.0
v2.0-beta3
v1.9.6
v1.9.5
v2.0-beta2
v1.9.4-r1
v1.9.3
v2.0-beta1
v1.9.2
v1.9.1
v1.9.0-r1
v1.8.4
v1.8.3
v1.8.2
v1.8.1
v1.8.0
v1.7.3
v1.7.2-r1
v1.7.1
v1.7.0
v1.6.3
v1.6.2-r1
v1.6.1-r1
1.6.1
v1.6.0
v1.5.2
v1.5.1
v1.5.0
v1.4.2
v1.4.1
v1.4.0
v1.3.2
v1.3.1
v1.3.0
v1.2.2
v1.2.1
v1.2.0
v1.1.0
v1.0.7-r1
v1.0.7
v1.0.6
v1.0.5
v1.0.4
v1.0.3-r1
v1.0.3
1.0.0
Labels
Clear labels
beta
breaking change
complexity: high
complexity: low
complexity: medium
needs milestone
netbox
pending closure
plugin candidate
pull-request
severity: high
severity: low
severity: medium
status: accepted
status: backlog
status: blocked
status: duplicate
status: needs owner
status: needs triage
status: revisions needed
status: under review
topic: GraphQL
topic: Internationalization
topic: OpenAPI
topic: UI/UX
topic: cabling
topic: event rules
topic: htmx navigation
topic: industrialization
topic: migrations
topic: plugins
topic: scripts
topic: templating
topic: testing
type: bug
type: deprecation
type: documentation
type: feature
type: housekeeping
type: translation
Mirrored from GitHub Pull Request
No Label
type: housekeeping
Milestone
No items
No Milestone
Projects
Clear projects
No project
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: starred/netbox#3984
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @roganartu on GitHub (Aug 13, 2020).
Environment
Proposed Functionality
Improve the performance of custom field models and serializers by reducing the number of SQL queries issued per request.
GET /api/dcim/devices?limit=1000takes ~20-30s on our best bare metal hardware (~10-15s withexclude=config_context), and results in thousands of duplicate SQL queries relating to custom fields.The profiling I did seems to suggest some of this performance may be improved by #4878 too, though this issue seemed distinct enough to warrant a separate ticket. If you disagree, please feel free to close this.
Profiling
I did some Python profiling and Postgres query logging and found that the majority of queries come from a small number of sources. I will try to document the process I used thoroughly to make reproducing this easier.
I used the following command to generate the request each time:
With the following postgres query logging config:
Each query then created two lines in the query log like this:
I then used the following (on a freshly rotated log file for each request) to aggregate query counts:
I also wrote an adhoc middleware plugin that let me pass
enable_profiler=1to get acProfile.Profile()of the request dumped to/tmpthat I then extracted the call graph from usinggprof2dotand visualised usingxdot. This allowed me to narrow down some of the sources of the SQL a little easier, since I'm not very familiar with Django.The test database I did this profiling on contains:
dcim_deviceobj_type_idVanilla v2.8.8
The request typically takes ~10-15s depending on server load and executes ~3k SQL queries. This example is relatively fast because it was run with a local test DB that was idle:
Almost all of these are identical, repeatedly fetching custom field names and choices:
CustomFieldModelSerializerFound here:
ee51dae73f/netbox/extras/api/customfields.py (L132)These two lines result in one SQL query per device object:
ee51dae73f/netbox/extras/api/customfields.py (L146-L148)The specific query this executes ~1000 times is:
However, afaict these lines can be removed and
fieldscan be replaced with the existingself.context["custom_fields"]without any problems. It requires a minor change to howCustomFieldModelcaches too since that seems to be where the DRF lazy execution actually ends up triggering the query. Someone who understands Django better should comment on the viability, but this use ofself.contextreduced the query count to ~2k and response time by ~17%:Here is the commit that achieved this, let me know if this makes sense and I can submit a PR:
13238ea6c7I did some adhoc testing of the returned objects, and the custom_fields in them appeared to be correct. I admit I do not understand how DRF serialization contexts work (further than a cursory read while doing this investigation), so if this is obviously the incorrect way to fix this I'd love to learn more.
CustomFieldManagerThis one is basically the same as above, but the calls are generated from each instantiated model instead of each serializer. This seems to make it harder to deal with on a per-request basis, since the
CustomFieldManagerdoesn't appear to have any kind of request context, unlike the serializer which has the per-requestself.context["custom_fields"].I don't know how to approach this correctly, but as a proof-of-concept to demonstrate this is where these queries are coming from, I made the
CustomFieldManagera singleton with a per-model._meta.concrete_modellru_cachearound theCustomFieldManager.get_for_modelmethod from here:d5b9722533/netbox/extras/models/customfields.py (L61)Before (including the
CustomFieldModelSerializerfix):After (including the
CustomFieldModelSerializerfix):This is the part of this issue that I think is most related to #4878. I wonder if thread locals could be used (with a middleware to create/del a context dict) to give a per-request cache for this stuff to the model objects?
Missing Index
I didn't do enough digging to see what exactly is generating this query, but it's missing an index which makes it rather slow on large
extras_customfieldvaluetables.Before:
Proposed index:
After:
I can submit a PR with a migration to add this index if you agree it should be fixed.
Remaining queries
This should probably be a separate issue, but about half the remaining queries seem to be parent device lookups. It'd be nice if these were just one query (eg as
WHERE dcim_device.id IN (...)) but I'm not sure how easy Django makes that. This is obviously a much less significant improvement than the other 3,000 queries though!Use Case
We do a lot of things that involve fetching data for a large number of devices. Additionally, we have a large number of custom fields, though that doesn't seem to really matter that much here. It seems that the number of queries currently executed is
3*Nwhere N is number ofCustomFieldModelobjects instantiated. Reducing this to a constant number of queries should significantly improve general Netbox performance.Database Changes
See
Proposed indexsection above.External Dependencies
None.
@DanSheps commented on GitHub (Aug 17, 2020):
I think we would be best to hold off on this until we get #4878 completed
@jeremystretch commented on GitHub (Aug 18, 2020):
While I appreciate the thoughtful analysis, as Dan points out, #4878 was opened specifically to address the shortcomings of the current implementation. I'd hate to see people spend too much time working to improve the current implementation only for it to be ripped out in v2.10. That said, I'll leave this open for anyone who wants to volunteer to make improvements, with the understanding that any changes to the data model itself are untenable given the pending work.
@roganartu commented on GitHub (Aug 18, 2020):
That's entirely reasonable. The primary motivation for submitting this was that I'd done the profiling and I figured a ticket seemed as good a place as any to document my findings since I hadn't seen an deep analysis of the root causes anywhere yet.
Would a PR for
13238ea6c7be compatible with your plans for #4878? It represents a nearly 20% improvement by cutting out one SQL query per result object, and only touches the model to add an optional kwarg for custom field memoisation, otherwise it just uses the existing serialiser cache.@jeremystretch commented on GitHub (Sep 17, 2020):
Given that #4878 has been merged into
develop-2.10at this point, I don't think it makes sense to commit any further development effort to the current implementation (except for any bugs that happen to pop up). However I invite you to experiment with the upcoming release and share any suggestions you have for improvements.