Cache invalidation issues #2808

Closed
opened 2025-12-29 18:22:21 +01:00 by adam · 4 comments
Owner

Originally created by @jeremystretch on GitHub (Aug 15, 2019).

Environment

  • Python version: 3.5.2
  • NetBox version: 2.6.2

This issue is an aggregation of several reported bug relating to cache invalidation:

  • #3300: MAC address filtering
  • #3363: Cache not invalidated when interface name changed
  • #3379: Error occurred on virtual chassis deletion
  • #3382: Renaming a slave device is not visible on the parent device

These are all instances of cached data not being appropriately flushed in response to changes in the data.

Please note any related issues here instead of opening new bug reports.

Originally created by @jeremystretch on GitHub (Aug 15, 2019). ### Environment * Python version: 3.5.2 * NetBox version: 2.6.2 This issue is an aggregation of several reported bug relating to cache invalidation: * #3300: MAC address filtering * #3363: Cache not invalidated when interface name changed * #3379: Error occurred on virtual chassis deletion * #3382: Renaming a slave device is not visible on the parent device These are all instances of cached data not being appropriately flushed in response to changes in the data. Please note any related issues here instead of opening new bug reports.
adam added the type: bugstatus: accepted labels 2025-12-29 18:22:21 +01:00
adam closed this issue 2025-12-29 18:22:21 +01:00
Author
Owner

@lampwins commented on GitHub (Aug 19, 2019):

There are two circumstances I identified in the NetBox codebase causing these and other cache invalidation issues.

First, Cacheops does not support auto invalidation of changes to data contained in a select_related() clause, and this is with good reason. As we know, select_related preforms a join, and thus the additional data is added directly to the destination QuerySet. This is problematic because the additional data cannot be differentiated within that single QuerySet, and thus there is no way to know it was later modified. On the other hand, prefetch_related creates separate QuestSets which are linked to the main QuerySet. In doing so, Cacheops is able to watch these additional QuerySets for modifications.

The catch to prefetch_related is instead of a join, it uses an additional query to create a district QuerySet for each relation. On the outset, it may sound bad to use prefetch_related in place of select_related in all occurrences, but after analyzing the code and thinking about the context of data and relationships in NetBox, I honestly don't think it makes a huge difference. After having tested it, my vote is to make the change to use prefetch_related in all use cases. I further back this stance up by pointing out that caching is meant to overcome a great deal of performance overhead anyway. If issues arise with particular use cases, we can certainly identify and deal with them in another way.

The second issue is our continued use of update() on QuerySets. The QuerySet update method has long been a thorn in the side of Django, due to the fact it is implemented entirely at the DB layer. This means there is no signaling withing Django for updates that occur with its use. Luckily, Cacheops has overcome this by exposing an invalidated_update() method on the QuerySet class. Its usage is entirely interchangeable with update() which means we do not have to further refactor anything. The catch is, it too results in an additional query to identify all updated instances, but again, this is realistically negligible.

@lampwins commented on GitHub (Aug 19, 2019): There are two circumstances I identified in the NetBox codebase causing these and other cache invalidation issues. First, Cacheops does not support auto invalidation of changes to data contained in a `select_related()` clause, and this is with good reason. As we know, `select_related` preforms a join, and thus the additional data is added directly to the destination QuerySet. This is problematic because the additional data cannot be differentiated within that single QuerySet, and thus there is no way to know it was later modified. On the other hand, `prefetch_related` creates separate QuestSets which are linked to the main QuerySet. In doing so, Cacheops is able to watch these additional QuerySets for modifications. The catch to `prefetch_related` is instead of a join, it uses an additional query to create a district QuerySet for each relation. On the outset, it may sound bad to use `prefetch_related` in place of `select_related` in all occurrences, but after analyzing the code and thinking about the context of data and relationships in NetBox, I honestly don't think it makes a huge difference. After having tested it, my vote is to make the change to use `prefetch_related` in all use cases. I further back this stance up by pointing out that caching is meant to overcome a great deal of performance overhead anyway. If issues arise with particular use cases, we can certainly identify and deal with them in another way. The second issue is our continued use of `update()` on QuerySets. The QuerySet update method has long been a thorn in the side of Django, due to the fact it is implemented entirely at the DB layer. This means there is no signaling withing Django for updates that occur with its use. Luckily, Cacheops has overcome this by exposing an `invalidated_update()` method on the QuerySet class. Its usage is entirely interchangeable with `update()` which means we do not have to further refactor anything. The catch is, it too results in an additional query to identify all updated instances, but again, this is realistically negligible.
Author
Owner

@jeremystretch commented on GitHub (Aug 19, 2019):

As we know, select_related preforms a join, and thus the additional data is added directly to the destination QuerySet. This is problematic because the additional data cannot be differentiated within that single QuerySet, and thus there is no way to know it was later modified.

I don't follow. select_related() is pulling in data from related models, but those models aren't being modified. For example:

circuit_list = Circuit.objects.select_related('type')
circuit = circuit_list.first()
circuit.type = new_type
circuit.save()

This doesn't have any effect on the CircuitType model AFAICT. Does it related to caching of the related objects count (e.g. circuit_type.circuits.count())? If that's the case, there's still a problem because we can update the circuit's type without ever calling select_related or prefetch_related anyway.

The second issue is our continued use of update() on QuerySets.

I thought we got rid of all update() calls a while back but it looks like we either missed some or they crept back in.

Luckily, Cacheops has overcome this by exposing an invalidated_update() method on the QuerySet class.

We should probably just replace all update() calls with separate query/save calls anyway IMO.

@jeremystretch commented on GitHub (Aug 19, 2019): > As we know, select_related preforms a join, and thus the additional data is added directly to the destination QuerySet. This is problematic because the additional data cannot be differentiated within that single QuerySet, and thus there is no way to know it was later modified. I don't follow. `select_related()` is pulling in data from related models, but those models aren't being modified. For example: ``` circuit_list = Circuit.objects.select_related('type') circuit = circuit_list.first() circuit.type = new_type circuit.save() ``` This doesn't have any effect on the CircuitType model AFAICT. Does it related to caching of the related objects count (e.g. `circuit_type.circuits.count()`)? If that's the case, there's still a problem because we can update the circuit's type without ever calling `select_related` or `prefetch_related` anyway. > The second issue is our continued use of `update()` on QuerySets. I thought we got rid of all `update()` calls a while back but it looks like we either missed some or they crept back in. > Luckily, Cacheops has overcome this by exposing an `invalidated_update()` method on the QuerySet class. We should probably just replace all `update()` calls with separate query/save calls anyway IMO.
Author
Owner

@lampwins commented on GitHub (Aug 19, 2019):

I don't follow. select_related() is pulling in data from related models, but those models aren't being modified. For example:

It is not about the modification of model instances for which the query is targeting, but rather the related model instances. In your example, if a CircuitType object that was included in the select_related('type') was modified, it would not be invalidated as a part of your circuit_list query and thus references to it from your circuit_list would show stale data. This is the direct cause of #3363 and #3382. This is the offending line for the interface example. Switching to a prefetch_related here allows the connected neighbor relation fields to be independent QuerySets which is what Cacheops needs to be able to later invalidate them.

Does it related to caching of the related objects count (e.g. circuit_type.circuits.count())? If that's the case, there's still a problem because we can update the circuit's type without ever calling select_related or prefetch_related anyway.

Not sure I follow, but in general the counts are handled as normal queries and invalidated in the same way. If anything, these cases are simpler.

We should probably just replace all update() calls with separate query/save calls anyway IMO.

In general, I agree with this. There are ~20 or so instances of update() remaining. Some are rather old, and some snuck in recently. Some cases may make since so invalidated_update() is a valid option if we choose to keep them.

@lampwins commented on GitHub (Aug 19, 2019): > I don't follow. select_related() is pulling in data from related models, but those models aren't being modified. For example: It is not about the modification of model instances for which the query is targeting, but rather the related model instances. In your example, if a CircuitType object that was included in the `select_related('type')` was modified, it would not be invalidated as a part of your `circuit_list` query and thus references to it from your `circuit_list` would show stale data. This is the direct cause of #3363 and #3382. [This](https://github.com/netbox-community/netbox/blob/develop/netbox/dcim/views.py#L949) is the offending line for the interface example. Switching to a `prefetch_related` here allows the connected neighbor relation fields to be independent QuerySets which is what Cacheops needs to be able to later invalidate them. > Does it related to caching of the related objects count (e.g. circuit_type.circuits.count())? If that's the case, there's still a problem because we can update the circuit's type without ever calling select_related or prefetch_related anyway. Not sure I follow, but in general the counts are handled as normal queries and invalidated in the same way. If anything, these cases are simpler. > We should probably just replace all update() calls with separate query/save calls anyway IMO. In general, I agree with this. There are ~20 or so instances of `update()` remaining. Some are rather old, and some snuck in recently. Some cases may make since so `invalidated_update()` is a valid option if we choose to keep them.
Author
Owner

@lampwins commented on GitHub (Aug 20, 2019):

#3300 is caused by select_related() on the API view's queryset
#3363 is caused by select_related() on the device's interface context queryset
#3379 is caused by update() in virtual chassis signals and the delete views
#3382 is caused by select_related() on the device's parent/child context queryset

@lampwins commented on GitHub (Aug 20, 2019): #3300 is caused by `select_related()` on the API view's queryset #3363 is caused by `select_related()` on the device's interface context queryset #3379 is caused by `update()` in virtual chassis signals and the delete views #3382 is caused by `select_related()` on the device's parent/child context queryset
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/netbox#2808