Interleaved delete requests cause corruption of in-progress pagination of GET requests #3014

Closed
opened 2025-12-29 18:24:45 +01:00 by adam · 8 comments
Owner

Originally created by @ajknv on GitHub (Nov 13, 2019).

Environment

  • Python version: 3.6.8
  • NetBox version: 2.6.6

Steps to Reproduce

  1. Make a GET request that will produce enough results to trigger pagination.
  2. Iterate the results and delete the records using the returned IDs.
  3. Repeat the GET request from step 1.

Expected Behavior

The second GET request at step 3 should return zero results (all of the records matching the first query should have been deleted by the iteration). Or alternately the paginated read from the first request should fail with an indication of concurrent modification.

Observed Behavior

Iteration of the results from the first request completes quietly but the second GET request still returns results, indicating that the deletes occurring during the iteration caused the pagination to skip records that should have been included in the results of the first request.

(I did not attempt to verify whether deletes occurring in another thread or from another client provoke the same behavior. If they do this is a particularly nasty bug for any client doing paginated reads.)

Originally created by @ajknv on GitHub (Nov 13, 2019). ### Environment * Python version: 3.6.8 * NetBox version: 2.6.6 ### Steps to Reproduce 1. Make a GET request that will produce enough results to trigger pagination. 2. Iterate the results and delete the records using the returned IDs. 3. Repeat the GET request from step 1. ### Expected Behavior The second GET request at step 3 should return zero results (all of the records matching the first query should have been deleted by the iteration). Or alternately the paginated read from the first request should fail with an indication of concurrent modification. ### Observed Behavior Iteration of the results from the first request completes quietly but the second GET request still returns results, indicating that the deletes occurring during the iteration caused the pagination to skip records that should have been included in the results of the first request. (I did not attempt to verify whether deletes occurring in another thread or from another client provoke the same behavior. If they do this is a particularly nasty bug for any client doing paginated reads.)
adam closed this issue 2025-12-29 18:24:45 +01:00
Author
Owner

@DanSheps commented on GitHub (Nov 14, 2019):

Can you upgrade your environment first and try again?

@DanSheps commented on GitHub (Nov 14, 2019): Can you upgrade your environment first and try again?
Author
Owner

@ajknv commented on GitHub (Nov 14, 2019):

Do you mean from 2.6.6. to 2.6.7?

@ajknv commented on GitHub (Nov 14, 2019): Do you mean from 2.6.6. to 2.6.7?
Author
Owner

@DanSheps commented on GitHub (Nov 14, 2019):

Yes

@DanSheps commented on GitHub (Nov 14, 2019): Yes
Author
Owner

@jeremystretch commented on GitHub (Nov 15, 2019):

I might just be tired, but I'm not following exactly what's happening. However, I'll point out that pagination is not idempotent: The page and per_page query parameters apply only at the time the query is made.

For example, let's say we have 1000 records at 50 records per page, and are currently looking at ?page=3 (items 101-150). If we delete items 1-50 and then call ?page=3 again, we'll now see items 151-200 (of the original 1000). This is how the pagination is intended to operate.

If you need to iterate through the complete list of objects and delete some intermittently, I suggest first gathering the complete list of unique IDs and leveraging the id__in filter to list specific sets of objects.

@jeremystretch commented on GitHub (Nov 15, 2019): I might just be tired, but I'm not following exactly what's happening. However, I'll point out that pagination is not idempotent: The `page` and `per_page` query parameters apply only at the time the query is made. For example, let's say we have 1000 records at 50 records per page, and are currently looking at `?page=3` (items 101-150). If we delete items 1-50 and then call `?page=3` again, we'll now see items 151-200 (of the original 1000). This is how the pagination is intended to operate. If you need to iterate through the complete list of objects and delete some intermittently, I suggest first gathering the complete list of unique IDs and leveraging the `id__in` filter to list specific sets of objects.
Author
Owner

@ajknv commented on GitHub (Nov 18, 2019):

Okay, I think what you described there matches in understanding with the behavior I was describing. Unfortunately IMHO the behavior is rather a bit... surprising. Intuitively I think there would be a tendency to expect that the result set is backed by an iterator that would have snapped a consistent view as of the time of the initial query, and the pagination is just a mechanism for managing the delivery of the data to the client.

The practical impact is that any attempt to apply an operation (delete or otherwise) to all the records that match a given query when pagination is involved requires writing a "convergence" algorithm, e.g. query, update all the results, query and update again, repeating until the updates no longer produce any actual changes. If the inconsistent views can even be affected by concurrent add/deletes by other clients the current client isn't even aware of, then it becomes a rather substantial risk of subtle bugs.

Ideally there would be a way to run a query that would at least have a mechanism for detecting and reporting concurrent modification in the face of subsequent pagination requests (an idempotence token that could be be used for internal consistence checks or something). But barring that perhaps there ought at least be some verbiage in the API documentation to highlight this characteristic of the way pagination currently works.

@ajknv commented on GitHub (Nov 18, 2019): Okay, I think what you described there matches in understanding with the behavior I was describing. Unfortunately IMHO the behavior is rather a bit... surprising. Intuitively I think there would be a tendency to expect that the result set is backed by an iterator that would have snapped a consistent view as of the time of the initial query, and the pagination is just a mechanism for managing the delivery of the data to the client. The practical impact is that any attempt to apply an operation (delete or otherwise) to all the records that match a given query when pagination is involved requires writing a "convergence" algorithm, e.g. query, update all the results, query and update again, repeating until the updates no longer produce any actual changes. If the inconsistent views can even be affected by concurrent add/deletes by other clients the current client isn't even aware of, then it becomes a rather substantial risk of subtle bugs. Ideally there would be a way to run a query that would at least have a mechanism for detecting and reporting concurrent modification in the face of subsequent pagination requests (an idempotence token that could be be used for internal consistence checks or something). But barring that perhaps there ought at least be some verbiage in the API documentation to highlight this characteristic of the way pagination currently works.
Author
Owner

@LBegnaud commented on GitHub (Nov 18, 2019):

I think this is a common behavior of pagination. The suggestion of getting the full dataset and/or the full list of unique IDs seems satisfactory, no?

Or are you saying that if you perform the GET request too quickly, you return items that were successfully deleted?

@LBegnaud commented on GitHub (Nov 18, 2019): I think this is a common behavior of pagination. The suggestion of getting the full dataset and/or the full list of unique IDs seems satisfactory, no? Or are you saying that if you perform the GET request too quickly, you return items that were successfully deleted?
Author
Owner

@ajknv commented on GitHub (Nov 18, 2019):

The suggestion of getting the full dataset and/or the full list of unique IDs seems satisfactory, no?

I dunno, yes and no I guess? Even the process of gathering the list of IDs will require a converging multi-iteration to deal with paginated results in the face of concurrent modification (at least in the case of deletes, since the implication is that subsequent page requests can thus miss still-valid ids). In the sense that the API here tends to feel like a thin wrapper to a DB, this still tends to a violation of the principle of least surprise IMHO. At least to the point that some specific callout in the documentation seems very helpful. (Though I would still love to see a functional mechanism of some kind added to enable the API to signal the concurrent modification condition, even if by opt-in.)

Or are you saying that if you perform the GET request too quickly, you return items that were successfully deleted?

I don't think I saw this, though FWIW for the scenarios where I've dealt with the pagination behavior so far I'm not sure I would have conclusively detected it if it happened.

@ajknv commented on GitHub (Nov 18, 2019): > The suggestion of getting the full dataset and/or the full list of unique IDs seems satisfactory, no? I dunno, yes and no I guess? Even the process of gathering the list of IDs will require a converging multi-iteration to deal with paginated results in the face of concurrent modification (at least in the case of deletes, since the implication is that subsequent page requests can thus miss still-valid ids). In the sense that the API here tends to feel like a thin wrapper to a DB, this still tends to a violation of the principle of least surprise IMHO. At least to the point that some specific callout in the documentation seems very helpful. (Though I would still love to see a functional mechanism of some kind added to enable the API to signal the concurrent modification condition, even if by opt-in.) > Or are you saying that if you perform the GET request too quickly, you return items that were successfully deleted? I don't think I saw this, though FWIW for the scenarios where I've dealt with the pagination behavior so far I'm not sure I would have conclusively detected it if it happened.
Author
Owner

@jeremystretch commented on GitHub (Nov 18, 2019):

The pagination functionality in the NetBox API is provided by the Django REST Framework. Specifically, we use DRF's LimitOffsetPagination as opposed to the alternative CursorPagination.

I'm fine with you opening a feature request to discuss extending/replacing the pagination functionality in NetBox (please be as detailed as possible if you do), but as far as this specific issue I think it's fair to say this is expected behavior per the DRF docs.

@jeremystretch commented on GitHub (Nov 18, 2019): The pagination functionality in the NetBox API is provided by the [Django REST Framework](https://www.django-rest-framework.org/api-guide/pagination/). Specifically, we use DRF's `LimitOffsetPagination` as opposed to the alternative `CursorPagination`. I'm fine with you opening a feature request to discuss extending/replacing the pagination functionality in NetBox (please be as detailed as possible if you do), but as far as this specific issue I think it's fair to say this is expected behavior per the DRF docs.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/netbox#3014