Deleting prefixes with large number of children (Performance) #11450

Closed
opened 2025-12-29 21:45:23 +01:00 by adam · 2 comments
Owner

Originally created by @nbl-csling on GitHub (Aug 5, 2025).

Originally assigned to: @jnovinger on GitHub.

Deployment Type

NetBox Cloud

NetBox Version

v4.3.4

Python Version

3.10

Steps to Reproduce

  1. Have a database with a large number of prefixes that have child relationships to others (a db with over 1 million prefixes was used in this testing)
  2. Delete a prefix that has a large number of child prefixes (2300 children were attached to the prefix that was deleted in this testing)

Expected Behavior

The prefix should be deleted and return you to the prefix list in a timely manner

Observed Behavior

The front end timed out after 10 minutes.

Originally created by @nbl-csling on GitHub (Aug 5, 2025). Originally assigned to: @jnovinger on GitHub. ### Deployment Type NetBox Cloud ### NetBox Version v4.3.4 ### Python Version 3.10 ### Steps to Reproduce 1. Have a database with a large number of prefixes that have child relationships to others (a db with over 1 million prefixes was used in this testing) 2. Delete a prefix that has a large number of child prefixes (2300 children were attached to the prefix that was deleted in this testing) ### Expected Behavior The prefix should be deleted and return you to the prefix list in a timely manner ### Observed Behavior The front end timed out after 10 minutes.
adam added the type: bugstatus: acceptedseverity: medium labels 2025-12-29 21:45:23 +01:00
adam closed this issue 2025-12-29 21:45:23 +01:00
Author
Owner

@jnovinger commented on GitHub (Aug 5, 2025):

Hey @nbl-csling, thank you for reporting this issue. I attempted to reproduce the performance problem you described by creating a test scenario with a single parent prefix containing 2,300 child prefixes (matching your report). Note, though, that I did not have ~1M prefixes total present.

My test results:

  • Import time: ~2 minutes for 2,301 prefixes (which did take a while as expected)
  • Deletion time: The single parent prefix deletion completed in just a couple of seconds

We were unable to reproduce the 10-minute timeout you experienced when deleting the parent prefix. This suggests there may be specific environmental factors or conditions that we haven't accounted for in our reproduction attempt.

Could you help us by providing additional details:

  • What specific steps did you take when the timeout occurred? (e.g., navigating to the prefix detail page, bulk operations, etc.)
  • Were there any error messages or specific UI behaviors you observed?
  • Do you recall if there were any other operations running at the same time?
  • Was this a consistent issue or did it happen only once?

Any additional context you can provide would be very helpful in identifying what might be causing the performance issue in your environment.

@jnovinger commented on GitHub (Aug 5, 2025): Hey @nbl-csling, thank you for reporting this issue. I attempted to reproduce the performance problem you described by creating a test scenario with a single parent prefix containing 2,300 child prefixes (matching your report). Note, though, that I did not have ~1M prefixes total present. My test results: - Import time: ~2 minutes for 2,301 prefixes (which did take a while as expected) - Deletion time: The single parent prefix deletion completed in just a couple of seconds We were unable to reproduce the 10-minute timeout you experienced when deleting the parent prefix. This suggests there may be specific environmental factors or conditions that we haven't accounted for in our reproduction attempt. Could you help us by providing additional details: - What specific steps did you take when the timeout occurred? (e.g., navigating to the prefix detail page, bulk operations, etc.) - Were there any error messages or specific UI behaviors you observed? - Do you recall if there were any other operations running at the same time? - Was this a consistent issue or did it happen only once? Any additional context you can provide would be very helpful in identifying what might be causing the performance issue in your environment.
Author
Owner

@jnovinger commented on GitHub (Aug 8, 2025):

I have been able to reproduce this with updated data, against v4.2.9, v4.3.5, and upcoming v4.4 (the feature branch as of today).

On my local dev machine, these delete operations are taking somewhere on the order of ~5-8 minutes.

Results:

  • 07/Aug/2025 22:07:49: confirm delete in the UI
  • 07/Aug/2025 22:12:45: NetBox logging indicates deletion of prefix 36.128..0.0/10 (which just over 2000 child prefixes) has completed
  • 07/Aug/2025 22:12:46: NetBox returns a redirect from the POST call (the actual delete)
  • 07/Aug/2025 22:13:11: The prefix list view load finishes (on the server, it finished one second later in the browser)

Raw access logs from runserver:

[07/Aug/2025 22:07:30] "GET /ipam/prefixes/288227/delete/ HTTP/1.1" 200 8433
[07/Aug/2025 22:07:30] "GET /__debug__/history_sidebar/?store_id=8f1782f823264c8c876c485042d85dc4 HTTP/1.1" 200 10256
netbox.views.ObjectDeleteView INFO 2025-08-07 22:12:45,784 object_views 70416 6271004672 Deleted prefix 36.128.0.0/10
[07/Aug/2025 22:12:46] "POST /ipam/prefixes/288227/delete/ HTTP/1.1" 302 0
[07/Aug/2025 22:13:11] "GET /ipam/prefixes/ HTTP/1.1" 200 4562279

Django Debug Toolbar history indicates that there were 117 queries for the delete action, that took a combined 296250.52 ms:

default 296250.52 ms (117 queries including 97 similar and 77 duplicates )

The biggest offender (at 296048.36 ms!) is the following query, which is evaluated in netbox/ipam/signals.py in update_children_depth(24) (as of commit 33d891e67 in feature):

SELECT "ipam_prefix"."id",
       "ipam_prefix"."created",
       "ipam_prefix"."last_updated",
       "ipam_prefix"."custom_field_data",
       "ipam_prefix"."description",
       "ipam_prefix"."comments",
       "ipam_prefix"."scope_type_id",
       "ipam_prefix"."scope_id",
       "ipam_prefix"."_location_id",
       "ipam_prefix"."_site_id",
       "ipam_prefix"."_region_id",
       "ipam_prefix"."_site_group_id",
       "ipam_prefix"."prefix",
       "ipam_prefix"."vrf_id",
       "ipam_prefix"."tenant_id",
       "ipam_prefix"."vlan_id",
       "ipam_prefix"."status",
       "ipam_prefix"."role_id",
       "ipam_prefix"."is_pool",
       "ipam_prefix"."mark_utilized",
       "ipam_prefix"."_depth",
       "ipam_prefix"."_children",
       (
        SELECT COUNT(DISTINCT U0."prefix") AS "c"
          FROM "ipam_prefix" U0
         WHERE (U0."prefix" >> "ipam_prefix"."prefix" AND COALESCE(U0."vrf_id", 0) = COALESCE("ipam_prefix"."vrf_id", 0))
       ) AS "hierarchy_depth",
       (
        SELECT COUNT(U1."prefix") AS "c"
          FROM "ipam_prefix" U1
         WHERE (U1."prefix" << "ipam_prefix"."prefix" AND COALESCE(U1."vrf_id", 0) = COALESCE("ipam_prefix"."vrf_id", 0))
       ) AS "hierarchy_children"
  FROM "ipam_prefix"
 WHERE ("ipam_prefix"."prefix" <<= '36.128.0.0/10' AND "ipam_prefix"."vrf_id" IS NULL)
 ORDER BY "ipam_prefix"."vrf_id" ASC NULLS FIRST,
          "ipam_prefix"."prefix" ASC,
          "ipam_prefix"."id" ASC

A subsequent EXPLAIN returned:

Gather Merge  (cost=23304.58..57050273.40 rows=827 width=151)
   Workers Planned: 2
   ->  Sort  (cost=22304.55..22305.42 rows=345 width=135)
         Sort Key: ipam_prefix.vrf_id NULLS FIRST, ipam_prefix.prefix, ipam_prefix.id
         ->  Parallel Seq Scan on ipam_prefix  (cost=0.00..22290.01 rows=345 width=135)
               Filter: ((vrf_id IS NULL) AND ((prefix)::inet <<= '36.128.0.0/10'::inet))
   SubPlan 1
     ->  Aggregate  (cost=34478.15..34478.16 rows=1 width=8)
           ->  Seq Scan on ipam_prefix u0  (cost=0.00..34478.07 rows=31 width=9)
                 Filter: (((prefix)::inet >> (ipam_prefix.prefix)::inet) AND (COALESCE(vrf_id, '0'::bigint) = COALESCE(ipam_prefix.vrf_id, '0'::bigint)))
   SubPlan 2
     ->  Aggregate  (cost=34478.15..34478.16 rows=1 width=8)
           ->  Seq Scan on ipam_prefix u1  (cost=0.00..34478.07 rows=31 width=9)
                 Filter: (((prefix)::inet << (ipam_prefix.prefix)::inet) AND (COALESCE(vrf_id, '0'::bigint) = COALESCE(ipam_prefix.vrf_id, '0'::bigint)))
(14 rows)
@jnovinger commented on GitHub (Aug 8, 2025): I have been able to reproduce this with updated data, against v4.2.9, v4.3.5, and upcoming v4.4 (the `feature` branch as of today). On my local dev machine, these delete operations are taking somewhere on the order of ~5-8 minutes. Results: - 07/Aug/2025 22:07:49: confirm delete in the UI - 07/Aug/2025 22:12:45: NetBox logging indicates deletion of prefix 36.128..0.0/10 (which just over 2000 child prefixes) has completed - 07/Aug/2025 22:12:46: NetBox returns a redirect from the POST call (the actual delete) - 07/Aug/2025 22:13:11: The prefix list view load finishes (on the server, it finished one second later in the browser) Raw access logs from `runserver`: ``` [07/Aug/2025 22:07:30] "GET /ipam/prefixes/288227/delete/ HTTP/1.1" 200 8433 [07/Aug/2025 22:07:30] "GET /__debug__/history_sidebar/?store_id=8f1782f823264c8c876c485042d85dc4 HTTP/1.1" 200 10256 netbox.views.ObjectDeleteView INFO 2025-08-07 22:12:45,784 object_views 70416 6271004672 Deleted prefix 36.128.0.0/10 [07/Aug/2025 22:12:46] "POST /ipam/prefixes/288227/delete/ HTTP/1.1" 302 0 [07/Aug/2025 22:13:11] "GET /ipam/prefixes/ HTTP/1.1" 200 4562279 ``` Django Debug Toolbar history indicates that there were 117 queries for the delete action, that took a combined 296250.52 ms: > default 296250.52 ms (117 queries including 97 similar and 77 duplicates ) The biggest offender (at 296048.36 ms!) is the following query, which is evaluated in [`netbox/ipam/signals.py in update_children_depth(24)`](https://github.com/netbox-community/netbox/blob/37d6c160b98cfcd29c1957d2c09acb367612c9a1/netbox/ipam/signals.py#L24) (as of commit 33d891e67 in `feature`): ```sql SELECT "ipam_prefix"."id", "ipam_prefix"."created", "ipam_prefix"."last_updated", "ipam_prefix"."custom_field_data", "ipam_prefix"."description", "ipam_prefix"."comments", "ipam_prefix"."scope_type_id", "ipam_prefix"."scope_id", "ipam_prefix"."_location_id", "ipam_prefix"."_site_id", "ipam_prefix"."_region_id", "ipam_prefix"."_site_group_id", "ipam_prefix"."prefix", "ipam_prefix"."vrf_id", "ipam_prefix"."tenant_id", "ipam_prefix"."vlan_id", "ipam_prefix"."status", "ipam_prefix"."role_id", "ipam_prefix"."is_pool", "ipam_prefix"."mark_utilized", "ipam_prefix"."_depth", "ipam_prefix"."_children", ( SELECT COUNT(DISTINCT U0."prefix") AS "c" FROM "ipam_prefix" U0 WHERE (U0."prefix" >> "ipam_prefix"."prefix" AND COALESCE(U0."vrf_id", 0) = COALESCE("ipam_prefix"."vrf_id", 0)) ) AS "hierarchy_depth", ( SELECT COUNT(U1."prefix") AS "c" FROM "ipam_prefix" U1 WHERE (U1."prefix" << "ipam_prefix"."prefix" AND COALESCE(U1."vrf_id", 0) = COALESCE("ipam_prefix"."vrf_id", 0)) ) AS "hierarchy_children" FROM "ipam_prefix" WHERE ("ipam_prefix"."prefix" <<= '36.128.0.0/10' AND "ipam_prefix"."vrf_id" IS NULL) ORDER BY "ipam_prefix"."vrf_id" ASC NULLS FIRST, "ipam_prefix"."prefix" ASC, "ipam_prefix"."id" ASC ``` A subsequent `EXPLAIN` returned: ``` Gather Merge (cost=23304.58..57050273.40 rows=827 width=151) Workers Planned: 2 -> Sort (cost=22304.55..22305.42 rows=345 width=135) Sort Key: ipam_prefix.vrf_id NULLS FIRST, ipam_prefix.prefix, ipam_prefix.id -> Parallel Seq Scan on ipam_prefix (cost=0.00..22290.01 rows=345 width=135) Filter: ((vrf_id IS NULL) AND ((prefix)::inet <<= '36.128.0.0/10'::inet)) SubPlan 1 -> Aggregate (cost=34478.15..34478.16 rows=1 width=8) -> Seq Scan on ipam_prefix u0 (cost=0.00..34478.07 rows=31 width=9) Filter: (((prefix)::inet >> (ipam_prefix.prefix)::inet) AND (COALESCE(vrf_id, '0'::bigint) = COALESCE(ipam_prefix.vrf_id, '0'::bigint))) SubPlan 2 -> Aggregate (cost=34478.15..34478.16 rows=1 width=8) -> Seq Scan on ipam_prefix u1 (cost=0.00..34478.07 rows=31 width=9) Filter: (((prefix)::inet << (ipam_prefix.prefix)::inet) AND (COALESCE(vrf_id, '0'::bigint) = COALESCE(ipam_prefix.vrf_id, '0'::bigint))) (14 rows) ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/netbox#11450