Provide a "dry-run" functionality for bulk import/update #8622

New Issue

adam · 2025-12-29T20:38:59+01:00

adam commented

2025-12-29 20:38:59 +01:00

Originally created by @peteeckel on GitHub (Sep 15, 2023).

NetBox version

v3.6.1

Feature type

New functionality

Proposed functionality

Based on the brief discussion in #13773 I suggest implementing a "dry-run" functionality for bulk import/update data.

Use case

This FR needs to be seen in conjunction with #13775 und #13777. Importing or updating data in bulk can be complex and deal with large amounts of data and certain errors such as mis-spelled or mis-cased headers currently result in data being silently ignored.

In case of an input, data sets with some invalid header names are currently silently ignored while the remainder of the columns is imported, which requires a subsequent update run with a new data set including IDs. It would be helpful to validate the input data and check for this kind of error before the import is actually executed so errors can be fixed.

Database changes

None

External dependencies

None

Originally created by @peteeckel on GitHub (Sep 15, 2023). ### NetBox version v3.6.1 ### Feature type New functionality ### Proposed functionality Based on the brief discussion in #13773 I suggest implementing a "dry-run" functionality for bulk import/update data. ### Use case This FR needs to be seen in conjunction with #13775 und #13777. Importing or updating data in bulk can be complex and deal with large amounts of data and certain errors such as mis-spelled or mis-cased headers currently result in data being silently ignored. In case of an input, data sets with some invalid header names are currently silently ignored while the remainder of the columns is imported, which requires a subsequent update run with a new data set including IDs. It would be helpful to validate the input data and check for this kind of error before the import is actually executed so errors can be fixed. ### Database changes None ### External dependencies None

adam added the type: feature label 2025-12-29 20:38:59 +01:00

adam closed this issue

2025-12-29 20:39:00 +01:00

adam commented

2025-12-29 20:39:00 +01:00

@jeremystretch commented on GitHub (Sep 15, 2023):

I don't see how this would really help. After completing a "dry run" import, you would presumably only be able to see what attributes are listed in the resulting import table: Any others that happen not to be displayed in the table cannot be verified.

Additionally, the mechanism by which imported objects are displayed would not permit this behavior. After objects are imported, NetBox redirects the user to a list of objects filtered by request ID. This would not be feasible if the imported objects don't actually exist.

@jeremystretch commented on GitHub (Sep 15, 2023): I don't see how this would really help. After completing a "dry run" import, you would presumably only be able to see what attributes are listed in the resulting import table: Any others that happen not to be displayed in the table cannot be verified. Additionally, the mechanism by which imported objects are displayed would not permit this behavior. After objects are imported, NetBox redirects the user to a list of objects filtered by request ID. This would not be feasible if the imported objects don't actually exist.

adam commented

2025-12-29 20:39:01 +01:00

@peteeckel commented on GitHub (Sep 15, 2023):

In combination with the features suggested in #13775 and especially #13777 a dry-run would give the user a list of ignored columns that would not be used in the import without actually performing the import.

Without the "dry-run" the import is performed, but lacking the affected columns. Currently this might not even be noticed, with a notice that these rows haven't been imported would at the very least require a bulk update which needs an additional ID column - which in turn requires exporting the records and generating a new import set.

With a dry-run the user will be aware that there are some problematic columns and can fix them for the import, thus avoiding the need for the subsequent update.

@peteeckel commented on GitHub (Sep 15, 2023): In combination with the features suggested in #13775 and especially #13777 a dry-run would give the user a list of ignored columns that would not be used in the import without actually performing the import. Without the "dry-run" the import is performed, but lacking the affected columns. Currently this might not even be noticed, with a notice that these rows haven't been imported would at the very least require a bulk update which needs an additional ID column - which in turn requires exporting the records and generating a new import set. With a dry-run the user will be aware that there are some problematic columns and can fix them for the import, thus avoiding the need for the subsequent update.

adam commented

2025-12-29 20:39:01 +01:00

@jeremystretch commented on GitHub (Sep 15, 2023):

a dry-run would give the user a list of ignored columns that would not be used in the import without actually performing the import.

How? Maybe an example would help.

@jeremystretch commented on GitHub (Sep 15, 2023): > a dry-run would give the user a list of ignored columns that would not be used in the import without actually performing the import. How? Maybe an example would help.

adam commented

2025-12-29 20:39:01 +01:00

@peteeckel commented on GitHub (Sep 15, 2023):

Let's assume someone is trying to import the following data set:

address,status,dnsname
10.0.0.1/16,active,node1.zone1.example.com
10.0.0.2/16,active,node2.zone1.example.com
10.0.0.3/16,active,node3.zone1.example.com
10.0.0.4/16,active,node4.zone1.example.com
10.0.0.5/16,active,node5.zone1.example.com
10.0.0.6/16,active,node6.zone1.example.com
10.0.0.7/16,active,node7.zone1.example.com
[...]
10.0.0.254/16,active,node7.zone1.example.com
10.0.1.1/16,active,node1.zone2.example.com
10.0.1.2/16,active,node2.zone2.example.com
10.0.1.3/16,active,node3.zone2.example.com
10.0.1.4/16,active,node4.zone2.example.com
10.0.1.5/16,active,node5.zone2.example.com
10.0.1.6/16,active,node6.zone2.example.com
10.0.1.7/16,active,node7.zone2.example.com
[...]
10.0.1.254/16,active,node7.zone2.example.com
[...]
10.0.16.1/16,active,node1.zone16.example.com
10.0.16.2/16,active,node2.zone16.example.com
10.0.16.3/16,active,node3.zone16.example.com
10.0.16.4/16,active,node4.zone16.example.com
10.0.16.5/16,active,node5.zone16.example.com
10.0.16.6/16,active,node6.zone16.example.com
10.0.16.7/16,active,node7.zone16.example.com
[...]
10.0.16.254/16,active,node7.zone16.example.com

(and imagine the data being less schematic to add a bit of complexity).

What currently happens, as the dns_name field is optional, is that all data would be imported without the dns_name as the column header is not spelled correctly.

Now since the field is missing for all records, the only way to fix it is a bulk update. For that, the user needs the IDs for the IPAddress objects in question, so the CSV data need to be amended:

id,address,status,dns_name
1,10.0.0.1/16,active,node1.zone1.example.com
2,10.0.0.2/16,active,node2.zone1.example.com
3,10.0.0.3/16,active,node3.zone1.example.com
4,10.0.0.4/16,active,node4.zone1.example.com
5,10.0.0.5/16,active,node5.zone1.example.com
6,10.0.0.6/16,active,node6.zone1.example.com
7,10.0.0.7/16,active,node7.zone1.example.com
[...]
8,10.0.0.254/16,active,node7.zone1.example.com
9,10.0.1.1/16,active,node1.zone2.example.com
10,10.0.1.2/16,active,node2.zone2.example.com
11,10.0.1.3/16,active,node3.zone2.example.com
12,10.0.1.4/16,active,node4.zone2.example.com
13,10.0.1.5/16,active,node5.zone2.example.com
14,10.0.1.6/16,active,node6.zone2.example.com
15,10.0.1.7/16,active,node7.zone2.example.com
[...]
16,10.0.1.254/16,active,node7.zone2.example.com
[...]
17,10.0.16.1/16,active,node1.zone16.example.com
18,10.0.16.2/16,active,node2.zone16.example.com
19,10.0.16.3/16,active,node3.zone16.example.com
20,10.0.16.4/16,active,node4.zone16.example.com
21,10.0.16.5/16,active,node5.zone16.example.com
22,10.0.16.6/16,active,node6.zone16.example.com
23,10.0.16.7/16,active,node7.zone16.example.com
[...]
24,10.0.16.254/16,active,node7.zone16.example.com

A dry run would have returned the message like 'Field "dnsname" is unknown and will not be imported' (provided #13777 gets implemented) without actually importing anything, thereby giving the user the chance to fix the issue by correcting the header field.

@peteeckel commented on GitHub (Sep 15, 2023): Let's assume someone is trying to import the following data set: ``` address,status,dnsname 10.0.0.1/16,active,node1.zone1.example.com 10.0.0.2/16,active,node2.zone1.example.com 10.0.0.3/16,active,node3.zone1.example.com 10.0.0.4/16,active,node4.zone1.example.com 10.0.0.5/16,active,node5.zone1.example.com 10.0.0.6/16,active,node6.zone1.example.com 10.0.0.7/16,active,node7.zone1.example.com [...] 10.0.0.254/16,active,node7.zone1.example.com 10.0.1.1/16,active,node1.zone2.example.com 10.0.1.2/16,active,node2.zone2.example.com 10.0.1.3/16,active,node3.zone2.example.com 10.0.1.4/16,active,node4.zone2.example.com 10.0.1.5/16,active,node5.zone2.example.com 10.0.1.6/16,active,node6.zone2.example.com 10.0.1.7/16,active,node7.zone2.example.com [...] 10.0.1.254/16,active,node7.zone2.example.com [...] 10.0.16.1/16,active,node1.zone16.example.com 10.0.16.2/16,active,node2.zone16.example.com 10.0.16.3/16,active,node3.zone16.example.com 10.0.16.4/16,active,node4.zone16.example.com 10.0.16.5/16,active,node5.zone16.example.com 10.0.16.6/16,active,node6.zone16.example.com 10.0.16.7/16,active,node7.zone16.example.com [...] 10.0.16.254/16,active,node7.zone16.example.com ``` (and imagine the data being less schematic to add a bit of complexity). What currently happens, as the `dns_name` field is optional, is that all data would be imported without the `dns_name` as the column header is not spelled correctly. Now since the field is missing for all records, the only way to fix it is a bulk update. For that, the user needs the IDs for the IPAddress objects in question, so the CSV data need to be amended: ``` id,address,status,dns_name 1,10.0.0.1/16,active,node1.zone1.example.com 2,10.0.0.2/16,active,node2.zone1.example.com 3,10.0.0.3/16,active,node3.zone1.example.com 4,10.0.0.4/16,active,node4.zone1.example.com 5,10.0.0.5/16,active,node5.zone1.example.com 6,10.0.0.6/16,active,node6.zone1.example.com 7,10.0.0.7/16,active,node7.zone1.example.com [...] 8,10.0.0.254/16,active,node7.zone1.example.com 9,10.0.1.1/16,active,node1.zone2.example.com 10,10.0.1.2/16,active,node2.zone2.example.com 11,10.0.1.3/16,active,node3.zone2.example.com 12,10.0.1.4/16,active,node4.zone2.example.com 13,10.0.1.5/16,active,node5.zone2.example.com 14,10.0.1.6/16,active,node6.zone2.example.com 15,10.0.1.7/16,active,node7.zone2.example.com [...] 16,10.0.1.254/16,active,node7.zone2.example.com [...] 17,10.0.16.1/16,active,node1.zone16.example.com 18,10.0.16.2/16,active,node2.zone16.example.com 19,10.0.16.3/16,active,node3.zone16.example.com 20,10.0.16.4/16,active,node4.zone16.example.com 21,10.0.16.5/16,active,node5.zone16.example.com 22,10.0.16.6/16,active,node6.zone16.example.com 23,10.0.16.7/16,active,node7.zone16.example.com [...] 24,10.0.16.254/16,active,node7.zone16.example.com ``` A dry run would have returned the message like 'Field "dnsname" is unknown and will not be imported' (provided #13777 gets implemented) without actually importing anything, thereby giving the user the chance to fix the issue by correcting the header field.

adam commented

2025-12-29 20:39:01 +01:00

@jeremystretch commented on GitHub (Sep 15, 2023):

Ok, I think I understand the concern better, thanks. I believe this would be addressed by #11617, which seeks to raise a validation error on the presence of an unrecognized column header.

In general I don't like the concept of dry runs because in the best case scenario, they require wasting time, and in the worst the user forgets to utilize them in the first place.

@jeremystretch commented on GitHub (Sep 15, 2023): Ok, I think I understand the concern better, thanks. I believe this would be addressed by #11617, which seeks to raise a validation error on the presence of an unrecognized column header. In general I don't like the concept of dry runs because in the best case scenario, they require wasting time, and in the worst the user forgets to utilize them in the first place.

adam commented

2025-12-29 20:39:02 +01:00

@peteeckel commented on GitHub (Sep 15, 2023):

Absolutely d'accord, but in #13773 @pv2b answered that silently ignoring this kind of error was a feature and not a bug, and he suggested the dry-run feature as a way to solve the issue. I'd prefer the error message in combination with not accepting errorneous data as well.

@peteeckel commented on GitHub (Sep 15, 2023): Absolutely d'accord, but in #13773 @pv2b answered that silently ignoring this kind of error was a feature and not a bug, and he suggested the dry-run feature as a way to solve the issue. I'd prefer the error message in combination with not accepting errorneous data as well.

adam commented

2025-12-29 20:39:02 +01:00

@jeremystretch commented on GitHub (Sep 15, 2023):

in https://github.com/netbox-community/netbox/issues/13773 @pv2b answered that silently ignoring this kind of error was a feature and not a bug

I'll admit it's a bit subjective, but I'd prefer to treat it as a bug per the principle of least astonishment.

@jeremystretch commented on GitHub (Sep 15, 2023): > in https://github.com/netbox-community/netbox/issues/13773 @pv2b answered that silently ignoring this kind of error was a feature and not a bug I'll admit it's a bit subjective, but I'd prefer to treat it as a bug per the [principle of least astonishment](https://en.wikipedia.org/wiki/Principle_of_least_astonishment).

adam commented

2025-12-29 20:39:02 +01:00

@peteeckel commented on GitHub (Sep 15, 2023):

I'll admit it's a bit subjective, but I'd prefer to treat it as a bug per the principle of least astonishment.

Since I was quite astounded when I stumbled across this behaviour today I'm totally with you on that. Especially since in many cases you won't even notice that something is missing, i.e. when the misspelled column is not in the list of columns displayed in the table popping up after the import.

@peteeckel commented on GitHub (Sep 15, 2023): > I'll admit it's a bit subjective, but I'd prefer to treat it as a bug per the [principle of least astonishment](https://en.wikipedia.org/wiki/Principle_of_least_astonishment). Since I was quite astounded when I stumbled across this behaviour today I'm totally with you on that. Especially since in many cases you won't even notice that something is missing, i.e. when the misspelled column is not in the list of columns displayed in the table popping up after the import.

Sign in to join this conversation.

Branches Tags

main

21524-invlaid-paths-exception

21518-cf-decimal-zero

21356-etags

feature

20787-spectacular

21477-extend-graphql-api-filters-for-cables

21331-deprecate-querystring-tag

21304-deprecate-housekeeping-command

21481-facility-id-doesnt-show-in-rack-page

21429-cable-create-add-another-does-not-carry-over-termination

21364-swagger

20442-callable-audit

feature-ip-prefix-link

20923-dcim-templates

20911-dropdown-3

fix_module_substitution

21203-q-attr-denorm

21160-filterset

21118-site

20911-dropdown-2

21102-fix-graphiql-explorer

20044-elevation-stuck-lightmode

v4.5-beta1-release

20068-import-moduletype-attrs

20766-fix-german-translation-code-literals

20378-del-script

7604-filter-modifiers-v3

circuit-swap

12318-case-insensitive-uniqueness

20637-improve-device-q-filter

20660-script-load

19724-graphql

20614-update-ruff

14884-script

02496-max-page

19720-macaddress-interface-generic-relation

19408-circuit-terminations-export-templates

20203-openapi-check

fix-19669-api-image-download

7604-filter-modifiers

19275-fixes-interface-bulk-edit

fix-17794-get_field_value_return_list

11507-show-aggregate-and-rir-on-api

9583-add_column_specific_search_field_to_tables

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: starred/netbox#8622