Implement natural sorting for the sites list #223

Closed
opened 2025-12-29 16:19:38 +01:00 by adam · 12 comments
Owner

Originally created by @nicko170 on GitHub (Jul 15, 2016).

Orders 10, 11, 12, 13, 1, 2, 3

Originally created by @nicko170 on GitHub (Jul 15, 2016). Orders 10, 11, 12, 13, 1, 2, 3
adam added the type: feature label 2025-12-29 16:19:38 +01:00
adam closed this issue 2025-12-29 16:19:38 +01:00
Author
Owner

@nicko170 commented on GitHub (Jul 15, 2016):

image

@nicko170 commented on GitHub (Jul 15, 2016): ![image](https://cloud.githubusercontent.com/assets/9477420/16860985/0fa610a0-4a80-11e6-9f5e-d3ae759193d0.png)
Author
Owner

@jeremystretch commented on GitHub (Jul 15, 2016):

Looks like another candidate for natsort. Should be simple to implement.

Just out of curiosity, why do you name sites like this?

@jeremystretch commented on GitHub (Jul 15, 2016): Looks like another candidate for `natsort`. Should be simple to implement. Just out of curiosity, why do you name sites like this?
Author
Owner

@nicko170 commented on GitHub (Jul 15, 2016):

That would be awesome, I had a read but django isn't really my thing - was hoping to just be able to fix and send back :-)

Sites of ours are assigned a number which everything is labelled against - so switches are sw-6-1 sw-6-2 etc, routers bdr-1-1, bdr-1-2, bdr-6-1 - Was done before I started and I just kept it going for new sites.

@nicko170 commented on GitHub (Jul 15, 2016): That would be awesome, I had a read but django isn't really my thing - was hoping to just be able to fix and send back :-) Sites of ours are assigned a number which everything is labelled against - so switches are sw-6-1 sw-6-2 etc, routers bdr-1-1, bdr-1-2, bdr-6-1 - Was done before I started and I just kept it going for new sites.
Author
Owner

@Zanthras commented on GitHub (Jul 16, 2016):

My coworkers on the enterprise side name sites(aka offices) like that as well because it matches their ip space assigned. I find it odd, but to each their own.

@Zanthras commented on GitHub (Jul 16, 2016): My coworkers on the enterprise side name sites(aka offices) like that as well because it matches their ip space assigned. I find it odd, but to each their own.
Author
Owner

@jeremystretch commented on GitHub (Jul 18, 2016):

I know I suggested natsort earlier, but thinking further on this I'm not sure it's a road we want to go down. Calling natsorted on a queryset forces its evaluation prior to pagination (which is handled by the table's RequestConfig). This means that all rows matching the given filter set are pulled from the database to be sorted, even when only a subset will be displayed.

A more efficient approach would be to effect sorting the database by CASTing integers, but it's impossible to achieve a foolproof sorting scheme since the format of a site name is arbitrary. Maybe it would be enough to order sites (and potentially other objects) first by a leading integer, if one exists, and then fall back to the default ordering. Something like this:

queryset.extra(select={
    '_number': "CAST(SUBSTRING({} FROM '^([0-9]+)') AS integer)".format(sql_col),
}).order_by('_number')
@jeremystretch commented on GitHub (Jul 18, 2016): I know I suggested natsort earlier, but thinking further on this I'm not sure it's a road we want to go down. Calling `natsorted` on a queryset forces its evaluation prior to pagination (which is handled by the table's `RequestConfig`). This means that all rows matching the given filter set are pulled from the database to be sorted, even when only a subset will be displayed. A more efficient approach would be to effect sorting the database by CASTing integers, but it's impossible to achieve a foolproof sorting scheme since the format of a site name is arbitrary. Maybe it would be enough to order sites (and potentially other objects) first by a leading integer, if one exists, and then fall back to the default ordering. Something like this: ``` queryset.extra(select={ '_number': "CAST(SUBSTRING({} FROM '^([0-9]+)') AS integer)".format(sql_col), }).order_by('_number') ```
Author
Owner

@Zanthras commented on GitHub (Jul 18, 2016):

I'll do some research, try and see if i can find a more general solution, possibly chunk the string into up to 3 or 4 parts to sort each chunk normally, which should result in better? returned ordering, and still pre data read on the db.

@Zanthras commented on GitHub (Jul 18, 2016): I'll do some research, try and see if i can find a more general solution, possibly chunk the string into up to 3 or 4 parts to sort each chunk normally, which should result in better? returned ordering, and still pre data read on the db.
Author
Owner

@Zanthras commented on GitHub (Jul 19, 2016):

I saw this option http://stackoverflow.com/questions/12965463/humanized-or-natural-number-sorting-of-mixed-word-and-number-strings/20667107#20667107 actually implementing something like that seemed to require falling back to a querysetraw which seems to not be worth it at all. Some of the other options seemed to require custom types or tables which could be doable, but not directly obvious from django's ORM. I have here the best i could come up with in those constraints.

Given some data from the provided example at the top of the issue here is the sorted results
image

I'll submit a pull request for the complete thing if this is a decent alternative, the bulk of the code is this function.

def natural_ordering(queryset, sql_col, fallback_ordering):

    ordering = ('_strchunk1', '_intchunk1', '_strchunk2', '_intchunk2', '_strchunk3', '_intchunk3', '_strchunk4',
                '_intchunk4') + fallback_ordering

    return queryset.extra(select={
        '_strchunk1': "SUBSTRING({} FROM '^(\D+)')".format(sql_col),
        '_intchunk1': "CAST(SUBSTRING({} FROM '^(\d+)') AS integer)".format(sql_col),
        '_strchunk2': "SUBSTRING({} FROM '^\d+(\D+)')".format(sql_col),
        '_intchunk2': "CAST(SUBSTRING({} FROM '^\D+(\d+)') AS integer)".format(sql_col),
        '_strchunk3': "SUBSTRING({} FROM '^\D+\d+(\D+)')".format(sql_col),
        '_intchunk3': "CAST(SUBSTRING({} FROM '^\d+\D+(\d+)') AS integer)".format(sql_col),
        '_strchunk4': "SUBSTRING({} FROM '^\d+\D+\d+(\D+)')".format(sql_col),
        '_intchunk4': "CAST(SUBSTRING({} FROM '^\D+\d+\D+(\d+)') AS integer)".format(sql_col),
    }).order_by(*ordering)`
@Zanthras commented on GitHub (Jul 19, 2016): I saw this option http://stackoverflow.com/questions/12965463/humanized-or-natural-number-sorting-of-mixed-word-and-number-strings/20667107#20667107 actually implementing something like that seemed to require falling back to a querysetraw which seems to not be worth it at all. Some of the other options seemed to require custom types or tables which could be doable, but not directly obvious from django's ORM. I have here the best i could come up with in those constraints. Given some data from the provided example at the top of the issue here is the sorted results ![image](https://cloud.githubusercontent.com/assets/5170786/16935085/1bfa0314-4d11-11e6-9528-0a045791d068.png) I'll submit a pull request for the complete thing if this is a decent alternative, the bulk of the code is this function. ``` def natural_ordering(queryset, sql_col, fallback_ordering): ordering = ('_strchunk1', '_intchunk1', '_strchunk2', '_intchunk2', '_strchunk3', '_intchunk3', '_strchunk4', '_intchunk4') + fallback_ordering return queryset.extra(select={ '_strchunk1': "SUBSTRING({} FROM '^(\D+)')".format(sql_col), '_intchunk1': "CAST(SUBSTRING({} FROM '^(\d+)') AS integer)".format(sql_col), '_strchunk2': "SUBSTRING({} FROM '^\d+(\D+)')".format(sql_col), '_intchunk2': "CAST(SUBSTRING({} FROM '^\D+(\d+)') AS integer)".format(sql_col), '_strchunk3': "SUBSTRING({} FROM '^\D+\d+(\D+)')".format(sql_col), '_intchunk3': "CAST(SUBSTRING({} FROM '^\d+\D+(\d+)') AS integer)".format(sql_col), '_strchunk4': "SUBSTRING({} FROM '^\d+\D+\d+(\D+)')".format(sql_col), '_intchunk4': "CAST(SUBSTRING({} FROM '^\D+\d+\D+(\d+)') AS integer)".format(sql_col), }).order_by(*ordering)` ```
Author
Owner

@jeremystretch commented on GitHub (Jul 19, 2016):

I think we can simplify that a lot. I only see the need for three chunks:

  1. Leading digits, if any (treated as a single integer)
  2. Middle text
  3. Trailing digits, if any (treated as a single integer)

This should cover 99% of cases, I would think.

@jeremystretch commented on GitHub (Jul 19, 2016): I think we can simplify that a lot. I only see the need for three chunks: 1. Leading digits, if any (treated as a single integer) 2. Middle text 3. Trailing digits, if any (treated as a single integer) This should cover 99% of cases, I would think.
Author
Owner

@jeremystretch commented on GitHub (Jul 19, 2016):

I think I have the ordering pretty much nailed down:

queryset = super(NaturalOrderByManager, self).get_queryset().extra(select={
            id1: "CAST(SUBSTRING({}.{} FROM '^(\d+)') AS integer)".format(db_table, primary_field),
            id2: "SUBSTRING({}.{} FROM '^\d*(.*?)\d*$')".format(db_table, primary_field),
            id3: "CAST(SUBSTRING({}.{} FROM '(\d+)$') AS integer)".format(db_table, primary_field),
        })

Just deciding now how best to integrate it via managers.

@jeremystretch commented on GitHub (Jul 19, 2016): I think I have the ordering pretty much nailed down: ``` queryset = super(NaturalOrderByManager, self).get_queryset().extra(select={ id1: "CAST(SUBSTRING({}.{} FROM '^(\d+)') AS integer)".format(db_table, primary_field), id2: "SUBSTRING({}.{} FROM '^\d*(.*?)\d*$')".format(db_table, primary_field), id3: "CAST(SUBSTRING({}.{} FROM '(\d+)$') AS integer)".format(db_table, primary_field), }) ``` Just deciding now how best to integrate it via managers.
Author
Owner

@jeremystretch commented on GitHub (Jul 20, 2016):

I've implemented a simple (and possibly naïve) form of natural ordering for sites, racks, and devices. Objects are ordered appropriately based on leading and trailing integers. Please try it out.

With the new ordering applied:
sites_list

@jeremystretch commented on GitHub (Jul 20, 2016): I've implemented a simple (and possibly naïve) form of natural ordering for sites, racks, and devices. Objects are ordered appropriately based on leading and trailing integers. Please try it out. With the new ordering applied: ![sites_list](https://cloud.githubusercontent.com/assets/13487278/16995725/f1bef872-4e7b-11e6-8816-a2f5753be32a.png)
Author
Owner

@joachimtingvold commented on GitHub (Jul 22, 2016):

Out of curiosity; why do the sort server-side? This requires extra queries every time you resort a column by clicking on it. As an example; Datatables for jQuery has natural sort.

@joachimtingvold commented on GitHub (Jul 22, 2016): Out of curiosity; why do the sort server-side? This requires extra queries every time you resort a column by clicking on it. As an example; Datatables for jQuery has natural sort.
Author
Owner

@jeremystretch commented on GitHub (Jul 22, 2016):

Sorting on the client side requires pulling down the entire table. For some objects, this is potentially thousands of rows. In these cases it's usually much faster to cap the database query and return only the subset of rows which will be displayed.

@jeremystretch commented on GitHub (Jul 22, 2016): Sorting on the client side requires pulling down the entire table. For some objects, this is potentially thousands of rows. In these cases it's usually much faster to cap the database query and return only the subset of rows which will be displayed.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/netbox#223