mirror of
https://github.com/advplyr/audiobookshelf.git
synced 2026-05-30 23:40:40 +02:00
[PR #3952] [MERGED] Improve book library page query performance on title, titleIgnorePrefix, and addedAt sort orders. #4129
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/advplyr/audiobookshelf/pull/3952
Author: @mikiher
Created: 2/8/2025
Status: ✅ Merged
Merged: 2/11/2025
Merged by: @advplyr
Base:
master← Head:query-performance📝 Commits (6)
52bb286Add a profile utility functiona131432Improve page load queries on title, titleIgnorePrefix, and addedAt sort order3d08a35Add index on (libraryId, mediaType, createdAt)9a26119Update server/models/Book.jsef45f84Update upwards migration to be idempotent4cc300dUpdate changelog with v2.19.1 migration📊 Changes
8 files changed (+416 additions, -9 deletions)
View changed files
📝
server/migrations/changelog.md(+1 -0)➕
server/migrations/v2.19.1-copy-title-to-library-items.js(+164 -0)📝
server/models/Book.js(+9 -0)📝
server/models/LibraryItem.js(+16 -1)📝
server/scanner/BookScanner.js(+2 -0)➕
server/utils/profiler.js(+41 -0)📝
server/utils/queries/libraryItemsBookFilters.js(+35 -8)➕
test/server/migrations/v2.19.1-copy-title-to-library-items.test.js(+148 -0)📄 Description
Brief summary
Significantly improves book library Sequelize page queries for the following sort orders:
Which issue is fixed?
This partially fixes #2073 (resolving the book library load times, but not the podcast library load times)
In-depth Description
After digging more into the details of the issues people were complaining about in #2073 and doing additional performance analysis in Sequelize query to bring the page here, I made the following observations:
1 is by far the most serious problem, and also causes significant degradation in query performance as the offset becomes larger.
When the main query is sorting by title:
it's evident in the query plan that the query engine cannot use the existing book.title index, and needs to build a temporary tree for sorting.
Even when you remove the
feedstable join from the query:The query plan still doesn't make use of the book.title index:
The significant boost in performance can come only if the title column is put in the libraryItems table, and an index on
(libraryId, mediaType, title)is built. This way, filtering and sorting happens at the same time, and the index can be traveresed very quickly to reach the required offset without needing to look at the tables themselves.So with a query like this:
We get the following query plan:
Which is optimal! (or, to be more precise, optimal given the current architecture)
Resolution
The following changes were made:
titleandtitleIgnorePrefixcolumns were added tolibraryItemsbookschange.libraryItems:(libraryId, mediaType, title)(libraryId, mediaType, titleIgnorePrefix)(libraryId, mediaType, createdAt)findAllis called instead offindAndCountAllHow have you tested this?
I tested on loading 72 consecutive page of 35 books each, on each of the above sorting orders, on an ABS docker container running on a Synology 920+ NAS (I wanted to test on a hardware that was much weaker than my dev machine).
Results
All measurements are in ms.
Summary
Overall, we see a 94-95% drop (!) in mean and median query time.
Standard deviaion also reduces drastically from ~400 to ~10.
Note how the steady rise in query time (as the the requested offset grows) which is quite visible before, is not noticable after.
Sorting by title - before:
Sorting by title - after:
Sorting by titleIgnorePrefix - before:
Sorting by titleIgnorePrefix - after:
Sorting by addedAt - before:
Sorting by addedAt - after:
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.