[PR #5206] Add OpenAI-powered series tools, scan inference, directory grouping, and book dedupe #4460

Open
opened 2026-04-25 00:51:00 +02:00 by adam · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/advplyr/audiobookshelf/pull/5206
Author: @korjik
Created: 4/22/2026
Status: 🔄 Open

Base: masterHead: add-openai-helper


📝 Commits (8)

📊 Changes

19 files changed (+2399 additions, -46 deletions)

View changed files

📝 client/components/app/BookShelfToolbar.vue (+45 -0)
📝 client/components/app/LazyBookshelf.vue (+7 -0)
📝 client/components/modals/libraries/EditModal.vue (+1 -0)
📝 client/components/modals/libraries/LibraryScannerSettings.vue (+17 -2)
📝 client/components/modals/libraries/LibraryTools.vue (+106 -0)
📝 client/pages/config/index.vue (+104 -1)
📝 readme.md (+37 -0)
📝 server/controllers/LibraryController.js (+419 -0)
📝 server/controllers/MiscController.js (+30 -0)
📝 server/controllers/SeriesController.js (+65 -0)
📝 server/models/Library.js (+2 -0)
📝 server/objects/settings/ServerSettings.js (+39 -1)
server/providers/OpenAI.js (+887 -0)
📝 server/routers/ApiRouter.js (+3 -0)
📝 server/scanner/BookScanner.js (+40 -0)
📝 server/scanner/LibraryScanner.js (+222 -7)
📝 server/utils/scandir.js (+63 -35)
test/server/providers/OpenAI.test.js (+272 -0)
📝 test/server/utils/scandir.test.js (+40 -0)

📄 Description

Brief summary

Adds OpenAI-assisted organization workflows for book libraries, including:

  • server-side OpenAI configuration and UI settings
  • library tools for AI series detection and AI duplicate-book cleanup
  • series-page AI story ordering
  • scan-time OpenAI metadata inference from messy paths and filenames
  • optional OpenAI directory-tree interpretation during scans for poorly structured libraries
  • improved non-AI scanner grouping for mixed parent-folder/direct-file book layouts

Which issue is fixed?

N/A

In-depth Description

This PR introduces an OpenAI integration for book-library organization and metadata cleanup.

Configuration:

  • Adds OpenAI settings to the server and web UI
  • Supports configuration from either the UI or environment variables:
    • OPENAI_API_KEY
    • OPENAI_MODEL
    • OPENAI_BASE_URL

Library tools:

  • Adds Detect Missing Series With AI
  • Adds Re-evaluate All Series
  • Adds Dedupe Books With AI

Series tools:

  • Adds Organize Story Order With AI on series pages to update series sequence values using AI

Scanner improvements:

  • Adds OpenAI path and filename inference as an optional scan metadata source
  • Adds Use OpenAI to interpret poor directory trees during library scans as an optional library scanner setting
  • This allows scans to recover more cleanly from messy folder structures and infer metadata when standard folder parsing is weak

Scanner grouping fixes:

  • Fixes a scanner edge case where a direct media file inside a parent series folder could cause nested book folders to collapse into a single detected item
  • Sidecar files like .nfo, .cue, and cover images are now attached to the correct logical item in that scenario

OpenAI response handling:

  • Adds defensive validation for AI responses
  • Invalid or partial AI rows are skipped instead of aborting the whole operation
  • Duplicate, missing, or unknown ids in AI responses are tolerated where appropriate so long-running library operations can continue

This change is aimed at users with poorly normalized audiobook libraries, especially libraries where:

  • series membership is incomplete or inconsistent
  • directory structure does not follow the expected author/series/title layout
  • standalone files and nested book folders are mixed together
  • duplicate copies of the same book exist with different metadata quality

How have you tested this?

I tested the changes with focused server-side validation and scanner regression tests.

Reproducible checks run locally:

  • node --check server/providers/OpenAI.js
  • node --check server/scanner/BookScanner.js
  • node --check server/scanner/LibraryScanner.js
  • node --check server/controllers/LibraryController.js
  • node --check server/routers/ApiRouter.js

Targeted test suites run locally:

  • ./node_modules/.bin/mocha test/server/providers/OpenAI.test.js
  • ./node_modules/.bin/mocha test/server/utils/scandir.test.js
  • ./node_modules/.bin/mocha test/server/scanner/LibraryScanner.test.js
  • Combined runs of the focused suites also passed during development

Specific behaviors covered:

  • OpenAI response parsing and validation
  • tolerance of malformed AI outputs
  • duplicate/unknown/missing AI ids
  • scan metadata inference payload validation
  • directory-grouping payload validation
  • duplicate-book decision payload validation
  • scanner grouping regression for poor parent-folder/direct-file layouts
  • scanner regrouping helper behavior for mixed standalone and nested book items

Manual verification performed in the app/container workflow:

  1. Build and run the local Docker image instead of the upstream image
  2. Configure OpenAI through the UI
  3. Run AI series detection from library tools
  4. Run AI series re-evaluation from library tools
    1 Run AI story ordering from a series page
  5. Enable OpenAI scan metadata inference and AI directory grouping in library scanner settings

Screenshots

Settings

image

Library Tools

image

Series menu update

image

Library scanning setting update (off by default)

image

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/advplyr/audiobookshelf/pull/5206 **Author:** [@korjik](https://github.com/korjik) **Created:** 4/22/2026 **Status:** 🔄 Open **Base:** `master` ← **Head:** `add-openai-helper` --- ### 📝 Commits (8) - [`77206d9`](https://github.com/advplyr/audiobookshelf/commit/77206d90cb73b39ab43e6a7f89e1b7f598a1a6ff) Add OpenAI series evaluation - [`83fc6f0`](https://github.com/advplyr/audiobookshelf/commit/83fc6f049533964b6edbc3ba1c1ad98e5b4e6087) update - [`6527b8b`](https://github.com/advplyr/audiobookshelf/commit/6527b8b0f59e54557c42d10b0a92ca0f948b3f3d) update - [`f4ce4a4`](https://github.com/advplyr/audiobookshelf/commit/f4ce4a4bdeb30a1bc6ee2960ca93d5cb0b292417) update - [`e5261d1`](https://github.com/advplyr/audiobookshelf/commit/e5261d137f8fcdeb41ac8a958fe683dd410a45bf) update - [`58776ca`](https://github.com/advplyr/audiobookshelf/commit/58776ca98334415816f0a16ae534c6f22fe812ac) update - [`b5e620b`](https://github.com/advplyr/audiobookshelf/commit/b5e620b981a3d11ec2faf3d62af2887603a59449) update - [`2cfc175`](https://github.com/advplyr/audiobookshelf/commit/2cfc175c6108c6a3730f79f39aeb586ef6241079) update readme ### 📊 Changes **19 files changed** (+2399 additions, -46 deletions) <details> <summary>View changed files</summary> 📝 `client/components/app/BookShelfToolbar.vue` (+45 -0) 📝 `client/components/app/LazyBookshelf.vue` (+7 -0) 📝 `client/components/modals/libraries/EditModal.vue` (+1 -0) 📝 `client/components/modals/libraries/LibraryScannerSettings.vue` (+17 -2) 📝 `client/components/modals/libraries/LibraryTools.vue` (+106 -0) 📝 `client/pages/config/index.vue` (+104 -1) 📝 `readme.md` (+37 -0) 📝 `server/controllers/LibraryController.js` (+419 -0) 📝 `server/controllers/MiscController.js` (+30 -0) 📝 `server/controllers/SeriesController.js` (+65 -0) 📝 `server/models/Library.js` (+2 -0) 📝 `server/objects/settings/ServerSettings.js` (+39 -1) ➕ `server/providers/OpenAI.js` (+887 -0) 📝 `server/routers/ApiRouter.js` (+3 -0) 📝 `server/scanner/BookScanner.js` (+40 -0) 📝 `server/scanner/LibraryScanner.js` (+222 -7) 📝 `server/utils/scandir.js` (+63 -35) ➕ `test/server/providers/OpenAI.test.js` (+272 -0) 📝 `test/server/utils/scandir.test.js` (+40 -0) </details> ### 📄 Description ## Brief summary Adds OpenAI-assisted organization workflows for book libraries, including: - server-side OpenAI configuration and UI settings - library tools for AI series detection and AI duplicate-book cleanup - series-page AI story ordering - scan-time OpenAI metadata inference from messy paths and filenames - optional OpenAI directory-tree interpretation during scans for poorly structured libraries - improved non-AI scanner grouping for mixed parent-folder/direct-file book layouts ## Which issue is fixed? N/A ## In-depth Description This PR introduces an OpenAI integration for book-library organization and metadata cleanup. Configuration: - Adds OpenAI settings to the server and web UI - Supports configuration from either the UI or environment variables: - `OPENAI_API_KEY` - `OPENAI_MODEL` - `OPENAI_BASE_URL` Library tools: - Adds `Detect Missing Series With AI` - Adds `Re-evaluate All Series` - Adds `Dedupe Books With AI` Series tools: - Adds `Organize Story Order With AI` on series pages to update series sequence values using AI Scanner improvements: - Adds `OpenAI path and filename inference` as an optional scan metadata source - Adds `Use OpenAI to interpret poor directory trees during library scans` as an optional library scanner setting - This allows scans to recover more cleanly from messy folder structures and infer metadata when standard folder parsing is weak Scanner grouping fixes: - Fixes a scanner edge case where a direct media file inside a parent series folder could cause nested book folders to collapse into a single detected item - Sidecar files like `.nfo`, `.cue`, and cover images are now attached to the correct logical item in that scenario ## OpenAI response handling: - Adds defensive validation for AI responses - Invalid or partial AI rows are skipped instead of aborting the whole operation - Duplicate, missing, or unknown ids in AI responses are tolerated where appropriate so long-running library operations can continue ## This change is aimed at users with poorly normalized audiobook libraries, especially libraries where: - series membership is incomplete or inconsistent - directory structure does not follow the expected author/series/title layout - standalone files and nested book folders are mixed together - duplicate copies of the same book exist with different metadata quality ## How have you tested this? I tested the changes with focused server-side validation and scanner regression tests. ## Reproducible checks run locally: - `node --check server/providers/OpenAI.js` - `node --check server/scanner/BookScanner.js` - `node --check server/scanner/LibraryScanner.js` - `node --check server/controllers/LibraryController.js` - `node --check server/routers/ApiRouter.js` ## Targeted test suites run locally: - `./node_modules/.bin/mocha test/server/providers/OpenAI.test.js` - `./node_modules/.bin/mocha test/server/utils/scandir.test.js` - `./node_modules/.bin/mocha test/server/scanner/LibraryScanner.test.js` - Combined runs of the focused suites also passed during development ## Specific behaviors covered: - OpenAI response parsing and validation - tolerance of malformed AI outputs - duplicate/unknown/missing AI ids - scan metadata inference payload validation - directory-grouping payload validation - duplicate-book decision payload validation - scanner grouping regression for poor parent-folder/direct-file layouts - scanner regrouping helper behavior for mixed standalone and nested book items ## Manual verification performed in the app/container workflow: 1. Build and run the local Docker image instead of the upstream image 1. Configure OpenAI through the UI 1. Run AI series detection from library tools 1. Run AI series re-evaluation from library tools 1 Run AI story ordering from a series page 1. Enable OpenAI scan metadata inference and AI directory grouping in library scanner settings ## Screenshots ### Settings <img width="1482" height="855" alt="image" src="https://github.com/user-attachments/assets/5ae9ec62-0e95-44b1-a889-05aacf4afe94" /> ### Library Tools <img width="828" height="721" alt="image" src="https://github.com/user-attachments/assets/501f4068-7e77-46f5-b30d-988b9500c9d8" /> ### Series menu update <img width="243" height="217" alt="image" src="https://github.com/user-attachments/assets/78bb6d85-ae18-4df4-8b47-b6ddeba2a19a" /> ### Library scanning setting update (off by default) <img width="811" height="624" alt="image" src="https://github.com/user-attachments/assets/893f7765-11bb-440b-9ab0-5cf317f3104b" /> --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
adam added the pull-request label 2026-04-25 00:51:00 +02:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/audiobookshelf#4460