[Enhancement]: GPTs optimised API #1532

New Issue

2026-04-24T23:49:08+02:00

adam commented

2026-04-24 23:49:08 +02:00

Originally created by @emmertex on GitHub (Nov 16, 2023).

Describe the feature/enhancement

I have been experimenting with OpenAIs GPTs to search my library for book suggestions.
While the existing APIs I had some success with, there seems to be unknown arbitrary limits on what GPTs will actually look at, maybe not arbitrary, could be a percentage of token capacity, but the issue remains where the existing responses are too verbose.
To try and better understand how it works I let it access the sqldb direct, about 90MB in my case, and while it seemed to work, it was not finding books it should have. Unfortunately, it was unclear how it was using the db, it did not provide debug.
I then made a simple CSV with just what it should care about.
SELECT books.title as Title, authors.name as Author, books.narrators as Narrators, books.description as Description, books.asin, books.isbn from books join bookAuthors on bookAuthors.bookId == books.Id join authors on bookAuthors.authorId == authors.Id;

Now in testing, the description is a positive and a negative. that makes a 24MB csv, and the GPTs ends up searching for keywords to find books.
If I exclude descriptions, then it must use it's knowledge on books, either internally of searching Bing. This is not a positive because the CSV even with all descriptions removed is still too large for it to directly use (yes, it did work using the new GPT4 turbo API, but I am trying to use the consumer ChatGPT GPTs).
Depending on how it decides to approach the issue, it either only gets a small part of the library, or searches for books it found finding potentially no matches.

Adding in categories was a mixed bag due to it not always using keywords that narrowed the list like it should. It really needs to be able to use categories, but more so as a sort than a filter... maybe?

So, this is what I am proposing (and willing to potentially do myself and upstream), is an API specifically for ChatGPT.

A really dumb search, so any keywords that ChatGPT uses searches Description, Subtitle, Tags, Genre.
A really minimalist response, only Title, Author, Narrators. And an alternative hook it can use which includes Description.
The less in the response, the more it can use. Any unnecessary information only increases the chance it won't be analysed.

A potential additional filter where it can add a username to filter only books the user has watched or not watched would also greatly increase the usefulness. I have also tested that with csv, using only title and author, and has had no issues.

I would also like to have it return book id, so it can provide the url, but I think that would need to be another API, again, due to the need to keep the responses as small as possible.

Discussion?

Originally created by @emmertex on GitHub (Nov 16, 2023). ### Describe the feature/enhancement I have been experimenting with OpenAIs GPTs to search my library for book suggestions. While the existing APIs I had some success with, there seems to be unknown arbitrary limits on what GPTs will actually look at, maybe not arbitrary, could be a percentage of token capacity, but the issue remains where the existing responses are too verbose. To try and better understand how it works I let it access the sqldb direct, about 90MB in my case, and while it seemed to work, it was not finding books it should have. Unfortunately, it was unclear how it was using the db, it did not provide debug. I then made a simple CSV with just what it should care about. SELECT books.title as Title, authors.name as Author, books.narrators as Narrators, books.description as Description, books.asin, books.isbn from books join bookAuthors on bookAuthors.bookId == books.Id join authors on bookAuthors.authorId == authors.Id; Now in testing, the description is a positive and a negative. that makes a 24MB csv, and the GPTs ends up searching for keywords to find books. If I exclude descriptions, then it must use it's knowledge on books, either internally of searching Bing. This is not a positive because the CSV even with all descriptions removed is still too large for it to directly use (yes, it did work using the new GPT4 turbo API, but I am trying to use the consumer ChatGPT GPTs). Depending on how it decides to approach the issue, it either only gets a small part of the library, or searches for books it found finding potentially no matches. Adding in categories was a mixed bag due to it not always using keywords that narrowed the list like it should. It really needs to be able to use categories, but more so as a sort than a filter... maybe? So, this is what I am proposing (and willing to potentially do myself and upstream), is an API specifically for ChatGPT. A really dumb search, so any keywords that ChatGPT uses searches Description, Subtitle, Tags, Genre. A really minimalist response, only Title, Author, Narrators. And an alternative hook it can use which includes Description. The less in the response, the more it can use. Any unnecessary information only increases the chance it won't be analysed. A potential additional filter where it can add a username to filter only books the user has watched or not watched would also greatly increase the usefulness. I have also tested that with csv, using only title and author, and has had no issues. I would also like to have it return book id, so it can provide the url, but I think that would need to be another API, again, due to the need to keep the responses as small as possible. Discussion?

adam added the enhancement label 2026-04-24 23:49:08 +02:00

adam closed this issue

2026-04-24 23:49:08 +02:00

adam commented

2026-04-24 23:49:09 +02:00

@emmertex commented on GitHub (Nov 18, 2023):

To further this, chatgpt despite instructions is not good at consuming large files, sometimes it greps them, sometimes it searches, but more often than not it does something dumb like head or random from the file. Using a csv is not reliable whatsoever, and even telling it what code to use to search is unreliable as to if it will.

When testing with a pipeline that forces it to search the functionality is superior. The lack of consistency can be forced by limiting what interactions it can have with the data. If a result is too large, no response and forcing it to use more restrictive filters would be more beneficial than pagination.

I'll fork and start working on this.

@emmertex commented on GitHub (Nov 18, 2023): To further this, chatgpt despite instructions is not good at consuming large files, sometimes it greps them, sometimes it searches, but more often than not it does something dumb like head or random from the file. Using a csv is not reliable whatsoever, and even telling it what code to use to search is unreliable as to if it will. When testing with a pipeline that forces it to search the functionality is superior. The lack of consistency can be forced by limiting what interactions it can have with the data. If a result is too large, no response and forcing it to use more restrictive filters would be more beneficial than pagination. I'll fork and start working on this.

adam commented

2026-04-24 23:49:09 +02:00

@emmertex commented on GitHub (Dec 28, 2023):

Just as an update, I cannot make the performance reliable and repeatable unless I use the paid API.
Using GPTS through the paid GPT Plus service results in unpredictable behaviour, and every time I have tweaked it to get it working the AI is tweaked breaking it.
To further the issues, the recent no harm filters added means anything in the Horror, True Crime, and Erotica categories is hopeless.

I will keep messing with this, but at this stage it requires both a custom API to fetch data in an AI compatible method, and, AI needs to create its own summary of each book for is to search. For a 2000 book library like mine, that costs about US$4 to generate, but the result of the output is fantastic, and worth having even if nothing else works in the short time.

@emmertex commented on GitHub (Dec 28, 2023): Just as an update, I cannot make the performance reliable and repeatable unless I use the paid API. Using GPTS through the paid GPT Plus service results in unpredictable behaviour, and every time I have tweaked it to get it working the AI is tweaked breaking it. To further the issues, the recent no harm filters added means anything in the Horror, True Crime, and Erotica categories is hopeless. I will keep messing with this, but at this stage it requires both a custom API to fetch data in an AI compatible method, and, AI needs to create its own summary of each book for is to search. For a 2000 book library like mine, that costs about US$4 to generate, but the result of the output is fantastic, and worth having even if nothing else works in the short time.

adam referenced this issue

2026-04-25 00:16:10 +02:00

[PR #1532] [CLOSED] Added block tags based on Patreon private rss feed #3568

Sign in to join this conversation.

Branches Tags

master

book_tags_genres_dedupe

episode_download_fallback

Issue-4540-SortBy-StartedDate-and-FinishedDate

episode_meta_tagging

fix_authorize_race_condition

redirect_transcode_requests

progress_updated_sort

fix_ereader_socket_event

fix_change_empty_root_password

fix_podcast_session_track_index

fix_set_token

session_modal_user

localize_durations

fix_oidc_create_user

jwt_auth_refactor

fix_scanner_deleting_single_file_books

fix_mediaprogress_updatedat_2

experimental_next_client

podcast_episode_duration

episode-timestamps-clickable

book_author_secondary_sort_title

podcast_useragents

pathexists_user_access

fix_pathexists_join

book_author_secondary_sort

clean_duplicate_mediaprogress

sanitize_html_description

trix_prevent_attachments

check_path_api_fix

fix_mediaprogress_updatedat

increase_express_json_limit

fix_dockerfile_nunicode

search_episodes

audiobook_tools_update

episode_secondary_sorts

hls_stream_url_update

new_session_track_endpoint

audiobook_tools_enhancements

watcher_rescans_update

player_track_tooltip

fix_exclude_prefixes_crash

socket_item_events

fix_podcast_episode_scanner_promise

new_stats_controller

count_cache_for_userpermissions

parsing-opf-v3

validate_migration_files

fix-quick-match-all-crash

fix-chapter-end-sleep-timer

stringify_sequelize_query

remove-col-ambiguity

fix_next_prev_edit_description

details_trim_whitespace

fix_content_url_basepath

fix_logger_fatal

progress_bar_visibility

batch-edit-populate-map-details

feed_generator_updates

bookmark-modal-updates

migrate-library-item-in-scanner

migrate-new-library-items

migrate-podcasts-new-library-item-2

migrate-podcasts-new-library-item

fix-remove-episode-from-playlist

playback-session-use-new-library-item

refactor-library-item

fix-heatmap-caption

feed-episodes-upsert

share-media-player-media-session-api

remove-old-playlist

remove_old_collection_object

plugin-implementation-demo

feed_migration

refactor-feeds-from-item

fix_remove_authors_no_books

v2.17.3-fk-constraints-migration

migrations-first-upgrade

sqlite_2

feature/nuxt-target-server

waveform

sqlite

playlists

video

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: starred/audiobookshelf#1532