mirror of
https://github.com/advplyr/audiobookshelf.git
synced 2026-05-30 23:40:40 +02:00
[Enhancement]: TTS (Text to speech) #362
Open
opened 2026-04-24 23:06:03 +02:00 by adam
·
15 comments
No Branch/Tag Specified
master
book_tags_genres_dedupe
episode_download_fallback
Issue-4540-SortBy-StartedDate-and-FinishedDate
episode_meta_tagging
fix_authorize_race_condition
redirect_transcode_requests
progress_updated_sort
fix_ereader_socket_event
fix_change_empty_root_password
fix_podcast_session_track_index
fix_set_token
session_modal_user
localize_durations
fix_oidc_create_user
jwt_auth_refactor
fix_scanner_deleting_single_file_books
fix_mediaprogress_updatedat_2
experimental_next_client
podcast_episode_duration
episode-timestamps-clickable
book_author_secondary_sort_title
podcast_useragents
pathexists_user_access
fix_pathexists_join
book_author_secondary_sort
clean_duplicate_mediaprogress
sanitize_html_description
trix_prevent_attachments
check_path_api_fix
fix_mediaprogress_updatedat
increase_express_json_limit
fix_dockerfile_nunicode
search_episodes
audiobook_tools_update
episode_secondary_sorts
hls_stream_url_update
new_session_track_endpoint
audiobook_tools_enhancements
watcher_rescans_update
player_track_tooltip
fix_exclude_prefixes_crash
socket_item_events
fix_podcast_episode_scanner_promise
new_stats_controller
count_cache_for_userpermissions
parsing-opf-v3
validate_migration_files
fix-quick-match-all-crash
fix-chapter-end-sleep-timer
stringify_sequelize_query
remove-col-ambiguity
fix_next_prev_edit_description
details_trim_whitespace
fix_content_url_basepath
fix_logger_fatal
progress_bar_visibility
batch-edit-populate-map-details
feed_generator_updates
bookmark-modal-updates
migrate-library-item-in-scanner
migrate-new-library-items
migrate-podcasts-new-library-item-2
migrate-podcasts-new-library-item
fix-remove-episode-from-playlist
playback-session-use-new-library-item
refactor-library-item
fix-heatmap-caption
feed-episodes-upsert
share-media-player-media-session-api
remove-old-playlist
remove_old_collection_object
plugin-implementation-demo
feed_migration
refactor-feeds-from-item
fix_remove_authors_no_books
v2.17.3-fk-constraints-migration
migrations-first-upgrade
sqlite_2
feature/nuxt-target-server
waveform
sqlite
playlists
video
v2.35.1
v2.35.0
v2.34.0
v2.33.2
v2.33.1
v2.33.0
v2.32.1
v2.32.0
v2.31.0
v2.30.0
v2.29.0
v2.28.0
v2.27.0
v2.26.3
v2.26.2
v2.26.1
v2.26.0
v2.25.1
v2.25.0
v2.24.0
v2.23.0
v2.22.0
v2.21.0
v2.20.0
v2.19.5
v2.19.4
v2.19.3
v2.19.2
v2.19.1
v2.19.0
v2.18.1
v2.18.0
v2.17.7
v2.17.6
v2.17.5
v2.17.4
v2.17.3
v2.17.2
v2.17.1
v2.17.0
v2.16.2
v2.16.1
v2.16.0
v2.15.1
v2.15.0
v2.14.0
v2.13.4
v2.13.3
v2.13.2
v2.13.1
v2.13.0
v2.12.3
v2.12.2
v2.12.1
v2.12.0
v2.11.0
v2.10.1
v2.10.0
v2.9.0
v2.8.1
v2.8.0
v2.7.2
v2.7.1
v2.7.0
v2.6.0
v2.5.0
v2.4.4
v2.4.3
v2.4.2
v2.4.1
v2.4.0
v2.3.5
v2.3.4
v2.3.3
v2.3.2
v2.3.1
v2.3.0
v2.2.23
v2.2.22
v2.2.21
v2.2.20
v2.2.19
v2.2.18
v2.2.17
v2.2.16
v2.2.15
v2.2.14
v2.2.13
v2.2.12
v2.2.11
v2.2.10
v2.2.9
v2.2.8
v2.2.7
v2.2.6
v2.2.5
v2.2.4
v2.2.3
v2.2.2
v2.2.1
v2.2.0
v2.1.5
v2.1.4
v2.1.3
v2.1.2
v2.1.1
v2.1.0
v2.0.24
v2.0.23
v2.0.22
v2.0.21
v2.0.20
v2.0.19
v2.0.18
v2.0.17
v2.0.16
v2.0.15
v2.0.14
v2.0.13
v2.0.12
v2.0.11
v2.0.10
v2.0.9
v2.0.8
v2.0.7
v2.0.6
v2.0.5
v2.0.4
v2.0.3
v2.0.2
v2.0.1
v1.7.2
v1.7.1
v1.7.0
v1.6.0
v1.5.5
v1.5.0
v1.4.11
v1.4.9
v1.4.7
v1.4.6
v1.4.4
v1.4.2
v1.4.0
v1.4.1
v1.3.4
v1.3.3
v1.3.1
v1.2.8
v1.2.6
v1.2.5
v1.2.4
v1.2.1
v1.1.15
v1.1.14
v1.1.13
v1.1.12
v1.1.11
v1.1.10
v1.1.9
v1.1.8
v1.0.0
0.9.61-beta.0
0.9.61-beta
Labels
Clear labels
authentication
backlog
bug
chapter editor
config-issue
ebooks
encoding/embedding
enhancement
help wanted
listening sessions & progress
planned
possible plugin
progress sync
pull-request
sorting/filtering/searching
unable to reproduce
upload
users & permissions
waiting
Mirrored from GitHub Pull Request
Milestone
No items
No Milestone
Projects
Clear projects
No project
Assignees
adam (Adam Melkus)
Clear assignees
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: starred/audiobookshelf#362
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @kobemtl on GitHub (May 17, 2022).
Describe the feature/enhancement
Hi, Thanks a lot the great app first. I am just wondering if its possible to integrate TTS? I am using some apps with TTS feature to listen epub or aws ebook all the time. It will be nice audiobookshelf could have this feature. Thanks again.
@kobemtl commented on GitHub (Jun 14, 2022):
I am using Moon+ Reader for many years.
@danielzrob commented on GitHub (Mar 22, 2023):
oh so this, I would LOVE to see some tts in this. My dumb brain will only allow books go in my ears and there are so many books that have not audio option.
I currently use SayIt on android, but with all this cool AI TTS going on right I am trying to find something better.
@p0n1 commented on GitHub (Nov 10, 2023):
Just noticed this feature request. I'm a heavy audiobookshelf user. Yes, many books I love to read don't have audio version. So I made this simple tool https://github.com/p0n1/epub_to_audiobook to convert epub books into audiobooks. It works great for me. I have consumed many books using this tool together with the great audiobookshelf.
I just supported OpenAI TTS which is unbelievable awesome. Not sure if it's a good idea to integrate the TTS features into audiobookshelf itself because we can convert outside and import easily.
@uniquePWD commented on GitHub (Dec 7, 2025):
There hasn't been any activity in this bug since 2023?
Is this something the project would be interested in? In #1189 someone even offered a PR and that seems to have fallen through the cracks. It was also requested in #601
My own request was in #1743 where I requested
@Phoenix-Grand commented on GitHub (Dec 7, 2025):
I would very much like this TTS feature.
+1 vote from me.
@uniquePWD commented on GitHub (Dec 11, 2025):
@advplyr would you be willing to accept a patch from @cutiepoka?
@nichwall commented on GitHub (Dec 11, 2025):
There are currently over 1000 open bug reports and feature requests between the server and app repositories, and this is a project we work on for fun in our free time (we all have day jobs).
Not sure what you are referencing, because I'm not finding a PR numbered 1189 and this is 601.
@Oxika95 commented on GitHub (Jan 23, 2026):
Generating full length audio for a book seems a bit overkill and while possible it certainly isn't easy. I have tried several times to get full length generation set up but I don't have hardware good enough to support it on my server. I have also tried it on a decently powered home PC and it can take hours and is prone to failure. It isn't even a better solution in my opinion as I like being able to switch between reading and listening which TTS enables. I feel like TTS at least for mobile apps would be qualitative upgrade. Simply allowing for onboard device TTS would be great. Really hope to see a turn around on the stance regarding TTS support.
@nichwall commented on GitHub (Jan 24, 2026):
Not sure exactly what we would need to do here since most devices already have built in TTS engines as you mentioned that can read what is on the screen. We don't include automatic caption generation support either (speech to text) because this is also supported natively by many devices, and generating text ahead of time (or audio like mentioned above) is basically just combining the ebook and audiobook in the same library item as has already been discussed.
@Oxika95 commented on GitHub (Jan 24, 2026):
On board TTS engines are great but the screen reader functionality is absolutely awful. They don't allow continuous reading, often requiring you to monitor the screen restart TTS after it finishes doesn't track progress often turns off if the screen or app switches is usually part of an accessibility service that reads out everything like incoming calls texts and notifications naming the buttons or elements you click interupt or prevent clicks until it has finished announcing out loud what it is you clicked last.
Look at any TTS ereader and look at how its implemented. Look at @voice aloud reader literally the golden standard if dated in appearance. TTS can be so much better and now with options for nural AI voices it is frequently better than many voice actors, imho.
A play button that initiates internal TTS or linked to the built in engine that can track progress would be huge. 75% of my library is ebooks. I don't use audiobookshelf to read them even though I would love to because manual download and upload to another app with TTS is a better experience.
@uniquePWD commented on GitHub (Jan 24, 2026):
Clarification on the Request: Client-Side TTS with Progress Syncing
There appears to be a misunderstanding regarding the desired implementation. This request is not for the server to generate audio files (which is storage/CPU intensive).
The request is for a Client-Side feature for the mobile app that utilizes the device's native capabilities.
The Goal
For the Android app to open an epub and use the device's native System TTS (or a user-selected engine) to read the text aloud. This playback should behave exactly like a standard audio player, appearing in the notification shade, continuing playback with the screen off, responding to Bluetooth controls, and syncing progress to the server.
Technical Implementation
This feature would likely require integrating the standard Android TTS engine into a Foreground Service that manages a MediaSession.
Text-to-Speech: The app would utilize the
TextToSpeechclass. This allows the app to offload generation to the user's preferred installed engine (supporting modern Neural/AI on-device models without extra app overhead).Media Controls: To ensure the TTS acts like a media player (lock screen controls, Bluetooth play/pause), the audio stream must be managed via a
MediaSession(orMediaSessionServicein Media3).Background Playback: To keep reading while the screen is off, the app must run a Foreground Service with the
mediaPlaybacktype.Progress Syncing: The
UtteranceProgressListenercan be used to track which sentence/paragraph is currently being spoken, allowing the app to calculate progress and sync it back to the AudioBookShelf server.@uniquePWD commented on GitHub (Feb 15, 2026):
I've been following the progress on PR #1747 (Media3 Architecture), and it looks fantastic.
Just wanted to note that the move to Media3 and MediaSessionService effectively solves the biggest technical hurdle for this request. With that architecture in place, plugging in a Local TTS engine as a playback source becomes much cleaner since the MediaSession handling and background service logic are now standardized.
Exciting to see the groundwork being laid!
@iredmedia commented on GitHub (Feb 23, 2026):
@uniquePWD Link seems wrong? Links to "Add dutch translation".
@uniquePWD commented on GitHub (Feb 23, 2026):
Try this link: https://github.com/advplyr/audiobookshelf-app/pull/1747/commits/
@v4u6h4n commented on GitHub (Mar 15, 2026):
I would love to be able to have TTS speech options with ebooks. There is just so much of what I read that isn't available in audiobook form, and I am really not happy with the options on linux for listening to ebooks.