[PR #2918] WIP: Adding Transcription/Subtitle Viewing Support to the Web Player (VTT) #3822

Open
opened 2026-04-25 00:17:11 +02:00 by adam · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/advplyr/audiobookshelf/pull/2918
Author: @mfcar
Created: 5/4/2024
Status: 🔄 Open

Base: masterHead: mf/vttSupport


📝 Commits (9)

  • b37a863 Initial transcription support
  • 8e5fc4a Avoid duplicated code
  • ee8e7cf Add seek support to transcriptions
  • 282203a Automatically scrolls to active cue when enable/disable the transcription panel
  • 68d4dac Avoid error: "Cannot read properties of null (reading 'track')"
  • 35f51f4 Fix formatting
  • bfcf4e3 Remove semicolon
  • 1a9aaf1 Fix small bug on the AudioTrack
  • 2f515cc Add support to recognize srt and vtt as subtitles formats on the file table

📊 Changes

13 files changed (+235 additions, -39 deletions)

View changed files

📝 client/components/app/MediaPlayerContainer.vue (+24 -1)
📝 client/components/player/PlayerUi.vue (+10 -2)
client/components/player/TranscriptionLine.vue (+43 -0)
client/components/player/TranscriptionUi.vue (+55 -0)
📝 client/players/AudioTrack.js (+12 -1)
📝 client/players/LocalAudioPlayer.js (+25 -2)
📝 client/plugins/constants.js (+3 -2)
📝 client/strings/en-us.json (+2 -1)
📝 server/controllers/LibraryItemController.js (+49 -24)
📝 server/objects/files/AudioTrack.js (+4 -1)
📝 server/objects/files/LibraryFile.js (+2 -1)
📝 server/routers/ApiRouter.js (+4 -3)
📝 server/utils/globals.js (+2 -1)

📄 Description

I have begun work on adding transcription support to the Web Player.
I've used Whisper to generate transcriptions for some audiobooks and podcasts. Many tools based on Whisper support exports in VTT and SRT formats.
For this pull request, I'm only supporting VTT as it is natively supported by browsers. Support for SRT can be added in a future pull request.

How does it work?

A new endpoint, api/items/:id/file/:fileid/transcript, has been created on the backend. This endpoint attempts to return a transcription for each audio track. For instance, if there's an audio file named adventuresherlockholmes_01_doyle_64kb.mp3, this endpoint will attempt to return the file adventuresherlockholmes_01_doyle_64kb.vtt.

On the frontend, when an audio file is set as the source property of the <audio> HTML tag, a <track> is created and linked to that <audio>. The source property for the <track> HTML tag is populated with the link to the aforementioned endpoint.

What does this PR support?

  • Show/Hide transcription block
  • Highlighting the current transcription line
  • Clicking on a line to seek the player to that time
  • Changing transcriptions when the audio file changes (supports audiobooks and podcasts)

Demo

https://github.com/advplyr/audiobookshelf/assets/814828/3bd43148-6adc-48b7-8417-bc068be14c7b

What is missing for the scope of this PR

  • Hiding the "Show transcription" button when the transcription is not available for the audio file
  • Known issues

Known issues

  • When playing an audio file with transcription, if you close the web player and reopen it, the transcription block is not displayed, even though the transcription is still available. Clicking on the "Show transcription" button to display the block again. I think this is related with the MediaPlayerContainer.vue component not reloading the TranscriptionUi component.

https://github.com/advplyr/audiobookshelf/assets/814828/c1aaef74-2f70-45a3-b0ce-b04053bf3bc5

  • When playing an audio file with transcription, if you change the audio file, the active transcription line for the new audio file focuses on the first line. The focus shifts to the correct line only when the next line change occurs.

Related

  • #1723 - This PR can helps to implement the Whisper support on the Web Player

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/advplyr/audiobookshelf/pull/2918 **Author:** [@mfcar](https://github.com/mfcar) **Created:** 5/4/2024 **Status:** 🔄 Open **Base:** `master` ← **Head:** `mf/vttSupport` --- ### 📝 Commits (9) - [`b37a863`](https://github.com/advplyr/audiobookshelf/commit/b37a863c0afd8ad4c933d7487fea9ca06cea0263) Initial transcription support - [`8e5fc4a`](https://github.com/advplyr/audiobookshelf/commit/8e5fc4a411b0bd42ab9151662d1ff9a254c0433a) Avoid duplicated code - [`ee8e7cf`](https://github.com/advplyr/audiobookshelf/commit/ee8e7cf958524b2849239e1a236fe737587378cc) Add seek support to transcriptions - [`282203a`](https://github.com/advplyr/audiobookshelf/commit/282203a30aa16091c2fe15301182a01cb42ed0ef) Automatically scrolls to active cue when enable/disable the transcription panel - [`68d4dac`](https://github.com/advplyr/audiobookshelf/commit/68d4dac4a20308e0cc679f6fe8b9475d9680ae42) Avoid error: "Cannot read properties of null (reading 'track')" - [`35f51f4`](https://github.com/advplyr/audiobookshelf/commit/35f51f4e986eb478a8d649e17bfce97cde631fa8) Fix formatting - [`bfcf4e3`](https://github.com/advplyr/audiobookshelf/commit/bfcf4e317f40425c60c58b113f890b3df9e60a26) Remove semicolon - [`1a9aaf1`](https://github.com/advplyr/audiobookshelf/commit/1a9aaf17a6f9fd71efad281d38ab501e417ef94e) Fix small bug on the AudioTrack - [`2f515cc`](https://github.com/advplyr/audiobookshelf/commit/2f515cc2075599c6b2589948b332ceadb2c50400) Add support to recognize srt and vtt as subtitles formats on the file table ### 📊 Changes **13 files changed** (+235 additions, -39 deletions) <details> <summary>View changed files</summary> 📝 `client/components/app/MediaPlayerContainer.vue` (+24 -1) 📝 `client/components/player/PlayerUi.vue` (+10 -2) ➕ `client/components/player/TranscriptionLine.vue` (+43 -0) ➕ `client/components/player/TranscriptionUi.vue` (+55 -0) 📝 `client/players/AudioTrack.js` (+12 -1) 📝 `client/players/LocalAudioPlayer.js` (+25 -2) 📝 `client/plugins/constants.js` (+3 -2) 📝 `client/strings/en-us.json` (+2 -1) 📝 `server/controllers/LibraryItemController.js` (+49 -24) 📝 `server/objects/files/AudioTrack.js` (+4 -1) 📝 `server/objects/files/LibraryFile.js` (+2 -1) 📝 `server/routers/ApiRouter.js` (+4 -3) 📝 `server/utils/globals.js` (+2 -1) </details> ### 📄 Description I have begun work on adding transcription support to the Web Player. I've used [Whisper](https://github.com/openai/whisper) to generate transcriptions for some audiobooks and podcasts. Many tools based on Whisper support exports in VTT and SRT formats. For this pull request, I'm only supporting [VTT](https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API) as it is natively supported by browsers. Support for SRT can be added in a future pull request. ### How does it work? A new endpoint, `api/items/:id/file/:fileid/transcript`, has been created on the backend. This endpoint attempts to return a transcription for each audio track. For instance, if there's an audio file named `adventuresherlockholmes_01_doyle_64kb.mp3`, this endpoint will attempt to return the file `adventuresherlockholmes_01_doyle_64kb.vtt`. On the frontend, when an audio file is set as the source property of the `<audio>` HTML tag, a `<track>` is created and linked to that `<audio>`. The `source` property for the `<track>` HTML tag is populated with the link to the aforementioned endpoint. ### What does this PR support? - Show/Hide transcription block - Highlighting the current transcription line - Clicking on a line to seek the player to that time - Changing transcriptions when the audio file changes (supports audiobooks and podcasts) ### Demo https://github.com/advplyr/audiobookshelf/assets/814828/3bd43148-6adc-48b7-8417-bc068be14c7b ### What is missing for the scope of this PR - Hiding the "Show transcription" button when the transcription is not available for the audio file - Known issues ### Known issues - When playing an audio file with transcription, if you close the web player and reopen it, the transcription block is not displayed, even though the transcription is still available. Clicking on the "Show transcription" button to display the block again. I think this is related with the `MediaPlayerContainer.vue` component not reloading the `TranscriptionUi` component. https://github.com/advplyr/audiobookshelf/assets/814828/c1aaef74-2f70-45a3-b0ce-b04053bf3bc5 - When playing an audio file with transcription, if you change the audio file, the active transcription line for the new audio file focuses on the first line. The focus shifts to the correct line only when the next line change occurs. ### Related - #1723 - This PR can helps to implement the Whisper support on the Web Player --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
adam added the pull-request label 2026-04-25 00:17:11 +02:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/audiobookshelf#3822