Speech event transcription #18522
Closed
bojangles9
started this conversation in
Ideas
Replies: 1 comment
-
Speech audio event transcription has already been implemented for a future version of Frigate (likely 0.17). It still needs refinements, but the base functionality is working. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'd like to contribute to the project with a new feature.
This would be an addition to the "speech" event type.
What I'm envisioning is Frigate would upon completion of speech detection send that video chunk to an endpoint that supports voice transcription via an LLM.
I've used this one quite extensively and it works quite well:
https://ahmetoner.com/whisper-asr-webservice/#quick-usage
Instead of sending the 10 second recorded chunks of video, it's best to send the full event for transcription as this helps with the context and allows Whisper to better detect (and transcribe, and translate) the language.
Looking at the code and DB schema, what I'm thinking is:
add a new section to the config for audio_transcription, initially supporting the previously mentioned docker container
in audio.py, I would add some code in "expire_detections". If the event type is "speech", I would allow the writing of the speech event to the DB to finish
I would then use the API to fetch the video chunk that contains the speech, and POST it to the config location. What is returned can be a subtitle file or just raw text. Initially it would just be returning a text file, in the future we could generate subtitle files that could be overlay-ed in videos
I would either append the raw text to a new JSON blobland store it in the data column in "events" table, or we could expand the schema to include another column for transcription of speech? Any preference there?
Any issues or thoughts about doing it this way?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions