Hi.
I am looking for a developer who can install and set up an AI module for speech-to-text transcription.
I have one in mind which is Nemo by Nvidia: [login to view URL]
I have a dedicated server, Centos.
so everything is almost ready from my side.
I need it for profanity filter purposes to detect bad words in the Audio file.
Audio files are in M4A format.
I need a server-side solution.
The server should not consume high CPU/GPU or RAM. As I have Millions of minutes to be translated every day.
1: There should be an option to add Badwords on the frontend (page).
2: and go to the next page and browse and upload the audio file from the local PC.
3: and get the result after the audio file is processed on the next page with minutes and badword detected.
4: convert all audio to words (transcribe) and highlight badwords in yellow background.
This should be a fully trained module for English, Arabic, Urdu languages.
I don't need google/AWS etc paid transcription APIs.
This is a fixed project and you should have worked on Nemo at least 1 project.