Audio to text is slow and words are getting dropped The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) The Ask Question Wizard is Live! Data science time! April 2019 and salary with experienceUsing Gstreamer with google speech api (Streaming Transcribe) in C++change wav, aiff or mov audio sample rate of MOV or WAV WITHOUT changing number of samplesHow to insert frames to compensate for frames lost during captureHow to get high quality audio to text with the confidence the text will be as accurate as possiblegoogle cloud speech api audio data is being streamed too slowTest Google Speech API with audio fileFFMPEG command from Python 3.5 does not actually create audio fileDetecting filler words in speech-to-textGoogle Cloud Speech API: how to get the full text transcription of audios longer than 1 minute?Python Audio Streaming & Speech/Text Recognition ProjectGoogle Cloud Speech-To-Text drops chunks of FLAC file

When did F become S in typeography, and why?

Python - Fishing Simulator

What information about me do stores get via my credit card?

Who or what is the being for whom Being is a question for Heidegger?

Keeping a retro style to sci-fi spaceships?

Did the UK government pay "millions and millions of dollars" to try to snag Julian Assange?

Can the DM override racial traits?

What are these Gizmos at Izaña Atmospheric Research Center in Spain?

How do I add random spotting to the same face in cycles?

Didn't get enough time to take a Coding Test - what to do now?

How to test the equality of two Pearson correlation coefficients computed from the same sample?

How to politely respond to generic emails requesting a PhD/job in my lab? Without wasting too much time

Can withdrawing asylum be illegal?

How should I replace vector<uint8_t>::const_iterator in an API?

Make it rain characters

Would an alien lifeform be able to achieve space travel if lacking in vision?

Working through the single responsibility principle (SRP) in Python when calls are expensive

Windows 10: How to Lock (not sleep) laptop on lid close?

Semisimplicity of the category of coherent sheaves?

Do warforged have souls?

Is it ok to offer lower paid work as a trial period before negotiating for a full-time job?

First use of “packing” as in carrying a gun

Can smartphones with the same camera sensor have different image quality?

Is there a writing software that you can sort scenes like slides in PowerPoint?

Audio to text is slow and words are getting dropped

The 2019 Stack Overflow Developer Survey Results Are In

Announcing the arrival of Valued Associate #679: Cesar Manara

Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)

The Ask Question Wizard is Live!

Data science time! April 2019 and salary with experienceUsing Gstreamer with google speech api (Streaming Transcribe) in C++change wav, aiff or mov audio sample rate of MOV or WAV WITHOUT changing number of samplesHow to insert frames to compensate for frames lost during captureHow to get high quality audio to text with the confidence the text will be as accurate as possiblegoogle cloud speech api audio data is being streamed too slowTest Google Speech API with audio fileFFMPEG command from Python 3.5 does not actually create audio fileDetecting filler words in speech-to-textGoogle Cloud Speech API: how to get the full text transcription of audios longer than 1 minute?Python Audio Streaming & Speech/Text Recognition ProjectGoogle Cloud Speech-To-Text drops chunks of FLAC file

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

I have a code which takes videos from an input folder, converts it into audio file(.wav) using ffmpeg.
It then converts the audio file to text by recording 30 seconds audio (dura=30) and converting it to text using google translate api.

The problem is that the code takes a lot of time to convert video to text and it drops first two words and some words after every 30 seconds.

import speech_recognition as sr
import sys
import shutil
from googletrans import Translator
from pathlib import Path
import os
import wave
def audio_to_text(self,video_lst,deploy_path,video_path,audio_path):
 try:
 txt_lst=[]
 for video_file in video_lst:
 file_part=video_file.split('.')
 audio_path_mod = audio_path +'/'+ '.'.join(file_part[:-1])
 dir_path=video_path+'.'.join(file_part[:-1])
 self.createDirectory(audio_path_mod)
 audio_file='.'.join(file_part[:-1])+'.wav'
 command_ffmpeg='set PATH=%PATH%;'+deploy_path.replace('config','script')+'audio_video/ffmpeg/bin/'
 command='ffmpeg -i '+video_path+'/'+video_file+' '+audio_path_mod+'/'+audio_file
 os.system(command_ffmpeg)
 os.system(command)
 r=sr.Recognizer()
 dura=30
 lang='en'
 wav_filename=audio_path_mod+'/'+audio_file

 f = wave.open(wav_filename, 'r')
 frames = f.getnframes()
 rate = f.getframerate()
 audio_duration = frames / float(rate)
 final_text_lst=[]
 counter=0

 with sr.AudioFile(wav_filename) as source:
 while counter<audio_duration:
 audio=r.record(source,duration=dura)
 counter+=dura
 try:
 str=r.recognize_google(audio)
 final_text_lst.append(str)
 except Exception as e:
 print(e)
 print('Text data generated..')

 text_path=audio_path_mod+'/'+audio_file.replace('.wav','_audio_text.csv')
 with open(text_path, 'w') as f:
 f.write(' '.join(final_text_lst))

 except Exception as e:
 print(e)

Any help/suggestion would be valuable. Thanks in advance.

asked Mar 8 at 13:30

Madhur Yadav

171214

I'm mostly converting educational speeches

– Madhur Yadav
Mar 9 at 3:45

Hey Madhur, This is an interesting application. Would be you open to share details on video to audio conversion? You may want to use a simple gstreamer pipeline for that and you can simply add subtitles to it in the pipeline itself, or you can use the audio file generated with it to put in gRPC speech recognition sample given online. refer to this for how I did it. It is similar to what you are trying. Let me know if you want to use this approach.

– RC0993
Mar 12 at 5:33

Is there a progress?

– RC0993
Mar 18 at 11:27

add a comment |

The problem is that the code takes a lot of time to convert video to text and it drops first two words and some words after every 30 seconds.

import speech_recognition as sr
import sys
import shutil
from googletrans import Translator
from pathlib import Path
import os
import wave
def audio_to_text(self,video_lst,deploy_path,video_path,audio_path):
 try:
 txt_lst=[]
 for video_file in video_lst:
 file_part=video_file.split('.')
 audio_path_mod = audio_path +'/'+ '.'.join(file_part[:-1])
 dir_path=video_path+'.'.join(file_part[:-1])
 self.createDirectory(audio_path_mod)
 audio_file='.'.join(file_part[:-1])+'.wav'
 command_ffmpeg='set PATH=%PATH%;'+deploy_path.replace('config','script')+'audio_video/ffmpeg/bin/'
 command='ffmpeg -i '+video_path+'/'+video_file+' '+audio_path_mod+'/'+audio_file
 os.system(command_ffmpeg)
 os.system(command)
 r=sr.Recognizer()
 dura=30
 lang='en'
 wav_filename=audio_path_mod+'/'+audio_file

 f = wave.open(wav_filename, 'r')
 frames = f.getnframes()
 rate = f.getframerate()
 audio_duration = frames / float(rate)
 final_text_lst=[]
 counter=0

 with sr.AudioFile(wav_filename) as source:
 while counter<audio_duration:
 audio=r.record(source,duration=dura)
 counter+=dura
 try:
 str=r.recognize_google(audio)
 final_text_lst.append(str)
 except Exception as e:
 print(e)
 print('Text data generated..')

 text_path=audio_path_mod+'/'+audio_file.replace('.wav','_audio_text.csv')
 with open(text_path, 'w') as f:
 f.write(' '.join(final_text_lst))

 except Exception as e:
 print(e)

Any help/suggestion would be valuable. Thanks in advance.

asked Mar 8 at 13:30

Madhur Yadav

171214

I'm mostly converting educational speeches

– Madhur Yadav
Mar 9 at 3:45

Hey Madhur, This is an interesting application. Would be you open to share details on video to audio conversion? You may want to use a simple gstreamer pipeline for that and you can simply add subtitles to it in the pipeline itself, or you can use the audio file generated with it to put in gRPC speech recognition sample given online. refer to this for how I did it. It is similar to what you are trying. Let me know if you want to use this approach.

– RC0993
Mar 12 at 5:33

Is there a progress?

– RC0993
Mar 18 at 11:27

add a comment |

The problem is that the code takes a lot of time to convert video to text and it drops first two words and some words after every 30 seconds.

import speech_recognition as sr
import sys
import shutil
from googletrans import Translator
from pathlib import Path
import os
import wave
def audio_to_text(self,video_lst,deploy_path,video_path,audio_path):
 try:
 txt_lst=[]
 for video_file in video_lst:
 file_part=video_file.split('.')
 audio_path_mod = audio_path +'/'+ '.'.join(file_part[:-1])
 dir_path=video_path+'.'.join(file_part[:-1])
 self.createDirectory(audio_path_mod)
 audio_file='.'.join(file_part[:-1])+'.wav'
 command_ffmpeg='set PATH=%PATH%;'+deploy_path.replace('config','script')+'audio_video/ffmpeg/bin/'
 command='ffmpeg -i '+video_path+'/'+video_file+' '+audio_path_mod+'/'+audio_file
 os.system(command_ffmpeg)
 os.system(command)
 r=sr.Recognizer()
 dura=30
 lang='en'
 wav_filename=audio_path_mod+'/'+audio_file

 f = wave.open(wav_filename, 'r')
 frames = f.getnframes()
 rate = f.getframerate()
 audio_duration = frames / float(rate)
 final_text_lst=[]
 counter=0

 with sr.AudioFile(wav_filename) as source:
 while counter<audio_duration:
 audio=r.record(source,duration=dura)
 counter+=dura
 try:
 str=r.recognize_google(audio)
 final_text_lst.append(str)
 except Exception as e:
 print(e)
 print('Text data generated..')

 text_path=audio_path_mod+'/'+audio_file.replace('.wav','_audio_text.csv')
 with open(text_path, 'w') as f:
 f.write(' '.join(final_text_lst))

 except Exception as e:
 print(e)

Any help/suggestion would be valuable. Thanks in advance.

asked Mar 8 at 13:30

Madhur Yadav

171214

The problem is that the code takes a lot of time to convert video to text and it drops first two words and some words after every 30 seconds.

import speech_recognition as sr
import sys
import shutil
from googletrans import Translator
from pathlib import Path
import os
import wave
def audio_to_text(self,video_lst,deploy_path,video_path,audio_path):
 try:
 txt_lst=[]
 for video_file in video_lst:
 file_part=video_file.split('.')
 audio_path_mod = audio_path +'/'+ '.'.join(file_part[:-1])
 dir_path=video_path+'.'.join(file_part[:-1])
 self.createDirectory(audio_path_mod)
 audio_file='.'.join(file_part[:-1])+'.wav'
 command_ffmpeg='set PATH=%PATH%;'+deploy_path.replace('config','script')+'audio_video/ffmpeg/bin/'
 command='ffmpeg -i '+video_path+'/'+video_file+' '+audio_path_mod+'/'+audio_file
 os.system(command_ffmpeg)
 os.system(command)
 r=sr.Recognizer()
 dura=30
 lang='en'
 wav_filename=audio_path_mod+'/'+audio_file

 f = wave.open(wav_filename, 'r')
 frames = f.getnframes()
 rate = f.getframerate()
 audio_duration = frames / float(rate)
 final_text_lst=[]
 counter=0

 with sr.AudioFile(wav_filename) as source:
 while counter<audio_duration:
 audio=r.record(source,duration=dura)
 counter+=dura
 try:
 str=r.recognize_google(audio)
 final_text_lst.append(str)
 except Exception as e:
 print(e)
 print('Text data generated..')

 text_path=audio_path_mod+'/'+audio_file.replace('.wav','_audio_text.csv')
 with open(text_path, 'w') as f:
 f.write(' '.join(final_text_lst))

 except Exception as e:
 print(e)

Any help/suggestion would be valuable. Thanks in advance.

python-3.x ffmpeg speech-recognition speech-to-text google-speech-api

asked Mar 8 at 13:30

Madhur Yadav

171214

asked Mar 8 at 13:30

Madhur Yadav

171214

asked Mar 8 at 13:30

Madhur Yadav

171214

asked Mar 8 at 13:30

Madhur Yadav

171214

asked Mar 8 at 13:30

Madhur Yadav

171214

I'm mostly converting educational speeches

– Madhur Yadav
Mar 9 at 3:45

Hey Madhur, This is an interesting application. Would be you open to share details on video to audio conversion? You may want to use a simple gstreamer pipeline for that and you can simply add subtitles to it in the pipeline itself, or you can use the audio file generated with it to put in gRPC speech recognition sample given online. refer to this for how I did it. It is similar to what you are trying. Let me know if you want to use this approach.

– RC0993
Mar 12 at 5:33

Is there a progress?

– RC0993
Mar 18 at 11:27

add a comment |

I'm mostly converting educational speeches

– Madhur Yadav
Mar 9 at 3:45

Hey Madhur, This is an interesting application. Would be you open to share details on video to audio conversion? You may want to use a simple gstreamer pipeline for that and you can simply add subtitles to it in the pipeline itself, or you can use the audio file generated with it to put in gRPC speech recognition sample given online. refer to this for how I did it. It is similar to what you are trying. Let me know if you want to use this approach.

– RC0993
Mar 12 at 5:33

Is there a progress?

– RC0993
Mar 18 at 11:27

I'm mostly converting educational speeches

– Madhur Yadav
Mar 9 at 3:45

Hey Madhur, This is an interesting application. Would be you open to share details on video to audio conversion? You may want to use a simple gstreamer pipeline for that and you can simply add subtitles to it in the pipeline itself, or you can use the audio file generated with it to put in gRPC speech recognition sample given online. refer to this for how I did it. It is similar to what you are trying. Let me know if you want to use this approach.

– RC0993
Mar 12 at 5:33

Is there a progress?

– RC0993
Mar 18 at 11:27

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55064256%2faudio-to-text-is-slow-and-words-are-getting-dropped%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ufdjrw

0

Your Answer

Post as a guest

0

0

Post as a guest

Popular posts from this blog

Алба-Юлія

Захаров Федір Захарович

0

Your Answer

Sign up or log in

Post as a guest

Post as a guest

0

0

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Алба-Юлія

Захаров Федір Захарович