Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not found output as speech to text #698

Open
akshat9425 opened this issue Sep 10, 2018 · 15 comments
Open

not found output as speech to text #698

akshat9425 opened this issue Sep 10, 2018 · 15 comments

Comments

@akshat9425
Copy link

akshat9425 commented Sep 10, 2018

i used your provided transcribe function for speech to text and i replaced your provided _decoder with other objects provided by pocketsphinx here is my code

def transcribe(fp):

config = pocketsphinx.Decoder.default_config()
config.set_string('-hmm', HMDIR)

config.set_string('-lm', LMDIR)
config.set_string('-dict', DICTD) 
decoder = Decoder(config)

speech_rec = pocketsphinx.Decoder(config)
opened_file =  open(fp)
print("\n""types results",type(opened_file),"\n\n")
#exit(0)
opened_file.seek(44)

    # FIXME: Can't use the Decoder.decode_raw() here, because
    # pocketsphinx segfaults with tempfile.SpooledTemporaryFile()
data = opened_file.read()
decoder.start_utt()
decoder.process_raw(data, False, True)
decoder.end_utt()

result = decoder.hyp()

result = speech_rec.get_hyp()

exit(0)

print("our results",result)
transcribed = [result]
logging.info('PocketSphinx ?????%r', transcribed)
return transcribed 

and i got this as output 👍

You just said: <pocketsphinx.pocketsphinx.Hypothesis; proxy of <Swig Object of type 'Hypothesis *' at 0x7f074c6033f0> >

but expecting speech to text

someone please suggest what's wrong here

and if i use

decoder.hyp().hypstr

than nothing is printed as output

@G10DRAS
Copy link

G10DRAS commented Sep 10, 2018

where did you get this code ?
what version of pocketsphinx are you using ?

@akshat9425
Copy link
Author

akshat9425 commented Sep 11, 2018

Thanks for reply let me explain from scratch
I used https://github.com/VikParuchuri/scribe but in above repo there is a file named recognizer.py which contains function recognize() inside that there is decode_raw() which is not supported now

for its replacement i use algorithm of EXAMPLE 46 of this link
https://www.programcreek.com/python/example/10479/tempfile.SpooledTemporaryFile

here is my code of that function

def transcribe(fp):

config = pocketsphinx.Decoder.default_config()
config.set_string('-hmm', HMDIR)
config.set_string('-lm', LMDIR)
config.set_string('-dict', DICTD) 
decoder = Decoder(config)

speech_rec = pocketsphinx.Decoder(config)
opened_file =  open(fp)
print("\n""types results",type(opened_file),"\n\n")

opened_file.seek(44)
data = opened_file.read()
print("value of data",data)
decoder.start_utt()
decoder.process_raw(data, False, True)
print("value of process_raw",decoder.process_raw(data, False, True))
decoder.end_utt()

result = decoder.hyp().hypstr

print("our results",result)
transcribed = [result]
logging.info('PocketSphinx ?????%r', transcribed)
return transcribed 

version of pocketsphinx is : Version: 0.1.15
version of sphinxbase is : Version: 0.8

if i speak nothing than i got 0 as output from process_raw and if i say hello i got some specific value like 256 from process_raw() but blank string from decoder.hyp().hypstr

@G10DRAS
Copy link

G10DRAS commented Sep 12, 2018

Download latest (5prealpha) Pocketsphinx and Sphinxbase code from github
and try followig example
https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/decoder_test.py

or see similar code in jasper-dev
https://github.com/jasperproject/jasper-client/blob/jasper-dev/plugins/stt/pocketsphinx-stt/sphinxplugin.py

@akshat9425
Copy link
Author

i used https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/decoder_test.py
and pass my audio file with .wav extension in place of goforward.raw file and goforward.mfc file i also ran code as it is once

in all three cases i got results as: hyp().hypstr gives blank string and model score along with confidence are giving some values

what to did to get text from hyp().hypstr of wav file

@akshat9425
Copy link
Author

i need to know while creating decoder other than .dict and .bin file you also passed a file of hmm model which one file it is i downloaded https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English/cmusphinx-en-us-8khz-5.2.tar.gz/download from here i found 6-7 files inside it

@G10DRAS
Copy link

G10DRAS commented Sep 12, 2018

Make sure wav file is in 16Khz 16 bit mono format.

Try this model

https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English/cmusphinx-en-us-ptm-5.2.tar.gz/download

@akshat9425
Copy link
Author

thanks alot man its now working i was stucked from many days because of this

@akshat9425
Copy link
Author

yeah, its working well but i need model for indian english and hindi i think that was for american english i already developed application for speech to text in american english
will you please suggest me or send links to download model for indian english and hindi that would work with pocketsphinx

@G10DRAS
Copy link

G10DRAS commented Sep 12, 2018

Did you search sourceforge for it ??

@akshat9425
Copy link
Author

yeah, i found their pretrained indian english and hindi models but they are not working

@akshat9425
Copy link
Author

while using hindi model i got below error
AttributeError: 'NoneType' object has no attribute 'hypstr'

here is my code
from os import environ, path

from pocketsphinx.pocketsphinx import *
from sphinxbase.sphinxbase import *

MODELDIR = "/home/user/scribe/model/hindi/"
DATADIR = "/home/user/scribe/pocketsphinx/test/data/"

config = Decoder.default_config()
config.set_string('-hmm', path.join(MODELDIR, 'hindi_hmm'))
config.set_string('-lm', path.join(MODELDIR, 'hindi.lm'))
config.set_string('-dict', path.join(MODELDIR, 'hindi.dic'))
config.set_string('-logfn', '/dev/null')
decoder = Decoder(config)

#stream = open(path.join(DATADIR, 'goforward.raw'), 'rb')
stream = open('/home/user/Downloads/Audio_Conversation-001.wav', 'rb')

in_speech_bf = False
decoder.start_utt()
while True:
buf = stream.read(1024)
if buf:
decoder.process_raw(buf, False, False)
if decoder.get_in_speech() != in_speech_bf:
in_speech_bf = decoder.get_in_speech()
if not in_speech_bf:
decoder.end_utt()
print 'Result:', decoder.hyp().hypstr
decoder.start_utt()
else:
break
decoder.end_utt()

@G10DRAS
Copy link

G10DRAS commented Sep 13, 2018

If not working then Train your own model.

@akshat9425
Copy link
Author

how?
i am a beginner please send me any suggestion to train my own model

@G10DRAS
Copy link

G10DRAS commented Sep 13, 2018

a good start point
https://cmusphinx.github.io/wiki/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants