ATT Speech API

jwehr · June 15, 2014, 11:32pm

I am attempting to use the ATT Speech API to convert text to be played back on a Lala board. I’ve written code to talk to the Speech API, and I am getting a wav file back, but with this error. “Unable to locate headers in new message buffer.” The Speech API will return a wav file encoded like this: 16-bit PCM WAV, single channel, 16 kHz sampling
Has anyone used this API?

back_ache · June 17, 2014, 4:35am

I can see the exception is thrown from the imp’s code "fmt " or “data” are missing from the contents of the buffer

I have posted a test WAV (on a onedrive) so that someone could perhaps have a look at it’s contents?

My initial question is about the space following the FMT does it need to be there?

https://github.com/electricimp/examples/blob/master/lala/lala.agent.nut
https://ccrma.stanford.edu/courses/422/projects/WaveFormat/
https://onedrive.live.com/redir?resid=A5E7D2270BBA9EC!309

MakeDeck · June 17, 2014, 12:38pm

There is a typo at line 160… should be sample_width, I believe. I think there is another bug when not using A-law compression as well. Tom is looking into it.

I do have working code to pull audio files from the Speech API though. I’ll share when its all working.

back_ache · June 18, 2014, 4:04am

Thanks for letting us know Tom’s investigating , I was planning on spending more time on it today!

MakeDeck · June 18, 2014, 7:54am

As it turns out, the Imp currently does not support playback of signed 16 bit PCM files. Tom filed a feature request to get this added, since it will encode signed 16 bit files. Until that is completed, the AT&T API won’t work. There is another API called ReadSpeaker that will output A-Law compressed files, and those will work. The AT&T API is pretty sweet, so hopefully that feature is soon added.

hugo · June 18, 2014, 2:22pm

The other way to do it is to use the agent to change the PCM files from signed to unsigned…

back_ache · June 19, 2014, 2:01am

How about using a webservice to convert the soundfile for you? google suggested https://cloudconvert.org/

jwehr · June 21, 2014, 2:53pm

I tried ReadSpeaker, which works fine, but you get a VERY limited demo account. I had a Lala board set up to tell say random things to me, but no more credits! I guess I’ll look into conversion.

back_ache · June 23, 2014, 8:31am

Or how about hosting an instance of espeak or other open-source TTS ?

http://www.babelfish.org/tts-free.htm

MakeDeck · June 23, 2014, 10:26am

I did some work with Cloud Convert, which might actually have some interesting uses for the Imp. You can convert wav to wav, but I don’t seen any options about changing the data from signed to unsigned.

It might be worth hosting one of those… we are still working on the Pix API though.