ATT Speech API

I am attempting to use the ATT Speech API to convert text to be played back on a Lala board. I’ve written code to talk to the Speech API, and I am getting a wav file back, but with this error. “Unable to locate headers in new message buffer.” The Speech API will return a wav file encoded like this: 16-bit PCM WAV, single channel, 16 kHz sampling
Has anyone used this API?

I can see the exception is thrown from the imp’s code "fmt " or “data” are missing from the contents of the buffer

I have posted a test WAV (on a onedrive) so that someone could perhaps have a look at it’s contents?

My initial question is about the space following the FMT does it need to be there?

https://github.com/electricimp/examples/blob/master/lala/lala.agent.nut
https://ccrma.stanford.edu/courses/422/projects/WaveFormat/
https://onedrive.live.com/redir?resid=A5E7D2270BBA9EC!309

There is a typo at line 160… should be sample_width, I believe. I think there is another bug when not using A-law compression as well. Tom is looking into it.

I do have working code to pull audio files from the Speech API though. I’ll share when its all working.

Thanks for letting us know Tom’s investigating , I was planning on spending more time on it today!

As it turns out, the Imp currently does not support playback of signed 16 bit PCM files. Tom filed a feature request to get this added, since it will encode signed 16 bit files. Until that is completed, the AT&T API won’t work. There is another API called ReadSpeaker that will output A-Law compressed files, and those will work. The AT&T API is pretty sweet, so hopefully that feature is soon added. :slight_smile:

The other way to do it is to use the agent to change the PCM files from signed to unsigned…

How about using a webservice to convert the soundfile for you? google suggested https://cloudconvert.org/

I tried ReadSpeaker, which works fine, but you get a VERY limited demo account. I had a Lala board set up to tell say random things to me, but no more credits! I guess I’ll look into conversion.

Or how about hosting an instance of espeak or other open-source TTS ?

http://www.babelfish.org/tts-free.htm

I did some work with Cloud Convert, which might actually have some interesting uses for the Imp. You can convert wav to wav, but I don’t seen any options about changing the data from signed to unsigned.

It might be worth hosting one of those… we are still working on the Pix API though. :slight_smile: