NOTICE: The Processors Wiki will End-of-Life on January 15, 2021. It is recommended to download any files or other content you may need that are hosted on processors.wiki.ti.com. The site is now set to read only.
Text-To-Speech-and-Speech-Recognition-on-Android
Contents
Text-To-Speech (TTS) on Android[edit]
The Android platform includes a Text-to-Speech (TTS) capability. Also known as "speech synthesis", TTS enables an Android device to "speak" text in various languages. Although all Android-powered devices that support the TTS functionality ship with the TTS-engine (for ex: pico), some devices have limited storage and may lack the language-specific resource files.
Complete TTS Android developer reference.
Enabling TTS on Android[edit]
com.svox.langpack.installer.apk contains speech synthesis data required by the TTS-engine. The following languages are supported:
- English (US) - English (UK) - French - German - Italian - Spanish
After successful installation, the Android TTS-engine can be configured in the following menu:
Settings > Voice input and output > Text to speech settings >
For example a sample TTS demo can be heard by using the following option:
Settings > Voice input and output > Text to speech settings > Listen to an Example
Once the speech synthesis data is installed, ANY application running on android can utilise the android TTS-engine to "read out loud" a piece of text.
Attached is a sample application Text_To_Speech_Reloaded_v1.0.apk which can read text typed by the user or from any file.
NOTE: Both the APKs listed are available for free on the android market. They can be installed on the device via adb or from an sdcard.
Speech Recognition on Android[edit]
Android is an open platform, so applications can potentially make use of any speech recognition service on the device that's registered to receive a RecognizerIntent. Google's Voice Search application, which is pre-installed on many Android devices, responds to a RecognizerIntent by displaying the "Speak now" dialog and streaming audio to Google's servers -- the same servers used when a user taps the microphone button on the search widget or the voice-enabled keyboard.
For speech input to be as accurate as possible, it's helpful to have an idea of what words are likely to be spoken. While a message like "Mom, I'm writing you this message with my voice!" might be appropriate for an email or SMS message, you're probably more likely to say something like "weather in Mountain View" if you're using Google Search. You can make sure your users have the best experience possible by requesting the appropriate language model: free_form for dictation, or web_search for shorter, search-like phrases. We developed the "free form" model to improve dictation accuracy for the voice keyboard, while the "web search" model is used when users want to search by voice.
Google's servers support many languages for voice input, with more arriving regularly. You can use the ACTION_GET_LANGUAGE_DETAILS broadcast intent to query for the list of supported languages. The web search model is available for all languages, while the free-form model may not be optimized for all languages.
Complete Speech Input Android developer reference.
Enabling Speech Recognition on Android[edit]
TODO
Third-party alternatives[edit]
A couple of Third-party alternatives exist for Android:
- iSpeech Text to Speech (TTS) and Speech Recognition (ASR) SDK for Android lets you Speech-enable any Android App quickly and easily with iSpeech Cloud. The SDK has a small footprint and supports 27 TTS and ASR languages and 15 for free-form dictation voice recognition. (Last updated: 10/13/2011)
- cmuSphinx Open Source Toolkit For Speech Recognition. Project by Carnegie Mellon University. Supports offline speech recognition on devices WITHOUT any network access. Sample Android application using cmuSphinx.