wiki:SpeechEngine

SpeechEngine

AsapRealizer's default SpeechEngine provides flexible support for different TTS systems. Lipsync can flexibly be configured through lipsync providers. Loaders and default configurations are provided for  MARYTTS and Microsoft Speech API compatible TTS systems.

XML setup

AsapRealizer's default SpeechEngine is set up through a SpeechEngineLoader xml description.

Typical setup:

<Loader id="facelipsync" requiredloaders="faceengine" loader="asap.faceengine.loader.TimedFaceUnitLipSynchProviderLoader">
  <MorphVisemeBinding resources="Humanoids/armandia/facebinding/" filename="ikpvisemebinding.xml"/>
</Loader>

<Loader id="jawlipsync" requiredloaders="animationengine" loader="asap.animationengine.loader.TimedAnimationUnitLipSynchProviderLoader">
  <SpeechBinding basedir="" resources="Humanoids/shared/speechbinding/" filename="ikpspeechbinding.xml"/>
</Loader>

<Loader id="ttsbinding" loader="asap.maryttsbinding.loader.MaryTTSBindingLoader">
  <PhonemeToVisemeMapping resources="Humanoids/shared/phoneme2viseme/" filename="sampade2ikp.xml"/>
</Loader>
 
<Loader id="speechengine" loader="asap.speechengine.loader.SpeechEngineLoader" requiredloaders="facelipsync,jawlipsync,ttsbinding">
  <Voice factory="WAV_TTS"/>
</Loader>

The voice selection is specified in the Voice element. It has the following attributes:

AttributeUseDescription
voicenameoptionalthe name of the voice that is to be used
factoryoptional, defaults to WAV_TTSWAV_TTS or DIRECT_TTS whether speech is to be produced directly by the TTS-system (DIRECT_TTS) or by first having the TTS system generate a .wav file and then play that back (WAV_TTS). DIRECT_TTS is experimental.

A ttsbinding is used to connect a specific text-to-speech-system to the SpeechEngine.

Testing and listing available voices

The SpeechUI tool can be used to check which voices are available for your TTS system. It requires a gui-embodiment to place its drop down list of voices on.

Example configuration:

<Loader id="guiembodiment" loader="asap.realizerembodiments.JFrameEmbodiment">
  ...
</Loader>
<Loader id="speechengine" loader="asap.speechengine.loader.SpeechEngineLoader" requiredloaders="facelipsync,jawlipsync,guiembodiment">
  <Voice factory="WAV_TTS"/>
  <SpeechUI/>
</Loader>

The drop down list shows the available voices and allows you to select a voice to try out.

MaryTTS setup

A default MaryTTS configuration is available in HmiResource/MARYTTS (see GitRepositories). This configuration is kept small (<100 Mb) by design, it is meant only for quick demonstration purposes. You can copy this configuration to a custom directory on your computer and install custom voices, or simply install a new version of MaryTTS. To set up AsapRealizer to use your own MaryTTS configuration, link Mary's TTSBinding to your MaryTTS dir using either the localmary dir or the marydir attribute in MaryTTS:

<Loader id="ttsbinding" loader="asap.maryttsbinding.loader.MaryTTSBindingLoader">
    <MaryTTS localdir="HmiResource/MARYTTS/resource/MARYTTS"/>
    <PhonemeToVisemeMapping resources="Humanoids/shared/phoneme2viseme/" filename="sampade2ikp.xml"/>
</Loader>

The MaryTTS element (optional) is used to specify where MaryTTS is installed.

attributedescription
localdirthe directory in which MARYTTS is installed, relative to shared.project.root
dirthe directory in which MARYTTS is installed (absolute path)

If no dir nor localdir is specified or the MaryTTS element is not provided, MaryTTS is assumed to be resolved using the ivy.xml file in your project. e.g.

<dependency org="HMI"   name="MARYTTS"     rev="latest.${resolve.status}"   /> 

Microsoft SAPI setup

Microsoft SAPI works automatically on Windows >= Vista.

Herwin:if someone really wants to know I can probably dig up how to get it to work on Windows XP..

The following ttsbinding needs to be set up:

<Loader id="ttsbinding" loader="asap.sapittsbinding.loader.SapiTTSBindingLoader"/>

TTS through Ipaaca

<Loader id="ttsbinding" loader="asap.ipaacattsbinding.loader.IpaacaTTSBindingLoader">
    <PhonemeToVisemeMapping resources="Humanoids/shared/phoneme2viseme/" filename="sampade2ikp.xml"/>
</Loader>

Connecting a new TTS system

Will be documented on request.

Lipsync providers

Face lip sync provider Will be further documented on request.

Discussion

SpeechEngine
 unsolved

If you have any questions, documentation requests, hints, tips, or corrections to this document, please add them here by replying to this topic.