Version 27 (modified by welberge, 6 years ago) (diff)



AsapRealizer's default SpeechEngine provides flexible support for different TTS systems. Lipsync can flexibly be configured through lipsync providers. Loaders and default configurations are provided for  MARYTTS and Microsoft Speech API compatible TTS systems.

XML setup

AsapRealizer's default SpeechEngine is set up through a SpeechEngineLoader xml description.

Typical setup:

<Loader id="facelipsync" requiredloaders="faceengine" loader="asap.faceengine.loader.TimedFaceUnitLipSynchProviderLoader">
  <MorphVisemeBinding resources="Humanoids/armandia/facebinding/" filename="ikpvisemebinding.xml"/>

<Loader id="jawlipsync" requiredloaders="animationengine" loader="asap.animationengine.loader.TimedAnimationUnitLipSynchProviderLoader">
  <SpeechBinding basedir="" resources="Humanoids/shared/speechbinding/" filename="ikpspeechbinding.xml"/>

<Loader id="ttsbinding" loader="asap.maryttsbinding.loader.MaryTTSBindingLoader">
  <PhonemeToVisemeMapping resources="Humanoids/shared/phoneme2viseme/" filename="sampade2ikp.xml"/>
<Loader id="speechengine" loader="asap.speechengine.loader.SpeechEngineLoader" requiredloaders="facelipsync,jawlipsync,ttsbinding">
  <Voice factory="WAV_TTS"/>

The connection to the TTS system and the voice selection is specified in the Voice element. It has the following attributes:

voicenameoptionalthe name of the voice that is to be used
factoryoptional, defaults to WAV_TTSWAV_TTS or DIRECT_TTS whether speech is to be produced directly by the TTS-system (DIRECT_TTS) or by first having the TTS system generate a .wav file and then play that back (WAV_TTS). DIRECT_TTS is experimental.

If no marydir nor localmarydir is specified, MaryTTS is assumed to be resolved using the ivy.xml file in your project. e.g.

<dependency org="HMI"   name="MARYTTS"     rev="latest.${resolve.status}"   /> 

Testing and listing available voices

The SpeechUI tool can be used to check which voices are available for your TTS system. It requires a gui-embodiment to place its drop down list of voices on.

Example configuration:

<Loader id="guiembodiment" loader="asap.realizerembodiments.JFrameEmbodiment">
<Loader id="speechengine" loader="asap.speechengine.loader.SpeechEngineLoader" requiredloaders="facelipsync,jawlipsync,guiembodiment">
  <Voice voicetype="MARY" factory="WAV_TTS"/>

The drop down list shows the available voices and allows you to select a voice to try out.

MaryTTS setup

A default MaryTTS configuration is available in HmiResource/MARYTTS (see GitRepositories). This configuration is kept small (<100 Mb) by design, it is meant only for quick demonstration purposes. You can copy this configuration to a custom directory on your computer and install custom voices, or simply install a new version of MaryTTS. To set up AsapRealizer to use your own MaryTTS configuration, link Mary's TTSBinding to your MaryTTS dir using either the localmary dir or the marydir attribute in MaryTTS:

<Loader id="ttsbinding" loader="asap.maryttsbinding.loader.MaryTTSBindingLoader">
    <MaryTTS localmarydir="HmiResource/MARYTTS/resource/MARYTTS"/>
    <PhonemeToVisemeMapping resources="Humanoids/shared/phoneme2viseme/" filename="sampade2ikp.xml"/>

Microsoft SAPI setup

Microsoft SAPI works automatically on Windows >= Vista.

The following ttsbinding needs to be set up:

<Loader id="ttsbinding" loader="asap.sapittsbinding.loader.SapiTTSBindingLoader"/>

Herwin:if someone really wants to know I can probably dig up how to get it to work on Windows XP..

Connecting a new TTS system

Will be documented on request.

Lipsync providers

Face lip sync provider Will be further documented on request.



If you have any questions, documentation requests, hints, tips, or corrections to this document, please add them here by replying to this topic.