wiki:SpeechEngine

Version 22 (modified by welberge, 7 years ago) (diff)

--

SpeechEngine

AsapRealizer's default SpeechEngine provides flexible support for different TTS systems. Lipsync can flexibly be configured through lipsync providers. Loaders and default configurations are provided for  MARYTTS and Microsoft Speech API compatible TTS systems.

XML setup

AsapRealizer's default SpeechEngine is set up through a SpeechEngineLoader xml description.

Typical setup:

<Loader id="facelipsync" requiredloaders="faceengine" loader="asap.faceengine.loader.TimedFaceUnitLipSynchProviderLoader">
  <MorphVisemeBinding resources="Humanoids/armandia/facebinding/" filename="ikpvisemebinding.xml"/>
</Loader>

<Loader id="jawlipsync" requiredloaders="animationengine" loader="asap.animationengine.loader.TimedAnimationUnitLipSynchProviderLoader">
  <SpeechBinding basedir="" resources="Humanoids/shared/speechbinding/" filename="ikpspeechbinding.xml"/>
</Loader>
 
<Loader id="speechengine" loader="asap.speechengine.loader.SpeechEngineLoader" requiredloaders="facelipsync,jawlipsync">
  <Voice voicetype="MARY" factory="WAV_TTS"/>
</Loader>

The connection to the TTS system and the voice selection is specified in the Voice element. It has the following attributes:

AttributeUseDescription
voicetyperequiredeither MARY for MARYTTS or SAPI5 for a Microsoft Speech API compatible voice (Windows only)
voicenameoptionalthe name of the voice that is to be used
localmarydirMARYTTS onlythe directory in which MARYTTS is installed, relative to shared.project.root
marydirMARYTTS onlythe directory in which MARYTTS is installed (absolute path)
factoryoptional, defaults to WAV_TTSWAV_TTS or DIRECT_TTS whether speech is to be produced directly by the TTS-system (DIRECT_TTS) or by first having the TTS system generate a .wav file and then play that back (WAV_TTS). DIRECT_TTS is experimental.

If no marydir nor localmarydir is specified, MaryTTS is assumed to be resolved using the ivy.xml file in your project. e.g.

<dependency org="HMI"   name="MARYTTS"     rev="latest.${resolve.status}"   /> 

Testing and listing available voices

The SpeechUI tool can be used to check which voices are available for your TTS system. It requires a gui-embodiment to place its drop down list of voices on.

Example configuration:

<Loader id="guiembodiment" loader="asap.realizerembodiments.JFrameEmbodiment">
  ...
</Loader>
<Loader id="speechengine" loader="asap.speechengine.loader.SpeechEngineLoader" requiredloaders="facelipsync,jawlipsync,guiembodiment">
  <Voice voicetype="MARY" localmarydir="HmiResource/MARYTTS" factory="WAV_TTS"/>
  <SpeechUI/>
</Loader>

The drop down list shows the available voices and allows you to select a voice to try out.

MaryTTS setup

A default MaryTTS configuration is available in HmiResource/MARYTTS (see GitRepositories). This configuration is kept small (<100 Mb) by design, it is meant only for quick demonstration purposes. You can copy this configuration to a custom directory on your computer and install custom voices, or simply install a new version of MaryTTS. To set up AsapRealizer to use your own MaryTTS configuration, link it to your MaryTTS dir using either the localmary dir or the marydir attribute in Voice:

<Loader id="speechengine" loader="asap.speechengine.loader.SpeechEngineLoader" requiredloaders="facelipsync,jawlipsync">
  <Voice voicetype="MARY" localmarydir="MyMaryTTSDir" factory="WAV_TTS"/>
</Loader>

Microsoft SAPI setup

Microsoft SAPI works automatically on Windows >= Vista.

Herwin:if someone really wants to know I can probably dig up how to get it to work on Windows XP..

Connecting a new TTS system

Will be documented on request.

Lipsync providers

Face lip sync provider Will be further documented on request.

Discussion

SpeechEngine
 unsolved

If you have any questions, documentation requests, hints, tips, or corrections to this document, please add them here by replying to this topic.