Jump to content

Development/Tutorials/Text-To-Speech

From KDE TechBase
Revision as of 18:18, 27 June 2011 by Nicolas17 (talk | contribs) (add syntaxhighlighting and fix bullet formatting)

What is Text to Speech?

Jovie (previously called ktts in KDE <= 4.4) -- is a subsystem within the KDE desktop for conversion of text to audible speech. Jovie is currently under development and aims to become the standard subsystem for all KDE applications to provide speech output.

How does it work?

Applications send text they wish spoken to Jovie via D-Bus. For example, in a terminal window, you can type the following commands to start Jovie and speak "Hello World".

# Start Jovie (if not already running)
jovie
# Send "Hello World" to KTTSD for speaking in English.
qdbus org.kde.KSpeech /KSpeech say "Hello World" 0

Using speech in your application

The above example shows how to play with speech via command-line, but to use it inside your application you'll likely want to use the dbus interface programmatically. It's defined in kdelibs/interfaces/kspeech/org.kde.KSpeech.xml and can be used in your application by putting the following in your CMakeLists.txt:

qt4_add_dbus_interfaces(my_SRCS ${KDE4_DBUS_INTERFACES_DIR}/org.kde.KSpeech.xml)

Then in the source file you want to use speech in, #include <kspeech.h> to get some of the enumerations needed (i.e. KSpeech::soPlainText). Finally, define an object of type org::kde::KSpeech* and use it like so:

org::kde::KSpeech* kspeech = new org::kde::Kspeech("org.kde.kttsd", "/KSpeech", QDbusConnection::sessionBus());
kspeech->setApplicationName("myappname");
kspeech->say("text to speak", KSpeech::soPlainText);

The KSpeech API

For a complete description of these and other commands, see the KDE Text-to-Speech API.

Jovie takes care of sending the text to speech-dispatcher. Jovie is not a speech synthesis engine itself. You must install one of the compatible speech engines. Speech-dispatcher is designed with a plugin architecture that makes it easy to write new plugins for other speech engines.

Why? Who needs it?

Jovie provides a common interface for all KDE applications to use for speaking. Programmers need not concern themselves with the details of the particular speech synthesis engine(s) used.

User Features

  • Speak any text from the clipboard.
  • Speak any plain text file.
  • Speak all or any portion of a text file from Kate, including instances where Kate is embedded in another application.
  • Speak all or any portion of an HTML page from Konqueror.
  • Use as the speech backend for KMouth
  • Speak KDE notifications (KNotify).
  • Speech is spoken via speech-dispatcher, so any speech-dispatcher backend can be used (espeak, festival, etc.)
  • User-configurable filters for substituting misspoken words, choosing speech synthesizers, and transforming XHMTL/XML documents.

Programmer Features

  • Priority system for screen reader outputs, warnings and messages, while still playing regular texts.
  • Permit generation of speech from the command line (or via shell scripts) using the KDE DCOP utilities.
  • Provide a lightweight and easily usable interface for applications to generate speech output.
  • Applications need not be concerned about contention over the speech device.
  • FUTURE: Provide support for speech markup languages, such as VoiceXML, Sable, Java Speech Markup Language (JSML), and Speech Markup Meta-language (SMML).
  • FUTURE: Provide limited support for embedded speech markers.
  • Asynchronous to prevent system blocking.

It is hoped that more programmers will begin adding speech capabilities to their KDE programs using Jovie.

Different parts

Jovie actually consists of a few programs:

Jovie

The KDE Text-to-Speech system, a system tray application that provides TTS support to KDE applications. Applications initiate TTS by making D-Bus calls to Jovie.

kcmkttsd

A KControl module for configuring the Text to Speech System. kcmkttsd runs in the KDE Control Center or start it with the command "kcmshell4 kcmkttsd" or by choosing "configure" from Jovie system tray icon's menu.

ktexteditor_kttsd

A plugin for the KDE Advanced Text Editor that permits you to speak an entire text file or any portion of a file.

libkhtmlkttsdplugin

A plugin for Konqueror that permits you to speak all or any portion of an HTML web page.

Requirements

  • KDE 4.4 or later.
  • speech-dispatcher version 0.6.7 or later.
  • A speech synthesizer such as espeak, festival, or flite.