Richard Maw Text To Speech

What do you do if you need to be present on a call but you've lost your voice? Why write an 11 line shell script to replace it of course!

Our first port of call is of course to Google! So what is the first result for "linux text to speech"? (well for me it's RPi Text to Speech (Speech Synthesis)) and deep down there after Cepstral it's) Festival of course!

So how do we use that?

First we need to install it. Since I wrote this on an Ubuntu system I do:

$ apt install festival

This installs the festival command, which has a convenient --tts option!

$ echo hello world | festival --tts

This however has two problems:

  1. It is fatiguing on the fingers to tweak the parameters and run the command.
  2. The output of the command is to the speakers rather than a microphone.

We can fix problem 1 with a trivial shell script to produce output after every line instead.

#!/bin/sh
while read -r REPLY; do
        printf %s "$REPLY" | festival --tts
done

The problem of outputting to a microphone is somewhat more involved.

It's possible to loop your speaker output through a recording device in Pulse Audio by setting the recording device to a "Monitor" device.

It's no doubt possible to drive this from the command-line, but since my chat software is graphical I've no problem using pavucontrol.

Once the chat software is running the "Recording" tab and change the input device of the application.

This works but is unsatisfying because you need a second output device otherwise other sounds will be broadcast and is prone to causing feedback.

What we need is some kind of virtual microphone. As usual, the likes of Google and StackOverflow come to hand, and a virtual microphone is what Pulse Audio calls a "null sink".

We can create a null sink and give it a recognisable name by running:

pacmd load-module module-null-sink sink_name=Festival
pacmd update-sink-proplist Festival device.description=Festival

Then we can remove it again by running:

pacmd unload-module module-null-sink

So how do we get festival to play its output to that?

We can't start the command, then tweak the parameters in pavucontrol because it doesn't run long enough to change that before it starts playing.

We can play audio to a specified device with the paplay command, but how do we get Festival to output?

Fortunately Festival lets you set some parameters in its scripting language.

We need to pick a common audio format that paplay can read and festival can produce. We can set this with:

(Parameter.set 'Audio_Required_Format 'aiff)

We need to tell festival to play audio through a specified Pulse Audio device. The best way I could find to do this was setting Audio_Method to Audio_Command and Audio_Command to a paplay command.

(Parameter.set 'Audio_Method 'Audio_Command)
(Parameter.set 'Audio_Command "paplay $FILE --client-name=Festival --stream-name=Speech --device=Festival")

Festival lets us run commands on its command-line so the final script we get is:

#!/bin/sh
pacmd load-module module-null-sink sink_name=Festival
pacmd update-sink-proplist Festival device.description=Festival
while read -r REPLY; do
        festival --batch \
                '(Parameter.set '\''Audio_Required_Format '\''aiff)' \
                '(Parameter.set '\''Audio_Method '\''Audio_Command)' \
                '(Parameter.set '\''Audio_Command "paplay $FILE --client-name=Festival --stream-name=Speech --device=Festival")' \
                '(SayText "'"$REPLY"'")'
done
pacmd unload-module module-null-sink
  1. Run that in a terminal window.
  2. Start your chat program.
  3. Start pavucontrol and change the input device of your program to Festival.
  4. Type lines of text into the terminal to speak.

Since this was a project to achieve the goal of being able to participate in a group chat without being able to speak, development stopped there.

Should further development be warranted other changes could include:

  1. The module load and unload process is pretty fragile. Would need to use an API that is tied to process lifetime or at least unload by ID rather than name.
  2. No escaping mechanism for $REPLY. Would need to learn string escaping in lisp.
  3. Lots of work done per line of text. Festival has a server mode which could reduce the amount of work per line.
  4. Investigate a way to pipe audio directly between Festival and Pulse Audio. text2wave exists to write to a file, possibly standard output, and pacat exists to take audio from standard input and put it to speakers, but I couldn't get it to work at the time.
  5. Replace festival entirely. It is in need of maintainership, and has been broken in Fedora releases, so replacing the voice generation with pyttsx, espeak or flite could help.
Posted Wed Apr 4 15:23:39 2018 Tags:

Unless your program is compiled into one big binary lump it will typically need to load other assets on program start.

This is usually libraries, though other assets may also be required.

Your programming environment will define some standard locations (see hier(7) for some examples), but will normally have a way to specify more.

  • C programs will look for libraries in directories listed as : separated strings in the LD_LIBRARY_PATH environment variable.
  • Python programs will look in PYTHONPATH.
  • Lua programs will look in LUA_PATH, LUA_CPATH and other environment variables based on the version of the Lua interpreter.
  • Java will look in its class path, which can be set with the -classpath option.
  • Executables will be sought in the PATH environment variable.

If you only need assets in the standard locations then you wouldn't normally need to change anything.

However you're not always able to stick to only distribution provided software.

In this case you need to use software which has "bundled" its dependencies alongside its own software.

Linux ELF binaries can make use of its "RPATH" to add extra paths, but most executable formats don't have a direct equivalent.

In which case we can instead specify the new locations with a wrapper script. The standard trick is to use $0 for the name of the script, dirname(1) to get the directory the script is located in, and readlink(1) -f on the to turn it into an absolute path.

#!/bin/sh
D="$(dirname "$(readlink -f "$0")")"
cp="$(set -- "$D/support/jars/"*.jar; IFS=:; printf %s "$*")"
exec java -classpath "$cp" com.myapplication.Main "$@"

This works for running the script in the directory the assets are stored in, but it can be convenient to add the program to a directory in PATH.

If written as a bash script you can use $BASH_SOURCE which is guaranteed to be the path of the script, and in circumstances I can now no longer reproduce I needed to use it instead of $0.

#!/bin/bash
D="$(dirname "$(readlink -f "${BASH_SOURCE}")")"
cp="$(set -- "$D/support/jars/"*.jar; IFS=:; printf %s "$*")"
exec java -classpath "$cp" com.myapplication.Main "$@"
Posted Wed Apr 11 12:00:08 2018 Tags: