Using Multiple TTS Streams On The Emacspeak Audio Desktop
1 Executive Summary
Emacspeak now uses multiple text-to-speech streams — as an example,
this enables spoken notifications that do not interrupt ongoing spoken
output. To make such notifications more perceivable, Emacspeak places
notifications to the right of the user by leveraging Linux-ALSA
features that allow one to scale the amplitude of the left and right
audio channels.
2 Background
Until now, Emacspeak has used a single instance of a Text-To-Speech
(TTS) engine to produce all spoken feedback. An unfortunate
consequence is that any spoken announcement necessarily interrupts
ongoing speech; as an example, an incoming instant-message (e.g.,
Jabber notification) can interrupt what you're currently
reading.
Emacs itself produces a large number of asynchronous messages
depending on the number of processes running within Emacs; at present,
all Emacs generated messages are equal though there are ongoing
plans to improve this situation going forward, e.g., using package
alert
. With Emacspeak now able to use multiple TTS streams, arrival
of package alert
within Emacs should facilitate smarter handling of
different categories of messages over time.
Playing multiple TTS streams simultaneously can make it hard to
understand the resulting output; Emacspeak leverages underlying ALSA
functionality to send notifications to a virtual ALSA device that
places the auditory output mostly on the right channel. See the
following paragraphs on setup/configuration. I'm presently using this
on Linux with the linux-outloud
voice — you need to have a copy of
this TTS engine installed and working — see Voxin for details on
obtaining that engine. Note: the Emacspeak espeak
server does not
use raw ALSA for its output — consequently, notifications produced
by espeak
play on both left and right channels, making it
impossible to understand. The mac
server may be able to support
this functionality using something Mac-specific — patches welcome.
3 Emacspeak Setup
- Emacspeak now adds user-option
emacspeak-tts-use-notify-stream
. If this is set tot
in the
user's initialization file before Emacspeak is loaded, Emacspeak
checks to see if the user's selected TTS engine supports multiple
instances, and if so launches a second instance of the TTS engine
for use as a Notification TTS Stream. See my
tvr/emacs-startup.el
in the Emacspeak Git Repository for an
example setup. - The Notification TTS Stream can be restarted via command
dtk-notify-initialize
bound toC-e d C-n
. You should
ordinarily not need to invoke this command. - The Notification TTS Stream can be shut-down using command
dtk-notify-shutdown
bound toC-e d C-s
. When the /Notification
TTS Stream is not available, Emacspeak defaults to using a single
TTS stream for all spoken output — i.e., no change. - At present, emacspeak tries to use a separate Notification TTS
Stream when the selected TTS engine is a software TTS
running locally. - File
servers/linux-outloud/notify-asoundrc
contains the
.asoundrc
that I am using on my thinkpad. To have Emacspeak
place the Notification TTS Stream mostly on the right, the
contents of that file (suitably modified for your sound card)
need to be placed in file$HOME/.asoundrc
. Warning: Handle with
care — a broken.asoundrc
can kill all audio output. - The
.asoundrc
scales left and right amplitude to place the
output mostly on the right — to change this behavior, you can
edit the Transformation Table for virtual devicetts_mono
in
the.asoundrc
file. - This set-up has not been tested with
pulseaudio
.
4 Summary
Share and enjoy —