System-wide speakup: Difference between revisions

From JookWiki
(Initial commit)
 
(Fixed your typo for ya xD)
 
(32 intermediate revisions by one other user not shown)
Line 1: Line 1:
WIP WIP WIP, idea dump for now, rename to Linux systemwide audio
Is it possible to use Speakup across multiple users on Linux? This article is the notes and ramblings of someone who tries to do that.


is it possible to play audio as two users on the same computer? today we find out!
== Background ==
The Linux kernel includes the [http://www.linux-speakup.org/ Speakup screen reader]. Unlike most desktop screen readers, Speakup will aid in reading the Linux virtual TTYs that you can switch to with Ctrl+Alt+F1 through Ctrl+Alt+F6 on most systems.


first attempt: shared group for audio device
This functionality is vitally important for troubleshooting in a Linux system and running command line applications, which is unfortunately one of the more accessible experiences on Linux.


result: fails if root plays audio first
Speakup alone is not enough to implement a screen reader: It still needs some speech synthesizer to turn text and control signals in to audio. Once upon a time hardware synthesizers connected over serial were used to do this task, but as computers became more capable software synthesizers became viable.


second attempt: shared group for audio device + alsa dmix
Speakup supports software synthesizers by exposing a character device in /dev/softsynthu. A connector such as connects to this device and implements the synthesizer protocol and handles playback of speech. In practice the software synthesizer is [https://github.com/linux-speakup/espeakup espeakup] which plays speech using [https://github.com/espeak-ng/espeak-ng eSpeak NG].


result: works, but pulse bypasses this and locks the sound card anyway
== The problem ==
Linux is a multi-user system, but hardware and other resources are not. There's only one sound card and only one Speakup device. So these resources have to be shared somehow.


system and pulseaudio coordinate will snatch the audio whenever you change to a seat assigned to your user. maybe you could modify these to not cede the audio outside a specific seat (in this case seat = virtual tty you switch in linux with ctrl alt f1 through f7)?
Since around 2007 with the release of [https://www.freedesktop.org/wiki/Software/ConsoleKit/ ConsoleKit] the Linux desktop has handled sharing resources using the idea of seats. To put it simply:


so there's two solutions here:
* Devices get allocated to seats
* Logging in gives you a session
* Only one session can use a seat at a time
* Switching sessions requires handing over devices to another session


1. systemwide audio?
The Speakup device has no support for sharing, but the sound card does.


2. rewriting the program to run as multiple users?
This leads to the following situation:


here's some more nightmare stuff i realized: seats only give permission to device nodes. applications still have to gracefully hand off the hardware during a seat switch. pulse does this just by shutting up
# Linux boots
# systemd gives root the current audio device
# systemd starts espeakup as root
# You can read the login prompt or login as root
# You log in as your own user
# systemd gives your user the current audio device
# Your PulseAudio instance claims the device
# The root espeakup can no longer speak
# You can no longer use Speakup
# You might have no sound at all if PulseAudio couldn't claim the device if espeakup was still talking while PulseAudio was starting
# You switch to another TTY using Ctrl+Alt+F2
# systemd gives root the current audio device
# Your PulseAudio instance frees the device
# espeakup can talk again using the sound card


in fact if pulse tries to start and it doesn't immediately get hardware access on login it will freak the FUCK out and give you a dummy output that can't output anything
Because espeakup can only speak when root is using the current seat effectively becomes useless outside logging in as root on your computer.


so the new idea is this:
It's also important to note that PulseAudio stores settings for volume and outputs. You might be using Bluetooth headphones at low volume but then switch to a root TTY and have espeakup blare loudly out your desktop speakers.


- have a proxy that sits between speakup and espeak
It really makes me wonder why PulseAudio doesn't have a per-seat instance that lets you switch between sessions and preserves audio configuration but still swaps out which user's applications can use audio.


- send messages to espeakup instances based on current active UID
As a quick note: PipeWire has the exact same behaviour, so for the most part you can substitute 'PulseAudio' with 'PipeWire' or any sound other server when reading this page.


- during a switch between instances, wait for the current instance to finish talking OR the stop talking control is sent. then start feeding the new instance data
== Existing solutions ==
Here are some solutions that kind of work but have severe trade-offs.


- have a shim that blocks pulseaudio from starting until it has permission, but also don't consume the buffer
=== Running PulseAudio system-wide ===
Running system-wide PulseAudio allows you to always have espeakup running and talking regardless of who is currently logged in.


i do not like how i'm basically reinventing flow control but poorly
However this has a major security concern, mainly that all users can now play and record audio from your sound card and other applications. For more information see [https://www.freedesktop.org/wiki/Software/PulseAudio/Documentation/User/WhatIsWrongWithSystemWide/ What is wrong with system mode?]


ok so it turns out i was WRONG: you can't share speakup protocol between multiple synths! the protocol is stateful! ie if you tell it to change voice and switch synth the voice change won't be applied. YAY
Running PipeWire system-wide is a little more complicated: You need to run both the PipeWire daemon and a session manager system-wide. This session manager needs to lock hardware and not give it up on seat switch.


so this makes the only viable solution to send PCM data from a root espeakup instance.
I'm not going to include instructions on how to do this as I can't provide support for it. Sorry.


on top of that this also means i have to modify espeakup to handle some flow control AND output to a buffer instead of the sound card
=== Running espeakup as your user ===
It's possible to run espeakup as your own user by:


pulseaudio doesn't gracefully give up audio, only alsa -> pulseaudio
* Taking control of the audio device away from logind
* Giving control of all audio devices to your user
* Giving control of the softsynth to your user
* Running espeakup as a daemon as your user
* Running espeakup at boot as your user
 
I've prepared steps to follow to get this working below.
 
Paste lines between <code>--- PASTE START ---</code> and <code>--- PASTE END ---</code> in to the file specified or a terminal.
 
Step 1: Put this in <code>/etc/modules-load.d/speakup.conf</code>
 
--- PASTE START ---
speakup
speakup_soft
--- PASTE END ---
 
This will cause the speakup kernel modules to load at boot.
 
Step 2: Put this in <code>/etc/udev/rules.d/99-speakup.rules</code>
 
--- PASTE START ---
SUBSYSTEM=="sound", TAG-="seat", GROUP="audio"
KERNEL=="softsynth*", GROUP="audio"
--- PASTE END ---
 
This will will do three things:
 
# Stop logind managing sound devices
# Give users in the audio group access to sound devices
# Give users in the audio group access to Speakup
 
Step 3: Put this in <code>~/.config/systemd/user/espeakup.service</code>
 
--- PASTE START ---
[Unit]
Description=Software speech output for Speakup
After=pulseaudio.service
[Service]
Environment="default_voice= ALSA_CARD="
ExecStart=/usr/bin/espeakup -d --default-voice=${default_voice}
Restart=always
Nice=-10
OOMScoreAdjust=-900
[Install]
WantedBy=default.target
--- PASTE END ---
 
This is a service that just runs the espeakup daemon. It is set to start after pulseaudio.
 
Change <code>pulseaudio.service</code> to <code>pipewire.service</code> if you're using on PipeWire.
 
Step 4: Run these commands in a terminal as your user:
 
--- PASTE START ---
systemctl --user enable espeakup
loginctl enable-linger
sudo gpasswd -a $USER audio
--- PASTE END ---
 
This does three things:
 
# Enable the speakup user service
# Enable running user services when logged out
# Add your user to the audio group
 
Step 5: Reboot and enjoy!
 
There's two downsides to this method:
 
# Your user can see what other users are reading, including root
# Other users can't play audio
 
== Ideas ==
There are some ideas I've considered to solve this problem.
 
=== Loaning PulseAudio to root ===
The main case here would be to allow root to use a user's PulseAudio install.
 
This would have mean that multiple people can privately use a computer, with the exception of one user being able to read root's screen.
 
I'm not too sure what this buys compared to using sudo or something to act as root.
 
In practice I'm not sure how easy this would be to implement. You would need to replace the logind hardware management for sound with something else that allows finer grained management of hardware.
=== Sharing Speakup between multiple users ===
Things would be a lot easier if we could run one espeakup instance per user. This is tricky because the Speakup kernel modules don't have concepts of users or sessions.
 
There's a wide list of engineering problems to solve with this:
 
* Saving and restoring per-user Speakup settings
* Saving and restoring the softsynth state between users
* Proxying /dev/softsynth so users can't read other's data
* Restricting access to the fakekey input device
* Handling graceful handovers from ALSA to PulseAudio
* Handling forceful handovers from PulseAudio to ALSA
* Shim PulseAudio so it waits for hardware access before running
* Flow control to indicate when the proxy is ready to send
* Flow control to indicate when espeakup is ready to talk
* Buffering data when espeakup isn't ready
* Discarding buffered data when speakup signals espeakup to shut up
* Handling messages sent back from espeakup to speakup and flow control for that
 
There's a lot to untangle here since we're touching multiple levels of abstraction.
 
=== Sharing espeakup output between multiple users ===
A more practical solution might be to run a root espeakup instance and proxy its PCM output to users.
 
This would cut down the engineering problems to:
 
* Assigning /sys/accessibility/speakup to the current seat
* Proxying PCM data to stub programs users run
* Handling graceful handovers from ALSA to PulseAudio
* Shim PulseAudio so it waits for hardware access before running
* Flow control to indicate when a proxy is ready to send
* Buffering data when a stub isn't playing audio
* Discarding buffered data when espeakup wants to shut up
 
This gives most the benefits of sharing speakup but without saving or restoring state.
[[Category:Research]]

Latest revision as of 09:06, 2 November 2023

Is it possible to use Speakup across multiple users on Linux? This article is the notes and ramblings of someone who tries to do that.

Background[edit | edit source]

The Linux kernel includes the Speakup screen reader. Unlike most desktop screen readers, Speakup will aid in reading the Linux virtual TTYs that you can switch to with Ctrl+Alt+F1 through Ctrl+Alt+F6 on most systems.

This functionality is vitally important for troubleshooting in a Linux system and running command line applications, which is unfortunately one of the more accessible experiences on Linux.

Speakup alone is not enough to implement a screen reader: It still needs some speech synthesizer to turn text and control signals in to audio. Once upon a time hardware synthesizers connected over serial were used to do this task, but as computers became more capable software synthesizers became viable.

Speakup supports software synthesizers by exposing a character device in /dev/softsynthu. A connector such as connects to this device and implements the synthesizer protocol and handles playback of speech. In practice the software synthesizer is espeakup which plays speech using eSpeak NG.

The problem[edit | edit source]

Linux is a multi-user system, but hardware and other resources are not. There's only one sound card and only one Speakup device. So these resources have to be shared somehow.

Since around 2007 with the release of ConsoleKit the Linux desktop has handled sharing resources using the idea of seats. To put it simply:

  • Devices get allocated to seats
  • Logging in gives you a session
  • Only one session can use a seat at a time
  • Switching sessions requires handing over devices to another session

The Speakup device has no support for sharing, but the sound card does.

This leads to the following situation:

  1. Linux boots
  2. systemd gives root the current audio device
  3. systemd starts espeakup as root
  4. You can read the login prompt or login as root
  5. You log in as your own user
  6. systemd gives your user the current audio device
  7. Your PulseAudio instance claims the device
  8. The root espeakup can no longer speak
  9. You can no longer use Speakup
  10. You might have no sound at all if PulseAudio couldn't claim the device if espeakup was still talking while PulseAudio was starting
  11. You switch to another TTY using Ctrl+Alt+F2
  12. systemd gives root the current audio device
  13. Your PulseAudio instance frees the device
  14. espeakup can talk again using the sound card

Because espeakup can only speak when root is using the current seat effectively becomes useless outside logging in as root on your computer.

It's also important to note that PulseAudio stores settings for volume and outputs. You might be using Bluetooth headphones at low volume but then switch to a root TTY and have espeakup blare loudly out your desktop speakers.

It really makes me wonder why PulseAudio doesn't have a per-seat instance that lets you switch between sessions and preserves audio configuration but still swaps out which user's applications can use audio.

As a quick note: PipeWire has the exact same behaviour, so for the most part you can substitute 'PulseAudio' with 'PipeWire' or any sound other server when reading this page.

Existing solutions[edit | edit source]

Here are some solutions that kind of work but have severe trade-offs.

Running PulseAudio system-wide[edit | edit source]

Running system-wide PulseAudio allows you to always have espeakup running and talking regardless of who is currently logged in.

However this has a major security concern, mainly that all users can now play and record audio from your sound card and other applications. For more information see What is wrong with system mode?

Running PipeWire system-wide is a little more complicated: You need to run both the PipeWire daemon and a session manager system-wide. This session manager needs to lock hardware and not give it up on seat switch.

I'm not going to include instructions on how to do this as I can't provide support for it. Sorry.

Running espeakup as your user[edit | edit source]

It's possible to run espeakup as your own user by:

  • Taking control of the audio device away from logind
  • Giving control of all audio devices to your user
  • Giving control of the softsynth to your user
  • Running espeakup as a daemon as your user
  • Running espeakup at boot as your user

I've prepared steps to follow to get this working below.

Paste lines between --- PASTE START --- and --- PASTE END --- in to the file specified or a terminal.

Step 1: Put this in /etc/modules-load.d/speakup.conf

--- PASTE START ---
speakup
speakup_soft
--- PASTE END ---

This will cause the speakup kernel modules to load at boot.

Step 2: Put this in /etc/udev/rules.d/99-speakup.rules

--- PASTE START ---
SUBSYSTEM=="sound", TAG-="seat", GROUP="audio"
KERNEL=="softsynth*", GROUP="audio"
--- PASTE END ---

This will will do three things:

  1. Stop logind managing sound devices
  2. Give users in the audio group access to sound devices
  3. Give users in the audio group access to Speakup

Step 3: Put this in ~/.config/systemd/user/espeakup.service

--- PASTE START ---
[Unit]
Description=Software speech output for Speakup
After=pulseaudio.service
[Service]
Environment="default_voice= ALSA_CARD="
ExecStart=/usr/bin/espeakup -d --default-voice=${default_voice}
Restart=always
Nice=-10
OOMScoreAdjust=-900
[Install]
WantedBy=default.target
--- PASTE END ---

This is a service that just runs the espeakup daemon. It is set to start after pulseaudio.

Change pulseaudio.service to pipewire.service if you're using on PipeWire.

Step 4: Run these commands in a terminal as your user:

--- PASTE START ---
systemctl --user enable espeakup
loginctl enable-linger
sudo gpasswd -a $USER audio
--- PASTE END ---

This does three things:

  1. Enable the speakup user service
  2. Enable running user services when logged out
  3. Add your user to the audio group

Step 5: Reboot and enjoy!

There's two downsides to this method:

  1. Your user can see what other users are reading, including root
  2. Other users can't play audio

Ideas[edit | edit source]

There are some ideas I've considered to solve this problem.

Loaning PulseAudio to root[edit | edit source]

The main case here would be to allow root to use a user's PulseAudio install.

This would have mean that multiple people can privately use a computer, with the exception of one user being able to read root's screen.

I'm not too sure what this buys compared to using sudo or something to act as root.

In practice I'm not sure how easy this would be to implement. You would need to replace the logind hardware management for sound with something else that allows finer grained management of hardware.

Sharing Speakup between multiple users[edit | edit source]

Things would be a lot easier if we could run one espeakup instance per user. This is tricky because the Speakup kernel modules don't have concepts of users or sessions.

There's a wide list of engineering problems to solve with this:

  • Saving and restoring per-user Speakup settings
  • Saving and restoring the softsynth state between users
  • Proxying /dev/softsynth so users can't read other's data
  • Restricting access to the fakekey input device
  • Handling graceful handovers from ALSA to PulseAudio
  • Handling forceful handovers from PulseAudio to ALSA
  • Shim PulseAudio so it waits for hardware access before running
  • Flow control to indicate when the proxy is ready to send
  • Flow control to indicate when espeakup is ready to talk
  • Buffering data when espeakup isn't ready
  • Discarding buffered data when speakup signals espeakup to shut up
  • Handling messages sent back from espeakup to speakup and flow control for that

There's a lot to untangle here since we're touching multiple levels of abstraction.

Sharing espeakup output between multiple users[edit | edit source]

A more practical solution might be to run a root espeakup instance and proxy its PCM output to users.

This would cut down the engineering problems to:

  • Assigning /sys/accessibility/speakup to the current seat
  • Proxying PCM data to stub programs users run
  • Handling graceful handovers from ALSA to PulseAudio
  • Shim PulseAudio so it waits for hardware access before running
  • Flow control to indicate when a proxy is ready to send
  • Buffering data when a stub isn't playing audio
  • Discarding buffered data when espeakup wants to shut up

This gives most the benefits of sharing speakup but without saving or restoring state.