System-wide speakup: Difference between revisions

From JookWiki
(Add some more stuff)
(Finish problem section)
Line 8: Line 8:
Speakup alone is not enough to implement a screen reader: It still needs some speech synthesizer to turn text and control signals in to audio. Once upon a time hardware synthesizers connected over serial were used to do this task, but as computers became more capable software synthesizers became viable.
Speakup alone is not enough to implement a screen reader: It still needs some speech synthesizer to turn text and control signals in to audio. Once upon a time hardware synthesizers connected over serial were used to do this task, but as computers became more capable software synthesizers became viable.


Speakup supports software synthesizers by exposing a character device in /dev/softsynth. A connector such as connects to this device and implements the synthesizer protocol and handles playback of speech. In practice the software synthesizer is [https://github.com/linux-speakup/espeakup espeakup] which plays speech using [https://github.com/espeak-ng/espeak-ng eSpeak NG].
Speakup supports software synthesizers by exposing a character device in /dev/softsynthu. A connector such as connects to this device and implements the synthesizer protocol and handles playback of speech. In practice the software synthesizer is [https://github.com/linux-speakup/espeakup espeakup] which plays speech using [https://github.com/espeak-ng/espeak-ng eSpeak NG].


== The problem ==
== The problem ==
Linux is a multi-user system and  
Linux is a multi-user system, but hardware and other resources are not. There's only one sound card and only one Speakup device. So these resources have to be shared somehow.


- speakup is system-wide, not multi-user. so it must always have access to the audio device. the problem is that this isn't possible in linux.
Since around 2007 with the release of [https://www.freedesktop.org/wiki/Software/ConsoleKit/ ConsoleKit] the Linux desktop has handled sharing resources using the idea of seats. To put it simply:


- espeakup runs as root boot up and log in as root- works fine
* Devices get allocated to seats
* Logging in gives you a session
* Only one session can use a seat at a time
* Switching sessions requires handing over devices to another session


- log in as yourself- no audio, pulseaudio claims the sound card. or tries to and gives up because root is playing audio, leaving you without sound
The Speakup device has no support for sharing, but the sound card does.


- run espeakup as your own user- you can read what root does, but switching to a root tty now goes silent
This leads to the following situation:


- run pulseaudio systemwide is not recommended, you can snoop data
# Linux boots
# systemd gives root the current audio device
# systemd starts starts espeakup as root
# You can read the login prompt or login as root
# You log in as your own user
# systemd gives your user the current audio device
# Your PulseAudio instance claims the device
# The root espeakup can no longer speak
# You can no longer use Speakup
# You might have no sound at all if PulseAudio couldn't claim the device if espeakup was still talking while PulseAudio was starting
# You switch to another TTY using Ctrl+Alt+F2
# systemd gives root the current audio device
# Your PulseAudio instance frees the device
# espeakup can talk again using the sound card


- pipewire
Because espeakup can only speak when root is using the current seat effectively becomes useless outside logging in as root on your computer.
 
- pulseaudio


== Linux audio architecture ==
== Linux audio architecture ==
Line 65: Line 79:


TODO: when did sound servers give up hardware for seats?
TODO: when did sound servers give up hardware for seats?
system and pulseaudio coordinate will snatch the audio whenever you change to a seat assigned to your user. maybe you could modify these to not cede the audio outside a specific seat (in this case seat = virtual tty you switch in linux with ctrl alt f1 through f7)?
here's some more nightmare stuff i realized: seats only give permission to device nodes. applications still have to gracefully hand off the hardware during a seat switch. pulse does this just by shutting up


pulseaudio doesn't gracefully give up audio, only alsa -> pulseaudio
pulseaudio doesn't gracefully give up audio, only alsa -> pulseaudio
Line 75: Line 85:


- pipewire has most of the same problems
- pipewire has most of the same problems
- root cannot use pulseaudio prefs from your user
honestly the more i think about it the more it seems like pulseaudio should belong to a seat instead of a specific user, with sessions switching between use of it


== Attempt 1: Sharing Speakup ==
== Attempt 1: Sharing Speakup ==

Revision as of 10:43, 11 October 2022

Is it possible to play audio 'globally' or system-wide in Linux? This article is the notes and ramblings of someone who tries to do that.

Background

The Linux kernel includes the Speakup screen reader. Unlike most desktop screen readers, Speakup will aid in reading the Linux virtual TTYs that you can switch to with Ctrl+Alt+F1 through Ctrl+Alt+F6 on most systems.

This functionality is vitally important for troubleshooting in a Linux system and running command line applications, which is unfortunately one of the more accessible experiences on Linux.

Speakup alone is not enough to implement a screen reader: It still needs some speech synthesizer to turn text and control signals in to audio. Once upon a time hardware synthesizers connected over serial were used to do this task, but as computers became more capable software synthesizers became viable.

Speakup supports software synthesizers by exposing a character device in /dev/softsynthu. A connector such as connects to this device and implements the synthesizer protocol and handles playback of speech. In practice the software synthesizer is espeakup which plays speech using eSpeak NG.

The problem

Linux is a multi-user system, but hardware and other resources are not. There's only one sound card and only one Speakup device. So these resources have to be shared somehow.

Since around 2007 with the release of ConsoleKit the Linux desktop has handled sharing resources using the idea of seats. To put it simply:

  • Devices get allocated to seats
  • Logging in gives you a session
  • Only one session can use a seat at a time
  • Switching sessions requires handing over devices to another session

The Speakup device has no support for sharing, but the sound card does.

This leads to the following situation:

  1. Linux boots
  2. systemd gives root the current audio device
  3. systemd starts starts espeakup as root
  4. You can read the login prompt or login as root
  5. You log in as your own user
  6. systemd gives your user the current audio device
  7. Your PulseAudio instance claims the device
  8. The root espeakup can no longer speak
  9. You can no longer use Speakup
  10. You might have no sound at all if PulseAudio couldn't claim the device if espeakup was still talking while PulseAudio was starting
  11. You switch to another TTY using Ctrl+Alt+F2
  12. systemd gives root the current audio device
  13. Your PulseAudio instance frees the device
  14. espeakup can talk again using the sound card

Because espeakup can only speak when root is using the current seat effectively becomes useless outside logging in as root on your computer.

Linux audio architecture

learn what's actually going on here

hardware:

- oss

- alsa

- audio group

sound servers:

- dmix

- esd

- arts

- phonon

- pulseaudio

- auth using a cookie (even dmix)

- include multiple

one per user?

- handover during login

- handover during seat swapping

- there's no system-wide audio

TODO: when did sound servers lock hardware?

TODO: when did sound servers give up hardware for seats?

pulseaudio doesn't gracefully give up audio, only alsa -> pulseaudio

- pulseaudio startup race

- pipewire has most of the same problems

- root cannot use pulseaudio prefs from your user

honestly the more i think about it the more it seems like pulseaudio should belong to a seat instead of a specific user, with sessions switching between use of it

Attempt 1: Sharing Speakup

- have a proxy that sits between speakup and espeak

- send messages to espeakup instances based on current active UID

- during a switch between instances, wait for the current instance to finish talking OR the stop talking control is sent. then start feeding the new instance data

- have a shim that blocks pulseaudio from starting until it has permission, but also don't consume the buffer

i do not like how i'm basically reinventing flow control but poorly

ok so it turns out i was WRONG: you can't share speakup protocol between multiple synths! the protocol is stateful! ie if you tell it to change voice and switch synth the voice change won't be applied. YAY

Attempt 2: Sharing synth data

so this makes the only viable solution to send PCM data from a root espeakup instance.

on top of that this also means i have to modify espeakup to handle some flow control AND output to a buffer instead of the sound card

Attempt 3: One Pulse for all

- user pulse for all system