Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distorted audio with moonlight-qt and sdl2-compat on Arch Linux #329

Open
rubin55 opened this issue Feb 8, 2025 · 17 comments
Open

Distorted audio with moonlight-qt and sdl2-compat on Arch Linux #329

rubin55 opened this issue Feb 8, 2025 · 17 comments
Milestone

Comments

@rubin55
Copy link

rubin55 commented Feb 8, 2025

I'm running Arch Linux. I use moonlight-qt to stream my desktop (which runs sunshine) to my laptop (all systems running Arch Linux).

Since the upgrade of sdl3 and introduction of sdl2-compat, I have highly distorted audio. It is specifically not due to network latency, which sounds very different. This sounds like way too loud oversteering or clipping, robotic. It sounds similar to libsdl-org/SDL#12150.

I've made a video to illustrate the issue (zipped, because I can apparently not upload an mp4 directly):
moonlight-qt-sdl2-compat-distorted-audio.zip

@slouken
Copy link
Collaborator

slouken commented Feb 8, 2025

Can you please note the versions of SDL and sdl2-compat you're using?

@slouken slouken transferred this issue from libsdl-org/SDL Feb 8, 2025
@slouken slouken added this to the 2.32.52 milestone Feb 8, 2025
@rubin55
Copy link
Author

rubin55 commented Feb 8, 2025

I'm running Arch Linux with kernel 6.13.1, Gnome on Wayland. Package versions:

  • moonlight-qt 6.1.0-4
  • sdl3 3.2.4-1
  • sdl2-compat 2.30.54-1

@slouken
Copy link
Collaborator

slouken commented Feb 8, 2025

Thanks, we'll take a look!

@cgutman
Copy link
Contributor

cgutman commented Feb 8, 2025

Interesting, I was actually just running this test with the exact same configuration (Arch Linux client too) and audio was okay outputing to an HDMI audio device (monitor).

When you start a stream Moonlight will print some message about the audio configuration which look like:

00:00:12 - SDL Warn (0): Starting audio stream...
00:00:12 - SDL Warn (0): Desired audio buffer: 1440 samples (46080 bytes)
00:00:12 - SDL Warn (0): Obtained audio buffer: 1440 samples (46080 bytes)
00:00:12 - SDL Warn (0): SDL audio driver: pulseaudio
00:00:12 - SDL Warn (0): Audio stream has 8 channels

Can you post those from your test?

@rubin55
Copy link
Author

rubin55 commented Feb 8, 2025

In my case its:

00:00:09 - SDL Warn (0): Starting audio stream...
00:00:09 - SDL Warn (0): Desired audio buffer: 720 samples (5760 bytes)
00:00:09 - SDL Warn (0): Obtained audio buffer: 720 samples (5760 bytes)
00:00:09 - SDL Warn (0): SDL audio driver: pipewire
00:00:09 - SDL Warn (0): Received first video packet after 0 ms
00:00:09 - SDL Warn (0): Audio stream has 2 channels

To be 100% sure it's sdl2-compat related, I downgraded to sdl2 2.30.9-1 and sure enough, audio works normally. When using "regular" sdl2 that log output looks like yours (i.e., pulseaudio instead of pipewire):

00:00:08 - SDL Info (0): Starting audio stream...
00:00:08 - SDL Info (0): Desired audio buffer: 720 samples (5760 bytes)
00:00:08 - SDL Info (0): Obtained audio buffer: 720 samples (5760 bytes)
00:00:08 - SDL Info (0): SDL audio driver: pulseaudio
00:00:08 - SDL Info (0): Audio stream has 2 channels

@rubin55
Copy link
Author

rubin55 commented Feb 8, 2025

Setting the audio driver to pulseaudio gets rid of the distortion when using sdl2-compat:

export SDL_AUDIODRIVER=pulseaudio
moonlight

Note that I also tested downgrade to regular sdl2 and then export SDL_AUDIODRIVER=pipewire to see if the distortion also exists with regular sdl2 combined with pipewire, but it did not (i.e., it worked normally, good sound).

@slouken
Copy link
Collaborator

slouken commented Feb 8, 2025

What happens if you set the environment variable SDL_AUDIO_DEVICE_SAMPLE_FRAMES=720?

@rubin55
Copy link
Author

rubin55 commented Feb 9, 2025

What happens if you set the environment variable SDL_AUDIO_DEVICE_SAMPLE_FRAMES=720?

That works; audiodriver is mentioned as being pipewire in the logs, audio sounds normal!

@slouken
Copy link
Collaborator

slouken commented Feb 10, 2025

@icculus, we should fix whatever bug we have when the SDL2 app sample count doesn't match the actual sample count, but then I think we should have sdl2-compat set SDL_HINT_AUDIO_DEVICE_SAMPLE_FRAMES to the application spec sample count when opening an audio device, so we get as close a match as possible.

@icculus
Copy link
Collaborator

icculus commented Feb 21, 2025

Ok, I think the correct fix is to set the hint, but I don't think there's any other bug to be fixed in our code.

From here, it looks like Moonlight is flying too close to the sun in their audio queueing and it happened to work out in SDL2:

https://github.com/moonlight-stream/moonlight-qt/blob/fabb4fdadc302493d02bb23485dbc15d7988834b/app/streaming/audio/renderers/sdlaud.cpp#L102-L140

Forcing the hint will probably give them enough buffer to not have underruns, and that's all we can do here.

@cgutman
Copy link
Contributor

cgutman commented Feb 21, 2025

Is there a way that we can see which audio buffer size that we picked in the PipeWire backend? The fact that a 15 ms buffer works for WASAPI, PulseAudio, and ALSA, but not PipeWire (and it did work for PipeWire in SDL2) makes me wonder if there is something we can do to improve SDL3's behavior here.

(As an aside, it would be wonderful to get an underrun notification from SDL so I can detect this is happening and adjust how much audio I'm buffering)

@mdmatthias
Copy link

Downgrading to 2.30.51 fixed it for me, higher version gives crackling audio, looks like it's something since 2.30.52

@icculus
Copy link
Collaborator

icculus commented Feb 21, 2025

So there are only 3 audio-related commits between .51 and .52, but one of them is a doozy.

  • This one is nothing (we had an if at the end of a while that checked the same thing as the loop condition: d2cf6b3
  • This one tells the app it got exactly what it asked for when opening the audio device in all cases: b2d2190 ... this maybe causes a problem if Moonlight asked for something specific (like, a very small playback buffer) but the device declined to give it that, so it ended up with more buffering than it expected.
  • The doozy: this one that makes the audio callback consistent, always asking for the same amount of audio: 9110cc1 ... in .52, this would affect Moonlight, but we have since removed this code for the SDL_QueueAudio() path, which Moonlight uses...so if reverting this fixes the problem, it doesn't actually fix the problem, y'know?

I'm guessing that it's worth testing Moonlight with the latest in revision control but b2d2190 reverted, which just requires commenting out one line, if @mdmatthias is able to try this.

My current guess is that Moonlight is asking for a too-small hardware buffer, and SDL2 is refusing to do it, instead giving it a larger one. On top of that, it's looping until there are almost no samples left in the audio queue, which works when the hardware is buffering more behind the scenes but is leading to tons of underruns when it isn't.

Another theory: SDL_GetQueuedAudioSize() reports bytes in the queue...in SDL2, this is converted bytes, but in SDL3 this is data that has not yet been converted...so I'm wondering if this code is blocking for 300 milliseconds every time because sdl2-compat is returning the SDL3 value at the moment.

@icculus
Copy link
Collaborator

icculus commented Feb 21, 2025

Also, @cgutman, are you the Moonlight maintainer? If so:

In that code I linked to, is there any value in waiting for the queue to be almost drained before queueing more? In that function, you already know you have X more bytes to queue, why not just queue it and let the system sip from the queue as necessary?

@cgutman
Copy link
Contributor

cgutman commented Feb 22, 2025

This one tells the app it got exactly what it asked for when opening the audio device in all cases: b2d2190 ... this maybe causes a problem if Moonlight asked for something specific (like, a very small playback buffer) but the device declined to give it that, so it ended up with more buffering than it expected.

Moonlight is already passing 0 for allowed_changes, so that commit should be a no-op as well.

My current guess is that Moonlight is asking for a too-small hardware buffer, and SDL2 is refusing to do it, instead giving it a larger one. On top of that, it's looping until there are almost no samples left in the audio queue, which works when the hardware is buffering more behind the scenes but is leading to tons of underruns when it isn't.

It's looping until there are 10 frames of audio (50 ms) or less of unconsumed audio data queued to the SDL audio stream, which is over 3x more than the buffer size we asked for in SDL_OpenAudioDevice(), so that should be plenty of buffer to avoid the audio device underrunning if the audio device has anywhere near a buffer sized like what we asked.

I'm not totally convinced that it's due to getting a pipewire buffer that's too small. I'm wondering if it's actually due to getting a buffer that's too large instead. For example, if the underlying PW audio buffer was 100 ms, then the code there would constantly flap between having >= 100 ms and having 0 ms queued and playing (at least) half a buffer that's padded with silence because the queuing logic will refuse to queue more than 50 ms at a time.

I think the fundamental assumption I made there is that the audio device will consume data from the stream in multiples of the sample value. If that turns out not to hold, then the queuing logic could certainly end up doing bad things.

Another theory: SDL_GetQueuedAudioSize() reports bytes in the queue...in SDL2, this is converted bytes, but in SDL3 this is data that has not yet been converted...

That's certainly an interesting theory (and at least a good bug to fix). Moonlight is calculating the duration of queued audio by using the sample size of the audio stream it opened, so that could definitely cause issues if it's returning an incorrect value there.

In that code I linked to, is there any value in waiting for the queue to be almost drained before queueing more? In that function, you already know you have X more bytes to queue, why not just queue it and let the system sip from the queue as necessary?

The reason for all that complicated queuing and waiting logic there is that I'm trying to constrain the amount of audio data that can be queued to the audio device in order to prevent latency from building up in the audio playback pipeline.

With an inconsistent network connection, audio packets from the host PC can be delayed (or dropped and later recovered via FEC packets) and subsequently arrive after they're supposed to have been played (this can also happen if the playback device is temporarily paused due to a default device switch or whatever). Some amount of late audio traffic needs to be accepted in order to smooth over network jitter without running dry, but we don't want to let it accumulate forever. The code there is attempting to keep some data in all queues to avoid underruns but not too much that it creates significant audio latency for the user.

@slouken
Copy link
Collaborator

slouken commented Feb 22, 2025

FWIW, Steam Link does something very similar, but is already using SDL3.

@cgutman
Copy link
Contributor

cgutman commented Feb 22, 2025

I think I understand the issue now but it's not clear what the best way to fix it is. I went back to 9110cc1 to investigate (since that commit does exhibit the issue for me, unlike the current code now).

I found that it's not that the audio buffer underruns, it's that SDL3 is taking whatever the app can provide immediately and then padding the rest of the buffer with silence. Without the app being able to influence the size of those audio buffers (like SDL2 could via SDL_AudioSpec.samples or SDL3 via SDL_HINT_AUDIO_DEVICE_SAMPLE_FRAMES), that's a recipe for disaster if the app is limited in how much audio it can provide at a given time. This also explains why real SDL2's PipeWire backend worked fine (the SDL2 PW device buffer was set based upon SDL_AudioSpec.samples, while SDL3 just always used 4096 unless the hint was set).

The problem with the code in 9110cc1 was that it would not fully populate the provided buffer up to approx_amount if the bytes_per_callbacks value was smaller. What SDL3 will do in that case is fill the remaining portion of the buffer with silence and play it. That was the cause for the "truncated" audio waveform reported in #311 - each submitted buffer to PipeWire ended with silence due to the buffer size mismatch.

This same phenomenon is reproducible with SDL3 natively too using some testcode based on the modified repro provided for #311

SDL3 test code
#include <SDL3/SDL.h>

#define FREQ 48000
#define CHANNELS 2
#define WAVE_BUFFER_LEN (FREQ * CHANNELS * sizeof(Sint16))

SDL_AudioSpec spec = { .freq = FREQ, .channels = CHANNELS, .format = SDL_AUDIO_S16LE };
Uint8 waveBuffer[WAVE_BUFFER_LEN];
Uint32 waveBufferPos = 0;

void generate_squarewave() {
    const int freq = 260;
    const int amplitude = 20000;
    int period_samples = spec.freq / freq;
    for (int i = 0; i < WAVE_BUFFER_LEN; i+=2) {
        Sint16 sample = (i/4) % period_samples < period_samples / 2 ? amplitude : -amplitude;
        waveBuffer[i] = sample & 0xFF;
        waveBuffer[i+1] = (sample >> 8) & 0xFF;
    }
  }

void SDLCALL squarewave_callback(void *userdata, SDL_AudioStream *stream, int additional_amount, int total_amount)
{
    if (additional_amount == 0) {
        return;
    }

    Uint32 data_ready = spec.freq * spec.channels * sizeof(Sint16) * 10 / 1000; // Only 10 ms

    SDL_LogInfo(SDL_LOG_CATEGORY_APPLICATION, "SDL wants %u bytes, providing %u bytes", additional_amount, data_ready);

    /* XXX: This is gross but it's just a test app... */
    Uint32 data_per_frame = CHANNELS * sizeof(Sint16);
    for (int i = 0; i < data_ready; i += data_per_frame) {
        SDL_PutAudioStreamData(stream, &waveBuffer[waveBufferPos], data_per_frame);
        waveBufferPos = (waveBufferPos + data_per_frame) % WAVE_BUFFER_LEN;
    }
}

int main()
{  
    if (!SDL_Init(SDL_INIT_AUDIO)) {
      return -1;
    }
  
    generate_squarewave();
  
    SDL_AudioStream *stream = SDL_OpenAudioDeviceStream(SDL_AUDIO_DEVICE_DEFAULT_PLAYBACK, &spec, squarewave_callback, NULL);
    if (!stream) {
        SDL_LogError(SDL_LOG_CATEGORY_APPLICATION,"Failed to open audio device: %s", SDL_GetError());
        return -1;
    }
    SDL_ResumeAudioStreamDevice(stream);
    
    while (1) {
        SDL_Event e;
        SDL_WaitEvent(&e);
        if (e.type == SDL_EVENT_QUIT) {
            SDL_Log("exit\n");
            break;
        }
    }

    SDL_Quit();
    return 0;
}

When running out of the box without SDL_HINT_AUDIO_DEVICE_SAMPLE_FRAMES, SDL3 will pick 4096 as the device buffer size. This results in log messages like SDL wants 4096 bytes, providing 1920 bytes and the square wave sounding corrupted due to all the silence being inserted. If you set SDL_HINT_AUDIO_DEVICE_SAMPLE_FRAMES to something <= 480, you'll get a nice sounding square wave (at least until you get to a value low enough that you start getting underruns).

I took a look at the audio drivers in SDL3 and some of them (PipeWire included) can theoretically handle partially filled buffers and do the right thing. However, a bunch of them also have assumptions that the buffers returned will always be completely populated. Some of them would need significant rework to support providing partially filled buffers (if they can support it at all).

Maybe for now the best thing to do is to have sdl2-compat set SDL_HINT_AUDIO_DEVICE_SAMPLE_FRAMES according to SDL_AudioSpec.samples if it would be lower than what SDL_GetDefaultSampleFramesFromFreq() in SDL3 would pick (or perhaps always do it?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants