Convert SDL2 audio into ffmpeg AVFrame (recording from webcam + muxing)


I’m trying to record audio and video from a webcam (Logitech C920 currently, but any webcam with a micro could do it), and I’ve got several questions

First, I’m using -sort of- circular/ring buffer, and callbacks to:

  • record audio stream from the webcam
  • playback it into the default device

Tested a lot (Linux only, but it should be easely portable to Windows), it seems to work very well, but maybe something is plain wrong, and I need advices.

The code is there : Sources/step4 · master · Eric Bachard / AudioRecord · GitLab

FYI: I added some links in the code, to explain what I did and who inspired me.

Second question :

I have integrated this code inside a muxer, based on muxing.c (available in ffmpeg doc) and written by Fabrice Bellard.

The idea is to create an mp4 vidéo including both images and sound (and complete miniDart, recording any selected audio+video sources, but that’s the next step)

More precisely:

  • the input is : webcam (Logitech C920) audio + video;
  • the output is an mp4 container with aac for audio and h264 for video;
  • to stop recording, just hit the ESC key.

Currently, the video recording works fine, and the last problem to solve is about the audio, not working.

If I’m not too wrong, the sound coming from the micro has AUDIO_S16LSB format (SDL side). It works very well, no problem with that, and I can even, talking in the micro, record the sound with playback working using Audacity : SDL does a great job here.

In fact, the muxer (choosen for portability reasons), needs a libav AVFrame to create the final video file, and my problem is I don’t know what do to convert SDL audio into audio AVFrames.

The state of my work in progress is : the video is nice, but the audio (perfectly synchronized !) is not normal : this is just noise, but we can hear something, and verify video and audio are synchronized.

The code is available there : Sources/step6 · master · Eric Bachard / AudioRecord · GitLab (see muxer.cpp). There is a script to compile it, and a list of dependencies is provided in the script (easy to test under Linux)

If this could help, looking at the log (see below), the output codec is AAC and waits for fltp audio format. Of course, I searched a lot on the web, but found nothing usable/usefull to solve my problem. What do I have to do with such data ? Can I expect other SDL_AUDIO_FORMAT, closer to 16LSB (e.g.) or … ??
how many steps are missing ? Am I close or … ?

Thanks in advance for any suggestion, advice or help :slight_smile:
Eric Bachard

FYI, the log says (on Linux):

FPS : 24
Adresse de cb_out = 0x556495e4beb8
Adresse de cb_in = 0x556495e4bd13
have_out.freq = 48000
have_out.samples = 1024
Found encoder : ‘h264’
Found encoder : ‘aac’
(*codec)->sample_fmts[0] contains : = 8
(*codec)->supported_samplerates[0] = 96000
(*codec)->supported_samplerates[1] = 88200
(*codec)->supported_samplerates[2] = 64000
(*codec)->supported_samplerates[3] = 48000
(*codec)->supported_samplerates[4] = 44100
(*codec)->supported_samplerates[5] = 32000
(*codec)->supported_samplerates[6] = 24000
We got : c->sample_rate = 48000
ost->st->time_base = 1/48000
Found AV sample_fmt of type : 8, means : fltp
this sample format has 4 bytes per sample ,
and its buffer size is equal to 8192
[libx264 @ 0x556496523b40] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
[libx264 @ 0x556496523b40] profile High, level 3.1, 4:2:0, 8-bit
[libx264 @ 0x556496523b40] 264 - core 164 r3098 7628a56 - H.264/MPEG-4 AVC codec - Copyleft 2003-2022 - x264, the best H.264/AVC encoder - VideoLAN - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=12 keyint_min=1 scenecut=40 intra_refresh=0 rc_lookahead=12 rc=abr mbtree=1 bitrate=3000 ratetol=1.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
nb_samples = 1024
Output #0, mp4, to ‘outfile.mp4’:
Stream #0:0: Video: h264, yuv420p, 1280x720, q=2-31, 3000 kb/s, 24 tbn
Stream #0:1: Audio: aac (LC), 48000 Hz, stereo, fltp, 192 kb/s