Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • F ffmpeg-python
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 402
    • Issues 402
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 34
    • Merge requests 34
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Karl Kroening
  • ffmpeg-python
  • Issues
  • #750
Closed
Open
Issue created Mar 02, 2023 by Jon Jett@jon-jett

Mixdown all audio channels from video for OpenAI Whisper (0:a:0 ?)

I'm prepping audio for transcription in whisper, but having some trouble with multichannel source audio since ffmpeg defaults to including only the first listed channel unless specified when transcoding or resampling (right?)

Part of the issue is I want to downmix ALL the audio channels from video no matter if there are one or eight, or if some are considered stereo and others mono etc etc etc.

I modified the code to include a map=0:a:0 ** argument, but I think it's not changing anything:

allchannel, _ = (
    ffmpeg.input(file, threads=0)
    .output("-", format="s16le", acodec="pcm_s16le", ac=1, ar=sr, **{"map":"0:a:0"})
    .run(cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True)
)

From this stackoverflow article I read this solution but I don't know enough ffmpeg/python-ffmpeg to implement, or really what it's doing:

ffmpeg -y -vn -i stereo.mp4 -filter_complex "[0:a:0]channelsplit=channel_layout=mono:channels=FC[C0]" -map "[C0]" -acodec pcm_s16le -ac 1 -ar 8k first_channel.wav

Assignee
Assign to
Time tracking