Audio API Suitable for multi channel playback in windows - c

I am researching a project in which I need to playback simultaneously a multi-track audio source. ( >30 mono channels ) The audio on all channels needs to start simultaneously and be sustained for hours of playback.
What is the best audio API to use for this? WDM and ASIO have come up in my searches. I will be using a MOTU PCI Audio interface to get this many channels. The channels show up as normal audio channels in the host PC.

ASIO is definitely the way to go about this. It will keep everything in sync properly, with low latency, and is the defacto industry standard way to do it. Any pro audio interfaces supports ASIO, and for interfaces that don't, there is a wrapper that is capable of syncing multiple devices.

Related

How to play sound in C from scratch (Linux)

How to play sounds at different tones in C without using any external library? I know there are dozens of sound libraries in C that allows you to play sound but what I want to know is how does that work behind? How do you tell the computer to play a certain note at a certain tone/frequency?
I know it's possible on windows using the sound() function but I can't find any documentation talking about Linux, all that I found is the beep() function (or write(1, "\a", 1)) that outputs the default terminal beep but I can't figure out how to play different sounds.
The Linux kernel native audio API is ALSA (Advanced Linux Sound Architecture).
Example of raw audio playback with ALSA:
https://gist.github.com/ghedo/963382/815c98d1ba0eda1b486eb9d80d9a91a81d995283
However, ALSA is a low-level API that is not recommended to be used directly by higher-level applications.
A modern system audio API for GNU/Linux would be either PulseAudio (the current default on Ubuntu), or the newer and arguably better PipeWire (the default on Fedora).
Example of raw audio playback with PipeWire that generates audio "from scratch":
https://docs.pipewire.org/page_tutorial4.html
How do you tell the computer to play a certain note at a certain tone/frequency?
Sound is a mechanical vibration that propagates through the air (or another medium). It can be represented digitally as a sequence of numerical values representing air pressure at a given sampling rate. To play a given tone/frequency, generate a sine wave of that frequency (at the playback sampling rate) and use the sound API of your choice to play it.
See the PipeWire tutorial above for an example generating a 440Hz tone.
About PulseAudio/PipeWire:
These libraries are typically part of the OS and exposed as system APIs (so they are not "external libraries" if that means some library to ship with your program or to ask users to install), and they should be used by applications to play audio.
Behind the scene, these libraries handle audio routing, mixing, echo-canceling, recording, and playback to the kernel through ALSA (or using Bluetooth, etc). Everything that users and developers expect from the system audio layer.
Until recently, PulseAudio was the de-facto universal desktop system audio API, and many apps still use the PulseAudio API to play audio on GNU/Linux.
PipeWire includes compatibility with PulseAudio, so that apps using the PulseAudio API will keep working in the foreseeable future.
Example of raw audio playback with PulseAudio:
https://freedesktop.org/software/pulseaudio/doxygen/pacat-simple_8c-example.html

VAD to switch from listen mode to speak mode

I am attempting to turn my four-wire apartment buzzer into a VOIP phone using a raspberry pi and a custom circuit. The problem is that two way communication is not supported. I can either be listening or speaking. I want to use a standard SIP setup with asterisk, but do VAD on the sound output of the raspberry pi in order to send a digital signal switching the intercom to "speak mode" whenever there is a voice on the audio output. Is there any pre-existing c function or include that listens to the ALSA mixer and throws a 1 for speech and a 0 for absence of speech with sufficiently low latency to be used in this walkie-talkie like system?
Once again, I would prefer pre-existing libraries, and because this is live, low latencies.
ALSA is a simple audio mixer, and it's interface only consists of mixer-related methods. It's meant to abstract away the hardware driver. What you will be able to do is get the audio data from ALSA is real time, however you will need to implement your own voice activity detection.
This question on Signal Processing SE has a few good suggestions for libraries and codec implementations to get you started.

Audio conference between WPF Applications

I have 2 WPF applications that communicate using a couple of duplex WCF services. I need to enable audio communication also between them. I've been looking for a solution for a while now, but couldn't find a good one.
What I've tried is audio streaming using Microsoft Expression Encoder on the "server" side (the one that feeds the audio), and playing it on the "client" using VLC .NET. It works, at least streaming a song, but it's a big resource eater. The initial buffering also takes a lot, and so is stopping the stream.
What other options do I have? I want a clear, lightweight audio conversation between the apps, kinda like Skype. Is this possible? Thanks
EDIT: I found NAudio and it looks like a good audio library, I managed to stream my microphone quite easily. However, I have big problem - I can hear the voice clearly on the client, but it echoes indefinitely. Plus, there's this annoying background sound (could this be caused by the processor?) and after a while, a very high, loud sound is played on the receiving end. All I can do is stop the whole transmission. I have no idea what's causing these problems. I use the 'SpeexChatCodec' as in the NetworkChat example provided (sampling rate: 8000, 2 channels). Any suggestions? Thanks
It would be a lot of work to write a library that would support that from scratch... if you can spend $150 on this I would suggest purchasing a library like iConf .NET Video Conferencing SDK from AvSpeed...

Difference in CPU & memory utilization while using VLC Mozilla plugin and VLC player for playback of RTSP streams

For one of our ongoing projects we were planning to use some multimedia framework like VLC / Gstreamer to capture and playback / render h.264 encoded rtsp streams. For the same we have been observing the performance (CPU & memory utilization) of VLC using two demo applications that we have built. One of the demo application uses the mozilla vlc plugin using which we have embedded up to four h.264 encoded RTSP streams on a single html webpage while the other demo application simply invoked the vlc player and plays a single h.264 encoded rtsp stream.
I was surprised to observe that the results were as under (Tests were conducted on Ubuntu 11.04):
Demo 2 (Mozilla VLC plugin - 4 parallel streams)
CPU utilization: 16%
Memory utilization: ~61MB
Demo 2 (VLC player - 1 stream)
CPU utilization: 16%
Memory utilization: ~17MB
My question is, why is the CPU utilization lesser for the mozilla VLC plugin even though it is decoding more video streams.
Reply awaited.
Regards,
Saurabh Gandhi
I'm also using VLC mozilla plugin for my project and I have problem with h264 streams. The only way to handle such stream was to use --ffmpeg-hw (for vaapi use) which due Xlib works only in standalone VLC app (--no-xlib flag in vlcplugin_base.cpp). So I removed that flag and added XInitThreads() and it works now BUT far from performance level you had and besides no-xlib flag was there for reason (it might come to some unwanted behavior).
So the main question is HOW did you come to such results and if its possible to share your configuration flags with me and the rest.
The system I'm using is 4 core CPU and nvidia ION graphics. CPU cores stay at moderate level but stream on fullscreen doesn't play smoothly. If the same streams gets run in cvlc it works perfect. ffmpeg-hw flag is used in both accounts without any warning messages (vaapi successfully returns).
If you have hardware acceleration of some sort, then CPU only takes care of routing the data..

Pulling multiple live video streams into WPF

I'd like to create an app that pulls multiple live video feeds, supplied either by coax, hdmi or some other standard, into WPF for manipulation (i.e. apply a few transforms or pixel shaders) which is then output to monitor. What would I look at to get started with this app - is there any hardware that would make things easier?
If you are pulling in standard broadcast via coax or over the air, a $100 ATSC HD TV tuner will do. I don't have any experience with HD capture cards (I think they run for about $1000), or more specifically, cards that take in a raw HD stream.
When you install a capture device (TV tuner, webcam, capture card) in Windows, it creates a DirectShow source filter wrapper for it. Based off what kind of hw you are targeting, determines how you create the DirectShow graph. I have no reason to expect HD capture cards to be different than any capture card or webcam (TV tuners are slightly different).
You can use my WPF MediaKit as a base. The web cam control may work out of the box or just require slight changes for an HD capture card. A TV tuner would require a lot more than just this.

Resources