Convert a stereo wav to mono in C - c

I have developed the Synchronous Audio Interface (SAI) driver for a proprietary Real-Time Operating System (RTOS) using C language. My driver is configured to output left and right channel data (I2S) to the amplifier. But, since the amplifier attached is mono, it only outputs left or right channel audio data to the speaker. Now, i have a stereo PCM 16-bit audio data file and i want to somehow mix the left and right channel audio data in my application and send it to either of the left or right channel in the SAI driver. In this way, i will be able to play combined stereo audio data as mono on the speaker attached to the mono amplifier.
Can anyone suggest me that what's the best possible solution to do it?

As said in a comment, the usual way to mix two stereo channels in a mono one is to divide the sample of each channel by 2 and add them.
Example in C like :
int left_channel_sample, right_channel_sample;
int mono_channel = (left_channel_sample / 2) + ( right_channel_sample / 2);
You mentioned some driver you coded, modify it or add some new feature. Can't really help more given the mess of your question...

Related

Playing 15 audio tracks at once with <50ms latency?

To summarise, my question is: is it possible to decode and play 15 lossily-compressed audio tracks on-the-fly at the same time with under 50ms latency and with no stuttering?
Background
I'm writing a sound library in plain C for a game I'm creating. I'm hoping to have up to 15 audio tracks playing at once with less than 50ms latency.
As of now, the library is able to play raw PCM files (48000Hz packed 16-bit samples), and can easily play 15 sounds at once at 45ms latency without stuttering and with minimal CPU usage. This is on my relatively old Intel Q9300 + SSD machine.
Since raw audio files are huge though, I augmented my library to support playing back OPUS files using opusfile (https://mf4.xiph.org/jenkins/view/opus/job/opusfile-unix/ws/doc/html/index.html). I was hoping that I'd still be able to play 15 sounds at once without the audio files taking up 200MB+. How wrong I was - I was only able to play 3 or 4 OPUS tracks at once before I could hear stuttering and other buffer underrun symptoms. CPU usage was also massively increased compared to raw PCM playback.
I also tried including VORBIS support using vorbisfile (http://www.xiph.org/vorbis/doc/vorbisfile/). I thought maybe decoding VORBIS on-the-fly wouldn't be as CPU intensive. VORBIS is a little better than OPUS - I can play 5 or 6 sounds at once before stuttering becomes audible (I guess VORBIS is indeed easier to decode) - but this is still nowhere near as good as playing back raw PCM files.
Before I delve into the low-level libvorbis/libopus APIs and investigate other audio compression formats, is it actually feasible to decode and play 15 lossily-compressed audio tracks on-the-fly at the same time with under 50ms latency and with no stuttering on a medium-to-low end desktop computer?
If it helps, my sound library currently calls a function approximately every 15ms which basically does the following (error-handling and post-processing omitted for clarity):
void onBufferUpdateNeeded(int numSounds, struct Sound *sounds,
uint16_t *bufferToUpdate, int numSamplesNeeded, uint16_t *tmpBuffer) {
int i, j;
memset(bufferToUpdate, 0, numSamplesNeeded * sizeof(uint16_t));
for (i = 0; i < numSounds; ++i) {
/* Seek to the specified sample number in the already-opened
file handle. The implementation of this depends on the file
type (vorbis, opus, raw PCM). */
seekToSample(sounds[i].fileHandle, sounds[i].currentSample);
/* Read numSamplesNeeded samples from the file handle into
tmpBuffer. */
readSamples(tmpBuffer, sounds[i].fileHandle, numSamplesNeeded);
/* Add the samples into the buffer. */
for (j = 0; j < numSamplesNeeded; ++j) {
bufferToUpdate[j] += tmpBuffer[j];
}
}
}
Thanks in advance for any help!
It sounds like you already know the answer to your own question: NO. Normally, the only advice I would have to questions like these (especially performance-related queries) is to try it and find out if it's possible. But you have already collected that data.
It's true that perceptual/lossy audio codecs tend to be computationally intensive to decode. It sounds like you want to avoid the storage overhead of raw PCM. In that case, if you can safely assume you'll have enough memory reserved for your application, you can decode the audio streams in advance, or employ some caching mechanism to deal with memory constraints. Perhaps this can be offloaded to a different thread (since the Q9300 CPU mentioned in your question is dual core).
Otherwise, you will need to seek out a compressor that has lower computational requirements. You might be interested in FLAC, sponsored by the same organization as Vorbis and Opus. It's lossless, so it won't compress quite as well as the lossy algorithms, but it should be much, much faster to decode.
And if that's still not suitable, browse around on this big list of ~150 audio codecs until you find one that meets your standards. Since you control the client software, you have a lot of choices (vs, e.g., streaming to a web browser).

Capturing sound by ALSA

I am trying to capture the sound from sound card by ALSA in linux systems. Its read the data from the vector in PCM format. I need a way to find out the right way to capturing and save it to in the file and play to check the recevied data is correct or not.
To capture audio to a file with alsa , you can use arecord. By using this you can simply capture input audio to a file. Or you can write your own application which read PCM data. You can use snd_pcm_readi API for this purpose.

How can I have code running while using sampled audio playback in GBDK c?

I am making a game for the GameBoy in GBDK, and I'm trying to add sounds to the game. GBDK has a function that plays sounds from an array of values, the only problem is that while its playing the sound the rest of the script freezes. Is there a way I can get them to run at the same time?
There is no way to have code running while using sampled audio playback. This is due to the fact that it actually uses full CPU to preform this playback. If you want to use regular sound effects, you'll either need to pause the game while they play, or use a different method. I'll try to summarize using the other playback method below, but it is kind of complicated and I'm no expert.
Using "normal" sound effects
This is kind of WIP - I'm not too experienced with it but it should let you get started.
To use sound effects, you need to write to GameBoy audio registers. This is found in GBDK's hardware.h, which is automatically included with references to gb\gb.h. But (of course) the registers don't have any documentation. This information is found on the GB Cribsheet. There's also this sound documentation file (unfortunately it behaves weirdly on windows encodings - Open with something other than notepad), along with some other information found on the Devrs.com sound documentation.
Working off of GBSOUND.TXT:
The addresses through which the sound channels can be accessed are:
$Addresses: (Description), (Register shorthand)
$FF10 -- $FF14: Channel 1, Referred to as NR10-NR14
$FF15 is unused, was probably going to be a sweep reg for channel 2
originally
$FF16 -- $FF19: Channel 2, Referred to as NR21-NR24
$FF1A -- $FF1E: Channel 3, Referred to as NR30-NR34
$FF1F is unused, was probably going to be a sweep reg for channel 4
originally
$FF20 -- $FF23: Channel 4, Referred to as NR41-NR44
$FF24 controls the Vin status and volume, Referred to as NR50
$FF25 selects which output each channel goes to, Referred to as NR51
$FF26 is the status register, and also controls the sound circuit's power.
Referred to as NR52
$FF27 -- $FF2F are unused.
$FF30 -- $FF3F is the load register space for the 4-bit samples for channel
3
In GBDK, the registers are named NR10_REG, NR11_REG, NR12_REG, ect.
Also, try looking at the example program sound.c, which doesn't compile for me unfortuantely. I might edit this to include more info.
To answer #franklin's question:
Which begs the question, how does a gameboy play both the game and sound at the same time?
They usually don't do that with sample playback. For instance, if you look at Pokémon Yellow, Pikachu's cry is done with sample playback. But while that is playing, nothing else is done. On the other hand, things like normal background music are done using the other audio hardware (sorry, not very detailed wiki link). Similarly, while move sound effects are done with the noise channel (used for the sample playback as well), they aren't actually sampled audio. As such, the game can continue running.

Left and Right audio channels are exchanging

I am trying to write an application for capturing stereo audio. My audio input has two channels(Stereo). I am writing this audio data into a wav file. Some times these audio channels are exchanging i.e, Left becomes right and right becomes left. This is happening only if i open and close the device file or turn off the device and turn it on. And it is happening randomly. I don't want channels to be exchanged. Please suggest.
stereo PCM stored in a wav file is in an LR format. 'L' stands for left channel sample and 'R' for right channel sample. I guess you have a bug in retrieving or storing the PCM. Maybe sometimes you start with the right (correct) position in buffer and sometimes you start with the second sample. It's hard to tell without additional info.

How to multiplex Vorbis and Theora streams using libogg

I am currently writing a simple Theora video encoder, which uses libogg, libvorbis and libtheora. Currently, I can submit frames to the Theora encoder, and PCM samples to the Vorbis encoder, pass the resulting packets to Ogg streams (one for Theora and one for Vorbis) and get pages out.
When the program starts, it flushes the headers first from the Theora encoder, then from the Vorbis encoder to the output file (obviously, both streams have unique serial numbers). Then, I write interleaved pages to the file from both of the streams.
When writing just the video, or just the audio, I am able to play back the output in mplayer just fine, however when I attempt to write both, I get the following:
Ogg demuxer error : we met an unknown stream
I'm guessing I'm doing the multiplexing wrong. I have read through the documentation for multiplexing streams on Xiph.org, and I can't see where I differ. I cannot seem to find any example code for doing this, short of going through the source of an open-source encoder (which I'm having some trouble understanding). Would anyone be able to explain how to multiplex streams correctly using libogg? I'm trying to do this in C on Ubuntu 10.04, using the libraries from the Ubuntu repository.
Many thanks in advance!
Tom
Ok, for anyone who was reading this, I have to some extent solved it.
You should not flush all of the the header packets from each stream - just the first (setup) packet, which for Vorbis and Theora gets its own page by default. Put the other header packets into their respective streams, but do not flush until the setup pages from all streams have been written to the file.
Once you have done this, try to keep the streams as closely sync'd as possible (mplayer gave some errors for me when they got too far out). At 24fps video and 44.1 KHz audio, 1 frame should span 1837.5 audio samples (with PCM audio, this is 7,350 bytes).
If anyone else has any tips / info, it would be good to hear - I've never done anything with audio / video before!
Thanks!
Tom

Resources