Segmentation fault in samplerate conversion function - c

playmp3() using libmpg123
if (isPaused==0 && mpg123_read(mh, buffer, buffer_size, &done) == MPG123_OK)
{
char * resBuffer=&buffer[0]; //22100=0,5s
buffer = resample(resBuffer,22100,22100);
if((ao_play(dev, (char*)buffer, done)==0)){
return 1;
}
resample() Using avcodec from ffmpeg
#define LENGTH_MS 500 // how many milliseconds of speech to store
#define RATE 44100 // the sampling rate (input)
#define FORMAT PA_SAMPLE_S16NE // sample size: 8 or 16 bits
#define CHANNELS 2 // 1 = mono 2 = stereo
struct AVResampleContext* audio_cntx = 0;
char * resample(char in_buffer[(LENGTH_MS*RATE*16*CHANNELS)/8000],int out_rate,int nsamples)
{
char out_buffer[ sizeof( in_buffer ) * 4];
audio_cntx = av_resample_init( out_rate, //out rate
RATE, //in rate
16, //filter length
10, //phase count
0, //linear FIR filter
1.0 ); //cutoff frequency
assert( audio_cntx && "Failed to create resampling context!");
int samples_consumed;
int samples_output = av_resample( audio_cntx, //resample context
(short*)out_buffer, //buffout
(short*)in_buffer, //buffin
&samples_consumed, //&consumed
nsamples, //nb_samples
sizeof(out_buffer)/2,//lenout
0);//is_last
assert( samples_output > 0 && "Error calling av_resample()!" );
av_resample_close( audio_cntx );
//*resample = malloc(sizeof(out_buffer));
return &out_buffer[0];
}
When i run this code i get 3393 Segmentation fault (core dump created). Why?
For example, the use of pointers is correct?
and 22100 are the samples that are contained in 0.5 seconds of the song?

You have two issues that I can see right off the bat. These are noob questions, but everyone does this at least once, so don't worry!
Check that sizeof( in_buffer ) is giving you the size of the buffer you expect ((LENGTH_MS*RATE*16*CHANNELS)/8000) or the size of the pointer(which would be 2, 4 or 8 depending on your system.) Using sizeof on an array on the stack gives you its total size, because there is no pointer only a buffer. Sizeof an array on the parameter list gives you the size of the pointer even if you use [] on the param list, because there only is a pointer.
Also, returning a stack based buffer is undefined (i.e. will crash), since the stack gets reused on the next function call:
return &out_buffer[0];
Don't do that. Pass in the out buffer already allocated by the caller.

Related

SDL_OpenAudioDevice: Continuous play from real time processed source buffer

I'm writing a porting of an emulator to SDL. There is a method, called at each frame, that passes a buffer with new audio samples for next frame.
I opened a device with SDL_OpenAudioDevice and at each frame the SDL callback method reproduces samples from audio buffer.
It works but the sound is not perfect, some tic, some metallic noise and so on.
Sound is 16 bit signed.
EDIT: Ok, I found a solution!
With the code of the opening post I was playing samples for next frame at the current frame in real time. It was wrong!
So, I implemented a circular buffer where I put samples for next frame that underlying code passes to me at each (current) frame.
In that buffer there are 2 pointers, one for read point and the other one for write point. SDL calls callback function when on its audio stream there are no more data to play; so when callback function is called I play audio samples from read point on the circular buffer then update the read pointer.
When underlying code gives me audio samples data for next frame I write them in the circular buffer at write point, then update the write pointer.
Read and write pointers are shifted by the amount of samples to be played at each frame.
Code updated, needs some adjustment when samplesPerFrame is not an int but it works ;-)
Circular buffer structure:
typedef struct circularBufferStruct
{
short *buffer;
int cells;
short *readPoint;
short *writePoint;
} circularBuffer;
This method is called at initialization:
int initialize_audio(int stereo)
{
if (stereo)
channel = 2;
else
channel = 1;
// Check if sound is disabled
if (sampleRate != 0)
{
// Initialize SDL Audio
if (SDL_InitSubSystem(SDL_INIT_AUDIO) < 0)
{
SDL_Log("SDL fails to initialize audio subsystem!\n%s", SDL_GetError());
return 1;
}
// Number of samples per frame
samplesPerFrame = (double)sampleRate / (double)framesPerSecond * channel;
audioSamplesSize = samplesPerFrame * bytesPerSample; // Bytes
audioBufferSize = audioSamplesSize * 10; // Bytes
// Set and clear circular buffer
audioBuffer.buffer = malloc(audioBufferSize); // Bytes, must be a multiple of audioSamplesSize
memset(audioBuffer.buffer, 0, audioBufferSize);
audioBuffer.cells = (audioBufferSize) / sizeof(short); // Cells, not Bytes!
audioBuffer.readPoint = audioBuffer.buffer;
audioBuffer.writePoint = audioBuffer.readPoint + (short)samplesPerFrame;
}
else
samplesPerFrame = 0;
// First frame
return samplesPerFrame;
}
This is the SDL method callback from want.callback:
void audioCallback(void *userdata, uint8_t *stream, int len)
{
SDL_memset(stream, 0, len);
if (audioSamplesSize == 0)
return;
if (len > audioSamplesSize)
{
len = audioSamplesSize;
}
SDL_MixAudioFormat(stream, (const Uint8 *)audioBuffer.readPoint, AUDIO_S16SYS, len, SDL_MIX_MAXVOLUME);
audioBuffer.readPoint += (short)samplesPerFrame;
if (audioBuffer.readPoint >= audioBuffer.buffer + audioBuffer.cells)
audioBuffer.readPoint = audioBuffer.readPoint - audioBuffer.cells;
}
This method is called at each frame (after first pass we require only the amount of samples):
int update_audio(short *buffer)
{
// Check if sound is disabled
if (sampleRate != 0)
{
memcpy(audioBuffer.writePoint, buffer, audioSamplesSize); // Bytes
audioBuffer.writePoint += (short)samplesPerFrame; // Cells
if (audioBuffer.writePoint >= audioBuffer.buffer + audioBuffer.cells)
audioBuffer.writePoint = audioBuffer.writePoint - audioBuffer.cells;
if (firstTime)
{
// Set required audio specs
want.freq = sampleRate;
want.format = AUDIO_S16SYS;
want.channels = channel;
want.samples = samplesPerFrame / channel; // total samples divided by channel count
want.padding = 0;
want.callback = audioCallback;
want.userdata = NULL;
device = SDL_OpenAudioDevice(SDL_GetAudioDeviceName(0, 0), 0, &want, &have, 0);
SDL_PauseAudioDevice(device, 0);
firstTime = 0;
}
}
else
samplesPerFrame = 0;
// Next frame
return samplesPerFrame;
}
I expect that this question/answer will be useful for others in the future because I didn't find almost nothing on the net for SDL Audio
Ok, I found a solution!
With the code of the opening post I was playing samples for next frame at the current frame in real time. It was wrong!
So, I implemented a circular buffer where I put samples for next frame that underlying code passes to me at each (current) frame. From that buffer I read and write in different position, see opening post

What is the correct way to access a large fits grid in C with CFITSIO

I have two fits grid which I am accessing with CFITSIO (https://heasarc.gsfc.nasa.gov/fitsio/c). Now, the first grid is 1071 (row) x 3 (cols) and of the float data type. I can access this fine and print the output of each entry. When I repeat this code for my second grid (a large grid (1.1 Gb) which is 1071 x 262144 and also of the float data type, I get a Segmentation fault. Before I carry on, here's the code:
#FILE Cspec.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
#include "fitsio.h"
#include "load_data.h"
#include "general_functions.h"
#define MAX_ROW 230746
#define N 131072
#define WAVEMIN 450.0
#define WAVEMAX 650.0
void printerror( int status)
{
/*****************************************************/
/* Print out cfitsio error messages and exit program */
/*****************************************************/
if (status)
{
fits_report_error(stderr, status); /* print error report */
exit( status );
}
return;
}
int main()
{ /*----------------------------
Now load the table
----------------------------*/
printf("\n\n--------------------\nLoading the models..\n--------------------");
fflush(stdout);
// Imports and definitions
fitsfile *fptr; /* pointer to the FITS file, defined in fitsio.h */
int status, anynull;
long naxes[2], fpixel, nbuffer, npixels, ii;
char filename[] = "models.fits"; /* name of existing FITS file */
status = 0;
// Open the file and kick off if its not there
if ( fits_open_file(&fptr, filename, READONLY, &status) )
printerror( status );
/* read the NAXIS1 and NAXIS2 keyword to get image size */
//if ( fits_read_keys_lng(fptr, "NAXIS", 1, 2, naxes, &nfound, &status) )
// printerror( status );
// Define the rows and columns of fits image and thus the buffer size
naxes[0] = 1017;
naxes[1] = 262144;
size_t buffsize = ((size_t) (naxes[1] * naxes[0]) + (naxes[1] * sizeof(float)));
float nullval, buffer[buffsize];
npixels = naxes[0] * naxes[1]; /* number of pixels in the image */
fpixel = 1;
nullval = 0; /* don't check for null values in the image */
while (npixels > 0)
{
nbuffer = npixels;
if (npixels > buffsize)
nbuffer = buffsize; /* read as many pixels as will fit in buffer */
if ( fits_read_img(fptr, TFLOAT, fpixel, nbuffer, &nullval,
buffer, &anynull, &status) )
printerror( status );
for (ii = 0; ii < nbuffer; ii++)
{
// Output the value
printf("%f\t", buffer[ii]);
}
npixels -= nbuffer; /* increment remaining number of pixels */
fpixel += nbuffer; /* next pixel to be read in image */
}
if ( fits_close_file(fptr, &status) )
printerror( status );
return 0;
}
I compile using:
nvcc Cspec2.c general_functions.c load_data.c -I/iraf/iraf/vendor/cfitsio /iraf/iraf/vendor/cfitsio/libcfitsio.a -lm -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
The other c files (general_functions.c and load_data.c) are not relevant. The reason for the last two flags comes from https://heasarc.gsfc.nasa.gov/fitsio/c/f_user/node13.html which describe the size limits when working with fits files. One thing this says (para 4):
"If CFITSIO is compiled with the -D_LARGEFILE_SOURCE and -D_FILE_OFFSET_BITS=64 flags on a platform that supports large files, then it can read and write FITS files that contain up to 2**31 2880-byte FITS records, or approximately 6 terabytes in size. It is still required that the value of the NAXISn and PCOUNT keywords in each extension be within the range of a signed 4-byte integer (max value = 2,147,483,648). Thus, each dimension of an image (given by the NAXISn keywords), the total width of a table (NAXIS1 keyword), the number of rows in a table (NAXIS2 keyword), and the total size of the variable-length array heap in binary tables (PCOUNT keyword) must be less than this limit."
I assumed that adding these flags would sort this out since 1071*262,144 = 280,756,224 which is below the range of a signed 4-byte integer (max value = 2,147,483,648). There is also more information here ( https://heasarc.gsfc.nasa.gov/fitsio/c/c_user/node32.html ). Where am I going wrong? Why does this cause a seg fault? This code works fine with the smaller grid (1071*3). I assume I have to add some extra compiler flags or change something in my code? Could the problem lie in the fact I am using cfitsio from an IRAF distribution?
One thing I found from de-bugging is that I could not output the "buffsize" variable - this on it's own lead to a seg fault.
I am running 64 bit Ubuntu 16.04 LTS and I am using the nvcc compiler as I plan to offload some later work to the GPU. I'm still relatively new to C (from python).
Thanks
Sam

"-Nan" value for the total sum of array elements with GPU code

I am working on an OpenCL code which computes the sum of array elements. Every works fine up to a size of 1.024 * 1e+8 for the 1D input array but with 1.024 * 1e+9, the final value is "-Nan".
Here's the source of the code on this link
The Kernel code is on this link
and the Makefile on this link
Here's the result for the last array size which works (last value of size which works is 1.024 * 1e+8) :
$ ./sumReductionGPU 102400000
Max WorkGroup size = 4100
Number of WorkGroups = 800000
Problem size = 102400000
Final Sum Sequential = 5.2428800512000000000e+15
Final Sum GPU = 5.2428800512000000000e+15
Initializing Arrays : Wall Clock = 0 second 673785 micro
Preparing GPU/OpenCL : Wall Clock = 1 second 925451 micro
Time for one NDRangeKernel call and WorkGroups final Sum : Wall Clock = 0 second 30511 micro
Time for Sequential Sum computing : Wall Clock = 0 second 398485 micro
I have taken local_item_size = 128, so as it is indicated above, I have 800000 Work-Groups for NWorkItems = 1.024 * 1e+8.
Now If I take 1.024 * 10^9, the partial sums are no more computed, I get a "-nan" value for total sum of array elements.
$ ./sumReductionGPU 1024000000
Max WorkGroup size = 4100
Number of WorkGroups = 8000000
Problem size = 1024000000
Final Sum Sequential = 5.2428800006710899200e+17
Final Sum GPU = -nan
Initializing Arrays : Wall Clock = 24 second 360088 micro
Preparing GPU/OpenCL : Wall Clock = 19 second 494640 micro
Time for one NDRangeKernel call and WorkGroups final Sum : Wall Clock = 0 second 481910 micro
Time for Sequential Sum computing : Wall Clock = 166 second 214384 micro
Maybe I have reached the limit of what GPU unit can compute. But I would like to get your advice to confirm this.
If a double is 8 bytes, this will require 1.024 * 1e9 * 8 ~ 8 GBytes for the input array : isn't it too much ? I have only 8 GBytes of RAM.
From your experience, where this issue could come from ?
Thanks
As you already found out, your 1D input array requires a lot of memory. Thus, the memory allocation with malloc or clCreateBuffer are prone to fail.
For the malloc, I suggest to use a helper function checked_malloc which detects a failed memory allocation, prints out a message and exits the program.
#include <stdlib.h>
#include <stdio.h>
void * checked_malloc(size_t size, const char purpose[]) {
void *result = malloc(size);
if(result == NULL) {
fprintf(stderr, "ERROR: malloc failed for %s\n", purpose);
exit(1);
}
return result;
}
int main()
{
double *p1 = checked_malloc(1e8 * sizeof *p1, "array1");
double *p2 = checked_malloc(64 * 1e9 * sizeof *p2, "array2");
return 0;
}
On my PC which has only 48 GB of virtual memory, the second allocation failes and the program prints:
ERROR: malloc failed for array2
You can apply this scheme also for clCreateBuffer. But, you have to check the result of every OpenCL call anyway. So, I recommend to use a macro for this:
#define CHECK_CL_ERROR(result) if(result != CL_SUCCESS) { \
fprintf(stderr, "OpenCL call failed at: %s:%d with code %d\n", __FILE__, __LINE__, result); }
An example usage would be:
cl_mem inputBuffer = clCreateBuffer(context, CL_MEM_READ_ONLY,
nWorkItems * sizeof(double), NULL, &ret);
CHECK_CL_ERROR(ret);

UART write buffer with PDC

I'm having a problem with writing to a USARt using const char buffer and char arrray.
Here is my UART write function:
unsigned int USART_Send( unsigned char *p_pucData,
unsigned int p_unLen)
{
AT91C_BASE_US2->US_TPR = (unsigned int)p_pucData;
AT91C_BASE_US2->US_TCR = p_unLen;
AT91C_BASE_US2->US_PTCR = AT91C_PDC_TXTEN;
while((AT91C_BASE_US2->US_CSR & ((0x1 << 11) | (0x1 << 4) ) ) == 0);
AT91C_BASE_US2->US_PTCR = AT91C_PDC_TXTDIS;
return p_unLen;
}
Below function working with const char* like:
USART_Send("IsitDone?",9); //Working
If I use a array buffer like below it is showing garbage characters, wonder why ?
unsigned char arr[10];
memcpy(arr, "HelloWorld", 10);
USART_Send(arr, sizeof(arr)); //Not working properly displaying Garbage chars
Ricardo Crudo is correct. You run into the following problem:
arr is created on the stack
arr is filled
call USART_Send
fill transmit pointer, counter, enable tx requests
/* peripheral state is TXBUFE = '0' and ENDTX = '0' because: */
/* TXBUFE = '0' when PERIPH_TCR != 0 and */
/* ENDTX = '0' when PERIPH_TCR != 0 */
/* but we just wrote to PERIPH_TCR, so it's != 0 */
/* both conditions are satisfied, b/c the transfer hasn't started yet! */
wait until (TXBUFE = '0' and ENDTX = '0')
/* your code thinks PDC is done here */
/* but in reality, PDC is getting started */
disable tx requests
return from sub-function
overwrite stack (and arr) with unrelated data here
/* PDC might push out last word(s) here due to pipelining/ */
/* instruction cache/side effects/you-name-it */
/* even though its matrix requests were disabled a few cycles ago */
Solutions:
copy to a global buffer or
wait some cycles between enabling tx requests and checking if the PDC is done (possibly a whole baud tick) or
read back PERIPH_TCR and check if it's zero instead of checking the flags
Ideally, you would allocate some dynamic memory for strings and deallocate it after the PDC is done asynchronously to your actual code. You might want to check if you can get some kind of interrupt after the PDC/peripheral is done, then deallocate the memory it read from.
If you don't have dynamic memory allocation, then use the a global ring buffer and abstract your string/char send function to use this buffer instead.

Using bzip2 low-level routines to compress chunks of data

The Overview
I am using the low-level calls in the libbzip2 library: BZ2_bzCompressInit(), BZ2_bzCompress() and BZ2_bzCompressEnd() to compress chunks of data to standard output.
I am migrating working code from higher-level calls, because I have a stream of bytes coming in and I want to compress those bytes in sets of discrete chunks (a discrete chunk is a set of bytes that contains a group of tokens of interest — my input is logically divided into groups of these chunks).
A complete group of chunks might contain, say, 500 chunks, which I want to compress to one bzip2 stream and write to standard output.
Within a set, using the pseudocode I outline below, if my example buffer is able to hold 101 chunks at a time, I would open a new stream, compress 500 chunks in runs of 101, 101, 101, 101, and one final run of 96 chunks that closes the stream.
The Problem
The issue is that my bz_stream structure instance, which keeps tracks of the number of compressed bytes in a single pass of the BZ2_bzCompress() routine, seems to claim to be writing more compressed bytes than the total bytes in the final, compressed file.
For example, the compressed output could be a file with a true size of 1234 bytes, while the number of reported compressed bytes (which I track while debugging) is somewhat higher than 1234 bytes (say 2345 bytes).
My rough pseudocode is in two parts.
The first part is a rough sketch of what I do to compress a subset of chunks (and I know that I have another subset coming after this one):
bz_stream bzStream;
unsigned char bzBuffer[BZIP2_BUFFER_MAX_LENGTH] = {0};
unsigned long bzBytesWritten = 0UL;
unsigned long long cumulativeBytesWritten = 0ULL;
unsigned char myBuffer[UNCOMPRESSED_MAX_LENGTH] = {0};
size_t myBufferLength = 0;
/* initialize bzStream */
bzStream.next_in = NULL;
bzStream.avail_in = 0U;
bzStream.avail_out = 0U;
bzStream.bzalloc = NULL;
bzStream.bzfree = NULL;
bzStream.opaque = NULL;
int bzError = BZ2_bzCompressInit(&bzStream, 9, 0, 0);
/* bzError checking... */
do
{
/* read some bytes into myBuffer... */
/* compress bytes in myBuffer */
bzStream.next_in = myBuffer;
bzStream.avail_in = myBufferLength;
bzStream.next_out = bzBuffer;
bzStream.avail_out = BZIP2_BUFFER_MAX_LENGTH;
do
{
bzStream.next_out = bzBuffer;
bzStream.avail_out = BZIP2_BUFFER_MAX_LENGTH;
bzError = BZ2_bzCompress(&bzStream, BZ_RUN);
/* error checking... */
bzBytesWritten = ((unsigned long) bzStream.total_out_hi32 << 32) + bzStream.total_out_lo32;
cumulativeBytesWritten += bzBytesWritten;
/* write compressed data in bzBuffer to standard output */
fwrite(bzBuffer, 1, bzBytesWritten, stdout);
fflush(stdout);
}
while (bzError == BZ_OK);
}
while (/* while there is a non-final myBuffer full of discrete chunks left to compress... */);
Now we wrap up the output:
/* read in the final batch of bytes into myBuffer (with a total byte size of `myBufferLength`... */
/* compress remaining myBufferLength bytes in myBuffer */
bzStream.next_in = myBuffer;
bzStream.avail_in = myBufferLength;
bzStream.next_out = bzBuffer;
bzStream.avail_out = BZIP2_BUFFER_MAX_LENGTH;
do
{
bzStream.next_out = bzBuffer;
bzStream.avail_out = BZIP2_BUFFER_MAX_LENGTH;
bzError = BZ2_bzCompress(&bzStream, (bzStream.avail_in) ? BZ_RUN : BZ_FINISH);
/* bzError error checking... */
/* increment cumulativeBytesWritten by `bz_stream` struct `total_out_*` members */
bzBytesWritten = ((unsigned long) bzStream.total_out_hi32 << 32) + bzStream.total_out_lo32;
cumulativeBytesWritten += bzBytesWritten;
/* write compressed data in bzBuffer to standard output */
fwrite(bzBuffer, 1, bzBytesWritten, stdout);
fflush(stdout);
}
while (bzError != BZ_STREAM_END);
/* close stream */
bzError = BZ2_bzCompressEnd(&bzStream);
/* bzError checking... */
The Questions
Am I calculating cumulativeBytesWritten (or, specifically, bzBytesWritten) incorrectly, and how would I fix that?
I have been tracking these values in a debug build, and I do not seem to be "double counting" the bzBytesWritten value. This value is counted and used once to increment cumulativeBytesWritten after each successful BZ2_bzCompress() pass.
Alternatively, am I not understanding the correct use of the bz_stream state flags?
For example, does the following compress and keep the bzip2 stream open, so long as I keep sending some bytes?
bzError = BZ2_bzCompress(&bzStream, BZ_RUN);
Likewise, can the following statement compress data, so long as there are at least some bytes are available to access from the bzStream.next_in pointer (BZ_RUN), and then the stream is wrapped up when there are no more bytes available (BZ_FINISH)?
bzError = BZ2_bzCompress(&bzStream, (bzStream.avail_in) ? BZ_RUN : BZ_FINISH);
Or, am I not using these low-level calls correctly at all? Should I go back to using the higher-level calls to continuously append a grouping of compressed chunks of data to one main file?
There's probably a simple solution to this, but I've been banging my head on the table for a couple days in the course of debugging what could be wrong, and I'm not making much progress. Thank you for any advice.
In answer to my own question, it appears I am miscalculating the number of bytes written. I should not use the total_out_* members. The following correction works properly:
bzBytesWritten = sizeof(bzBuffer) - bzStream.avail_out;
The rest of the calculations follow.

Resources