encoding from rgb24 to mpeg4 and decoding from mpeg4 to rgb24 with libavcodec - rgb

Problem:
I am receiving rgb24 images from a webcam and am encoding them to mpeg4 using libavcodec. I am not able to decode them back to rgb24. avcodec_decode_video2 returns that it processed all of the input but the third parameter to avcodec_decode_video2 always returns 0.
What works:
Prior to calling avcodec_encode_video2, the rgb24 image is converted to yuv420p format via sws_scale. This works properly as I can take the result from sws_scale and convert it back to rgb24 successfully. I can view the results on the screen which is how I now this part works.
Description:
For each image returned from avcodec_encode_video2, I am calling another function that will decode it back to it rgb24. This scenario is setup for testing purposes to ensure I can encode/decode successfully. Below summarizes what is happening in the code:
/* variable declarations */
...
/* main processing */
while( true )
{
rgb24 = getRgb24();
sws_scale( ..., &rgb24, ..., frame->data, ... );
/* sws_scale is successful */
ret = avcodec_encode_video2( ..., &pkt, frame, &got_output );
/* avcodec_encode_video2 returns success and got_output is 1 */
/* copy encoded image to another buffer */
memcpy( hugebuffer, pkt.data, pkt.size );
memcpy( hugebuffer + pkt.size, 0, FF_INPUT_BUFFER_PADDING_SIZE );
av_init_packet( &decodePacket );
decodePacket.data = hugebuffer;
decodePacket.size = pkt.size;
len = avcodec_decode_video2( ..., decodeFrame, &got_output, &decodePacket );
/* len = pkt.size but got_output is ALWAYS 0 */
}
I wonder if there is something else I need to do before calling the decode function???
For reference, I have provided the actual code below which includes the following four functions: videoEncodeInit, videoEncode, videoDecodeInit, videoDecode. After that, I have posted how the functions are called.
/*
* Copyright (c) 2001 Fabrice Bellard
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
/**
* #file
* libavcodec API use example.
*
* Note that libavcodec only handles codecs (mpeg, mpeg4, etc...),
* not file formats (avi, vob, mp4, mov, mkv, mxf, flv, mpegts, mpegps, etc...). See library 'libavformat' for the
* format handling
* #example doc/examples/decoding_encoding.c
*/
#include <math.h>
extern "C"
{
#include <libavutil/opt.h>
#include <libavcodec/avcodec.h>
#include <libavutil/channel_layout.h>
#include <libavutil/common.h>
#include <libavutil/imgutils.h>
#include <libavutil/mathematics.h>
#include <libavutil/samplefmt.h>
#include <libswscale/swscale.h>
}
#include "video.h"
#include "logmsg.h"
typedef struct
{
AVCodec *codec;
AVCodecContext *c;
AVFrame *frame;
SwsContext *yuv420p_to_rgb24_ctx;
int frame_count;
} PrivateVideoDecodeParams;
static AVCodec *codec = 0;
static AVCodecContext *c = 0;
static AVFrame *frame = 0;
static SwsContext *rgb24_to_yuv420p_ctx = 0;
static int registered = 0;
extern int g_debug; /* from main.cpp */
int videoEncodeInit( VideoEncodeParams *ep )
{
int ret;
if( registered == 0 )
{
avcodec_register_all();
registered = 1;
}
if( c != 0 )
{
avcodec_close(c);
av_free(c);
}
if( frame && frame->data[0] )
{
av_freep(&frame->data[0]);
avcodec_free_frame(&frame);
}
if( rgb24_to_yuv420p_ctx )
{
sws_freeContext( rgb24_to_yuv420p_ctx );
}
/* find the mpeg1 video encoder */
codec = avcodec_find_encoder(ep->codec_id);
if (!codec)
{
logmsg( "error - Codec=%d not found [%s:%d]\n", (int)ep->codec_id, __FILE__, __LINE__ );
return( -1 );
}
c = avcodec_alloc_context3(codec);
if (!c)
{
logmsg("error - Could not allocate video codec context [%s:%d]\n", __FILE__, __LINE__ );
return(-1);
}
/* put sample parameters */
c->bit_rate = 400000;
/* resolution must be a multiple of two */
c->width = ep->width; /* was hard coded at 352 */
c->height = ep->height; /* was hard coded at 288 */
/* frames per second */
c->time_base.den = 1;
c->time_base.num = ep->fps; /* was hard coded to 25 */
c->gop_size = 10; /* emit one intra frame every ten frames */
c->max_b_frames = 1;
c->pix_fmt = AV_PIX_FMT_YUV420P;
if(ep->codec_id == AV_CODEC_ID_H264)
{
av_opt_set(c->priv_data, "preset", "slow", 0);
}
/* open it */
ret = avcodec_open2(c, codec, NULL);
if( ret < 0)
{
logmsg( "error - Could not open codec [%s:%d]\n", __FILE__, __LINE__ );
return(-1);
}
frame = avcodec_alloc_frame();
if (!frame)
{
logmsg("error - Could not allocate video frame [%s:%d]\n", __FILE__, __LINE__ );
return(-1);
}
frame->format = c->pix_fmt;
frame->width = c->width;
frame->height = c->height;
/* the image can be allocated by any means and av_image_alloc() is
* just the most convenient way if av_malloc() is to be used */
ret = av_image_alloc(frame->data, frame->linesize, c->width, c->height, c->pix_fmt, 32);
if (ret < 0)
{
logmsg("error - Could not allocate raw picture buffer [%s:%d]\n", __FILE__, __LINE__ );
return( -1 );
}
rgb24_to_yuv420p_ctx = sws_getContext( ep->width, ep->height, AV_PIX_FMT_RGB24,
ep->width, ep->height, AV_PIX_FMT_YUV420P,
SWS_FAST_BILINEAR, 0, 0, 0 );
if( rgb24_to_yuv420p_ctx == 0 )
{
logmsg( "error - failed to create rb24 to yuv420p conversion scale [%s:%d]\n",
__FILE__, __LINE__ );
return( -1 );
}
return( 0 );
}
int videoEncode( VideoEncodeParams *ep )
{
AVPacket pkt;
int ret;
int got_output;
av_init_packet(&pkt);
pkt.data = 0; // packet data will be allocated by the encoder
pkt.size = 0;
frame->pts = ep->pts;
ep->pts++;
/* convert input to that expected by the encoder */
int srcStride = c->width * 3;
ret = sws_scale( rgb24_to_yuv420p_ctx, &ep->rgb24, &srcStride, 0, c->height, frame->data, frame->linesize );
/* encode the image */
ret = avcodec_encode_video2(c, &pkt, frame, &got_output);
if (ret < 0)
{
logmsg("error - Error encoding frame [%s:%d]\n", __FILE__, __LINE__ );
return(-1);
}
if (got_output)
{
if( g_debug )
{
logmsg("debug - Write frame %3d (size=%5d)\n", ep->pts-1, pkt.size);
}
ep->callback( ep, pkt.data, pkt.size );
av_free_packet(&pkt);
}
return( 0 );
}
int videoDecodeInit( VideoDecodeParams *dp )
{
PrivateVideoDecodeParams *pdp;
int ret;
if( registered == 0 )
{
avcodec_register_all();
registered = 1;
}
if( dp->rgb24_out )
{
free( dp->rgb24_out );
}
dp->rgb24_out = (unsigned char*)calloc( 1, dp->width * 3 * dp->height );
pdp = (PrivateVideoDecodeParams*)dp->priv;
if( pdp )
{
if( pdp->c != 0 )
{
avcodec_close(pdp->c);
av_free(pdp->c);
}
if( pdp->frame && pdp->frame->data[0] )
{
av_freep(&pdp->frame->data[0]);
avcodec_free_frame(&pdp->frame);
}
if( pdp->yuv420p_to_rgb24_ctx )
{
sws_freeContext( pdp->yuv420p_to_rgb24_ctx );
}
free( pdp );
}
dp->priv = calloc( 1, sizeof(*pdp) );
pdp = (PrivateVideoDecodeParams*)dp->priv;
/* find the video decoder */
if( pdp->codec == 0 )
{
pdp->codec = avcodec_find_decoder(dp->codec_id);
if (!pdp->codec)
{
logmsg("error - Codec not found=%d [%s:%d]\n", (int)dp->codec_id, __FILE__, __LINE__ );
return( -1 );
}
}
pdp->c = avcodec_alloc_context3(pdp->codec);
if (!pdp->c)
{
logmsg("error - Could not allocate video codec context [%s:%d]\n", __FILE__, __LINE__ );
return( -1 );
}
if(pdp->codec->capabilities & CODEC_CAP_TRUNCATED)
{
pdp->c->flags |= CODEC_FLAG_TRUNCATED; /* we do not send complete frames */
}
/* For some codecs, such as msmpeg4 and mpeg4, width and height
MUST be initialized there because this information is not
available in the bitstream. */
/* put sample parameters */
pdp->c->bit_rate = 400000;
/* resolution must be a multiple of two */
pdp->c->width = dp->width;
pdp->c->height = dp->height;
pdp->c->time_base.den = 1;
pdp->c->time_base.num = 25; /* was hard coded to 25 */
pdp->c->gop_size = 10; /* emit one intra frame every ten frames */
pdp->c->max_b_frames = 1;
pdp->c->pix_fmt = AV_PIX_FMT_YUV420P;
if(dp->codec_id == AV_CODEC_ID_H264)
{
av_opt_set(pdp->c->priv_data, "preset", "slow", 0);
}
/* open it */
if (avcodec_open2(pdp->c, pdp->codec, NULL) < 0)
{
logmsg("error - Could not open codec [%s:%d]\n", __FILE__, __LINE__ );
return( -1 );
}
pdp->frame = avcodec_alloc_frame();
if (!pdp->frame)
{
logmsg("error - Could not allocate video frame [%s:%d]\n", __FILE__, __LINE__ );
return( -1 );
}
pdp->frame->format = AV_PIX_FMT_YUV420P;
pdp->frame->width = c->width;
pdp->frame->height = c->height;
#if 0
/* the image can be allocated by any means and av_image_alloc() is
* just the most convenient way if av_malloc() is to be used */
ret = av_image_alloc(pdp->frame->data, pdp->frame->linesize, pdp->c->width, pdp->c->height, pdp->c->pix_fmt, 32);
if (ret < 0)
{
logmsg("error - Could not allocate raw picture buffer [%s:%d]\n", __FILE__, __LINE__ );
return( -1 );
}
#endif
pdp->yuv420p_to_rgb24_ctx = sws_getContext( dp->width, dp->height, AV_PIX_FMT_YUV420P,
dp->width, dp->height, AV_PIX_FMT_RGB24,
SWS_FAST_BILINEAR, 0, 0, 0 );
if( pdp->yuv420p_to_rgb24_ctx == 0 )
{
logmsg( "error - failed to create yuv420p to rb24 conversion scale [%s:%d]\n",
__FILE__, __LINE__ );
return( -1 );
}
return( 0 );
}
int videoDecode( VideoDecodeParams *dp )
{
static uint8_t *inbuf = 0;
static int inbufLen = 0;
PrivateVideoDecodeParams *pdp;
AVPacket avpkt;
int len;
int got_frame;
int srcStride;
int ret;
pdp = (PrivateVideoDecodeParams*)dp->priv;
/* Make sure we have a buffer to work with. */
if( inbuf == 0 )
{
/* get a big buffer */
inbufLen = 256000;
inbuf = (uint8_t*)calloc( 1, inbufLen );
}
if( (dp->yuv420_len + FF_INPUT_BUFFER_PADDING_SIZE) > inbufLen )
{
if( inbuf )
{
free( inbuf );
}
inbufLen = dp->yuv420_len + FF_INPUT_BUFFER_PADDING_SIZE;
inbuf = (uint8_t*)calloc( 1, inbufLen );
}
/* copy the video to our buffer */
memcpy( inbuf, dp->yuv420_in, dp->yuv420_len );
/* set end of buffer to 0 (this ensures that no overreading happens for damaged mpeg streams) */
memset( inbuf + dp->yuv420_len, 0, FF_INPUT_BUFFER_PADDING_SIZE );
av_init_packet(&avpkt);
avpkt.size = dp->yuv420_len;
avpkt.data = inbuf;
len = avcodec_decode_video2(pdp->c, pdp->frame, &got_frame, &avpkt);
if (len < 0)
{
logmsg("error - Error while decoding frame %d [%s:%d]\n", pdp->frame_count, __FILE__, __LINE__ );
return len;
}
if (got_frame)
{
logmsg( "got a frame\n" );
/* convert to RGB24 */
srcStride = pdp->c->width * 3;
ret = sws_scale( pdp->yuv420p_to_rgb24_ctx, pdp->frame->data, pdp->frame->linesize, 0, pdp->c->height, &dp->rgb24_out, &srcStride );
pdp->frame_count++;
}
else
{
logmsg( "no frame\n" );
}
return( got_frame );
}
Here's how the init functions are called:
videoEncodeInit :
ep.codec_id = AV_CODEC_ID_MPEG4;
ep.width = 352;
ep.height = 288;
ep.fps = 10;
ep.rgb24 = 0;
ep.pts = 0;
ep.callback = debug_videoEncodeCallback;
if( videoEncodeInit( &ep ) == -1 )
{
MyMessageBox( g_mainwindow, "Can't initialize video codec!", "Error", MB_OK | MB_ICONERROR );
exit( -1 );
}
videoDecodeInit:
dp.codec_id = AV_CODEC_ID_MPEG4;
dp.width = 352;
dp.height = 288;
videoDecodeInit( &dp );
videoEncode:

Related

PortAudio mic picking up audio from speaker

I have the following program:
#include <stdio.h>
#include <stdlib.h>
#include "portaudio.h"
/* #define SAMPLE_RATE (17932) // Test failure to open with this value. */
#define SAMPLE_RATE (44100)
#define FRAMES_PER_BUFFER (512)
#define NUM_SECONDS (5)
#define NUM_CHANNELS (2)
/* #define DITHER_FLAG (paDitherOff) */
#define DITHER_FLAG (0) /**/
/** Set to 1 if you want to capture the recording to a file. */
#define WRITE_TO_FILE (0)
/* Select sample format. */
#if 1
#define PA_SAMPLE_TYPE paFloat32
typedef float SAMPLE;
#define SAMPLE_SILENCE (0.0f)
#define PRINTF_S_FORMAT "%.8f"
#elif 1
#define PA_SAMPLE_TYPE paInt16
typedef short SAMPLE;
#define SAMPLE_SILENCE (0)
#define PRINTF_S_FORMAT "%d"
#elif 0
#define PA_SAMPLE_TYPE paInt8
typedef char SAMPLE;
#define SAMPLE_SILENCE (0)
#define PRINTF_S_FORMAT "%d"
#else
#define PA_SAMPLE_TYPE paUInt8
typedef unsigned char SAMPLE;
#define SAMPLE_SILENCE (128)
#define PRINTF_S_FORMAT "%d"
#endif
typedef struct
{
int frameIndex; /* Index into sample array. */
int maxFrameIndex;
SAMPLE *recordedSamples;
}
paTestData;
/* This routine will be called by the PortAudio engine when audio is needed.
** It may be called at interrupt level on some machines so don't do anything
** that could mess up the system like calling malloc() or free().
*/
static int recordCallback( const void *inputBuffer, void *outputBuffer,
unsigned long framesPerBuffer,
const PaStreamCallbackTimeInfo* timeInfo,
PaStreamCallbackFlags statusFlags,
void *userData )
{
paTestData *data = (paTestData*)userData;
const SAMPLE *rptr = (const SAMPLE*)inputBuffer;
SAMPLE *wptr = &data->recordedSamples[data->frameIndex * NUM_CHANNELS];
long framesToCalc;
long i;
int finished;
unsigned long framesLeft = data->maxFrameIndex - data->frameIndex;
(void) outputBuffer; /* Prevent unused variable warnings. */
(void) timeInfo;
(void) statusFlags;
(void) userData;
if( framesLeft < framesPerBuffer )
{
framesToCalc = framesLeft;
finished = paComplete;
}
else
{
framesToCalc = framesPerBuffer;
finished = paContinue;
}
if( inputBuffer == NULL )
{
for( i=0; i<framesToCalc; i++ )
{
*wptr++ = SAMPLE_SILENCE; /* left */
if( NUM_CHANNELS == 2 ) *wptr++ = SAMPLE_SILENCE; /* right */
}
}
else
{
for( i=0; i<framesToCalc; i++ )
{
*wptr++ = *rptr++; /* left */
if( NUM_CHANNELS == 2 ) *wptr++ = *rptr++; /* right */
}
}
data->frameIndex += framesToCalc;
return finished;
}
/* This routine will be called by the PortAudio engine when audio is needed.
** It may be called at interrupt level on some machines so don't do anything
** that could mess up the system like calling malloc() or free().
*/
static int playCallback( const void *inputBuffer, void *outputBuffer,
unsigned long framesPerBuffer,
const PaStreamCallbackTimeInfo* timeInfo,
PaStreamCallbackFlags statusFlags,
void *userData )
{
paTestData *data = (paTestData*)userData;
SAMPLE *rptr = &data->recordedSamples[data->frameIndex * NUM_CHANNELS];
SAMPLE *wptr = (SAMPLE*)outputBuffer;
unsigned int i;
int finished;
unsigned int framesLeft = data->maxFrameIndex - data->frameIndex;
(void) inputBuffer; /* Prevent unused variable warnings. */
(void) timeInfo;
(void) statusFlags;
(void) userData;
if( framesLeft < framesPerBuffer )
{
/* final buffer... */
for( i=0; i<framesLeft; i++ )
{
*wptr++ = *rptr++; /* left */
if( NUM_CHANNELS == 2 ) *wptr++ = *rptr++; /* right */
}
for( ; i<framesPerBuffer; i++ )
{
*wptr++ = 0; /* left */
if( NUM_CHANNELS == 2 ) *wptr++ = 0; /* right */
}
data->frameIndex += framesLeft;
finished = paComplete;
}
else
{
for( i=0; i<framesPerBuffer; i++ )
{
*wptr++ = *rptr++; /* left */
if( NUM_CHANNELS == 2 ) *wptr++ = *rptr++; /* right */
}
data->frameIndex += framesPerBuffer;
finished = paContinue;
}
return finished;
}
/*******************************************************************/
int main(void);
int main(void)
{
PaStreamParameters inputParameters,
outputParameters;
PaStream* stream;
PaError err = paNoError;
paTestData data;
int i;
int totalFrames;
int numSamples;
int numBytes;
SAMPLE max, val;
double average;
printf("patest_record.c\n"); fflush(stdout);
data.maxFrameIndex = totalFrames = NUM_SECONDS * SAMPLE_RATE; /* Record for a few seconds. */
data.frameIndex = 0;
numSamples = totalFrames * NUM_CHANNELS;
numBytes = numSamples * sizeof(SAMPLE);
data.recordedSamples = (SAMPLE *) malloc( numBytes ); /* From now on, recordedSamples is initialised. */
if( data.recordedSamples == NULL )
{
printf("Could not allocate record array.\n");
goto done;
}
for( i=0; i<numSamples; i++ ) data.recordedSamples[i] = 0;
err = Pa_Initialize();
if( err != paNoError ) goto done;
inputParameters.device = Pa_GetDefaultInputDevice(); /* default input device */
if (inputParameters.device == paNoDevice) {
fprintf(stderr,"Error: No default input device.\n");
goto done;
}
inputParameters.channelCount = 2; /* stereo input */
inputParameters.sampleFormat = PA_SAMPLE_TYPE;
inputParameters.suggestedLatency = Pa_GetDeviceInfo( inputParameters.device )->defaultLowInputLatency;
inputParameters.hostApiSpecificStreamInfo = NULL;
/* Record some audio. -------------------------------------------- */
err = Pa_OpenStream(
&stream,
&inputParameters,
NULL, /* &outputParameters, */
SAMPLE_RATE,
FRAMES_PER_BUFFER,
paClipOff, /* we won't output out of range samples so don't bother clipping them */
recordCallback,
&data );
if( err != paNoError ) goto done;
err = Pa_StartStream( stream );
if( err != paNoError ) goto done;
printf("\n=== Now recording!! Please speak into the microphone. ===\n"); fflush(stdout);
while( ( err = Pa_IsStreamActive( stream ) ) == 1 )
{
Pa_Sleep(1000);
printf("index = %d\n", data.frameIndex ); fflush(stdout);
}
if( err < 0 ) goto done;
err = Pa_CloseStream( stream );
if( err != paNoError ) goto done;
/* Measure maximum peak amplitude. */
max = 0;
average = 0.0;
for( i=0; i<numSamples; i++ )
{
val = data.recordedSamples[i];
if( val < 0 ) val = -val; /* ABS */
if( val > max )
{
max = val;
}
average += val;
}
average = average / (double)numSamples;
printf("sample max amplitude = "PRINTF_S_FORMAT"\n", max );
printf("sample average = %lf\n", average );
/* Write recorded data to a file. */
#if WRITE_TO_FILE
{
FILE *fid;
fid = fopen("recorded.raw", "wb");
if( fid == NULL )
{
printf("Could not open file.");
}
else
{
fwrite( data.recordedSamples, NUM_CHANNELS * sizeof(SAMPLE), totalFrames, fid );
fclose( fid );
printf("Wrote data to 'recorded.raw'\n");
}
}
#endif
/* Playback recorded data. -------------------------------------------- */
data.frameIndex = 0;
outputParameters.device = Pa_GetDefaultOutputDevice(); /* default output device */
if (outputParameters.device == paNoDevice) {
fprintf(stderr,"Error: No default output device.\n");
goto done;
}
outputParameters.channelCount = 2; /* stereo output */
outputParameters.sampleFormat = PA_SAMPLE_TYPE;
outputParameters.suggestedLatency = Pa_GetDeviceInfo( outputParameters.device )->defaultLowOutputLatency;
outputParameters.hostApiSpecificStreamInfo = NULL;
printf("\n=== Now playing back. ===\n"); fflush(stdout);
err = Pa_OpenStream(
&stream,
NULL, /* no input */
&outputParameters,
SAMPLE_RATE,
FRAMES_PER_BUFFER,
paClipOff, /* we won't output out of range samples so don't bother clipping them */
playCallback,
&data );
if( err != paNoError ) goto done;
if( stream )
{
err = Pa_StartStream( stream );
if( err != paNoError ) goto done;
printf("Waiting for playback to finish.\n"); fflush(stdout);
while( ( err = Pa_IsStreamActive( stream ) ) == 1 ) Pa_Sleep(100);
if( err < 0 ) goto done;
err = Pa_CloseStream( stream );
if( err != paNoError ) goto done;
printf("Done.\n"); fflush(stdout);
}
done:
Pa_Terminate();
if( data.recordedSamples ) /* Sure it is NULL or valid. */
free( data.recordedSamples );
if( err != paNoError )
{
fprintf( stderr, "An error occured while using the portaudio stream\n" );
fprintf( stderr, "Error number: %d\n", err );
fprintf( stderr, "Error message: %s\n", Pa_GetErrorText( err ) );
err = 1; /* Always return 0 or 1, but no other return codes. */
}
return err;
}
This can also be found here: http://files.portaudio.com/docs/v19-doxydocs/paex__record_8c_source.html
I noticed when I started recording section, if I have something playing on my speaker, portaudio will pick up that sound as well. Is there anyway to configure port audio not to pick up that sound?

FFMPEG RTSP Server using muxing doc example

I am trying to develop RTSP server using FFMPEG. For that I slightly modified muxing file located at doc/example/ folder inside FFMPEG repository.
Giving my source code of RTSP server example:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <libavutil/avassert.h>
#include <libavutil/channel_layout.h>
#include <libavutil/opt.h>
#include <libavutil/mathematics.h>
#include <libavutil/timestamp.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>
#include <libswresample/swresample.h>
#define STREAM_DURATION 10.0
#define STREAM_FRAME_RATE 25 /* 25 images/s */
#define STREAM_PIX_FMT AV_PIX_FMT_YUV420P /* default pix_fmt */
#define SCALE_FLAGS SWS_BICUBIC
// a wrapper around a single output AVStream
typedef struct OutputStream {
AVStream *st;
AVCodecContext *enc;
/* pts of the next frame that will be generated */
int64_t next_pts;
int samples_count;
AVFrame *frame;
AVFrame *tmp_frame;
float t, tincr, tincr2;
struct SwsContext *sws_ctx;
struct SwrContext *swr_ctx;
} OutputStream;
static void log_packet(const AVFormatContext *fmt_ctx, const AVPacket *pkt)
{
AVRational *time_base = &fmt_ctx->streams[pkt->stream_index]->time_base;
printf("pts:%s pts_time:%s dts:%s dts_time:%s duration:%s duration_time:%s stream_index:%d\n",
av_ts2str(pkt->pts), av_ts2timestr(pkt->pts, time_base),
av_ts2str(pkt->dts), av_ts2timestr(pkt->dts, time_base),
av_ts2str(pkt->duration), av_ts2timestr(pkt->duration, time_base),
pkt->stream_index);
}
static int write_frame(AVFormatContext *fmt_ctx, const AVRational *time_base, AVStream *st, AVPacket *pkt)
{
/* rescale output packet timestamp values from codec to stream timebase */
av_packet_rescale_ts(pkt, *time_base, st->time_base);
pkt->stream_index = st->index;
/* Write the compressed frame to the media file. */
log_packet(fmt_ctx, pkt);
return av_interleaved_write_frame(fmt_ctx, pkt);
}
/* Add an output stream. */
static void add_stream(OutputStream *ost, AVFormatContext *oc,
AVCodec **codec,
enum AVCodecID codec_id)
{
AVCodecContext *c;
int i;
/* find the encoder */
*codec = avcodec_find_encoder(codec_id);
if (!(*codec)) {
fprintf(stderr, "Could not find encoder for '%s'\n",
avcodec_get_name(codec_id));
exit(1);
}
ost->st = avformat_new_stream(oc, NULL);
if (!ost->st) {
fprintf(stderr, "Could not allocate stream\n");
exit(1);
}
ost->st->id = oc->nb_streams-1;
c = avcodec_alloc_context3(*codec);
if (!c) {
fprintf(stderr, "Could not alloc an encoding context\n");
exit(1);
}
ost->enc = c;
switch ((*codec)->type) {
case AVMEDIA_TYPE_AUDIO:
c->sample_fmt = (*codec)->sample_fmts ?
(*codec)->sample_fmts[0] : AV_SAMPLE_FMT_FLTP;
c->bit_rate = 64000;
c->sample_rate = 44100;
if ((*codec)->supported_samplerates) {
c->sample_rate = (*codec)->supported_samplerates[0];
for (i = 0; (*codec)->supported_samplerates[i]; i++) {
if ((*codec)->supported_samplerates[i] == 44100)
c->sample_rate = 44100;
}
}
c->channels = av_get_channel_layout_nb_channels(c->channel_layout);
c->channel_layout = AV_CH_LAYOUT_STEREO;
if ((*codec)->channel_layouts) {
c->channel_layout = (*codec)->channel_layouts[0];
for (i = 0; (*codec)->channel_layouts[i]; i++) {
if ((*codec)->channel_layouts[i] == AV_CH_LAYOUT_STEREO)
c->channel_layout = AV_CH_LAYOUT_STEREO;
}
}
c->channels = av_get_channel_layout_nb_channels(c->channel_layout);
ost->st->time_base = (AVRational){ 1, c->sample_rate };
break;
case AVMEDIA_TYPE_VIDEO:
c->codec_id = codec_id;
c->bit_rate = 400000;
/* Resolution must be a multiple of two. */
c->width = 352;
c->height = 288;
/* timebase: This is the fundamental unit of time (in seconds) in terms
* of which frame timestamps are represented. For fixed-fps content,
* timebase should be 1/framerate and timestamp increments should be
* identical to 1. */
ost->st->time_base = (AVRational){ 1, STREAM_FRAME_RATE };
c->time_base = ost->st->time_base;
c->gop_size = 12; /* emit one intra frame every twelve frames at most */
c->pix_fmt = STREAM_PIX_FMT;
if (c->codec_id == AV_CODEC_ID_MPEG2VIDEO) {
/* just for testing, we also add B-frames */
c->max_b_frames = 2;
}
if (c->codec_id == AV_CODEC_ID_MPEG1VIDEO) {
/* Needed to avoid using macroblocks in which some coeffs overflow.
* This does not happen with normal video, it just happens here as
* the motion of the chroma plane does not match the luma plane. */
c->mb_decision = 2;
}
break;
default:
break;
}
/* Some formats want stream headers to be separate. */
if (oc->oformat->flags & AVFMT_GLOBALHEADER)
c->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
}
/**************************************************************/
/* audio output */
static AVFrame *alloc_audio_frame(enum AVSampleFormat sample_fmt,
uint64_t channel_layout,
int sample_rate, int nb_samples)
{
AVFrame *frame = av_frame_alloc();
int ret;
if (!frame) {
fprintf(stderr, "Error allocating an audio frame\n");
exit(1);
}
frame->format = sample_fmt;
frame->channel_layout = channel_layout;
frame->sample_rate = sample_rate;
frame->nb_samples = nb_samples;
if (nb_samples) {
ret = av_frame_get_buffer(frame, 0);
if (ret < 0) {
fprintf(stderr, "Error allocating an audio buffer\n");
exit(1);
}
}
return frame;
}
static void open_audio(AVFormatContext *oc, AVCodec *codec, OutputStream *ost, AVDictionary *opt_arg)
{
AVCodecContext *c;
int nb_samples;
int ret;
AVDictionary *opt = NULL;
c = ost->enc;
/* open it */
av_dict_copy(&opt, opt_arg, 0);
ret = avcodec_open2(c, codec, &opt);
av_dict_free(&opt);
if (ret < 0) {
fprintf(stderr, "Could not open audio codec: %s\n", av_err2str(ret));
exit(1);
}
/* init signal generator */
ost->t = 0;
ost->tincr = 2 * M_PI * 110.0 / c->sample_rate;
/* increment frequency by 110 Hz per second */
ost->tincr2 = 2 * M_PI * 110.0 / c->sample_rate / c->sample_rate;
if (c->codec->capabilities & AV_CODEC_CAP_VARIABLE_FRAME_SIZE)
nb_samples = 10000;
else
nb_samples = c->frame_size;
ost->frame = alloc_audio_frame(c->sample_fmt, c->channel_layout,
c->sample_rate, nb_samples);
ost->tmp_frame = alloc_audio_frame(AV_SAMPLE_FMT_S16, c->channel_layout,
c->sample_rate, nb_samples);
/* copy the stream parameters to the muxer */
ret = avcodec_parameters_from_context(ost->st->codecpar, c);
if (ret < 0) {
fprintf(stderr, "Could not copy the stream parameters\n");
exit(1);
}
/* create resampler context */
ost->swr_ctx = swr_alloc();
if (!ost->swr_ctx) {
fprintf(stderr, "Could not allocate resampler context\n");
exit(1);
}
/* set options */
av_opt_set_int (ost->swr_ctx, "in_channel_count", c->channels, 0);
av_opt_set_int (ost->swr_ctx, "in_sample_rate", c->sample_rate, 0);
av_opt_set_sample_fmt(ost->swr_ctx, "in_sample_fmt", AV_SAMPLE_FMT_S16, 0);
av_opt_set_int (ost->swr_ctx, "out_channel_count", c->channels, 0);
av_opt_set_int (ost->swr_ctx, "out_sample_rate", c->sample_rate, 0);
av_opt_set_sample_fmt(ost->swr_ctx, "out_sample_fmt", c->sample_fmt, 0);
/* initialize the resampling context */
if ((ret = swr_init(ost->swr_ctx)) < 0) {
fprintf(stderr, "Failed to initialize the resampling context\n");
exit(1);
}
}
/* Prepare a 16 bit dummy audio frame of 'frame_size' samples and
* 'nb_channels' channels. */
static AVFrame *get_audio_frame(OutputStream *ost)
{
AVFrame *frame = ost->tmp_frame;
int j, i, v;
int16_t *q = (int16_t*)frame->data[0];
/* check if we want to generate more frames */
if (av_compare_ts(ost->next_pts, ost->enc->time_base,
STREAM_DURATION, (AVRational){ 1, 1 }) >= 0)
return NULL;
for (j = 0; j <frame->nb_samples; j++) {
v = (int)(sin(ost->t) * 10000);
for (i = 0; i < ost->enc->channels; i++)
*q++ = v;
ost->t += ost->tincr;
ost->tincr += ost->tincr2;
}
frame->pts = ost->next_pts;
ost->next_pts += frame->nb_samples;
return frame;
}
/*
* encode one audio frame and send it to the muxer
* return 1 when encoding is finished, 0 otherwise
*/
static int write_audio_frame(AVFormatContext *oc, OutputStream *ost)
{
AVCodecContext *c;
AVPacket pkt = { 0 }; // data and size must be 0;
AVFrame *frame;
int ret;
int got_packet;
int dst_nb_samples;
av_init_packet(&pkt);
c = ost->enc;
frame = get_audio_frame(ost);
if (frame) {
/* convert samples from native format to destination codec format, using the resampler */
/* compute destination number of samples */
dst_nb_samples = av_rescale_rnd(swr_get_delay(ost->swr_ctx, c->sample_rate) + frame->nb_samples,
c->sample_rate, c->sample_rate, AV_ROUND_UP);
av_assert0(dst_nb_samples == frame->nb_samples);
/* when we pass a frame to the encoder, it may keep a reference to it
* internally;
* make sure we do not overwrite it here
*/
ret = av_frame_make_writable(ost->frame);
if (ret < 0)
exit(1);
/* convert to destination format */
ret = swr_convert(ost->swr_ctx,
ost->frame->data, dst_nb_samples,
(const uint8_t **)frame->data, frame->nb_samples);
if (ret < 0) {
fprintf(stderr, "Error while converting\n");
exit(1);
}
frame = ost->frame;
frame->pts = av_rescale_q(ost->samples_count, (AVRational){1, c->sample_rate}, c->time_base);
ost->samples_count += dst_nb_samples;
}
ret = avcodec_encode_audio2(c, &pkt, frame, &got_packet);
if (ret < 0) {
fprintf(stderr, "Error encoding audio frame: %s\n", av_err2str(ret));
exit(1);
}
if (got_packet) {
ret = write_frame(oc, &c->time_base, ost->st, &pkt);
if (ret < 0) {
fprintf(stderr, "Error while writing audio frame: %s\n",
av_err2str(ret));
exit(1);
}
}
return (frame || got_packet) ? 0 : 1;
}
/**************************************************************/
/* video output */
static AVFrame *alloc_picture(enum AVPixelFormat pix_fmt, int width, int height)
{
AVFrame *picture;
int ret;
picture = av_frame_alloc();
if (!picture)
return NULL;
picture->format = pix_fmt;
picture->width = width;
picture->height = height;
/* allocate the buffers for the frame data */
ret = av_frame_get_buffer(picture, 32);
if (ret < 0) {
fprintf(stderr, "Could not allocate frame data.\n");
exit(1);
}
return picture;
}
static void open_video(AVFormatContext *oc, AVCodec *codec, OutputStream *ost, AVDictionary *opt_arg)
{
int ret;
AVCodecContext *c = ost->enc;
AVDictionary *opt = NULL;
av_dict_copy(&opt, opt_arg, 0);
/* open the codec */
ret = avcodec_open2(c, codec, &opt);
av_dict_free(&opt);
if (ret < 0) {
fprintf(stderr, "Could not open video codec: %s\n", av_err2str(ret));
exit(1);
}
/* allocate and init a re-usable frame */
ost->frame = alloc_picture(c->pix_fmt, c->width, c->height);
if (!ost->frame) {
fprintf(stderr, "Could not allocate video frame\n");
exit(1);
}
/* If the output format is not YUV420P, then a temporary YUV420P
* picture is needed too. It is then converted to the required
* output format. */
ost->tmp_frame = NULL;
if (c->pix_fmt != AV_PIX_FMT_YUV420P) {
ost->tmp_frame = alloc_picture(AV_PIX_FMT_YUV420P, c->width, c->height);
if (!ost->tmp_frame) {
fprintf(stderr, "Could not allocate temporary picture\n");
exit(1);
}
}
/* copy the stream parameters to the muxer */
ret = avcodec_parameters_from_context(ost->st->codecpar, c);
if (ret < 0) {
fprintf(stderr, "Could not copy the stream parameters\n");
exit(1);
}
}
/* Prepare a dummy image. */
static void fill_yuv_image(AVFrame *pict, int frame_index,
int width, int height)
{
int x, y, i;
i = frame_index;
/* Y */
for (y = 0; y < height; y++)
for (x = 0; x < width; x++)
pict->data[0][y * pict->linesize[0] + x] = x + y + i * 3;
/* Cb and Cr */
for (y = 0; y < height / 2; y++) {
for (x = 0; x < width / 2; x++) {
pict->data[1][y * pict->linesize[1] + x] = 128 + y + i * 2;
pict->data[2][y * pict->linesize[2] + x] = 64 + x + i * 5;
}
}
}
static AVFrame *get_video_frame(OutputStream *ost)
{
AVCodecContext *c = ost->enc;
/* check if we want to generate more frames */
if (av_compare_ts(ost->next_pts, c->time_base,
STREAM_DURATION, (AVRational){ 1, 1 }) >= 0)
return NULL;
/* when we pass a frame to the encoder, it may keep a reference to it
* internally; make sure we do not overwrite it here */
if (av_frame_make_writable(ost->frame) < 0)
exit(1);
if (c->pix_fmt != AV_PIX_FMT_YUV420P) {
/* as we only generate a YUV420P picture, we must convert it
* to the codec pixel format if needed */
if (!ost->sws_ctx) {
ost->sws_ctx = sws_getContext(c->width, c->height,
AV_PIX_FMT_YUV420P,
c->width, c->height,
c->pix_fmt,
SCALE_FLAGS, NULL, NULL, NULL);
if (!ost->sws_ctx) {
fprintf(stderr,
"Could not initialize the conversion context\n");
exit(1);
}
}
fill_yuv_image(ost->tmp_frame, ost->next_pts, c->width, c->height);
sws_scale(ost->sws_ctx,
(const uint8_t * const *)ost->tmp_frame->data, ost->tmp_frame->linesize,
0, c->height, ost->frame->data, ost->frame->linesize);
} else {
fill_yuv_image(ost->frame, ost->next_pts, c->width, c->height);
}
ost->frame->pts = ost->next_pts++;
return ost->frame;
}
/*
* encode one video frame and send it to the muxer
* return 1 when encoding is finished, 0 otherwise
*/
static int write_video_frame(AVFormatContext *oc, OutputStream *ost)
{
int ret;
AVCodecContext *c;
AVFrame *frame;
int got_packet = 0;
AVPacket pkt = { 0 };
c = ost->enc;
frame = get_video_frame(ost);
av_init_packet(&pkt);
/* encode the image */
ret = avcodec_encode_video2(c, &pkt, frame, &got_packet);
if (ret < 0) {
fprintf(stderr, "Error encoding video frame: %s\n", av_err2str(ret));
exit(1);
}
if (got_packet) {
ret = write_frame(oc, &c->time_base, ost->st, &pkt);
} else {
ret = 0;
}
if (ret < 0) {
fprintf(stderr, "Error while writing video frame: %s\n", av_err2str(ret));
exit(1);
}
return (frame || got_packet) ? 0 : 1;
}
static void close_stream(AVFormatContext *oc, OutputStream *ost)
{
avcodec_free_context(&ost->enc);
av_frame_free(&ost->frame);
av_frame_free(&ost->tmp_frame);
sws_freeContext(ost->sws_ctx);
swr_free(&ost->swr_ctx);
}
/**************************************************************/
/* media file output */
int main(int argc, char **argv)
{
OutputStream video_st = { 0 }, audio_st = { 0 };
const char *filename;
AVOutputFormat *fmt;
AVFormatContext *oc;
AVCodec *audio_codec, *video_codec;
int ret;
int have_video = 0, have_audio = 0;
int encode_video = 0, encode_audio = 0;
AVDictionary *opt = NULL;
int i;
/* Initialize libavcodec, and register all codecs and formats. */
av_register_all();
avformat_network_init();
if (argc < 2) {
printf("usage: %s output_file\n"
"API example program to output a media file with libavformat.\n"
"This program generates a synthetic audio and video stream, encodes and\n"
"muxes them into a file named output_file.\n"
"The output format is automatically guessed according to the file extension.\n"
"Raw images can also be output by using '%%d' in the filename.\n"
"\n", argv[0]);
return 1;
}
filename = argv[1];
for (i = 2; i+1 < argc; i+=2) {
if (!strcmp(argv[i], "-flags") || !strcmp(argv[i], "-fflags"))
av_dict_set(&opt, argv[i]+1, argv[i+1], 0);
}
/* allocate the output media context */
avformat_alloc_output_context2(&oc, NULL, "rtsp", filename);
if (!oc) {
printf("Could not deduce output format from file extension: using MPEG.\n");
avformat_alloc_output_context2(&oc, NULL, "mpeg", filename);
}
if (!oc)
return 1;
fmt = oc->oformat;
/* Add the audio and video streams using the default format codecs
* and initialize the codecs. */
if (fmt->video_codec != AV_CODEC_ID_NONE) {
add_stream(&video_st, oc, &video_codec, fmt->video_codec);
have_video = 1;
encode_video = 1;
}
if (fmt->audio_codec != AV_CODEC_ID_NONE) {
add_stream(&audio_st, oc, &audio_codec, fmt->audio_codec);
have_audio = 1;
encode_audio = 1;
}
/* Now that all the parameters are set, we can open the audio and
* video codecs and allocate the necessary encode buffers. */
if (have_video)
open_video(oc, video_codec, &video_st, opt);
if (have_audio)
open_audio(oc, audio_codec, &audio_st, opt);
av_dump_format(oc, 0, filename, 1);
/* open the output file, if needed */
if (!(fmt->flags & AVFMT_NOFILE)) {
ret = avio_open(&oc->pb, filename, AVIO_FLAG_WRITE);
if (ret < 0) {
fprintf(stderr, "Could not open '%s': %s\n", filename,
av_err2str(ret));
return 1;
}
}
/* Write the stream header, if any. */
ret = avformat_write_header(oc, &opt);
if (ret < 0) {
fprintf(stderr, "Error occurred when opening output file: %s\n",
av_err2str(ret));
return 1;
}
while (encode_video || encode_audio) {
/* select the stream to encode */
if (encode_video &&
(!encode_audio || av_compare_ts(video_st.next_pts, video_st.enc->time_base,
audio_st.next_pts, audio_st.enc->time_base) <= 0)) {
encode_video = !write_video_frame(oc, &video_st);
} else {
encode_audio = !write_audio_frame(oc, &audio_st);
}
}
/* Write the trailer, if any. The trailer must be written before you
* close the CodecContexts open when you wrote the header; otherwise
* av_write_trailer() may try to use memory that was freed on
* av_codec_close(). */
av_write_trailer(oc);
/* Close each codec. */
if (have_video)
close_stream(oc, &video_st);
if (have_audio)
close_stream(oc, &audio_st);
if (!(fmt->flags & AVFMT_NOFILE))
/* Close the output file. */
avio_closep(&oc->pb);
/* free the stream */
avformat_free_context(oc);
return 0;
}
After compiling it, I am running binary:
$ ./muxing rtsp://127.0.0.1/test
Output #0, rtsp, to 'rtsp://127.0.0.1/test':
Stream #0:0: Video: mpeg4, yuv420p, 352x288, q=2-31, 400 kb/s, 25 tbn
Stream #0:1: Audio: aac (LC), 44100 Hz, stereo, fltp, 64 kb/s
[tcp # 0x2b9d220] Connection to tcp://127.0.0.1:554?timeout=0 failed: Connection refused
Error occurred when opening output file: Connection refused
But getting Connection refused error,
I created this repository for rtsp server using ffserver code:
https://github.com/harshil1991/ffserver.git
Now I can able to integrate this source code in my existing repo.

PortAudio microphone capture, separate channel values

I'm using a microphone array (playstation eye) with PortAudio. I'm attempting microphone array processing where I can know the level of each microphone and specify the direction of a sound using beam forming or inter aural time delays. I'm having trouble determining which sound levels come from each channel.
Here is some code snippets, the first being the recordCallback.
static int recordCallback( const void *inputBuffer, void *outputBuffer,
unsigned long framesPerBuffer,
const PaStreamCallbackTimeInfo* timeInfo,
PaStreamCallbackFlags statusFlags,
void *userData )
{
paTestData *data = (paTestData*)userData;
const SAMPLE *rptr = (const SAMPLE*)inputBuffer;
SAMPLE *wptr = &data->recordedSamples[data->frameIndex * NUM_CHANNELS];
long framesToCalc;
long i;
int finished;
unsigned long framesLeft = data->maxFrameIndex - data->frameIndex;
(void) outputBuffer; /* Prevent unused variable warnings. */
(void) timeInfo;
(void) statusFlags;
(void) userData;
if( framesLeft < framesPerBuffer )
{
framesToCalc = framesLeft;
finished = paComplete;
}
else
{
framesToCalc = framesPerBuffer;
finished = paContinue;
}
if( inputBuffer == NULL )
{
for( i=0; i<framesToCalc; i++ )
{
*wptr++ = SAMPLE_SILENCE; /* 1 */
if( NUM_CHANNELS =>= 2 ) *wptr++ = SAMPLE_SILENCE; /* 2 */
if( NUM_CHANNELS =>= 3 ) *wptr++ = SAMPLE_SILENCE; /* 3 */
if( NUM_CHANNELS >= 4 ) *wptr++ = SAMPLE_SILENCE; /* 4 */
}
}
else
{
for( i=0; i<framesToCalc; i++ )
{
*wptr++ = *rptr++; /* 1 */
if( NUM_CHANNELS >= 2 ) *wptr++ = *rptr++; /* 2 */
if( NUM_CHANNELS >= 3 ) *wptr++ = *rptr++; /* 3 */
if( NUM_CHANNELS >= 4 ) *wptr++ = *rptr++; /* 4 */
}
}
data->frameIndex += framesToCalc;
return finished;
}
Here is the main method where I'm playing back the audio and trying to show channel averages.
int main(void)
{
PaStreamParameters inputParameters,
outputParameters;
PaStream* stream;
PaError err = paNoError;
paTestData data;
int i;
int totalFrames;
int numSamples;
int numBytes;
SAMPLE max, val;
double average;
printf("patest_record.c\n"); fflush(stdout);
data.maxFrameIndex = totalFrames = NUM_SECONDS * SAMPLE_RATE; /* Record for a few seconds. */
data.frameIndex = 0;
numSamples = totalFrames * NUM_CHANNELS;
numBytes = numSamples * sizeof(SAMPLE);
data.recordedSamples = (SAMPLE *) malloc( numBytes ); /* From now on, recordedSamples is initialised. */
if( data.recordedSamples == NULL )
{
printf("Could not allocate record array.\n");
goto done;
}
for( i=0; i<numSamples; i++ ) data.recordedSamples[i] = 0;
err = Pa_Initialize();
if( err != paNoError ) goto done;
inputParameters.device = 2; /* default input device */
if (inputParameters.device == paNoDevice) {
fprintf(stderr,"Error: No default input device.\n");
goto done;
}
inputParameters.channelCount = 2; /* stereo input */
inputParameters.sampleFormat = PA_SAMPLE_TYPE;
inputParameters.suggestedLatency = Pa_GetDeviceInfo( inputParameters.device )->defaultLowInputLatency;
inputParameters.hostApiSpecificStreamInfo = NULL;
/* Record some audio. -------------------------------------------- */
err = Pa_OpenStream(
&stream,
&inputParameters,
NULL, /* &outputParameters, */
SAMPLE_RATE,
FRAMES_PER_BUFFER,
paClipOff, /* we won't output out of range samples so don't bother clipping them */
recordCallback,
&data );
if( err != paNoError ) goto done;
err = Pa_StartStream( stream );
if( err != paNoError ) goto done;
printf("\n=== Now recording!! Please speak into the microphone. ===\n"); fflush(stdout);
while( ( err = Pa_IsStreamActive( stream ) ) == 1 )
{
Pa_Sleep(1000);
printf("index = %d\n", data.frameIndex ); fflush(stdout);
printf("Channel = %d\n", data.currentChannel ); fflush(stdout);
}
if( err < 0 ) goto done;
err = Pa_CloseStream( stream );
if( err != paNoError ) goto done;
/* Measure maximum peak amplitude. */
/* average for each channel */
SAMPLE channel1val =0;
SAMPLE channel2val = 0;
SAMPLE channel3val =0;
SAMPLE channel4val = 0;
long channel1avg = 0.0;
long channel2avg =0.0;
long channel3avg =0.0;
long channel4avg =0.0;
SAMPLE channel1max = 0;
SAMPLE channel2max =0;
SAMPLE channel3max =0;
SAMPLE channel4max =0;
i = 0;
do
{
channel1val = data.recordedSamples[i];
if (channel1val < 0)
{
channel1val = -channel1val;
}
if (channel1val > channel1max)
{
channel1max = channel1val;
}
channel1avg += channel1val;
i = i + 4;
}
while (i<numSamples);
i = 1;
do
{
channel2val = data.recordedSamples[i];
if (channel2val < 0)
{
channel2val = -channel2val;
}
if (channel2val > channel2max)
{
channel2max = channel2val;
}
channel2avg += channel2val;
i = i + 4;
}
while (i<numSamples);
i = 2;
do
{
channel3val = data.recordedSamples[i];
if (channel3val < 0)
{
channel3val = -channel3val;
}
if (channel3val > channel3max)
{
channel3max = channel3val;
}
channel3avg += channel3val;
i = i + 4;
}
while (i<numSamples);
i = 3;
do
{
channel4val = data.recordedSamples[i];
if (channel4val < 0)
{
channel4val = -channel4val;
}
if (channel4val > channel4max)
{
channel4max = channel4val;
}
channel4avg += channel4val;
i = i + 4;
}
while (i<numSamples);
channel1avg = channel1avg / (double)numSamples;
channel2avg = channel2avg / (double)numSamples;
channel3avg = channel3avg / (double)numSamples;
channel4avg = channel4avg / (double)numSamples;
// printf("sample max amplitude = "PRINTF_S_FORMAT"\n", max );
// printf("sample average = %lf\n", average );
printf("channel1 max amplitude = "PRINTF_S_FORMAT"\n", channel1max);
printf("sample average = %lf\n", channel1avg);
printf("channel2 max amplitude = "PRINTF_S_FORMAT"\n", channel2max);
printf("sample average = %lf\n", channel2avg);
printf("channel3 max amplitude = "PRINTF_S_FORMAT"\n", channel3max);
printf("sample average = %lf\n", channel3avg);
printf("channel4 max amplitude = "PRINTF_S_FORMAT"\n", channel4max);
printf("sample average = %lf\n", channel4avg);
printf("/nPrinting out values/n");
for (int j=0; j<8; j++)
{
printf("Value: %lf\n", data.recordedSamples[j]);
}
/* Write recorded data to a file. */
#if WRITE_TO_FILE
{
FILE *fid;
fid = fopen("recorded.raw", "wb");
if( fid == NULL )
{
printf("Could not open file.");
}
else
{
fwrite( data.recordedSamples, NUM_CHANNELS * sizeof(SAMPLE), totalFrames, fid );
fclose( fid );
printf("Wrote data to 'recorded.raw'\n");
}
}
#endif
/* Playback recorded data. -------------------------------------------- */
data.frameIndex = 0;
outputParameters.device = Pa_GetDefaultOutputDevice(); /* default output device */
if (outputParameters.device == paNoDevice) {
fprintf(stderr,"Error: No default output device.\n");
goto done;
}
outputParameters.channelCount = 2; /* stereo output */
outputParameters.sampleFormat = PA_SAMPLE_TYPE;
outputParameters.suggestedLatency = Pa_GetDeviceInfo( outputParameters.device )->defaultLowOutputLatency;
outputParameters.hostApiSpecificStreamInfo = NULL;
printf("\n=== Now playing back. ===\n"); fflush(stdout);
err = Pa_OpenStream(
&stream,
NULL, /* no input */
&outputParameters,
SAMPLE_RATE,
FRAMES_PER_BUFFER,
paClipOff, /* we won't output out of range samples so don't bother clipping them */
playCallback,
&data );
if( err != paNoError ) goto done;
if( stream )
{
err = Pa_StartStream( stream );
if( err != paNoError ) goto done;
printf("Waiting for playback to finish.\n"); fflush(stdout);
while( ( err = Pa_IsStreamActive( stream ) ) == 1 ) Pa_Sleep(100);
if( err < 0 ) goto done;
err = Pa_CloseStream( stream );
if( err != paNoError ) goto done;
printf("Done.\n"); fflush(stdout);
}
done:
Pa_Terminate();
if( data.recordedSamples ) /* Sure it is NULL or valid. */
free( data.recordedSamples );
if( err != paNoError )
{
fprintf( stderr, "An error occured while using the portaudio stream\n" );
fprintf( stderr, "Error number: %d\n", err );
fprintf( stderr, "Error message: %s\n", Pa_GetErrorText( err ) );
err = 1; /* Always return 0 or 1, but no other return codes. */
}
return err;
}
If I run my code and blow into one microphone I get this. When I play the sound back using my code, it works fine and the sound is correct, but looking at the values outputted:
channel1 max amplitude = 1.00000000
sample average = 1.000000
channel2 max amplitude = 0.02542114
sample average = 0.025421
channel3 max amplitude = 1.00000000
sample average = 1.000000
channel4 max amplitude = 0.02627563
sample average = 0.026276
Which clearly isn't correct. As it's showing that two channels are almost identical. From what I understand, as it's capturing linear PCM, it should be mapping channels such as
SAMPLE
[
{Channel1}
{Channel2}
{Channel3}
{Channel4}
]
Now the problem is, when I blow into one microphone on audacity (which uses core-audio drivers) I get this. Clearly one microphone has peak of 1 and the others are near silent.
So I don't understand what I'm doing wrong, any pointers?
It looks like you are only recording 2 channels
inputParameters.channelCount = 2; /* stereo input */
This would explain why ch 1 and 3, and ch 2 and 4 are measuring the same level because you're just skipping every other sample from the same channel.

C - Transcoding to UDP using FFmpeg?

I'm trying to use the FFmpeg libraries to take an existing video file and stream it over a UDP connection. Specifically, I've been looking at the muxing.c and demuxing.c example files in the source code doc/example directory of FFmpeg. The demuxing file presents code which allows an input video to be converted into the video and audio streams. The muxing file presents code which creates fake data and can already be output to a UDP connection as I would like. I've begun work combining the two. Below can be found my code which is basically a copy of the muxing file with some parts replaced/appended with parts of the demuxing file. Unfortunately I'm running into plenty of complications attempting my goal through this approach. Is there an existing source code example which does the transcoding I'm looking for? Or at least a tutorial on how one might create this? If not, at least a few pointers might be helpful in directing my work in combing the two files to achieve my goal. Specifically, I'm getting the error:
[NULL # 0x23b4040] Unable to find a suitable output format for 'udp://localhost:7777'
Could not deduce output format from file extension: using MPEG.
Output #0, mpeg, to 'udp://localhost:7777':
Even though the muxing file could accept UDP formats. Any suggestions? Thank you much!
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <libavutil/mathematics.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>
/* 5 seconds stream duration */
#define STREAM_DURATION 200.0
#define STREAM_FRAME_RATE 25 /* 25 images/s */
#define STREAM_NB_FRAMES ((int)(STREAM_DURATION * STREAM_FRAME_RATE))
#define STREAM_PIX_FMT AV_PIX_FMT_YUV420P /* default pix_fmt */
//FROM DE
static AVFormatContext *fmt_ctx = NULL;
static AVCodecContext *video_dec_ctx = NULL, *audio_dec_ctx;
static AVStream *video_stream = NULL, *audio_stream = NULL;
static const char *src_filename = NULL;
static const char *video_dst_filename = NULL;
static const char *audio_dst_filename = NULL;
static FILE *video_dst_file = NULL;
static FILE *audio_dst_file = NULL;
static uint8_t *video_dst_data[4] = {NULL};
static int video_dst_linesize[4];
static int video_dst_bufsize;
static uint8_t **audio_dst_data = NULL;
static int audio_dst_linesize;
static int audio_dst_bufsize;
static int video_stream_idx = -1, audio_stream_idx = -1;
static AVFrame *frame = NULL;
static AVPacket pkt;
static int video_frame_count = 0;
static int audio_frame_count = 0;
//END DE
static int sws_flags = SWS_BICUBIC;
/**************************************************************/
/* audio output */
static float t, tincr, tincr2;
static int16_t *samples;
static int audio_input_frame_size;
/* Add an output stream. */
static AVStream *add_stream(AVFormatContext *oc, AVCodec **codec,
enum AVCodecID codec_id)
{
AVCodecContext *c;
AVStream *st;
/* find the encoder */
*codec = avcodec_find_encoder(codec_id);
if (!(*codec)) {
fprintf(stderr, "Could not find encoder for '%s'\n",
avcodec_get_name(codec_id));
exit(1);
}
st = avformat_new_stream(oc, *codec);
if (!st) {
fprintf(stderr, "Could not allocate stream\n");
exit(1);
}
st->id = oc->nb_streams-1;
c = st->codec;
switch ((*codec)->type) {
case AVMEDIA_TYPE_AUDIO:
st->id = 1;
c->sample_fmt = AV_SAMPLE_FMT_S16;
c->bit_rate = 64000;
c->sample_rate = 44100;
c->channels = 2;
break;
case AVMEDIA_TYPE_VIDEO:
c->codec_id = codec_id;
c->bit_rate = 400000;
/* Resolution must be a multiple of two. */
c->width = 352;
c->height = 288;
/* timebase: This is the fundamental unit of time (in seconds) in terms
* of which frame timestamps are represented. For fixed-fps content,
* timebase should be 1/framerate and timestamp increments should be
* identical to 1. */
c->time_base.den = STREAM_FRAME_RATE;
c->time_base.num = 1;
c->gop_size = 12; /* emit one intra frame every twelve frames at most */
c->pix_fmt = STREAM_PIX_FMT;
if (c->codec_id == AV_CODEC_ID_MPEG2VIDEO) {
/* just for testing, we also add B frames */
c->max_b_frames = 2;
}
if (c->codec_id == AV_CODEC_ID_MPEG1VIDEO) {
/* Needed to avoid using macroblocks in which some coeffs overflow.
* This does not happen with normal video, it just happens here as
* the motion of the chroma plane does not match the luma plane. */
c->mb_decision = 2;
}
break;
default:
break;
}
/* Some formats want stream headers to be separate. */
if (oc->oformat->flags & AVFMT_GLOBALHEADER)
c->flags |= CODEC_FLAG_GLOBAL_HEADER;
return st;
}
/**************************************************************/
/* audio output */
static float t, tincr, tincr2;
static int16_t *samples;
static int audio_input_frame_size;
static void open_audio(AVFormatContext *oc, AVCodec *codec, AVStream *st)
{
AVCodecContext *c;
int ret;
c = st->codec;
/* open it */
ret = avcodec_open2(c, codec, NULL);
if (ret < 0) {
fprintf(stderr, "Could not open audio codec: %s\n", av_err2str(ret));
exit(1);
}
/* init signal generator */
t = 0;
tincr = 2 * M_PI * 110.0 / c->sample_rate;
/* increment frequency by 110 Hz per second */
tincr2 = 2 * M_PI * 110.0 / c->sample_rate / c->sample_rate;
if (c->codec->capabilities & CODEC_CAP_VARIABLE_FRAME_SIZE)
audio_input_frame_size = 10000;
else
audio_input_frame_size = c->frame_size;
samples = av_malloc(audio_input_frame_size *
av_get_bytes_per_sample(c->sample_fmt) *
c->channels);
if (!samples) {
fprintf(stderr, "Could not allocate audio samples buffer\n");
exit(1);
}
}
/* Prepare a 16 bit dummy audio frame of 'frame_size' samples and
* 'nb_channels' channels. */
static void get_audio_frame(int16_t *samples, int frame_size, int nb_channels)
{
int j, i, v;
int16_t *q;
q = samples;
for (j = 0; j < frame_size; j++) {
v = (int)(sin(t) * 10000);
for (i = 0; i < nb_channels; i++)
*q++ = v;
t += tincr;
tincr += tincr2;
}
}
static void write_audio_frame(AVFormatContext *oc, AVStream *st)
{
AVCodecContext *c;
AVPacket pkt = { 0 }; // data and size must be 0;
AVFrame *frame = avcodec_alloc_frame();
int got_packet, ret;
av_init_packet(&pkt);
c = st->codec;
get_audio_frame(samples, audio_input_frame_size, c->channels);
frame->nb_samples = audio_input_frame_size;
avcodec_fill_audio_frame(frame, c->channels, c->sample_fmt,
(uint8_t *)samples,
audio_input_frame_size *
av_get_bytes_per_sample(c->sample_fmt) *
c->channels, 1);
ret = avcodec_encode_audio2(c, &pkt, frame, &got_packet);
if (ret < 0) {
fprintf(stderr, "Error encoding audio frame: %s\n", av_err2str(ret));
exit(1);
}
if (!got_packet)
return;
pkt.stream_index = st->index;
/* Write the compressed frame to the media file. */
ret = av_interleaved_write_frame(oc, &pkt);
if (ret != 0) {
fprintf(stderr, "Error while writing audio frame: %s\n",
av_err2str(ret));
exit(1);
}
avcodec_free_frame(&frame);
}
static void close_audio(AVFormatContext *oc, AVStream *st)
{
avcodec_close(st->codec);
av_free(samples);
}
/**************************************************************/
/* video output */
static AVFrame *frame;
static AVPicture src_picture, dst_picture;
static int frame_count;
static void open_video(AVFormatContext *oc, AVCodec *codec, AVStream *st)
{
int ret;
AVCodecContext *c = st->codec;
/* open the codec */
ret = avcodec_open2(c, codec, NULL);
if (ret < 0) {
fprintf(stderr, "Could not open video codec: %s\n", av_err2str(ret));
exit(1);
}
/* allocate and init a re-usable frame */
frame = avcodec_alloc_frame();
if (!frame) {
fprintf(stderr, "Could not allocate video frame\n");
exit(1);
}
/* Allocate the encoded raw picture. */
ret = avpicture_alloc(&dst_picture, c->pix_fmt, c->width, c->height);
if (ret < 0) {
fprintf(stderr, "Could not allocate picture: %s\n", av_err2str(ret));
exit(1);
}
/* If the output format is not YUV420P, then a temporary YUV420P
* picture is needed too. It is then converted to the required
* output format. */
if (c->pix_fmt != AV_PIX_FMT_YUV420P) {
ret = avpicture_alloc(&src_picture, AV_PIX_FMT_YUV420P, c->width, c->height);
if (ret < 0) {
fprintf(stderr, "Could not allocate temporary picture: %s\n",
av_err2str(ret));
exit(1);
}
}
/* copy data and linesize picture pointers to frame */
*((AVPicture *)frame) = dst_picture;
}
/* Prepare a dummy image. */
static void fill_yuv_image(AVPicture *pict, int frame_index,
int width, int height)
{
int x, y, i;
i = frame_index;
/* Y */
for (y = 0; y < height; y++)
for (x = 0; x < width; x++)
pict->data[0][y * pict->linesize[0] + x] = x + y + i * 3;
/* Cb and Cr */
for (y = 0; y < height / 2; y++) {
for (x = 0; x < width / 2; x++) {
pict->data[1][y * pict->linesize[1] + x] = 128 + y + i * 2;
pict->data[2][y * pict->linesize[2] + x] = 64 + x + i * 5;
}
}
}
static void write_video_frame(AVFormatContext *oc, AVStream *st)
{
int ret;
static struct SwsContext *sws_ctx;
AVCodecContext *c = st->codec;
if (frame_count >= STREAM_NB_FRAMES) {
/* No more frames to compress. The codec has a latency of a few
* frames if using B-frames, so we get the last frames by
* passing the same picture again. */
} else {
if (c->pix_fmt != AV_PIX_FMT_YUV420P) {
/* as we only generate a YUV420P picture, we must convert it
* to the codec pixel format if needed */
if (!sws_ctx) {
sws_ctx = sws_getContext(c->width, c->height, AV_PIX_FMT_YUV420P,
c->width, c->height, c->pix_fmt,
sws_flags, NULL, NULL, NULL);
if (!sws_ctx) {
fprintf(stderr,
"Could not initialize the conversion context\n");
exit(1);
}
}
fill_yuv_image(&src_picture, frame_count, c->width, c->height);
sws_scale(sws_ctx,
(const uint8_t * const *)src_picture.data, src_picture.linesize,
0, c->height, dst_picture.data, dst_picture.linesize);
} else {
fill_yuv_image(&dst_picture, frame_count, c->width, c->height);
}
}
if (oc->oformat->flags & AVFMT_RAWPICTURE) {
/* Raw video case - directly store the picture in the packet */
AVPacket pkt;
av_init_packet(&pkt);
pkt.flags |= AV_PKT_FLAG_KEY;
pkt.stream_index = st->index;
pkt.data = dst_picture.data[0];
pkt.size = sizeof(AVPicture);
ret = av_interleaved_write_frame(oc, &pkt);
} else {
AVPacket pkt = { 0 };
int got_packet;
av_init_packet(&pkt);
/* encode the image */
ret = avcodec_encode_video2(c, &pkt, frame, &got_packet);
if (ret < 0) {
fprintf(stderr, "Error encoding video frame: %s\n", av_err2str(ret));
exit(1);
}
/* If size is zero, it means the image was buffered. */
if (!ret && got_packet && pkt.size) {
pkt.stream_index = st->index;
/* Write the compressed frame to the media file. */
ret = av_interleaved_write_frame(oc, &pkt);
} else {
ret = 0;
}
}
if (ret != 0) {
fprintf(stderr, "Error while writing video frame: %s\n", av_err2str(ret));
exit(1);
}
frame_count++;
}
static void close_video(AVFormatContext *oc, AVStream *st)
{
avcodec_close(st->codec);
av_free(src_picture.data[0]);
av_free(dst_picture.data[0]);
av_free(frame);
}
static int open_codec_context(int *stream_idx,
AVFormatContext *fmt_ctx, enum AVMediaType type)
{
int ret;
AVStream *st;
AVCodecContext *dec_ctx = NULL;
AVCodec *dec = NULL;
ret = av_find_best_stream(fmt_ctx, type, -1, -1, NULL, 0);
if (ret < 0) {
fprintf(stderr, "Could not find %s stream in input file '%s'\n",
av_get_media_type_string(type), src_filename);
return ret;
} else {
*stream_idx = ret;
st = fmt_ctx->streams[*stream_idx];
/* find decoder for the stream */
dec_ctx = st->codec;
dec = avcodec_find_decoder(dec_ctx->codec_id);
if (!dec) {
fprintf(stderr, "Failed to find %s codec\n",
av_get_media_type_string(type));
return ret;
}
if ((ret = avcodec_open2(dec_ctx, dec, NULL)) < 0) {
fprintf(stderr, "Failed to open %s codec\n",
av_get_media_type_string(type));
return ret;
}
}
return 0;
}
/**************************************************************/
/* media file output */
int main(int argc, char **argv)
{
const char *filename;
AVOutputFormat *fmt;
AVFormatContext *oc;
AVStream *audio_st, *video_st;
AVCodec *audio_codec, *video_codec;
double audio_pts, video_pts;
int ret = 0, got_frame;;
/* Initialize libavcodec, and register all codecs and formats. */
av_register_all();
if (argc != 3) {
printf("usage: %s input_file output_file\n"
"\n", argv[0]);
return 1;
}
src_filename = argv[1];
filename = argv[2];
/* allocate the output media context */
avformat_alloc_output_context2(&oc, NULL, NULL, filename);
if (!oc) {
printf("Could not deduce output format from file extension: using MPEG.\n");
avformat_alloc_output_context2(&oc, NULL, "mpeg", filename);
}
if (!oc) {
return 1;
}
fmt = oc->oformat;
/* Add the audio and video streams using the default format codecs
* and initialize the codecs. */
video_stream = NULL;
audio_stream = NULL;
//FROM DE
/* open input file, and allocate format context */
if (avformat_open_input(&fmt_ctx, src_filename, NULL, NULL) < 0) {
fprintf(stderr, "Could not open source file %s\n", src_filename);
exit(1);
}
/* retrieve stream information */
if (avformat_find_stream_info(fmt_ctx, NULL) < 0) {
fprintf(stderr, "Could not find stream information\n");
exit(1);
}
if (open_codec_context(&video_stream_idx, fmt_ctx, AVMEDIA_TYPE_VIDEO) >= 0) {
video_stream = fmt_ctx->streams[video_stream_idx];
video_dec_ctx = video_stream->codec;
/* allocate image where the decoded image will be put */
ret = av_image_alloc(video_dst_data, video_dst_linesize,
video_dec_ctx->width, video_dec_ctx->height,
video_dec_ctx->pix_fmt, 1);
if (ret < 0) {
fprintf(stderr, "Could not allocate raw video buffer\n");
goto end;
}
video_dst_bufsize = ret;
}
if (open_codec_context(&audio_stream_idx, fmt_ctx, AVMEDIA_TYPE_AUDIO) >= 0) {
int nb_planes;
audio_stream = fmt_ctx->streams[audio_stream_idx];
audio_dec_ctx = audio_stream->codec;
nb_planes = av_sample_fmt_is_planar(audio_dec_ctx->sample_fmt) ?
audio_dec_ctx->channels : 1;
audio_dst_data = av_mallocz(sizeof(uint8_t *) * nb_planes);
if (!audio_dst_data) {
fprintf(stderr, "Could not allocate audio data buffers\n");
ret = AVERROR(ENOMEM);
goto end;
}
}
//END DE
/* Now that all the parameters are set, we can open the audio and
* video codecs and allocate the necessary encode buffers. */
if (video_stream)
open_video(oc, video_codec, video_stream);
if (audio_stream)
open_audio(oc, audio_codec, audio_stream);
av_dump_format(oc, 0, filename, 1);
/* open the output file, if needed */
if (!(fmt->flags & AVFMT_NOFILE)) {
ret = avio_open(&oc->pb, filename, AVIO_FLAG_WRITE);
if (ret < 0) {
fprintf(stderr, "Could not open '%s': %s\n", filename,
av_err2str(ret));
return 1;
}
}
/* Write the stream header, if any. */
ret = avformat_write_header(oc, NULL);
if (ret < 0) {
fprintf(stderr, "Error occurred when opening output file: %s\n",
av_err2str(ret));
return 1;
}
if (frame)
frame->pts = 0;
for (;;) {
/* Compute current audio and video time. */
if (audio_stream)
audio_pts = (double)audio_stream->pts.val * audio_stream->time_base.num / audio_stream->time_base.den;
else
audio_pts = 0.0;
if (video_stream)
video_pts = (double)video_stream->pts.val * video_stream->time_base.num /
video_stream->time_base.den;
else
video_pts = 0.0;
if ((!audio_stream || audio_pts >= STREAM_DURATION) &&
(!video_stream || video_pts >= STREAM_DURATION))
break;
/* write interleaved audio and video frames */
if (!video_stream || (video_stream && audio_st && audio_pts < video_pts)) {
write_audio_frame(oc, audio_stream);
} else {
write_video_frame(oc, video_stream);
frame->pts += av_rescale_q(1, video_stream->codec->time_base, video_stream->time_base);
}
}
/* Write the trailer, if any. The trailer must be written before you
* close the CodecContexts open when you wrote the header; otherwise
* av_write_trailer() may try to use memory that was freed on
* av_codec_close(). */
av_write_trailer(oc);
/* Close each codec. */
if (video_st)
close_video(oc, video_st);
if (audio_st)
close_audio(oc, audio_st);
if (!(fmt->flags & AVFMT_NOFILE))
/* Close the output file. */
avio_close(oc->pb);
/* free the stream */
avformat_free_context(oc);
end:
if (video_dec_ctx)
avcodec_close(video_dec_ctx);
if (audio_dec_ctx)
avcodec_close(audio_dec_ctx);
avformat_close_input(&fmt_ctx);
if (video_dst_file)
fclose(video_dst_file);
if (audio_dst_file)
fclose(audio_dst_file);
av_free(frame);
av_free(video_dst_data[0]);
av_free(audio_dst_data);
return 0;
}
The trouble you are facing is because udp is not a multimedia format. It is a network protocol. Ffmpeg converts to multimedia formats and can dump output to various locations such as disk (the most common case) or in this case on the network. So you must decide what format you want your content to be transmuxed to before you send it out over udp. From the command line this would be done using the -f format flag. Streamable formats are transport stream, webm in containers and elementary streams. You may want to use rtp for streaming over udp else on the reciever side you will face problems of missing packets and out of order arrival.
Given your current state, I suggest first you achieve what you want to do via the command line itself. This will give you an understanding of what are the various things you must do. Then you can go to the program.

Record RTSP stream with FFmpeg libavformat

I'm trying to record RTSP stream from Axis camera with FFmpeg libavformat.
I can grab video from files and then save it to another file, this is OK. But camera sends strange data, FPS is 100 and camera sends every 4th frame so result FPS is about 25. But libavformat set packets dts/pts for 90000 fps (default?) and new file stream has 100fps. Result is one hour video with only 100 frames.
Here is my code
#include <stdio.h>
#include <stdlib.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavformat/avio.h>
int main(int argc, char** argv) {
AVFormatContext* context = avformat_alloc_context();
int video_stream_index;
av_register_all();
avcodec_register_all();
avformat_network_init();
//open rtsp
if(avformat_open_input(&context, "rtsp://195.200.199.8/mpeg4/media.amp",NULL,NULL) != 0){
return EXIT_FAILURE;
}
if(avformat_find_stream_info(context,NULL) < 0){
return EXIT_FAILURE;
}
//search video stream
for(int i =0;i<context->nb_streams;i++){
if(context->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO)
video_stream_index = i;
}
AVPacket packet;
av_init_packet(&packet);
//open output file
AVOutputFormat* fmt = av_guess_format(NULL,"test2.avi",NULL);
AVFormatContext* oc = avformat_alloc_context();
oc->oformat = fmt;
avio_open2(&oc->pb, "test.avi", AVIO_FLAG_WRITE,NULL,NULL);
AVStream* stream=NULL;
int cnt = 0;
//start reading packets from stream and write them to file
av_read_play(context);//play RTSP
while(av_read_frame(context,&packet)>=0 && cnt <100){//read 100 frames
if(packet.stream_index == video_stream_index){//packet is video
if(stream == NULL){//create stream in file
stream = avformat_new_stream(oc,context->streams[video_stream_index]->codec->codec);
avcodec_copy_context(stream->codec,context->streams[video_stream_index]->codec);
stream->sample_aspect_ratio = context->streams[video_stream_index]->codec->sample_aspect_ratio;
avformat_write_header(oc,NULL);
}
packet.stream_index = stream->id;
av_write_frame(oc,&packet);
cnt++;
}
av_free_packet(&packet);
av_init_packet(&packet);
}
av_read_pause(context);
av_write_trailer(oc);
avio_close(oc->pb);
avformat_free_context(oc);
return (EXIT_SUCCESS);
}
Result file is here: http://dl.dropbox.com/u/1243577/test.avi
Thanks for any advice
Here's how I do it. What I found was when receiving H264 the framerate in the stream is not correct. It sends 1/90000 Timebase. I skip initializing the new stream from the incoming stream and just copy certain parameters. The incoming r_frame_rate should be accurate if max_analyze_frames works correctly.
#include <stdio.h>
#include <stdlib.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavformat/avio.h>
#include <sys/time.h>
time_t get_time()
{
struct timeval tv;
gettimeofday( &tv, NULL );
return tv.tv_sec;
}
int main( int argc, char* argv[] )
{
AVFormatContext *ifcx = NULL;
AVInputFormat *ifmt;
AVCodecContext *iccx;
AVCodec *icodec;
AVStream *ist;
int i_index;
time_t timenow, timestart;
int got_key_frame = 0;
AVFormatContext *ofcx;
AVOutputFormat *ofmt;
AVCodecContext *occx;
AVCodec *ocodec;
AVStream *ost;
int o_index;
AVPacket pkt;
int ix;
const char *sProg = argv[ 0 ];
const char *sFileInput;
const char *sFileOutput;
int bRunTime;
if ( argc != 4 ) {
printf( "Usage: %s url outfile runtime\n", sProg );
return EXIT_FAILURE;
}
sFileInput = argv[ 1 ];
sFileOutput = argv[ 2 ];
bRunTime = atoi( argv[ 3 ] );
// Initialize library
av_log_set_level( AV_LOG_DEBUG );
av_register_all();
avcodec_register_all();
avformat_network_init();
//
// Input
//
//open rtsp
if ( avformat_open_input( &ifcx, sFileInput, NULL, NULL) != 0 ) {
printf( "ERROR: Cannot open input file\n" );
return EXIT_FAILURE;
}
if ( avformat_find_stream_info( ifcx, NULL ) < 0 ) {
printf( "ERROR: Cannot find stream info\n" );
avformat_close_input( &ifcx );
return EXIT_FAILURE;
}
snprintf( ifcx->filename, sizeof( ifcx->filename ), "%s", sFileInput );
//search video stream
i_index = -1;
for ( ix = 0; ix < ifcx->nb_streams; ix++ ) {
iccx = ifcx->streams[ ix ]->codec;
if ( iccx->codec_type == AVMEDIA_TYPE_VIDEO ) {
ist = ifcx->streams[ ix ];
i_index = ix;
break;
}
}
if ( i_index < 0 ) {
printf( "ERROR: Cannot find input video stream\n" );
avformat_close_input( &ifcx );
return EXIT_FAILURE;
}
//
// Output
//
//open output file
ofmt = av_guess_format( NULL, sFileOutput, NULL );
ofcx = avformat_alloc_context();
ofcx->oformat = ofmt;
avio_open2( &ofcx->pb, sFileOutput, AVIO_FLAG_WRITE, NULL, NULL );
// Create output stream
//ost = avformat_new_stream( ofcx, (AVCodec *) iccx->codec );
ost = avformat_new_stream( ofcx, NULL );
avcodec_copy_context( ost->codec, iccx );
ost->sample_aspect_ratio.num = iccx->sample_aspect_ratio.num;
ost->sample_aspect_ratio.den = iccx->sample_aspect_ratio.den;
// Assume r_frame_rate is accurate
ost->r_frame_rate = ist->r_frame_rate;
ost->avg_frame_rate = ost->r_frame_rate;
ost->time_base = av_inv_q( ost->r_frame_rate );
ost->codec->time_base = ost->time_base;
avformat_write_header( ofcx, NULL );
snprintf( ofcx->filename, sizeof( ofcx->filename ), "%s", sFileOutput );
//start reading packets from stream and write them to file
av_dump_format( ifcx, 0, ifcx->filename, 0 );
av_dump_format( ofcx, 0, ofcx->filename, 1 );
timestart = timenow = get_time();
ix = 0;
//av_read_play(context);//play RTSP (Shouldn't need this since it defaults to playing on connect)
av_init_packet( &pkt );
while ( av_read_frame( ifcx, &pkt ) >= 0 && timenow - timestart <= bRunTime ) {
if ( pkt.stream_index == i_index ) { //packet is video
// Make sure we start on a key frame
if ( timestart == timenow && ! ( pkt.flags & AV_PKT_FLAG_KEY ) ) {
timestart = timenow = get_time();
continue;
}
got_key_frame = 1;
pkt.stream_index = ost->id;
pkt.pts = ix++;
pkt.dts = pkt.pts;
av_interleaved_write_frame( ofcx, &pkt );
}
av_free_packet( &pkt );
av_init_packet( &pkt );
timenow = get_time();
}
av_read_pause( ifcx );
av_write_trailer( ofcx );
avio_close( ofcx->pb );
avformat_free_context( ofcx );
avformat_network_deinit();
return EXIT_SUCCESS;
}
I don't think you're supposed to just increment the PTS value like that. It might work in rare occasions where the time base is just right, but for the general case it won't work.
You should change this:
pkt.pts = ix++;
pkt.dts = pkt.pts;
To this:
pkt.pts = av_rescale_q(pkt.pts, ifcx->streams[0]->codec->time_base, ofcx->streams[0]->time_base);
pkt.dts = av_rescale_q(pkt.dts, ifcx->streams[0]->codec->time_base, ofcx->streams[0]->time_base);
What that does is convert the packet's PTS/DTS from the units used in the input stream's codec to the units of the output stream.
Also, some streams have multiple ticks-per-frame, so if the video runs at double speed you might need to this right below the above line:
pkt.pts *= ifcx->streams[0]->codec->ticks_per_frame;
pkt.dts *= ifcx->streams[0]->codec->ticks_per_frame;
In my experience with a modern H.264 encoder, I'm finding that the duration returned by ffmpeg is only a "suggestion" and that there is some "jitter" in the PTS. The only accurate way to determine frame rate or duration is to measure it yourself using the PTS values.
For an H.264 encoder running at 30fps, duration is always reported as 3000/90000, while measured duration is usually +/- 1 but periodically jumps say 3000+25 one frame and 3000-25 the next. I'm smoothing this out for recording by noticing any adjacent frames with opposite deviation and adjusting the PTS of the 2nd frame while preserving the total duration.
This give me a stream with an occasional (calculated) duration of 30001 or 2999, reflecting clock drift.
When recording a 29.97fps stream, av_read_frame() always returns a duration of 3000, while the nominal calculated duration is 3003 (correct for 29.97) with the same jitter and drift as described above.
In my case, I just built a state machine to clean up the timing. Hoping this helps someone.
Recently was doing the same. I had FPS twice lower than the camera sent. The reason was in AVstream->codec->ticks_per_frame field, set to 2. My source was progressive, and in case yours in interleaved - then that might be a reason of yet another factor of 2, giving 4x different FPS.
90000 Hz is the default timebase for video stream sent via RTSP. Timebase is different from FPS in resolution. For instance, a frame with timestamp 30000 will be shown at 1/3 sec if the timebase is 90000 Hz. The timebase should be put into the AVstream structure during the output, but AVFormatContext should have a real FPS value.

Resources