I'm trying to use libavcodec (ffmpeg) to encode raw pixel data to mp4 format. Every thing goes well and I'm getting .avi file with decent quality but some times the codec gives "encoded frame too large" warning. And when ever it does that, a part of some frames (usually bottom portion of the frame) look garbled or all mixed up. Can any one tell me when this warning is given. Following are the settings I'm using for encoder:
qmax = 6;
qmin = 2;
bit_rate = 200000; // if I increase this, I get more warnings.
width = 1360;
height = 768;
time_base.den = 15; // frames per second
time_base.num = 1;
gop_size = 48;
pix_fmt = PIX_FMT_YUV420P;
From what I can gather ffmpeg allocates a constant buffer size of 2MB to hold a compressed
frame. 1080p is 3MB uncompressed for example, and the codec can't always compress a large frame into less than 2MB.
You can possibly fix this by increasing the buffer size, and/or making it dynamic.
Very probably that codec's buffer is not big enough. Try to change rc_buffer_size. Alternatively, you can try this settings:
ctx->bit_rate = 500000;
ctx->bit_rate_tolerance = 0;
ctx->rc_max_rate = 0;
ctx->rc_buffer_size = 0;
ctx->gop_size = 40;
ctx->max_b_frames = 3;
ctx->b_frame_strategy = 1;
ctx->coder_type = 1;
ctx->me_cmp = 1;
ctx->me_range = 16;
ctx->qmin = 10;
ctx->qmax = 51;
ctx->scenechange_threshold = 40;
ctx->me_method = ME_HEX;
ctx->me_subpel_quality = 5;
ctx->i_quant_factor = 0.71;
ctx->qcompress = 0.6;
ctx->max_qdiff = 4;
ctx->directpred = 1;
ctx->flags2 |= CODEC_FLAG2_FASTPSKIP;
In the example code I found something like:
outbuf_size = 100000;
outbuf = malloc(outbuf_size);
out_size = avcodec_encode_video(c, outbuf, outbuf_size, picture);
Pushing outbuf_size to be larger resolved the issue.
I have faced a new strange problem, I have initialized 2 framebuffer addresses in my SDRAM.
These 2 addresses: SDRAM_START_ADR and the second one SDRAM_START_ADR2.
In stm32f4 2 layers are independent, I programmed a project that 2 images which the first one was in 800 * 480 my LCD resolution and second one was an image smaller than my first image. That was ok when I ran the project.
LTDC read 2 images directly from my flash.
Now I want to save data in my external SDRAM, and read data directly from it, to do this I have configured 2 layers.(which I discussed above).
The strange problem is I want to display these 2 buffers at the same time, but for test, I wrote 2 full rectangle in different colors in 2 layers in my SDRAM, it is ok when I show them independently (layer 1 is enabled and 2 is disabled or vice versa), but to show them at the same time I got my rectangles in disturbed lines.
I want 2 layers in 800 * 480 resolution.
How can I solve this problem?
AND the important question is how can I switch between these 2 layers without any error in display.
STM32429 ZGT6
LCD was configured in 16 bits.
Here is code:
// layer1
LTDC_Layer_InitStruct.LTDC_HorizontalStart = HSYNC_W + HBP;
LTDC_Layer_InitStruct.LTDC_HorizontalStop = 800 + HSYNC_W + HBP - 1;
LTDC_Layer_InitStruct.LTDC_VerticalStart = VSYNC_W + VBP;
LTDC_Layer_InitStruct.LTDC_VerticalStop = 480+ VSYNC_W + VBP - 1;
LTDC_Layer_InitStruct.LTDC_PixelFormat = LTDC_Pixelformat_RGB565;
LTDC_Layer_InitStruct.LTDC_ConstantAlpha = 255;
LTDC_Layer_InitStruct.LTDC_DefaultColorAlpha = 0;
LTDC_Layer_InitStruct.LTDC_DefaultColorBlue = 0;
LTDC_Layer_InitStruct.LTDC_DefaultColorGreen = 0;
LTDC_Layer_InitStruct.LTDC_DefaultColorRed = 0;
LTDC_Layer_InitStruct.LTDC_BlendingFactor_1 = LTDC_BlendingFactor1_CA;
LTDC_Layer_InitStruct.LTDC_BlendingFactor_2 = LTDC_BlendingFactor2_CA;
LTDC_Layer_InitStruct.LTDC_CFBLineLength = ((800 * 2) + 3);
LTDC_Layer_InitStruct.LTDC_CFBPitch = (800 * 2);
LTDC_Layer_InitStruct.LTDC_CFBLineNumber = 480;
LTDC_Layer_InitStruct.LTDC_CFBStartAdress =SDRAM_START_ADR;
LTDC_LayerInit(LTDC_Layer1, <DC_Layer_InitStruct);
// layer 2
LTDC_Layer_InitStruct.LTDC_HorizontalStart = HSYNC_W + HBP;
LTDC_Layer_InitStruct.LTDC_HorizontalStop = 800 + HSYNC_W + HBP - 1;
LTDC_Layer_InitStruct.LTDC_VerticalStart = VSYNC_W + VBP;
LTDC_Layer_InitStruct.LTDC_VerticalStop = 480+ VSYNC_W + VBP - 1;
LTDC_Layer_InitStruct.LTDC_PixelFormat = LTDC_Pixelformat_RGB565;
LTDC_Layer_InitStruct.LTDC_ConstantAlpha = 150;
LTDC_Layer_InitStruct.LTDC_DefaultColorAlpha = 0;
LTDC_Layer_InitStruct.LTDC_DefaultColorBlue = 0;
LTDC_Layer_InitStruct.LTDC_DefaultColorGreen = 0;
LTDC_Layer_InitStruct.LTDC_DefaultColorRed = 0;
LTDC_Layer_InitStruct.LTDC_BlendingFactor_1 = LTDC_BlendingFactor1_CA;
LTDC_Layer_InitStruct.LTDC_BlendingFactor_2 = LTDC_BlendingFactor2_CA;
LTDC_Layer_InitStruct.LTDC_CFBLineLength = ((200 * 2) + 3);
LTDC_Layer_InitStruct.LTDC_CFBPitch = (200 * 2);
LTDC_Layer_InitStruct.LTDC_CFBLineNumber = 100;
LTDC_Layer_InitStruct.LTDC_CFBStartAdress = SDRAM_START_ADR2;
LTDC_LayerInit(LTDC_Layer2, <DC_Layer_InitStruct);
In my main.c:
here are these 2 functions:
LTDC_LayerCmd(LTDC_Layer1, ENABLE);
LCD_DrawFullRect(200, 200, 200,100, LCD_COLOR_BLUE);
LTDC_LayerCmd(LTDC_Layer2, ENABLE);
LCD_DrawFullRect2((uint32_t *)SDRAM_START_ADR2,400, 240, 200,100, LCD_COLOR_MAGENTA);
I'm writing a porting of an emulator to SDL. There is a method, called at each frame, that passes a buffer with new audio samples for next frame.
I opened a device with SDL_OpenAudioDevice and at each frame the SDL callback method reproduces samples from audio buffer.
It works but the sound is not perfect, some tic, some metallic noise and so on.
Sound is 16 bit signed.
EDIT: Ok, I found a solution!
With the code of the opening post I was playing samples for next frame at the current frame in real time. It was wrong!
So, I implemented a circular buffer where I put samples for next frame that underlying code passes to me at each (current) frame.
In that buffer there are 2 pointers, one for read point and the other one for write point. SDL calls callback function when on its audio stream there are no more data to play; so when callback function is called I play audio samples from read point on the circular buffer then update the read pointer.
When underlying code gives me audio samples data for next frame I write them in the circular buffer at write point, then update the write pointer.
Read and write pointers are shifted by the amount of samples to be played at each frame.
Code updated, needs some adjustment when samplesPerFrame is not an int but it works ;-)
Circular buffer structure:
typedef struct circularBufferStruct
short *buffer;
int cells;
short *readPoint;
short *writePoint;
} circularBuffer;
This method is called at initialization:
int initialize_audio(int stereo)
if (stereo)
channel = 2;
channel = 1;
// Check if sound is disabled
if (sampleRate != 0)
// Initialize SDL Audio
if (SDL_InitSubSystem(SDL_INIT_AUDIO) < 0)
SDL_Log("SDL fails to initialize audio subsystem!\n%s", SDL_GetError());
return 1;
// Number of samples per frame
samplesPerFrame = (double)sampleRate / (double)framesPerSecond * channel;
audioSamplesSize = samplesPerFrame * bytesPerSample; // Bytes
audioBufferSize = audioSamplesSize * 10; // Bytes
// Set and clear circular buffer
audioBuffer.buffer = malloc(audioBufferSize); // Bytes, must be a multiple of audioSamplesSize
memset(audioBuffer.buffer, 0, audioBufferSize);
audioBuffer.cells = (audioBufferSize) / sizeof(short); // Cells, not Bytes!
audioBuffer.readPoint = audioBuffer.buffer;
audioBuffer.writePoint = audioBuffer.readPoint + (short)samplesPerFrame;
samplesPerFrame = 0;
// First frame
return samplesPerFrame;
This is the SDL method callback from want.callback:
void audioCallback(void *userdata, uint8_t *stream, int len)
SDL_memset(stream, 0, len);
if (audioSamplesSize == 0)
if (len > audioSamplesSize)
len = audioSamplesSize;
SDL_MixAudioFormat(stream, (const Uint8 *)audioBuffer.readPoint, AUDIO_S16SYS, len, SDL_MIX_MAXVOLUME);
audioBuffer.readPoint += (short)samplesPerFrame;
if (audioBuffer.readPoint >= audioBuffer.buffer + audioBuffer.cells)
audioBuffer.readPoint = audioBuffer.readPoint - audioBuffer.cells;
This method is called at each frame (after first pass we require only the amount of samples):
int update_audio(short *buffer)
// Check if sound is disabled
if (sampleRate != 0)
memcpy(audioBuffer.writePoint, buffer, audioSamplesSize); // Bytes
audioBuffer.writePoint += (short)samplesPerFrame; // Cells
if (audioBuffer.writePoint >= audioBuffer.buffer + audioBuffer.cells)
audioBuffer.writePoint = audioBuffer.writePoint - audioBuffer.cells;
if (firstTime)
// Set required audio specs
want.freq = sampleRate;
want.format = AUDIO_S16SYS;
want.channels = channel;
want.samples = samplesPerFrame / channel; // total samples divided by channel count
want.padding = 0;
want.callback = audioCallback;
want.userdata = NULL;
device = SDL_OpenAudioDevice(SDL_GetAudioDeviceName(0, 0), 0, &want, &have, 0);
SDL_PauseAudioDevice(device, 0);
firstTime = 0;
samplesPerFrame = 0;
// Next frame
return samplesPerFrame;
I expect that this question/answer will be useful for others in the future because I didn't find almost nothing on the net for SDL Audio
Ok, I found a solution!
With the code of the opening post I was playing samples for next frame at the current frame in real time. It was wrong!
So, I implemented a circular buffer where I put samples for next frame that underlying code passes to me at each (current) frame. From that buffer I read and write in different position, see opening post
Is it possible to directly read/write to a WriteableBitmap's pixel data? I'm currently using WriteableBitmapEx's SetPixel() but it's slow and I want to access the pixels directly without any overhead.
I haven't used HTML5's canvas in a while, but if I recall correctly you could get its image data as a single array of numbers and that's kind of what I'm looking for
Thanks in advance
To answer your question, you can more directly access a writable bitmap's data by using the Lock, write, Unlock pattern, as demonstrated below, but it is typically not necessary unless you are basing your drawing upon the contents of the image. More typically, you can just create a new buffer and make it a bitmap, rather than the other way around.
That being said, there are many extensibility points in WPF to perform innovative drawing without resorting to pixel manipulation. For most controls, the existing WPF primitives (Border, Line, Rectangle, Image, etc...) are more than sufficient - don't be concerned about using many of them, they are rather cheap to use. For complex controls, you can use the DrawingContext to draw D3D primitives. For image effects, you can implement GPU assisted shaders using the Effect class or use the built in effects (Blur and Shadow).
But, if your situation requires direct pixel access, pick a pixel format and start writing. I suggest BGRA32 because it is easy to understand and is probably the most common one to be discussed.
BGRA32 means the pixel data is stored in memory as 4 bytes representing the blue, green, red, and alpha channels of an image, in that order. It is convenient because each pixel ends up on a 4 byte boundary, lending it to storage in an 32 bit integer. When dealing with a 32 bit integer, keep in mind the order will be reversed on most platforms (check BitConverter.IsLittleEndian to determine proper byte order at runtime if you need to support multiple platforms, x86 and x86_64 are both little endian)
The image data is stored in horizontal strips which are one stride wide which compose a single row the width of an image. The stride width is always greater than or equal to the pixel width of the image multiplied by the number of bytes per pixel in the format selected. Certain situations can cause the stride to be longer than the width * bytesPerPixel which are specific to certain architechtures, so you must use the stride width to calculate the start of a row, rather than multiplying the width. Since we are using a 4 byte wide pixel format, our stride does happen to be width * 4, but you should not rely upon it.
As mentioned, the only case I would suggest using a WritableBitmap is if you are accessing an existing image, so that is the example below:
Before / After:
// must be compiled with /UNSAFE
// get an image to draw on and convert it to our chosen format
BitmapSource srcImage = JpegBitmapDecoder.Create(File.Open("img13.jpg", FileMode.Open),
BitmapCreateOptions.None, BitmapCacheOption.OnLoad).Frames[0];
if (srcImage.Format != PixelFormats.Bgra32)
srcImage = new FormatConvertedBitmap(srcImage, PixelFormats.Bgra32, null, 0);
// get a writable bitmap of that image
var wbitmap = new WriteableBitmap(srcImage);
int width = wbitmap.PixelWidth;
int height = wbitmap.PixelHeight;
int stride = wbitmap.BackBufferStride;
int bytesPerPixel = (wbitmap.Format.BitsPerPixel + 7) / 8;
byte* pImgData = (byte*)wbitmap.BackBuffer;
// set alpha to transparent for any pixel with red < 0x88 and invert others
int cRowStart = 0;
int cColStart = 0;
for (int row = 0; row < height; row++)
cColStart = cRowStart;
for (int col = 0; col < width; col++)
byte* bPixel = pImgData + cColStart;
UInt32* iPixel = (UInt32*)bPixel;
if (bPixel[2 /* bgRa */] < 0x44)
// set to 50% transparent
bPixel[3 /* bgrA */] = 0x7f;
// invert but maintain alpha
*iPixel = *iPixel ^ 0x00ffffff;
cColStart += bytesPerPixel;
cRowStart += stride;
// if you are going across threads, you will need to additionally freeze the source
However, it really isn't necessary if you are not modifying an existing image. For example, you can draw a checkerboard pattern using all safe code:
// draw rectangles
int width = 640, height = 480, bytesperpixel = 4;
int stride = width * bytesperpixel;
byte[] imgdata = new byte[width * height * bytesperpixel];
int rectDim = 40;
UInt32 darkcolorPixel = 0xffaaaaaa;
UInt32 lightColorPixel = 0xffeeeeee;
UInt32[] intPixelData = new UInt32[width * height];
for (int row = 0; row < height; row++)
for (int col = 0; col < width; col++)
intPixelData[row * width + col] = ((col / rectDim) % 2) != ((row / rectDim) % 2) ?
lightColorPixel : darkcolorPixel;
Buffer.BlockCopy(intPixelData, 0, imgdata, 0, imgdata.Length);
// compose the BitmapImage
var bsCheckerboard = BitmapSource.Create(width, height, 96, 96, PixelFormats.Bgra32, null, imgdata, stride);
And you don't really even need an Int32 intermediate, if you write to the byte array directly.
// draw using byte array
int width = 640, height = 480, bytesperpixel = 4;
int stride = width * bytesperpixel;
byte[] imgdata = new byte[width * height * bytesperpixel];
// draw a gradient from red to green from top to bottom (R00 -> ff; Gff -> 00)
// draw a gradient of alpha from left to right
// Blue constant at 00
for (int row = 0; row < height; row++)
for (int col = 0; col < width; col++)
imgdata[row * stride + col * 4 + 0] = 0;
imgdata[row * stride + col * 4 + 1] = Convert.ToByte((1 - (col / (float)width)) * 0xff);
imgdata[row * stride + col * 4 + 2] = Convert.ToByte((col / (float)width) * 0xff);
imgdata[row * stride + col * 4 + 3] = Convert.ToByte((row / (float)height) * 0xff);
var gradient = BitmapSource.Create(width, height, 96, 96, PixelFormats.Bgra32, null, imgdata, stride);
Edit: apparently, you are trying to use WPF to make some sort of image editor. I would still be using WPF primitives for shapes and source bitmaps, and then implement translations, scaling, rotation as RenderTransform's, bitmap effects as Effect's and keep everything within the WPF model. But, if that does not work for you, we have many other options.
You could use WPF primitives to render to a RenderTargetBitmap which has a chosen PixelFormat to use with WritableBitmap as below:
Canvas cvRoot = new Canvas();
// position primitives on canvas
var rtb = new RenderTargetBitmap(width, height, dpix, dpiy, PixelFormats.Bgra32);
var wb = new WritableBitmap(rtb);
You could use a WPF DrawingVisual to issue GDI style commands then render to a bitmap as demonstrated on the sample on the RenderTargetBitmap page.
You could use GDI using an InteropBitmap created using System.Windows.Interop.Imaging.CreateBitmapSourceFromHBitmap from an HBITMAP retrieved from a Bitmap.GetHBitmap method. Make sure you don't leak the HBITMAP, though.
After a nice long headache, I found this article that explains a way to do it without using bit arithmetic, and allows me to treat it as an array instead:
IntPtr pBackBuffer = bitmap.BackBuffer;
byte* pBuff = (byte*)pBackBuffer.ToPointer();
pBuff[4 * x + (y * bitmap.BackBufferStride)] = 255;
pBuff[4 * x + (y * bitmap.BackBufferStride) + 1] = 255;
pBuff[4 * x + (y * bitmap.BackBufferStride) + 2] = 255;
pBuff[4 * x + (y * bitmap.BackBufferStride) + 3] = 255;
You can access the raw pixel data by calling the Lock() method and using the BackBuffer property afterwards. When you're finished, don't forget to call AddDirtyRect and Unlock.
For a simple example, you can take a look at this: http://cscore.codeplex.com/SourceControl/latest#CSCore.Visualization/WPF/Utils/PixelManipulationBitmap.cs
I have a vector with many frequencies. Now I try to program a sine-wave, which generates for each frequency one period and put it into one vector... (similar like a sweep signal)
Finally I want to plot this...
I already tried this, but it doesn't work correctly..
%fr = Frequency-Vector with 784 Elements from 2.0118e+04 to 1.9883e+04 Hz
fs = 48000; %Sampling frequency [Hz]
tstart = 0;
tstep = 1/fs;
tend = (length(fr))*(1/min(fr))-tstep;
t3 = tstart3:tstep3:tend3;
sin3 = [];
for i = 1:length(fr)/2
sin3 = [sin3 sin(2*pi*fr(i)*t3)];
tstart4 = 0;
tstep4 = 1/fs2;
tend4 = tstep4*length(sin3);
t4 = tstart4:tstep4:tend4-tstep4;
Could you please help me?
If reversed engineer your codes correctly, it seems like you wanted to generate a chirp frequency. It could be more efficient if you do it as follows
fr = linspace(2.0118e4, 1.9883e4, 784); % Frequency content
%fr = linspace(2e4, 1e4, 784); % Try this for a wider chirp
fs = 48e3;
phi = cumsum(2*pi*fr/fs);
s1 = sin(phi);
spectrogram(s1, 128, 120, 128, fs); % View the signal in time vs frequency
I've been working on some streaming software that takes live feeds
from various kinds of cameras and streams over the network using
H.264. To accomplish this, I'm using the x264 encoder directly (with
the "zerolatency" preset) and feeding NALs as they are available to
libavformat to pack into RTP (ultimately RTSP). Ideally, this
application should be as real-time as possible. For the most part,
this has been working well.
Unfortunately, however, there is some sort of synchronization issue:
any video playback on clients seems to show a few smooth frames,
followed by a short pause, then more frames; repeat. Additionally,
there appears to be approximately a 4-second delay. This happens with
every video player I've tried: Totem, VLC, and basic gstreamer pipes.
I've boiled it all down to a somewhat small test case:
#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
#include <x264.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>
#define WIDTH 640
#define HEIGHT 480
#define FPS 30
#define BITRATE 400000
#define RTP_ADDRESS ""
#define RTP_PORT 49990
struct AVFormatContext* avctx;
struct x264_t* encoder;
struct SwsContext* imgctx;
uint8_t test = 0x80;
void create_sample_picture(x264_picture_t* picture)
// create a frame to store in
x264_picture_alloc(picture, X264_CSP_I420, WIDTH, HEIGHT);
// fake image generation
// disregard how wrong this is; just writing a quick test
int strides = WIDTH / 8;
uint8_t* data = malloc(WIDTH * HEIGHT * 3);
memset(data, test, WIDTH * HEIGHT * 3);
test = (test << 1) | (test >> (8 - 1));
// scale the image
sws_scale(imgctx, (const uint8_t* const*) &data, &strides, 0, HEIGHT,
picture->img.plane, picture->img.i_stride);
int encode_frame(x264_picture_t* picture, x264_nal_t** nals)
// encode a frame
x264_picture_t pic_out;
int num_nals;
int frame_size = x264_encoder_encode(encoder, nals, &num_nals, picture, &pic_out);
// ignore bad frames
if (frame_size < 0)
return frame_size;
return num_nals;
void stream_frame(uint8_t* payload, int size)
// initalize a packet
AVPacket p;
p.data = payload;
p.size = size;
p.stream_index = 0;
p.flags = AV_PKT_FLAG_KEY;
// send it out
av_interleaved_write_frame(avctx, &p);
int main(int argc, char* argv[])
// initalize ffmpeg
// set up image scaler
// (in-width, in-height, in-format, out-width, out-height, out-format, scaling-method, 0, 0, 0)
imgctx = sws_getContext(WIDTH, HEIGHT, PIX_FMT_MONOWHITE,
// set up encoder presets
x264_param_t param;
x264_param_default_preset(¶m, "ultrafast", "zerolatency");
param.i_threads = 3;
param.i_width = WIDTH;
param.i_height = HEIGHT;
param.i_fps_num = FPS;
param.i_fps_den = 1;
param.i_keyint_max = FPS;
param.b_intra_refresh = 0;
param.rc.i_bitrate = BITRATE;
param.b_repeat_headers = 1; // whether to repeat headers or write just once
param.b_annexb = 1; // place start codes (1) or sizes (0)
// initalize
x264_param_apply_profile(¶m, "high");
encoder = x264_encoder_open(¶m);
// at this point, x264_encoder_headers can be used, but it has had no effect
// set up streaming context. a lot of error handling has been ommitted
// for brevity, but this should be pretty standard.
avctx = avformat_alloc_context();
struct AVOutputFormat* fmt = av_guess_format("rtp", NULL, NULL);
avctx->oformat = fmt;
snprintf(avctx->filename, sizeof(avctx->filename), "rtp://%s:%d", RTP_ADDRESS, RTP_PORT);
if (url_fopen(&avctx->pb, avctx->filename, URL_WRONLY) < 0)
perror("url_fopen failed");
return 1;
struct AVStream* stream = av_new_stream(avctx, 1);
// initalize codec
AVCodecContext* c = stream->codec;
c->codec_id = CODEC_ID_H264;
c->codec_type = AVMEDIA_TYPE_VIDEO;
c->width = WIDTH;
c->height = HEIGHT;
c->time_base.den = FPS;
c->time_base.num = 1;
c->gop_size = FPS;
c->bit_rate = BITRATE;
avctx->flags = AVFMT_FLAG_RTP_HINT;
// write the header
// make some frames
for (int frame = 0; frame < 10000; frame++)
// create a sample moving frame
x264_picture_t* pic = (x264_picture_t*) malloc(sizeof(x264_picture_t));
// encode the frame
x264_nal_t* nals;
int num_nals = encode_frame(pic, &nals);
if (num_nals < 0)
printf("invalid frame size: %d\n", num_nals);
// send out NALs
for (int i = 0; i < num_nals; i++)
stream_frame(nals[i].p_payload, nals[i].i_payload);
// free up resources
// stream at approx 30 fps
printf("frame %d\n", frame);
return 0;
This test shows black lines on a white background that
should move smoothly to the left. It has been written for ffmpeg 0.6.5
but the problem can be reproduced on 0.8 and 0.10 (from what I've tested so far). I've taken some shortcuts in error handling to make this example as short as
possible while still showing the problem, so please excuse some of the
nasty code. I should also note that while an SDP is not used here, I
have tried using that already with similar results. The test can be
compiled with:
gcc -g -std=gnu99 streamtest.c -lswscale -lavformat -lx264 -lm -lpthread -o streamtest
It can be played with gtreamer directly:
gst-launch udpsrc port=49990 ! application/x-rtp,payload=96,clock-rate=90000 ! rtph264depay ! decodebin ! xvimagesink
You should immediately notice the stuttering. One common "fix" I've
seen all over the Internet is to add sync=false to the pipeline:
gst-launch udpsrc port=49990 ! application/x-rtp,payload=96,clock-rate=90000 ! rtph264depay ! decodebin ! xvimagesink sync=false
This causes playback to be smooth (and near-realtime), but is a
non-solution and only works with gstreamer. I'd like to fix the
problem at the source. I've been able to stream with near-identical
parameters using raw ffmpeg and haven't had any issues:
ffmpeg -re -i sample.mp4 -vcodec libx264 -vpre ultrafast -vpre baseline -b 400000 -an -f rtp rtp:// -an
So clearly I'm doing something wrong. But what is it?
1) You didn't set PTS for frames you send to libx264 (you probably should see "non-strictly-monotonic PTS" warnings)
2) You didn't set PTS/DTS for packets you send to libavformat's rtp muxer (I not 100% sure it need to be set but I guess it would be better. From source code it looks like rtp use PTS).
3) IMHO usleep(33333) is bad. It cause encoder to stall this time also (increasing latency) while you could encode next frame during this time even if you still don't need to send it by rtp.
P.S. btw you didn't set param.rc.i_rc_method to X264_RC_ABR so libx264 will use CRF 23 instead and ignore your "param.rc.i_bitrate = BITRATE". Also it can be good idea to use VBV when encoding for network sending.