convert AVPicture to array<unsigned char> - arrays

I use ffmpeg to extract frame of video in c++. I want to get array<unsigned char> of frame in c++ but I get AVFrame from this line code.
avcodec_decode_video2(codecContext, DecodedFrame, &gotPicture, Packet);
So I use sws_scale to convert AVFrame to AVPicture but also I cannot got array<unsigned char> from frame.
sws_scale(convertContext, DecodedFrame->data, DecodedFrame->linesize, 0, (codecContext)->height, convertedFrame->data, convertedFrame->linesize);
So can anyone help me to convert AVFrame or AVPicture to array<unsigned char>?

AVPicture is deprecated. Converting to it is meaningless since AVFrame is its alternative.
If I understand this question correctly, you're trying to get the raw picture pixels value to a std::array. If such so, just dump the data fields of AVFrame into it.
avcodec_decode_video2(codecContext, DecodedFrame, &gotPicture, Packet);
// If you need rgb, create a swscontext to convert from video pixel format
sws_ctx = sws_getContext(DecodedFrame->width, DecodedFrame->height, codecContext->pix_fmt, DecodedFrame->width, DecodedFrame->height, AV_PIX_FMT_RGB24, 0, 0, 0, 0);
uint8_t* rgb_data[4]; int rgb_linesize[4];
av_image_alloc(rgb_data, rgb_linesize, DecodedFrame->width, DecodedFrame->height, AV_PIX_FMT_RGB24, 32);
sws_scale(sws_ctx, DecodedFrame->data, DecodedFrame->linesize, 0, DecodedFrame->height, rgb_data, rgb_linesize);
// RGB24 is a packed format. It means there is only one plane and all data in it.
size_t rgb_size = DecodedFrame->width * DecodedFrame->height * 3;
std::array<uint8_t, rgb_size> rgb_arr;
std::copy_n(rgb_data[0], rgb_size, rgb_arr);

Related

Gstreamer zero copy dmabuf encoding

I want to encode a framebuffer I got via dmabuf into a video stream.
I have a dmabuf file descriptor which contains the framebuffer. I got the filedescriptor from the intel i915 driver via ioctl VFIO_DEVICE_QUERY_GFX_PLANE.
Now I want to encode it per zero copy in gstreamer into a video stream (h264, h265 etc...). I push the single frames per appsrc into the gstreamer pipline. Since I use intel hardware I thought it makes sense to use VAAPI.
The problem is that the sink pads of vaapi only support video/x-raw and video/x-raw(memory:VASurface) and I have video/x-raw(memory:DMABuf).
Is there any way to convert video/x-raw(memory:DMABuf) to video/x-raw(memory:VASurface) (zero copy) or import the DMABuf directly as video/x-raw(memory:VASurface)?
Alternatively is there a framework which is better suited than vaapi?
My code to push the frames into gstreamer currently looks like this:
GstMemory* mem = gst_dmabuf_allocator_alloc(vedpy->gdata.allocator, dmabuf->fd, dmabuf->width * dmabuf->height * (dmabuf->stride/1024));
vedpy->gdata.buffer = gst_buffer_new();
gst_buffer_append_memory(vedpy->gdata.buffer, mem );
gsize offset[GST_VIDEO_MAX_PLANES] = {0, 0, 0, 0};
gint stride[GST_VIDEO_MAX_PLANES] = {dmabuf->stride, 0, 0, 0};
gst_buffer_add_video_meta_full( vedpy->gdata.buffer, GST_VIDEO_FRAME_FLAG_NONE,
GST_VIDEO_FORMAT_ENCODED,
dmabuf->width, dmabuf->height, 1, offset, stride);
GstFlowReturn ret;
vfio_encode_dpy *vedpy = container_of(dcl, vfio_encode_dpy, dcl);
g_signal_emit_by_name (vedpy->gdata.source, "push-buffer", vedpy->gdata.buffer, &ret);
And my pipline:
char launch_stream[] = "appsrc name=source ! "
" video/x-raw(memory:DMABuf),width=1024,height=768,framerate=0/1,format={BGRx,BGRx:0x0100000000000001} ! "
" vaapipostproc !"
" vaapih265enc !
...
which obviously does not work, because vaapipostproc can not be linked with the filter.

Encode buffer captured by OpenGL in C

I am trying to use OpenGL to capture the back buffer of my computer's screen, and then H.264 encode the buffer using FFMPEG's libavcodec library. The issue I'm having is that I would like to encode the video in AV_PIX_FMT_420P, but the back buffer capture function provided by OpenGL, glReadPixels(), only supports formats like GL_RGB. As you can see below, I try to use FFMPEG's swscale() function to convert from RGB to YUV, but the following code crashes at the swscale() line. Any ideas on how I can encode the OpenGL backbuffer?
// CAPTURE BACK BUFFER USING OPENGL
int width = 1280, height = 720;
BYTE* pixels = (BYTE *) malloc(sizeof(BYTE));
glReadPixels(0, 720, width, height, GL_RGB, GL_UNSIGNED_BYTE, pixels);
//CREATE FFMPEG VARIABLES
avcodec_register_all();
AVCodec *codec;
AVCodecContext *context;
struct SwsContext *sws;
AVPacket packet;
AVFrame *frame;
codec = avcodec_find_encoder(AV_CODEC_ID_H264);
context = avcodec_alloc_context3(encoder->codec);
context->dct_algo = FF_DCT_FASTINT;
context->bit_rate = 400000;
context->width = width;
context->height = height;
context->time_base.num = 1;
context->time_base.den = 30;
context->gop_size = 1;
context->max_b_frames = 1;
context->pix_fmt = AV_PIX_FMT_YUV420P;
avcodec_open2(context, codec, NULL);
// CONVERT TO YUV AND ENCODE
int frame_size = avpicture_get_size(AV_PIX_FMT_YUV420P, out_width, out_height);
encoder->frame_buffer = malloc(frame_size);
avpicture_fill((AVPicture *) encoder->frame, (uint8_t *) encoder->frame_buffer, AV_PIX_FMT_YUV420P, out_width, out_height);
sws = sws_getContext(in_width, in_height, AV_PIX_FMT_RGB32, out_width, out_height, AV_PIX_FMT_YUV420P, SWS_FAST_BILINEAR, 0, 0, 0);
uint8_t *in_data[1] = {(uint8_t *) pixels};
int in_linesize[1] = {width * 4};
// PROGRAM CRASHES HERE
sws_scale(encoder->sws, in_data, in_linesize, 0, encoder->in_height, encoder->frame->data, encoder->frame->linesize);
av_free_packet(&packet);
av_init_packet(&packet);
int success;
avcodec_encode_video2(context, &packet, frame, &success);
Your pixels buffer is too small; you malloc only one BYTE instead of width*height*4 bytes:
BYTE* pixels = (BYTE *) malloc(width*height*4);
Your glReadPixels call is also incorrect:
Passing y=720 causes it to read outside the window. Remember that OpenGL coordinate system has the y-axis pointing upwards.
AV_PIX_FMT_RGB32 expects four bytes per pixel, whereas GL_RGB writes three bytes per pixel, therefore you need GL_RGBA or GL_BGRA.
Of the two I'm pretty sure that it should be GL_BGRA: AV_PIX_FMT_RGB32 treats pixels as 32-bit integers, therefore on little-endian blue comes first. OpenGL treats each channel as a byte, therefore it should be GL_BGRA to match.
To summarize try:
glReadPixels(0, 0, width, height, GL_BGRA, GL_UNSIGNED_BYTE, pixels);
Additionally, due to OpenGL y-axis pointing upwards but ffmpeg y-axis pointing downwards you may need to flip the image. It can be done with the following trick:
uint8_t *in_data[1] = {(uint8_t *) pixels + (height-1)*width*4}; // address of the last line
int in_linesize[1] = {- width * 4}; // negative stride

Getting color mismatch while converting from NV12 raw data to H264 using FFMPEG

I am trying to convert NV12 raw data to H264 using hw encoder of FFMPEG.
to pass raw data to encoder I am passing AVFrame struct using below logic:
uint8_t * buf;
buf = (uint8_t *)dequeue();
frame->data[0] = buf;
frame->data[1] = buf + size;
frame->data[2] = buf + size;
frame->pts = frameCount;
frameCount++;
but using this logic, I am getting, color mismatched H264 data,
Can someone tell me , How to pass buffer to AVFrame data?
Thanks in Advance,
Harshil
I solved color mismatch issue by passing correct linesize and data value of AVFrame struct.
Let's say NV12 has YYYYUVUV plane for 4x4 image, then in ffmpeg, we need to pass
linesize[0] = start location of y
linesize[1] = 4 because location of u started at 4
and we dont need to specify linesize[2] because uv are packed.
and also in case of data
data[0] = start location of y
data[1] = 4

Correctly Allocate And Fill Frame In FFmpeg

I am filling a Frame with a BGR image for encoding, and I am getting a memory leak. I think I got to the source of the problem but it appears to be a library issue instead. Since FFmpeg is such a mature library, I think I am misusing it and I would like to be instructed on how to do it correctly.
I am allocating a Frame using:
AVFrame *bgrFrame = av_frame_alloc();
And later I allocate the image in the Frame using:
av_image_alloc(bgrFrame->data, bgrFrame->linesize, bgrFrame->width, bgrFrame->height, AV_PIX_FMT_BGR24, 32);
Then I fill the image allocated using:
av_image_fill_pointers(bgrFrame->data, AV_PIX_FMT_BGR24, bgrFrame->height, originalBGRImage.data, bgrFrame->linesize);
Where originalBGRImage is an OpenCV Mat.
And this has a memory leak, apparently, av_image_alloc() allocates memory, and av_image_fill_pointers() also allocates memory, on the same pointers (I can see bgrFrame->data[0] changing between calls).
If I call
av_freep(&bgrFrame->data[0]);
After av_image_alloc(), it's fine, but if I call it after av_image_fill_pointers(), the program crashes, even though bgrFrame->data[0] is not NULL, which I find very curious.
Looking FFmpeg's av_image_alloc() source code, I see it calls av_image_fill_pointers() twice inside it, once allocating a buffer buff....and later in av_image_fill_pointers() source code, data[0] is substituted by the image pointer, which is (I think) the source of the memory leak, since data[0] was holding buf from the previous av_image_alloc() call.
So this brings the final question: What's the correct way of filling a frame with an image?.
You should allocate your frame once.
AVFrame* alloc_picture(enum PixelFormat pix_fmt, int width, int height)
{
AVFrame *f = avcodec_alloc_frame();
if (!f)
return NULL;
int size = avpicture_get_size(pix_fmt, width, height);
uint8_t *buffer = (uint8_t *) av_malloc(size);
if (!buffer) {
av_free(f);
return NULL;
}
avpicture_fill((AVPicture *)f, buffer, pix_fmt, width, height);
return f;
}
Yes, the cast (AVPicture*) is allowed https://stackoverflow.com/a/20498359/2079934 .
In subsequent frames, you can write into the this frame. Since your OpenCV raw data is BGR and you need RGB or YUV, you can use sws_scale. In my application, I mirror vertically:
struct SwsContext* convertCtx = sws_getContext(width, height, PIX_FMT_RGB24, c->width, c->height, c->pix_fmt, SWS_FAST_BILINEAR, NULL, NULL, NULL);
avpicture_fill(&pic_raw, (uint8_t*)pixelBuffer, PIX_FMT_RGB24, width, height);
// flip
pic_raw.data[0] += (height - 1) * pic_raw.linesize[0];
pic_raw.linesize[0] *= -1;
sws_scale(convertCtx, pic_raw.data, pic_raw.linesize, 0, height, f->data, f->linesize);
out_size = avcodec_encode_video(c, outputBuffer, outputBufferSize, f);
(You can adapt PIX_FMT_RGB24 to your needs and read from cv::Mat without copying data.)
av_fill_arrays() does the job. It will fill the frame's data[] and linesizes[] but not reallocating any memory.
Too late for answer. But after take many hours, i want to share.
In document
/**
* AVBuffer references backing the data for this frame. All the pointers in
* data and extended_data must point inside one of the buffers in buf or
* extended_buf. This array must be filled contiguously -- if buf[i] is
* non-NULL then buf[j] must also be non-NULL for all j < i.
*
* There may be at most one AVBuffer per data plane, so for video this array
* always contains all the references. For planar audio with more than
* AV_NUM_DATA_POINTERS channels, there may be more buffers than can fit in
* this array. Then the extra AVBufferRef pointers are stored in the
* extended_buf array.
*/
AVBufferRef *buf[AV_NUM_DATA_POINTERS];
Then buf is "smart pointer" for data (extended_buf for extended_data)
for example, i using image one linesize only
int size = av_image_get_buffer_size(AVPixelFormat::AV_PIX_FMT_BGRA, width, height, 1);
AVBufferRef* dataref = av_buffer_alloc(size);//that for av_frame_unref
memcpy(dataref->data, your_buffer, size );
AVFrame* frame = av_frame_alloc();
av_image_fill_arrays(frame->data, frame->linesize, dataref->data, AVPixelFormat::AV_PIX_FMT_BGRA, source->width, source->height, 1
frame->buf[0] = dataref;
av_frame_unref will unref frame->buf and free pointer if ref count to zero

How to convert a CvMat from type CV_AA to CV_32F

The CvMat type 16 corresponds to "CV_AA". Is there an easy conversion between this and the type CV_32F?
Something in the same vein as cvCvtColor(cimg,gimg,CV_BGR2GRAY);?
CV_AA is used for telling drawing functions (i.e., line, circle, fonts, etc) to perform anti-aliased drawing; I don't believe it is a proper Mat data-type. As you can see in core_c.h, it is defined in the drawing functions section.
Could you show the code where you are receiving this data-type from?
EDIT : I think I see what's going on :)
Given that CV_8U is this:
#define CV_8U 0
And CV_MAKETYPE is:
#define CV_MAKETYPE(depth,cn) (CV_MAT_DEPTH(depth) + (((cn)-1) << CV_CN_SHIFT))
where cn is the number of channels, and CV_CN_SHIFT is 3. I'm betting the type 16 you are seeing is actually
(0 + ((3 - 1) << 3)) -> 16 or AKA CV_8UC3.
So, you have an 8bpp RGB image not a CV_AA image :)
You need to convert each channel from CV_8U to CV_32F.
EDIT : Take a look at using cvSplit and cvMerge (I haven't used the C interface in a while, but it should be something like the following):
IplImage* src = cvCreateImage( size, IPL_DEPTH_8U, 3 ); // CV_8UC3
IplImage* r8u = cvClone(src);
IplImage* g8u = cvClone(src);
IplImage* b8u = cvClone(src);
IplImage* dst = cvCreateImage( size, IPL_DEPTH_32F, 3 ); // CV_32F
IplImage* r32f = cvClone(dst);
IplImage* g32f = cvClone(dst);
IplImage* b32f = cvClone(dst);
// split the channels apart...
cvSplit(src, b8u, g8u, r8u, NULL); // assuming in OpenCV BGR order here...may be RGB...
// convert the data...
cvConvertScale(b8u, b32f, 1, 0);
cvConvertScale(g8u, g32f, 1, 0);
cvConvertScale(r8u, r32f, 1, 0);
// merge them back together again if you need to...
cvMerge(r32f, g32f, b32f, NULL, dst);
Yeah, to convert between types use cvConvertScale() and set the scale param to 1 and shift to 0.
A nice macro for this is:
#define cvConvert(src, dst) cvConvertScale((src), (dst), 1, 0 )

Resources