RGB to YUV conversion with libav (ffmpeg) triplicates image - c

I'm building a small program to capture the screen (using X11 MIT-SHM extension) on video. It works well if I create individual PNG files of the captured frames, but now I'm trying to integrate libav (ffmpeg) to create the video and I'm getting... funny results.
The furthest I've been able to reach is this. The expected result (which is a PNG created directly from the RGB data of the XImage file) is this:
However, the result I'm getting is this:
As you can see the colors are funky and the image appears cropped three times. I have a loop where I capture the screen, and first I generate the individual PNG files (currently commented in the code below) and then I try to use libswscale to convert from RGB24 to YUV420:
while (gRunning) {
printf("Processing frame framecnt=%i \n", framecnt);
if (!XShmGetImage(display, RootWindow(display, DefaultScreen(display)), img, 0, 0, AllPlanes)) {
printf("\n Ooops.. Something is wrong.");
break;
}
// PNG generation
// snprintf(imageName, sizeof(imageName), "salida_%i.png", framecnt);
// writePngForImage(img, width, height, imageName);
unsigned long red_mask = img->red_mask;
unsigned long green_mask = img->green_mask;
unsigned long blue_mask = img->blue_mask;
// Write image data
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
unsigned long pixel = XGetPixel(img, x, y);
unsigned char blue = pixel & blue_mask;
unsigned char green = (pixel & green_mask) >> 8;
unsigned char red = (pixel & red_mask) >> 16;
pixel_rgb_data[y * width + x * 3] = red;
pixel_rgb_data[y * width + x * 3 + 1] = green;
pixel_rgb_data[y * width + x * 3 + 2] = blue;
}
}
uint8_t* inData[1] = { pixel_rgb_data };
int inLinesize[1] = { in_w };
printf("Scaling frame... \n");
int sliceHeight = sws_scale(sws_context, inData, inLinesize, 0, height, pFrame->data, pFrame->linesize);
printf("Obtained slice height: %i \n", sliceHeight);
pFrame->pts = framecnt * (pVideoStream->time_base.den) / ((pVideoStream->time_base.num) * 25);
printf("Frame pts: %li \n", pFrame->pts);
int got_picture = 0;
printf("Encoding frame... \n");
int ret = avcodec_encode_video2(pCodecCtx, &pkt, pFrame, &got_picture);
// int ret = avcodec_send_frame(pCodecCtx, pFrame);
if (ret != 0) {
printf("Failed to encode! Error: %i\n", ret);
return -1;
}
printf("Succeed to encode frame: %5d - size: %5d\n", framecnt, pkt.size);
framecnt++;
pkt.stream_index = pVideoStream->index;
ret = av_write_frame(pFormatCtx, &pkt);
if (ret != 0) {
printf("Error writing frame! Error: %framecnt \n", ret);
return -1;
}
av_packet_unref(&pkt);
}
I've placed the entire code at this gist. This question right here looks pretty similar to mine, but not quite, and the solution did not work for me, although I think this has something to do with the way the line stride is calculated.

Don't use av_image_alloc use av_frame_get_buffer.
(unrelated to your question, But using avcodec_encode_video2 is considered bad practice now and should be replaced with avcodec_send_frame and avcodec_receive_packet)

In the end, the error was not in the usage of libav but on the code that fills the pixel data from XImage to the rgb vector. Instead of using:
pixel_rgb_data[y * width + x * 3 ] = red;
pixel_rgb_data[y * width + x * 3 + 1] = green;
pixel_rgb_data[y * width + x * 3 + 2] = blue;
I should have used this:
pixel_rgb_data[3 * (y * width + x) ] = red;
pixel_rgb_data[3 * (y * width + x) + 1] = green;
pixel_rgb_data[3 * (y * width + x) + 2] = blue;
Somehow I was multiplying only the the horizontal displacement within the matrix, not the vertical displacement. The moment I changed it, it worked perfectly.

Related

how to create bitmap in C and compile with gcc

i decided to learn C, and i try to follow this tutorial http://ricardolovelace.com/creating-bitmap-images-with-c-on-windows.html
but when i try to compile my code with gcc as this >gcc -Wall testc o app
he doesn't know type_rgb, can i define this type and how? and where in my code ?
#include <stdio.h>
struct rgb_data {
float r, g, b;
};
void save_bitmap( const char *file_name, int width, int height, int dpi, type_rgb *pixel_data);
/*
next steps of the tutorial
*/
rgb_data *pixels = new rgb_data[width * height];
for( int x = 0; x < width; x++)
{
for(int y = 0; y < height; y++)
int a = y * width +x;
{
if ((x > 50 && x < 350) && (y > y && y < 350))
{
pixels[a].r = 255;
pixels[a].g = 255;
pixels[a].b = 0;
}else{
pixels[a].r = 55;
pixels[a].g = 55;
pixels[a].b = 55;
}
}
}
save_bitmap("black_border.bmp", width, height, dpi, pixels);
Bitmap file format is rather complicated. This is not the best way to learn C. It's better to start with something much simpler.
Having said that, the bitmap format starts with a bitmap header BITMAPFILEHEADER structure which is 14 bytes long, followed by BITMAPINFOHEADER structure 40 bytes long. These structures are defined in "Windows.h"
You have to write in various information in these structures and write them to file before writing the actual pixels.
You can have 1, 4, 8, 16, 24, and 32-bit bitmap. This is an example to read a 32-bit bitmap. This code assumes sizeof(short) is 2, sizeof(int) is 4.
int main()
{
int row, column;
int width = 100;
int height = 100;
int size = width * height * 4; //for 32-bit bitmap only
char header[54] = { 0 };
strcpy(header, "BM");
memset(&header[2], (int)(54 + size), 1);
memset(&header[10], (int)54, 1);//always 54
memset(&header[14], (int)40, 1);//always 40
memset(&header[18], (int)width, 1);
memset(&header[22], (int)height, 1);
memset(&header[26], (short)1, 1);
memset(&header[28], (short)32, 1);//32bit
memset(&header[34], (int)size, 1);//pixel size
unsigned char *pixels = malloc(size);
for(row = height - 1; row >= 0; row--) {
for(column = 0; column < width; column++) {
int p = (row * width + column) * 4;
pixels[p + 0] = 64; //blue
pixels[p + 1] = 128;//green
pixels[p + 2] = 192;//red
}
}
FILE *fout = fopen("32bit.bmp", "wb");
fwrite(header, 1, 54, fout);
fwrite(pixels, 1, size, fout);
free(pixels);
fclose(fout);
return 0;
}
Note the first pixel is blue, followed by green and read. The last pixel is not used in 32-bit bitmap. Also the height goes from bottom to top. This is another odd feature of bitmap. 24-bit bitmaps are more complicated because they need padding. 8-bit and lower will need an additional palette.
struct rgb_data {
float r, g, b;
};
float is not the right type for pixels. Each color goes from 0 to 255. This fits in unsigned char. You need instead
struct rgb_data {
unsigned r, g, b, alpha;
};
The alpha is the extra byte for 32-bit bitmap (which we won't use). Notice the size of this structure is 4. You can allocate this as
struct rgb_data *rgb = malloc(size);
Now you can access the pixels as follows:
int p = (row * width + column);
rgb[p].r = 255;
rgb[p].g = 0;
rgb[p].b = 0;
...
fwrite(rgb, 4, width * height, fout);

Issue displaying IDirect3DTexture8 after backporting from IDirect3DTexture9

I'm trying to backport someones Direct3d9 port of Quake 1 by ID software to Direct3d8 so I can port it to the original Xbox (only uses the D3D8 API).
After making the changes to use Direct3d8 it displays some mashed up pixels on the screen that appear to be in little squares :/ (see pictures).
Does anyone know whats gone wrong here? It works flawlessly with D3D9, is there some extra arguments required that I'm missing require for D3D8, rect pitch maybe?
The data been passed in is a Quake 1 .lmp 2d image file. "It consists of two integers (width and height) followed by a string of width x height bytes, each of which is an index into the Quake palette"
Its been passed to the D3D_ResampleTexture() function.
Any help would be much appreciated.
Image output using D3D8
Image output using D3D9
The code:
void D3D_ResampleTexture (image_t *src, image_t *dst)
{
int y, x , srcpos, srcbase, dstpos;
unsigned int *dstdata, *srcdata;
// take an unsigned pointer to the dest data that we'll actually fill
dstdata = (unsigned int *) dst->data;
// easier access to src data for 32 bit resampling
srcdata = (unsigned int *) src->data;
// nearest neighbour for now
for (y = 0, dstpos = 0; y < dst->height; y++)
{
srcbase = (y * src->height / dst->height) * src->width;
for (x = 0; x < dst->width; x++, dstpos++)
{
srcpos = srcbase + (x * src->width / dst->width);
if (src->flags & IMAGE_32BIT)
dstdata[dstpos] = srcdata[srcpos];
else if (src->palette)
dstdata[dstpos] = src->palette[src->data[srcpos]];
else Sys_Error ("D3D_ResampleTexture: !(flags & IMAGE_32BIT) without palette set");
}
}
}
void D3D_LoadTextureStage3 (LPDIRECT3DTEXTURE8/*9*/ *tex, image_t *image)
{
int i;
image_t scaled;
D3DLOCKED_RECT LockRect;
memset (&LockRect, 0, sizeof(D3DLOCKED_RECT));
// check scaling here first
for (scaled.width = 1; scaled.width < image->width; scaled.width *= 2);
for (scaled.height = 1; scaled.height < image->height; scaled.height *= 2);
// clamp to max texture size
if (scaled.width > /*d3d_DeviceCaps.MaxTextureWidth*/640) scaled.width = /*d3d_DeviceCaps.MaxTextureWidth*/640;
if (scaled.height > /*d3d_DeviceCaps.MaxTextureHeight*/480) scaled.height = /*d3d_DeviceCaps.MaxTextureHeight*/480;
IDirect3DDevice8/*9*/_CreateTexture(d3d_Device, scaled.width, scaled.height,
(image->flags & IMAGE_MIPMAP) ? 0 : 1,
/*(image->flags & IMAGE_MIPMAP) ? D3DUSAGE_AUTOGENMIPMAP :*/ 0,
(image->flags & IMAGE_ALPHA) ? D3DFMT_A8R8G8B8 : D3DFMT_X8R8G8B8,
D3DPOOL_MANAGED,
tex
);
// lock the texture rectangle
//(*tex)->LockRect (0, &LockRect, NULL, 0);
IDirect3DTexture8/*9*/_LockRect(*tex, 0, &LockRect, NULL, 0);
// fill it in - how we do it depends on the scaling
if (scaled.width == image->width && scaled.height == image->height)
{
// no scaling
for (i = 0; i < (scaled.width * scaled.height); i++)
{
unsigned int p;
// retrieve the correct texel - this will either be direct or a palette lookup
if (image->flags & IMAGE_32BIT)
p = ((unsigned *) image->data)[i];
else if (image->palette)
p = image->palette[image->data[i]];
else Sys_Error ("D3D_LoadTexture: !(flags & IMAGE_32BIT) without palette set");
// store it back
((unsigned *) LockRect.pBits)[i] = p;
}
}
else
{
// save out lockbits in scaled data pointer
scaled.data = (byte *) LockRect.pBits;
// resample data into the texture
D3D_ResampleTexture (image, &scaled);
}
// unlock it
//(*tex)->UnlockRect (0);
IDirect3DTexture8/*9*/_UnlockRect(*tex, 0);
// tell Direct 3D that we're going to be needing to use this managed resource shortly
//FIXME
//(*tex)->PreLoad ();
}
LPDIRECT3DTEXTURE8/*9*/ D3D_LoadTextureStage2 (image_t *image)
{
d3d_texture_t *tex;
// look for a match
// create a new one
tex = (d3d_texture_t *) malloc (sizeof (d3d_texture_t));
// link it in
tex->next = d3d_Textures;
d3d_Textures = tex;
// fill in the struct
tex->LastUsage = 0;
tex->d3d_Texture = NULL;
// copy the image
memcpy (&tex->TexImage, image, sizeof (image_t));
// upload through direct 3d
D3D_LoadTextureStage3 (&tex->d3d_Texture, image);
// return the texture we got
return tex->d3d_Texture;
}
LPDIRECT3DTEXTURE8/*9*/ D3D_LoadTexture (char *identifier, int width, int height, byte *data, /*bool*/qboolean mipmap, /*bool*/qboolean alpha)
{
image_t image;
image.data = data;
image.flags = 0;
image.height = height;
image.width = width;
image.palette = d_8to24table;
strcpy (image.identifier, identifier);
if (mipmap) image.flags |= IMAGE_MIPMAP;
if (alpha) image.flags |= IMAGE_ALPHA;
return D3D_LoadTextureStage2 (&image);
}
When you lock the texture, you have to observe the returned Pitch member of the D3DLOCKED_RECT structure. Your code is assuming that all the data is contiguous, but the Pitch can be larger than the width of a scanline in order to allow for locking a subregion and other layouts of the buffer that don't have contiguous pixels at the end of one scanline to the beginning of the next.
Look at Chapter 4 of my book "The Direct3D Graphics Pipeline" to see an example of accessing a surface and using the Pitch properly.
For anyone else that comes across this issue, it was due to the way the image was been loaded into the Xbox's memory, it needed to be swizzled.

Create Bitmap in C

I'm trying to create a Bitmap that shows the flightpath of a bullet.
int drawBitmap(int height, int width, Point* curve, char* bitmap_name)
{
int image_size = width * height * 3;
int padding = width - (width % 4);
struct _BitmapFileheader_ BMFH;
struct _BitmapInfoHeader_ BMIH;
BMFH.type_[1] = 'B';
BMFH.type_[2] = 'M';
BMFH.file_size_ = 54 + height * padding;
BMFH.reserved_1_ = 0;
BMFH.reserved_2_ = 0;
BMFH.offset_ = 54;
BMIH.header_size_ = 40;
BMIH.width_ = width;
BMIH.height_ = height;
BMIH.colour_planes_ = 1;
BMIH.bit_per_pixel_ = 24;
BMIH.compression_ = 0;
BMIH.image_size_ = image_size + height * padding;
BMIH.x_pixels_per_meter_ = 2835;
BMIH.y_pixels_per_meter_ = 2835;
BMIH.colours_used_ = 0;
BMIH.important_colours_ = 0;
writeBitmap(BMFH, BMIH, curve, bitmap_name);
}
void* writeBitmap(struct _BitmapFileheader_ file_header,
struct _BitmapInfoHeader_ file_infoheader, void* pixel_data, char* file_name)
{
FILE* image = fopen(file_name, "w");
fwrite((void*)&file_header, 1, sizeof(file_header), image);
fwrite((void*)&file_infoheader, 1, sizeof(file_infoheader), image);
fwrite((void*)pixel_data, 1, sizeof(pixel_data), image);
fclose(image);
return 0;
}
Curve is the return value from the function which calculates the path. It points at an array of Points, which is a struct of x and y coordinates.
I don't really know how to "put" the data into the Bitmap correctly.
I just started programming C recently and I'm quite lost at the moment.
You already know about taking up any slack space in each pixel row, but I see a problem in your calculation. Each pixel row must have length % 4 == 0. So with 3 bytes per pixel (24-bit)
length = ((3 * width) + 3) & -4; // -4 as I don't know the int size, say 0xFFFFFFFC
Look up the structure of a bitmap - perhaps you already have. Declare (or allocate) an image byte array size height * length and fill it with zeros. Parse the bullet trajectory and find the range of x and y coordinates. Scale these to the bitmap size width and height. Now parse the bullet trajectory again, scaling the coordinates to xx and yy, and write three 0xFF bytes (you specified 24-bit colour) into the correct place in the array for each bullet position.
if (xx >= 0 && xx < width && yy >= 0 && yy < height) {
index = yy * length + xx * 3;
bitmap [index] = 0xFF;
bitmap [index + 1] = 0xFF;
bitmap [index + 2] = 0xFF;
}
Finally save the bitmap info, header and image data to file. When that works, you can refine your use of colour.

ARToolkit using a still image

I'm trying to use ARtoolkit, but with a static image instead of a video stream. I need to be able to load an image, identify markers, and locate them. I'm using SDL for loading the image. I'm able to obtain the RGB values for each pixel from the loaded image, but I'm unsure how to format the data for ARToolkit to work with it.
ARToolkit stores its images as type ARUint8* (an unsigned char*). I'm confused as to how this format works. Right now I have this code inside the main loop that runs continuously as the program is executing. This code (should) print out the RGB values for each pixel in the frame.
ARUint8* dataPtr;
dataPtr = arVideoGetImage(); // Get a new frame from the webcam
int width, height;
if (arVideoInqSize(&width, &height) == 0) // if width and height could be obtained
{
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
printf("pixel %i, %i: %i, %i, %i\n", x, y, dataPtr[(y * 320) + x], dataPtr[(y * 320) + x + 1], dataPtr[(y * 320) + x + 2]);
}
}
}
Typical output:
pixel 5, 100: 0, 0, 0
pixel 6, 100: 178, 3, 0
pixel 7, 100: 0, 0, 177
etc...
It seems to be accessing the RGB values correctly, but I'm unsure how to copy over the image data (from SDL's format) into this new format.
Figured it out. Posting answer in case anyone else ever needs it.
On Windows, ARToolkit defaults to BGRA for the dataPtr array. The following function will load an image (using SDL) and return a pointer to a ARUint8 (that contains the image data).
ARUint8* loadImage(char* filename, int* w, int* h)
{
SDL_Surface* img = IMG_Load(filename);
if (!img)
{
printf("Image '%s' failed to load. Error: %s\n", filename, IMG_GetError());
return NULL;
}
*w = img->w; // Assign width and height to the given pointers
*h = img->h;
ARUint8* dataPtr = (ARUint8*)calloc(img->w * img->h * 4, sizeof(ARUint8)); // Allocate space for image data
// Write image data to the dataPtr variable
for (int y = 0; y < img->h; y++)
{
for (int x = 0; x < img->w; x++)
{
Uint8 r, g, b;
SDL_GetRGB(getpixel(img, x, y), img->format, &r, &g, &b); // Get the RGB values
int i = ((y * img->w) + x) * 4; // Figure out index in array
dataPtr[i] = b;
dataPtr[i + 1] = g;
dataPtr[i + 2] = r;
dataPtr[i + 3] = 0; // Alpha
}
}
SDL_FreeSurface(img);
return dataPtr;
}
The getpixel function is borrowed from here: http://sdl.beuc.net/sdl.wiki/Pixel_Access
This function allowed me to use a photograph instead of a video feed from a webcam.

What is the simplest RGB image format?

I am working in C on a physics experiment, Young's interference experiment and I made a program who prints to file a huge bunch of pixels:
for (i=0; i < width*width; i++)
{
fwrite(hue(raster_matrix[i]), 1, 3, file);
}
Where hue, when given a value [0..255], gives back a char * with 3 bytes, R,G,B.
I would like to put a minimal header in my image file in order to make this raw file a valid image file.
More concise, switching from:
offset
0000 : height * width : data } my data, 24bit RGB pixels
to:
offset
0000 : dword : magic \
: /* ?? */ \
0012 : dword : height } Header <--> common image file
0016 : dword : width /
: /* ?? */ /
0040 : height * width : data } my data, 24bit RGB pixels
You probably want to use the PPM format which is what you're looking for: a minimal header followed by raw RGB.
TARGA (file name extension .tga) may be the simplest widely supported binary image file format if you don't use compression and don't use any of its extensions. It's even simpler than Windows .bmp files and is supported by ImageMagick and many paint programs. It has been my go-to format when I just need to output some pixels from a throwaway program.
Here's a minimal C program to generate an image to standard output:
#include <stdio.h>
#include <string.h>
enum { width = 550, height = 400 };
int main(void) {
static unsigned char pixels[width * height * 3];
static unsigned char tga[18];
unsigned char *p;
size_t x, y;
p = pixels;
for (y = 0; y < height; y++) {
for (x = 0; x < width; x++) {
*p++ = 255 * ((float)y / height);
*p++ = 255 * ((float)x / width);
*p++ = 255 * ((float)y / height);
}
}
tga[2] = 2;
tga[12] = 255 & width;
tga[13] = 255 & (width >> 8);
tga[14] = 255 & height;
tga[15] = 255 & (height >> 8);
tga[16] = 24;
tga[17] = 32;
return !((1 == fwrite(tga, sizeof(tga), 1, stdout)) &&
(1 == fwrite(pixels, sizeof(pixels), 1, stdout)));
}
The recently created farbfeld format is quite minimal, though there is not much software supporting it (at least so far).
Bytes │ Description
8 │ "farbfeld" magic value
4 │ 32-Bit BE unsigned integer (width)
4 │ 32-Bit BE unsigned integer (height)
(2+2+2+2)*width*height │ 4*16-Bit BE unsigned integers [RGBA] / pixel, row-major
Here's a minimal example that writes your image file with a minimal PPM header. Happily, I was able to get it to work with the exact for loop you've provided:
#include <math.h> // compile with gcc young.c -lm
#include <stdio.h>
#include <stdlib.h>
#define width 256
int main(){
int x, y, i; unsigned char raster_matrix[width*width], h[256][3];
#define WAVE(x,y) sin(sqrt( (x)*(x)+(y)*(y) ) * 30.0 / width)
#define hue(i) h[i]
/* Setup nice hue palette */
for (i = 0; i <= 85; i++){
h[i][0] = h[i+85][1] = h[i+170][2] = (i <= 42)? 255: 40+(85-i)*5;
h[i][1] = h[i+85][2] = h[i+170][0] = (i <= 42)? 40+i*5: 255;
h[i][2] = h[i+85][0] = h[i+170][1] = 40;
}
/* Setup Young's Interference image */
for (i = y = 0; y < width; y++) for (x = 0; x < width; x++)
raster_matrix[i++] = 128 + 64*(WAVE(x,y) + WAVE(x,width-y));
/* Open PPM File */
FILE *file = fopen("young.ppm", "wb"); if (!file) return -1;
/* Write PPM Header */
fprintf(file, "P6 %d %d %d\n", width, width, 255); /* width, height, maxval */
/* Write Image Data */
for (i=0; i < width*width; i++)
fwrite(hue(raster_matrix[i]), 1, 3, file);
/* Close PPM File */
fclose(file);
/* All done */
return 0;
}
The header code is based on the specs at http://netpbm.sourceforge.net/doc/ppm.html. For this image, the header is just a string of fifteen bytes: "P6 256 256 255\n".

Resources