I am trying to find the Distance Transform for each pixels of a binary image, using OpenCV library for C. According to the rule of DT, the value of each zero (black) pixels should be 0. And that of 255 (white) pixels should be the shortest distance to a zero (black) pixel, after applying Distance transform.
I post the code here.
IplImage *im = cvLoadImage("black_white.jpg", CV_LOAD_IMAGE_GRAYSCALE);
IplImage *tmp = cvCreateImage(cvGetSize(im), 32, 1);
cvThreshold(im, im, 128, 255, CV_THRESH_BINARY_INV);
//cvSaveImage("out.jpg", im);
cvDistTransform(im, tmp, CV_DIST_L1, 3, 0, 0 );
d = (uchar*)tmp->imageData;
da = (uchar*)im->imageData;
for(i=0;i<tmp->height;i++)
for(j=0;j<tmp->width;j++)
{
//if((int)da[i*im->widthStep + j] == 255)
fprintf(f, "pixel value = %d DT = %d\n", (int)da[i*im->widthStep + j], (int)d[i*tmp->widthStep + j]);
}
cvShowImage("H", tmp);
cvWaitKey(0);
cvDestroyWindow("H");
fclose(f);
I write the pixels values along with their DT values to a file. As it turns out, some of the 0 pixels have DT values like 65, 128 etc. ie they are not 0. Moreover, I also have some white pixels that have DT values as 0 (which I guess, souldn't happen as it should be atleast 1).
Any kind of help will be appreciated.
Thanks in advance.
I guess it is because of CV_THRESH_BINARY_INV which inverts your image. So the areas you expect to be white are in fact black for DT.
Of cause, inverting the image may be your intention. Display the image im and compare with tmp for verification
Related
I've been working on a bitmap loader, with the main goal to do nothing more than parse the data properly and render it in OpenGL. I'm at the point where I need to draw the pixels on an x/y (i.e., pixel by pixel) basis (at least, this is what I think I need to do as far as rendering is concerned). I've already bound the texture object and called glTexImage2D(...).
Currently, what I'm having trouble with is the pixel by pixel algorithm.
As far as I understand it, bitmap (aka DIB) files store color data in what is known as the pixel array. Each row of pixels consists of x amount of bytes, with each pixel holding a byte count divisible either by 4 ( 32 bits per pixel ), 3 ( 24 bits per pixel ), 2 ( 16 bits per pixel ), or 1 ( 8 bits per pixel ).
I think need to loop through the pixels while at the same time calculating the right offset within the pixel array, which is relative to its pixel x/y coordinate. Is this true, though? If not, what should I do? I'm honestly slightly confused as to whether or not, despite doing what was directed to me in this question I asked sometime ago, this approach is correct.
I assume that going about it on a pixel by pixel basis is the right approach, mainly because
rendering a quad with glVertex* and glTexCoord* produced nothing more than a grayed out rectangle (at the time I thought the OpenGL would handle this by itself, hence why attempting that in the first place).
I should also note that, while my question displays OpenGL 3.1 shaders, I moved to SDL 1.2
so I could just use immediate mode for the time being until I got the right algorithms implemented, and then switch back to modern GL.
The test image I'm parsing:
It's data output (pastebinned due to its very long length):
http://pastebin.com/6RVhAVRq
And The Code:
void R_RenderTexture_PixByPix( texture_bmp_t* const data, const vec3 center )
{
glBindTexture( GL_TEXTURE_2D, data->texbuf_id );
glBegin( GL_POINTS );
{
const unsigned width = data->img_data->width + ( unsigned int ) center[ VEC_X ];
const unsigned height = data->img_data->height + ( unsigned int ) center[ VEC_Y ];
const unsigned bytecount = GetByteCount( data->img_data->bpp );
const unsigned char* pixels = data->img_data->pixels;
unsigned color_offset = 0;
unsigned x_pixel;
for ( x_pixel = center[ VEC_X ]; x_pixel < width; ++x_pixel )
{
unsigned y_pixel;
for ( y_pixel = center[ VEC_Y ]; y_pixel < height; ++y_pixel )
{
}
const bool do_color_update = true; //<--- replace true with a condition which checks to see if the color needs to be updated.
if ( do_color_update )
{
glColor3fv( pixels + color_offset );
}
color_offset += bytecount;
}
}
glEnd();
glBindTexture( GL_TEXTURE_2D, 0 );
}
You're completely missing the point of a OpenGL texture in your code. The texture holds the image for you and the rasterizer does all the iterations over the pixel data for you. No need to write a slow pixel-pusher loop yourself.
As your code stands right now that texture is completely bogus and does nothing. You could completely omit the calls to glBindTexture and it'd still work – or not, because you're not actually drawing anything, you just set the glColor state. To draw something you'd have to call glVertex.
So why not leverage the pixel-pushing performance of modern GPUs and actually use a texture? How about this:
void R_RenderTexture_PixByPix( texture_bmp_t* const data, const vec3 center )
{
if( 0 == data->texbuf_id ) {
glGenTextures(1, &(data->texbuf_id));
glBindTexture( GL_TEXTURE_2D, data->texbuf_id );
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
// there are a few more, but the defaults are ok
// if you didn't change them no need for further unpack settings
GLenum internal_format;
GLenum format;
GLenum type;
switch(data->img_data->bpp) {
case 8:
// this could be a palette or grayscale
internal_format = GL_LUMINANCE8;
format = GL_LUMINANCE;
type = GL_UNSIGNED_BYTE;
break;
case 15:
internal_format = GL_RGB5;
format = GL_BGR; // BMP files have BGR pixel order
type = GL_UNSIGNED_SHORT_1_5_5_5;
break;
case 16:
internal_format = GL_RGB8;
format = GL_BGR; // BMP files have BGR pixel order
type = GL_UNSIGNED_SHORT_5_6_5;
break;
case 24:
internal_format = GL_RGB8;
format = GL_BGR; // BMP files have BGR pixel order
type = GL_UNSIGNED_BYTE;
break;
case 32:
internal_format = GL_RGB8;
format = GL_BGR; // BMP files have BGR pixel order
type = GL_UNSIGNED_INT_8_8_8_8;
break;
}
glTexImage2D( GL_TEXTURE_2D, 0, internal_format,
data->img_data->width, data->img_data->height, 0,
format, type, data->img_data->pixels );
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
} else {
glBindTexture( GL_TEXTURE_2D, data->texbuf_id );
}
static GLfloat verts[] = {
0, 0,
1, 0,
1, 1,
0, 1
};
// the following is to address texture image pixel centers
// tex coordinates 0 and 1 are not on pixel centers!
float const s0 = 1. / (2.*tex_width);
float const s1 = ( 2.*(tex_width-1) + 1.) / (2.*tex_width);
float const t0 = 1. / (2.*tex_height);
float const t1 = ( 2.*(tex_height-1) + 1.) / (2.*tex_height);
GLfloat texcoords[] = {
s0, t0,
s1, t0,
s1, t1,
s0, t1
};
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glEnable(GL_TEXTURE_2D);
glVertexPointer(2, GL_FLOAT, 0, verts);
glTexCoordPointer(2, GL_FLOAT, 0, texcoords);
glColor4f(1., 1., 1., 1.);
glDrawArrays(GL_QUADS, 0, 4);
glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
glBindTexture( GL_TEXTURE_2D, 0 );
}
Your intuition is basically correct. The pixels are stored as an array of bytes, but the bytes are arranged into consecutive groups, with each group representing a single pixel. To address a single pixel, you'll need to do a calculation like this:
unsigned char* image_data = start_of_pixel_data;
unsigned char* pixel_addr = image_data + bytes_per_pixel * (y * width_in_pixels + x);
Be careful about the width in pixels, as sometimes there is padding at the end of the row to bring the total row width in bytes up to a multiple of 4/8/16/32/64/etc. I recommend looking at the actual bytes of the bitmap in hex first to get a sense of what is going on. It's a great learning exercise and will give you high confidence in your pixel-walking code, which is what you want. You might be able to use a debugger to do this, or else write a simple loop with printf over the image bytes.
The CvMat type 16 corresponds to "CV_AA". Is there an easy conversion between this and the type CV_32F?
Something in the same vein as cvCvtColor(cimg,gimg,CV_BGR2GRAY);?
CV_AA is used for telling drawing functions (i.e., line, circle, fonts, etc) to perform anti-aliased drawing; I don't believe it is a proper Mat data-type. As you can see in core_c.h, it is defined in the drawing functions section.
Could you show the code where you are receiving this data-type from?
EDIT : I think I see what's going on :)
Given that CV_8U is this:
#define CV_8U 0
And CV_MAKETYPE is:
#define CV_MAKETYPE(depth,cn) (CV_MAT_DEPTH(depth) + (((cn)-1) << CV_CN_SHIFT))
where cn is the number of channels, and CV_CN_SHIFT is 3. I'm betting the type 16 you are seeing is actually
(0 + ((3 - 1) << 3)) -> 16 or AKA CV_8UC3.
So, you have an 8bpp RGB image not a CV_AA image :)
You need to convert each channel from CV_8U to CV_32F.
EDIT : Take a look at using cvSplit and cvMerge (I haven't used the C interface in a while, but it should be something like the following):
IplImage* src = cvCreateImage( size, IPL_DEPTH_8U, 3 ); // CV_8UC3
IplImage* r8u = cvClone(src);
IplImage* g8u = cvClone(src);
IplImage* b8u = cvClone(src);
IplImage* dst = cvCreateImage( size, IPL_DEPTH_32F, 3 ); // CV_32F
IplImage* r32f = cvClone(dst);
IplImage* g32f = cvClone(dst);
IplImage* b32f = cvClone(dst);
// split the channels apart...
cvSplit(src, b8u, g8u, r8u, NULL); // assuming in OpenCV BGR order here...may be RGB...
// convert the data...
cvConvertScale(b8u, b32f, 1, 0);
cvConvertScale(g8u, g32f, 1, 0);
cvConvertScale(r8u, r32f, 1, 0);
// merge them back together again if you need to...
cvMerge(r32f, g32f, b32f, NULL, dst);
Yeah, to convert between types use cvConvertScale() and set the scale param to 1 and shift to 0.
A nice macro for this is:
#define cvConvert(src, dst) cvConvertScale((src), (dst), 1, 0 )
Suppose there is a frame with some image.I want to display only those parts which have pixel intensity above 120 or 130. How can I do that with OpenCv ? Is there any commands to do so?
Then i need to set those parts to some intensity of 190.
As mentioned by astay13, you can use the threshold function like this:
Mat image = imread("someimage.jpg", 0); // flag == 0 means read as grayscale
Mat mask;
// this tells you where locations >= 120 pixel intensity are
threshold(image, mask, 120.0, 255.0, THRESH_BINARY);
// this sets those locations to 190 based on the mask you just created
image.setTo(Scalar(190, 0, 0), mask);
imshow("image", image);
Hope that is helpful!
You could try the cvThreshold function. For the second part, cvFloodFill might be what you need.
int main()
{
image_double image;
ntuple_list out;
unsigned int xsize,ysize,depth;
int x,y,i,j,width,height,step;
uchar *p;
IplImage* img = 0;
IplImage* dst = 0;
img = cvLoadImage("D:\\Ahram.jpg",CV_LOAD_IMAGE_COLOR);
width = img->width;
height = img->height;
dst=cvCreateImage(cvSize(width,height),IPL_DEPTH_8U,1);
cvCvtColor(img,dst,CV_RGB2GRAY);
width=dst->width;
height=dst->height;
step=dst->widthstep;
p=(uchar*)dst->imageData;
image=new_image_double(dst->width,dst->height);
xsize=dst->width;
for(i=0;i<height;i++)
{
for(j=0;j<width;j++)
{
image->data[i+j*xsize]=p[i*step+j];
}
}
/* call LSD */
out = lsd(dst);
/* print output */
printf("%u line segments found:\n",out->size);
for(i=0;i<out->size;i++)
{
for(j=0;j<out->dim;j++)
printf("%f ",out->values[ i * out->dim + j ]);
printf("\n");
}
/* free memory */
free_image_double(image);
free_ntuple_list(out);
return 0;
}
N.B:it has no errors but when i run it gives out an LSD internal error:invalid image input
Start by researching how PGM is structured:
Each PGM image consists of the following:
1. A "magic number" for identifying the file type.
A pgm image's magic number is the two characters "P5".
2. Whitespace (blanks, TABs, CRs, LFs).
3. A width, formatted as ASCII characters in decimal.
4. Whitespace.
5. A height, again in ASCII decimal.
6. Whitespace.
7. The maximum gray value (Maxval), again in ASCII decimal.
Must be less than 65536, and more than zero.
8. A single whitespace character (usually a newline).
9. A raster of Height rows, in order from top to bottom.
Each row consists of Width gray values, in order from left to right.
Each gray value is a number from 0 through Maxval, with 0 being black
and Maxval being white. Each gray value is represented in pure binary
by either 1 or 2 bytes. If the Maxval is less than 256, it is 1 byte.
Otherwise, it is 2 bytes. The most significant byte is first.
For PGM type P2, pixels are readable (ASCII) on the file, but for P5 they won't be because they will be stored in binary format.
One important thing you should know, is that this format takes only 1 channel per pixel. This means PGM can only store GREY scaled images. Remember this!
Now, if you're using OpenCV to load images from a file, you should load them using CV_LOAD_IMAGE_GRAYSCALE:
IplImage* cv_img = cvLoadImage("chairs.png", CV_LOAD_IMAGE_GRAYSCALE);
if(!cv_img)
{
std::cout << "ERROR: cvLoadImage failed" << std::endl;
return -1;
}
But if you use any other flag on this function or if you create an image with cvCreateImage(), or if you're capturing frames from a camera or something like that, you'll need to convert each frame to its grayscale representation using cvCvtColor().
I downloaded lsd-1.5 and noticed that there is an example there that shows how to use the library. One of the source code files, named lsd_cmd.c, manually reads a PGM file and assembles an image_double with it. The function that does this trick is read_pgm_image_double(), and it reads the pixels from a PGM file and stores them inside image->data. This is important because if the following does not work, you'll have to iterate on the pixels of IplImage and do this yourself.
After successfully loading a gray scaled image into IplImage* cv_img, you can try to create the structure you need with:
image_double image = new_image_double(cv_img->width, cv_img->height);
image->data = (double) cv_img->imageData;
In case this doesn't work, you'll need to check the file I suggested above and iterate through the pixels of cv_img->imageData and copy them one by one (doing the proper type conversion) to image->data.
At the end, don't forget to free this resource when you're done using it:
free_image_double(image);
This question helped me some time ago. You probably solved it already so sorry for the delay but i'm sharing now the answer.
I'm using lsd 1.6 and the lsd interface is a little different from the one you are using (they changed the lsd function interface from 1.5 to 1.6).
CvCapture* capture;
capture = cvCreateCameraCapture (0);
assert( capture != NULL );
//get capture properties
int width = cvGetCaptureProperty(capture, CV_CAP_PROP_FRAME_WIDTH);
int height = cvGetCaptureProperty(capture, CV_CAP_PROP_FRAME_HEIGHT);
//create OpenCV image structs
IplImage *frame;
IplImage *frameBW = cvCreateImage( cvSize( width, height ), IPL_DEPTH_8U, 1 );
//create LSD image type
double *image;
image = (double *) malloc( width * height * sizeof(double) );
while (1) {
frame = cvQueryFrame( capture );
if( !frame ) break;
//convert to grayscale
cvCvtColor( frame , frameBW, CV_RGB2GRAY);
//cast into LSD image type
uchar *data = (uchar *)frameBW->imageData;
for (i=0;i<width;i++){
for(j=0;j<height;j++){
image[ i + j * width ] = data[ i + j * width];
}
}
//run LSD
double *list;
int n;
list = lsd( &n, image, width, height );
//DO PROCESSING DRAWING ETC
//draw segments on frame
for (int j=0; j<n ; j++){
//define segment end-points
CvPoint pt1 = cvPoint(list[ 0 + j * 7 ],list[ 1 + j * 7 ]);
CvPoint pt2 = cvPoint(list[ 2 + j * 7 ],list[ 3 + j * 7 ]);
// draw line segment on frame
cvLine(frame,pt1,pt2,CV_RGB(255,0,0),1.5,8,0);
}
cvShowImage("FRAME WITH LSD",frame);
//free memory
free( (void *) list );
char c = cvWaitKey(1);
if( c == 27 ) break; // ESC QUITS
}
//free memory
free( (void *) image );
cvReleaseImage( &frame );
cvReleaseImage( &frameBW );
cvDestroyWindow( "FRAME WITH LSD");
Hope this helps you or someone in the future! LSD works really great.
I have two png images First one with Width1 2247 Height1 190 and second one with Width2 155 Height2 36. I wan't the second image(src) to be placed in the center of first image(dest). I created pixel buf of both and used gdk_pixbuf_composite as follows.
gdk_pixbuf_composite( srcpixbuf, dstpixbuf, 1000, 100, width2, height2, 0, 0, 1, 1, GDK_INTERP_BILINEAR, 255);
I get a hazy window of width2 and height2 on the first image.
If I replace width2 and height2 with 1.0 then I don't get the srcimage on the dstimage. Where am I going wrong?
gdk_pixbuf_composite( srcpixbuf, dstpixbuf, 1000, 100, width2, height2, 1000, 100, 1, 1, GDK_INTERP_BILINEAR, 255);
This solved. Wrongly understood the offset parameter. Basically an intermediate scaled image is created and only the part represented by the dest wid, height is composited. So in my case we need to move the entire unscaled image to the destination offset which is done by the offset parameter.