I have a binary image. Within a certain region of interest, I need to count the number of black pixels. There is always the way of looping through the pixels and counting them, but I'm looking for a more efficient method as I need to do it real-time.
I found a way to count the number of nonzero pixels(using cvCountNonZero()). Is there any such equivalent function for counting zero pixels (there doesn't seem to be as far as I've seen)? If not, what is the most efficient way to count the black pixels?
I believe the number of zero pixels could be seen as:
int TotalNumberOfPixels = width * height;
int ZeroPixels = TotalNumberOfPixels - cvCountNonZero(cv_image);
Related
I'm working on writing a hobby operating system. Currently a large struggle that I'm having is attempting to scroll the framebuffer upwards.
It's simply a 32-bit linear framebuffer.
I have access to a few tools that could be helpful:
some of the mem* functions from libc: memset, memcpy, memmove, and memcmp
direct access to the framebuffer
the width, height, and size in bytes, of said framebuffer
a previous attempt that managed to scroll it up a few lines, albeit EXTREMELY slowly, it took roughly 25 seconds to scroll the framebuffer up by 5 pixels
speaking of which, my previous attempt:
for (uint64_t i = 0; i != atoi(numLines); i++) {
for (uint64_t j = 0; j != bootboot.fb_width; j++) {
for (uint64_t k = 1; k != bootboot.fb_size; k++) {
((uint32_t *)&fb)[k - 1] = ((uint32_t *)&fb)[k];
}
}
}
A few things to note about the above:
numLines is a variable passed into the function, it's a char * that contains the number of lines to scroll up by, in a string. I eventually want this to be the number of actual text lines to scroll up by, but for now treating this as how many pixels to scroll up by is sufficient.
the bootboot struct is provided by the bootloader I use, it contains a few variables that could be of use: fb_width (the width of the framebuffer), fb_height (the height of the framebuffer), and fb_size (the size, in bytes, of the framebuffer)
the fb variable that I'm using the address of is also provided by the bootloader I use, it is a single byte that is placed at the first byte of the framebuffer, hence the need to cast it into a uint32_t * before using it.
Any and all help would be appreciated.
If I read the code correctly, what's happening with the triple nested loops is:
For every line to scroll,
For every pixel that the framebuffer is wide,
For every pixel in the entire framebuffer,
Move that pixel backwards by one.
Essentially you're moving each pixel one pixel distance at a time, so it's no wonder it takes so long to scroll the framebuffer. The total number of pixel moves is (numLines * fb_width * fb_size), so if your framebuffer is 1024x768, that's 5*1024*1024*768 moves, which is 4,026,531,840 moves. That's basically 5000 times the amount of work required.
Instead, you'll want to loop over the framebuffer only once, calculate that pixel's start and its end pointer, and only do the move once. Or you can calculate the source, destination, and size of the move once and then use memmove. Here's my attempt at this (with excessive comments):
// Convert string to integer
uint32_t numLinesInt = atoi(numLines);
// The destination of the move is just the top of the framebuffer
uint32_t* destination = (uint32_t*)&fb;
// Start the move from the top of the framebuffer plus however
// many lines we want to scroll.
uint32_t* source = (uint32_t*)&fb +
(numLinesInt * bootboot.fb_width);
// The total number of pixels to move is the size of the
// framebuffer minus the amount of lines we want to scroll.
uint32_t pixelSize = (bootboot.fb_height - numLinesInt)
* bootboot.fb_width;
// The total number of bytes is that times the size of one pixel.
uint32_t byteSize = pixelSize * sizeof(uint32_t);
// Do the move
memmove(destination, source, byteSize);
I haven't tested this, and I'm making a number of assumptions about how your framebuffer is laid out, so please make sure it works before using it. :)
(P.S. Also, if you put atoi(numLines) inside the end condition of the for loop, atoi will be called every time through the loop, instead of once at the beginning like you intended.)
Currently a large struggle that I'm having is attempting to scroll the framebuffer upwards.
The first problem is that the framebuffer is typically much slower to access than RAM (especially reads); so you want to do all the drawing in a buffer in RAM and then blit it efficiently (with a smaller number of much larger writes).
Once you have a buffer in RAM, you can make the buffer bigger than the screen. E.g. for a 1024 x 768 video mode you might have a 1024 x 1024 buffer. In that case small amounts of scrolling can often be done using the same "blit it efficiently" function; but sometimes you'll have to scroll the buffer in RAM.
To scroll the buffer in RAM you can cheat - treat it as a circular buffer and map a second copy into virtual memory immediately after the first. This allows you to (e.g.) copy 768 lines starting from the middle of the first copy without caring about hitting the end of the first buffer. The end result is that you can scroll the buffer in RAM without moving any data or changing the "blit it efficiently" function.
As a bonus, this also minimizing "tearing" artifacts. E.g. often you want to scroll the pixel data up and add more pixel data to the bottom, then blit it (without the user seeing an intermediate "half finished" frame).
I'm trying to play around with image manipulation in C and I want to be able to read and write pixels on an SDL Surface. (I'm loading a bmp to a surface to get the pixel data) I'm having some trouble figuring out how to properly use the following functions.
SDL_CreateRGBSurfaceFrom();
SDL_GetRGB();
SDL_MapRGB();
I have only found examples of these in c++ and I'm having a hard time implementing it in C because I don't fully understand how they work.
so my questions are:
how do you properly retrieve pixel data using GetRGB? + How is the pixel addressed with x, y cordinates?
What kind of array would I use to store the pixel data?
How do you use SDL_CreateRGBSurfaceFrom() to draw the new pixel data back to a surface?
Also I want to access the pixels individually in a nested for loop for y and x like so.
for(int y = 0; y < h; y++)
{
for (int x = 0; x < w; x++)
{
// get/put the pixel data
}
}
First have a look at SDL_Surface.
The parts you're interested in:
SDL_PixelFormat*format
int w, h
int pitch
void *pixels
What else you should know:
On position x, y (which must be greater or equal to 0 and less than w, h) the surface contains a pixel.
The pixel is described by the format field, which tells us, how the pixel is organized in memory.
The Remarks section of SDL_PixelFormat gives more information on the used datatype.
The pitch field is basically the width of the surface multiplied by the size of the pixel (BytesPerPixel).
With the function SDL_GetRGB, one can easily convert a pixel of any format to a RGB(A) triple/quadruple.
SDL_MapRGB is the reverse of SDL_GetRGB, where one can specify a pixel as RGB(A) triple/quadruple to map it to the closest color specified by the format parameter.
The SDL wiki provides many examples of the specific functions, i think you will find the proper examples to solve your problem.
I'm working on a program in which I need to find all lines which are in a circles located at some cartesian point of some radius.
At the moment, for every circle, I am iterating over all the lines and checking if the line enters/contacts the circle at any point.
The code essentially looks like this.
for (int i = 0; i < num_circles; i++)
{
for (int j = 0; j < num_lines; j++)
{
if(lineIntersectWithCircle(circle[i], lines[j]))
{
//Append line[j] to a list of lines intersecting with circle[i];
//some code
}
}
}
I've been thinking of many way to optimize this, but I'm having trouble.
I have sorted the circles by minimum Cartesian distance and sorted lines by maximum distance away. This way you can somewhat optimize, but it's quite minimal because once you reach the point where line[j].max > circle[i].min, you still have to iterate through all the rest of the lines.
I am fine with my intersection checking method, I just would like to minimize the amount of times I need to call it.
Is there a good way of doing this?
Cheapest way is just check the bounding extents/rectangles of the two shapes (line and circle) prior to the more expensive intersection test. Chances are that you can even compute the extents on the fly of the line/circle, not precompute, and still get a decent performance boost unless your line/circle intersection is already dirt cheap.
A really effective approach but one that requires a bit more work is to just create a grid. You can use the bounding rectangles computed above to cheaply see which grid cells your shapes intersect.
struct GridNode
{
// Points to the index of the next node in the grid cell
// or -1 if we're at the end of the singly-linked list.
int next_node;
// Points to the index of the shape being stored.
int shape;
};
struct GridCell
{
// Points to the first node or -1 if the cell is empty.
int first_node;
};
struct Grid
{
// Stores the cells in the grid. This is just illustrative
// code. You should dynamically allocate this with adjustable
// grid widths and heights based on your needs.
struct GridCell cells[grid_width * grid_height];
// Stores the nodes in the grid (one or more nodes per shape
// inserted depending on how many it intersects). This is
// a variable-sized array you can realloc needed (ex: double
// the size when you're out of room).
struct GridNode* nodes;
// The maximum number of nodes we can store before realloc.
int node_cap;
// The number of nodes inserted so far. realloc when this
// exceeds node_cap.
int node_num;
};
... something to this effect. This way, most of the time you can insert elements to the grid doing nothing more than just some integer (emulating pointers) operations and adding some grid node entries to this variable-sized nodes array. Heap allocations occur very infrequently.
I find in practice this outperforms quad-trees if you have many dynamic elements moving from one cell to the next like in a 2D video game where everything is moving around all the time while we need rapid collision detection, and can even rival quad-trees for searching if you are careful with the memory layout of the nodes to minimize cache misses when iterating through grid cells that intersect the shape you are testing against. You can even do a post-pass after the grid is constructed to rearrange the memory of each node for cache-friendly list iteration based on how efficient you need the intersection searches to be. If you want to get fancy, you can use Bresenham to figure out exactly what grid cells a line intersects, e.g., but given the quadratic complexity of what you're doing, you stand to improve exponentially without bothering with that and just doing it in a very simple way with bounding rectangles.
Basically to find an intersection, first grab the bounding rect of the shape. Then see which cells it intersects in the grid. Now check for intersection with the shapes contained in the grid cells the original shape intersects. This way you can work towards constant-time complexity except for gigantic shapes (worst-case with O(n)) which are hopefully a rare case.
I even find use for these in 3 dimensions when things are moving around a lot. They're often cheaper than the octree, BVH, and kd-tree variants which provide extensive search acceleration but at the cost of more expensive builds and updates, and if you use this strategy of a singly-linked list for each grid cell which doesn't have to individually allocate nodes, you can store it in a very reasonable amount of memory even with the 3rd dimension. I wouldn't use a 3-dimensional version of this for raytracing, but it can be very useful for 3D collision detection, like detecting collision between particles moving every single frame.
As with anything it depends on your use case. If you have a fixed number of lines or infrequently added, you may want to precompute some of the calculations needed to find out if any part of the line is within radius distance of the center of the circle
Starting with the equation for the shortest distance between a line and a point and comparing that distance is less than the radius of the circle:
//abs(Cx*(y1-y0)-Cy*(x1-x0)+x1*y0-y1*x0)/sqrt((y1-y0)*(y1-y0)+(x1-x0)*(x1-x0))<R
//pull out some constants and cache these as they are initialized
//int y10 = y1-y0, //add to the line struct
// x10 = x1 -x0,
// delta = x1*y0-y1*x0,
// sides = (y10)*(y10)+(x10)*(x10);
// R2 = R*R; //add to the circle struct
//now the equation factors down to
//abs(Cx*(y10)-Cy*(x10)+delta)/sqrt(sides)< R //replace constants
//abs(Cx*(y10)-Cy*(x10)+delta) < sqrt(sides) * R //remove division
//pow(Cx*(y10)-Cy*(x10)+delta , 2.0) < sides * R * R //remove sqrt()
//int tmp = Cx*(y10)-Cy*(x10)+delta //factor out pow data
//tmp * tmp < sides * R2 //remove pow() and use cache R squared
//now it is just a few cheap instructions
Now the check should be just 4 integer multiplies, 2 add/subtract and a compare.
lineIntersectWithCircle(size_t circle, size_t line){
struct circle C = circle_cache[circle]; //these may be separate arrays
struct line L = line_cache[line]; //from your point data
long tmp = C.x * L.y10 - C.y * L.x10 + L.delta;
return (tmp*tmp < L.sides * C.R2);
}
... but you may want to check my math - its been a while. Also I assumed the points would be integers - change to float as needed - it should still be relatively fast.
If that isn't fast enough you can add additional data for the bounding boxes of the circle and line
bool lineIntersectWithCircle(size_t circle, size_t line){
struct circle C = circle_cache[circle]; //these may be separate arrays
struct line L = line_cache[line]; //from your point data
//if the bounding boxes don't intersect neither does the line
//this may not be _that_ helpful and you would need to:
// figure out the bounding boxes for each line/circle
// and cache additional data
if (C.leftx > L.rightx || L.leftx > C.rightx) //a box is to the side
return 0;
if (C.topy < L.boty || L.topy < C.boty) //a box is below/above
return 0;
//the bounding boxes intersected so check exact calculation
long tmp = C.x * L.y10 - C.y * L.x10 + L.delta;
return (tmp*tmp < L.sides * C.R2);
}
If this function does what I think it does, it seems that on my machine at least in CMYK, C=0, M=0, Y=0 and K=0 does not correspond to white! What is the problem?
float *arr is a float array with size elements. I want to save this array as a JPEG with IJG's libjpeg in two color spaces upon demand: g: Gray scale and c: CMYK. I follow their example and make the input JSAMPLE *jsr array with the number of JSAMPLE elements depending on the color space: size elements for gray scale and 4*size elements for CMYK. JSAMPLE is just another name for unsigned char on my machine at least. The full program can be seen on Github. This is how I fill jsr:
void
floatfilljsarr(JSAMPLE *jsr, float *arr, size_t size, char color)
{
size_t i;
double m;
float min, max;
/* Find the minimum and maximum of the array:*/
fminmax(arr, size, &min, &max);
m=(double)UCHAR_MAX/((double)max-(double)min);
if(color=='g')
{
for(i=0;i<size;i++)
jsr[i]=(arr[i]-min)*m;
}
else
for(i=0;i<size;i++)
{
jsr[i*4+3]=(arr[i]-min)*m;
jsr[i*4]=jsr[i*4+1]=jsr[i*4+2]=0;
}
}
I should note that color has been checked before this function to be either c or g.
I then write the JPEG image exactly as the example.c program in libjpeg's source.
Here is the output after printing both images in a TeX document. Grayscale is on the left and CMYK is on the right. Both images are made from the same ordered array, so the bottom left element (the first in the array as I have defined it and displayed it here) has JSAMPLE value 0 and the top right element has JSAMPLE value 255.
Why aren't the two images similar? Because of the different nature, I would expect the CMYK image to be reversed with its bottom being bright and its top being black. Their displaying JSAMPLE values (the only value in grayscale and the K channel in CMYK) are identical, but this output I get is not what I expected! The CMYK image is also brighter to the top, but very faintly!
It seems that C=0, M=0, Y=0 and K=0 does not correspond to white at least with this algorithm and on my machine!?! How can I set white when I only what to change the K channel and keep the rest zero?
Try inverting your K channel:
jsr[i*4+3]= (m - ((arr[i]-min)*m);
I think I found the answer my self. First I tried setting all four colors to the same value. It did produce a reasonable result but the output was not inverted as I expected. Such that the pixel with the largest value in all four colors was white, not black!
It was then that it suddenly occurred to me that somewhere in the process, either in IJG's libjpeg or in general in the JPEG standard I have no idea which, CMYK colors are reversed. Such that for example a Cyan value of 0 is actually interpreted as UCHAR_MAX on the display or printing device and vice versa. If this was the solution, the fact that the image in the question was so dark and that its grey shade was the same as the greyscale image could easily be explained (since I set all three other colors to zero which was actually interpreted as the maximum intensity!).
So I set the first three CMYK colors to the full range (=UCHAR_MAX):
jsr[i*4]=jsr[i*4+1]=jsr[i*4+2]=UCHAR_MAX /* Was 0 before! */;
Then to my surprise the image worked. The greyscale (left) shades of grey are darker, but atleast generally everything can be explained and is reasonably similar. I have checked separately and the absolute black color is identical in both, but the shades of grey in grey scale are darker for the same pixel value.
After I checked them on print (below) the results seemed to differ less, although the shades of grey in greys-scale are darker! Image taken with my smartphone!
Also, until I made this change, on a normal image viewer (I am using Scientific Linux), the image would be completely black, that is why I thought I can't see a CMYK image! But after this correction, I could see the CMYK image just as an ordinary image. Infact using Eye of GNOME (the default image viewer in GNOME), the two nearly appear identical.
A frequent question that props up during array manipulation exercises is to rotate a two dimensional array by 90 degrees. There are a few SO posts that answer how to do it in a variety of programming languages. My question is to clarify one of the answers that is out there and explore what sort of thought-process is required in order to get to the answer in an organic manner.
The solution to this problem that I found goes as follows:
public static void rotate(int[][] matrix,int n)
{
for( layer = 0;layer < n/2;++layer){
int first = layer;
int last = n -1 - layer;
for(int i = first;i<last;++i){
int offset = i - first;
int top = matrix[first][i];
matrix[first][i] = matrix[last-offset][first];
matrix[last-offset][first] = matrix[last][last-offset];
matrix[last][last-offset] = matrix[i][last];
matrix[i][last] = top;
}
}
}
I have somewhat of an idea what the code above is trying to do, it is swapping out the extremities/corners by doing a four-way swap and doing the same for the other cells separated by some offset.
Stepping through this code I know it works, what I do not get is the mathematical basis for the above given algorithm. What is the rationale behind the 'layer','first','last' and the offset?
How did 'last' turn out to be n-1-layer? Why is the offset i-first? What is the offset in the first place?
If somebody could explain the genesis of this algorithm and step me through the thought process to come up with the solution, that will be great.
Thanks
The idea is to break down the big task (rotating a square matrix) into smaller tasks.
First, a square matrix can be broken into concentric square rings. The rotation of a ring is independent from the rotation of other rings, so to rotate the matrix just rotate each of the rings, one by one. In this case, we start at the outermost ring and work inward. We count the rings using layer (or first, same thing), and stop when we get to the middle, which is why it goes up to n/2. (It is worth checking to make sure this will work for odd and even n.) It is useful to keep track of the "far edge" of the ring, using last = n - 1 - layer. For instance, in a 5x5 matrix, the first ring starts at first=0 and ends at last=4, the second ring starts at first=1 and ends at last=3 and so on.
How to rotate a ring? Walk right along the top edge, up along the left edge, left along the bottom edge and down along the right edge, all at the same time. At each step swap the four values around. The coordinate that changes is i, and the number of steps is offset. For example, when walking around the second ring, i goes {1,2,3} and offset goes {0,1,2}.