I'm trying to write an algorithm that will give me the (approximate) area of a shape.
My algorithm is provided with an image, which in then performs a Sobel edge-detection on. Using the edge-detected image, I'd like to find the area of an enclosed space, given a co-ordinate within the area. The co-ordinate is likely to be close to the centre of the shape, but not exactly centred..
If I wanted to be thorough, I'd do it recursively. However, I'm running this code on an embedded platform, so I'm memory-limited. I'd also prefer speedier algorithms.
The algorithm doesn't need to produce the exact area of the shape - something close (an order of magnitude at most) would be OK.
So far, I've considered approximating the shape as both a rectangle and an ellipse, by measuring up, down, left and right from the co-ordinate until an edge is found.
Here's an example of said methods:
- example image:
- ellipse and rectangle-marked:
The green fill is a partial-run of recursively filling the area.
If anyone has some better methods, I'd appreciate it!
Related
What I am doing is a pick program. There are many triangles and I want select the front and visible ones by a rectangular region. The main method is described below.
there are a lot of triangles and each triangle has its own color.
draw all the triangles to a frame buffer.
read the color of pixel in frame buffer and based on the color, we know which triangles are selected.
The problem is that there are some tiny triangles can not be displayed in the final frame buffer. Just like the green triangle in the picture. I think the triangle is too tiny and ignored by the graphic card.
My question is how to display the tiny triangles in the final frame buffer? or how to know which triangles are ignored by the graphic card?
Triangles are not skipped based on their size, but if a pixel center does not fall inside or lie on the top or left edge (this is referred to as coverage testing) they do not generate any fragments during rasterization.
That does mean that certain really small triangles are never rasterized, but it is not entirely because of their size, just that their position is such that they do not satisfy pixel coverage.
Take a moment to examine the following diagram from the DirectX API documentation. Because of the size and position of the the triangle I have circled in red, this triangle does not satisfy coverage for any pixels (I have illustrated the left edge of the triangle in green) and thus never shows up on screen despite having a tangible surface area.
If the triangle highlighted were moved about a half-pixel in any direction it would cover at least one pixel. You still would not know it was a triangle, because it would show up as a single pixel, but it would at least be pickable.
Solving this problem will require you to ditch color picking altogether. Multisample rasterization can fix the coverage issue for small triangles, but it will compute pixel colors as the average of all samples and that will break color picking.
Your only viable solution is to do point inside triangle testing instead of relying on rasterization. In fact, the typical alternative to color picking is to cast a ray from your eye position through the far clipping plane and test for intersection against all objects in the scene.
The usability aspect of what you seem to be doing seems somewhat questionable to me. I doubt that most users would expect a triangle to be pickable if it's so small that they can't even see it. The most obvious solution is that you let the user zoom in if they really need to selectively pick such small details.
On the part that can actually be answered on a technical level: To find out if triangles produced any visible pixels/fragments/samples, you can use queries. If you want to count the pixels for n "objects" (which can be triangles), you would first generate the necessary query object names:
GLuint queryIds[n]; // probably dynamically allocated in real code
glGenQueries(n, queryIds);
Then bracket the rendering of each object with glBeginQuery()/glEndQuery():
loop over objects
glBeginQuery(GL_SAMPLES_PASSED, queryIds[i]);
// draw object
glEndQuery(GL_SAMPLES_PASSED);
Then at the end, you can get all the results:
loop over objects
GLint pixelCount = 0;
glGetQueryObjectiv(queryIds[i], GL_QUERY_RESULT, &pixelCount);
if (pixelCount > 0) {
// object produced visible pixels
}
A couple more points to be aware of:
If you only want to know if any pixels were rendered, but don't care how many, you can use GL_ANY_SAMPLES_PASSED instead of GL_SAMPLES_PASSED.
The query counts samples that pass the depth test, as the rendering happens. So there is an order dependency. A triangle could have visible samples when it is rendered, but they could later be hidden by another triangle that is drawn in front of it. If you only want to count the pixels that are actually visible at the end of the rendering, you'll need a two-pass approach.
So, to start off, I'm not very good at computer graphics. I'm trying to implement a GUI toolkit where one of the features is being able to apply 3D transformations to 2D "layers". (a layer only has one Z coordinate, as pre-transform, it's a two dimensional axis aligned rectangle)
Now, this is pretty straightforward, until you come to 3D transformations that would push the layer back, requiring splitting the layer into several polygons in order to render it correctly, as illustrated here. And because we can have transparency, layers may not get completely occluded, while still requiring getting split.
So here is an illustration depicting the issue and the desired outcome. In this scenario, the blue layer (call it B) is on top of the red layer (R), while having the same Z position (but B was added after R). In this scenario, if we rotate B, its top two points will get a Z index lower than 0 while the bottom points will get an index higher than 0 (with the anchor point being the only point/line left as 0).
Can somebody suggest a good way of doing this on the CPU? I've struggled to find a suitable algorithm implementation (in C++ or C) that would be appropriate to this scenario.
Edit: To clarify myself, at this stage in the pipeline, there is no rendering yet. We just need to produce a set of polygons for each layer that would then represent the layer's transformed and occluded geometry. Then, if required, rendering (either software or hardware) is done if required, which is not always the case (for example, when doing hit testing).
Edit 2: I looked at binary space partitioning as an option of achieving this but I have only been able to find one implementation (in GL2PS), which I'm not sure how to use. I do have a vague understanding of how BSPs work, but I'm not sure how they can be used for occlusion culling.
Edit 3: I'm not trying to do colour and transparency blending at this stage. Just pure geometry. Transparency can be handled by the renderer, and overdraw is okay. In this case, the blue polygon can just be drawn under the red one, but with more complicated cases, depth sorting or even splitting up the polygons may be required (example of a scary case like that below). Although the viewport is fixed, because all layers can be transformed in 3D, creating a shape shown below is possible.
So what I'm really looking for is an algorithm that would geometrically split layer B into two blue shapes, one of which would be drawn "above" and one of which would be drawn below R. The part "below" would get overdraw, yes, but it's not a major issue. So B just need to be split into two polygons so it would appear to cut through R when those polygons are drawn in order. No need to worry about blending.
Edit 4: For the purpose of this, we cannot render anything at all. This all has to be done purely geometrically (producing 2D polygons). This is what I was originally getting at.
Edit 5: I should note that the overall number of quads per subscene is around 30 (average). Definitely won't go above 100. Unless the layers are 3D transformed (which is where this problem arises), they are just radix sorted by Z positions before being drawn. Layers with the same Z position are drawn in order in which they were added (first in, first out).
Sorry if I didn't make it clear in the original question.
If you "aren't good with computer graphics", Doing it on CPU (software rendering) will be extremely difficult for you, if polygons can be transparent.
The easiest way to do it is to use GPU rendering (OpenGL/Direct3D) with Depth Peeling technique.
Cpu solutions:
Soltuion #1 (extremely difficult):
(I forgot the name of this algorithm).
You need to split polygon B into two, - for example, using polygon A as clip plane, then render result using painter's algorithm.
To do that you'll need to change your rendering routines so they'll no longer use quads, but textured polygons, plus you'll have to write/debug clipping routines that'll split triangles present in scene in such way that they'll no longer break paitner's algorithm.
Big Problem: If you have many polygons, this solution can potentially split scene into infinite number of triangles. Also, writing texture rendering code yourself isn't much fun, so it is advised to use OpenGL/Direct3D.
This can be extremely difficult to get right. I think this method was discussed in "Computer Graphics Using OpenGL 2nd edition" by "Francis S. Hill" - somewhere in one of their excercises.
Also check wikipedia article on Hidden Surface Removal.
Solution #2 (simpler):
You need to implement multi-layered z-buffer that stores up to N transparent pixels and their depth.
Solution #3 (computationally expensive):
Just use ray-tracing. You'll get perfect rendering result (no limitations of depth peeling and cpu solution #2), but it'll be computationally expensive, so you'll need to optimize rendering routines a lot.
Bottom line:
If you're performing software rendering, use Solution #2 or #3. If you're rendering on hardware, use technique similar to depth-peeling, or implement raytracing on hardware.
--edit-1--
required knowledge for implementing #1 and #2 is "line-plane intersection". If you understand how to split line (in 3d space) into two using a plane, you can implement raytracing or clipping easily.
Required knowledge for #2 is "textured 3d triangle rendering" (algorithm). It is a fairly complex topic.
In order to implement GPU solution, you need to be able to find few OpenGL tutorials that deal with shaders.
--edit-2--
Transparency is relevant, because in order to get transparency right, you need to draw polygons from back to front (from farthest to closest) using painter's algorithms. Sorting polygons properly is impossible in certain situation, so they must be split, or you should use one of the listed techniques, otherwise in certain situations there will be artifacts/incorrectly rendered images.
If there's no transparency, you can implement standard zbuffer or draw using hardware OpenGL, which is a very trivial task.
--edit-3--
I should note that the overall number of quads per subscene is around 30 (average). Definitely won't go above 100.
If you will split polygons, it can easily go way above 100.
It might be possible to position polygons in such way that each polygon will split all others polygon.
Now, 2^29 is 536870912, however, it is not possible to split one surface with a plane in such way that during each split number of polygons would double. If one polygon is split 29 timse, you'll get 30 polygons in the best-case scenario, and probably several thousands in the worst case if splitting planes aren't parallel.
Here's rough algorithm outline that should work:
Prepare list of all triangles in scene.
Remove back-facing triangles.
Find all triangles that intersect each other in 3d space, and split them using line of intersection.
compute screen-space coordinates for all vertices of all triangles.
Sort by depth for painter's algorithm.
Prepare extra list for new primitives.
Find triangles that overlap in 2D (post projection) screen space.
For all overlapping triangles check their rendering order. Basically a triangle that is going to be rendered "below" another triangles should have no part that is above another triangle.
8.1. To do that, use camera origin point and triangle edges to split original triangles into several sub-regions, then check if regions conform to established sort order (prepared for painter's algorithm). Regions are created by splitting existing pair of triangles using 6 clip planes created by camera origin points and triangle edges.
8.2. If all regions conform to rendering order, leave triangles be. If they don't, remove triangles from list, and add them to the "new primitives" list.
IF there are any primitives in new primitives list, merge the list with triangle list, and go to #5.
By looking at that algorithm, you can easily understand why everybody uses Z-buffer nowadays.
Come to think about it, that's a good training exercise for universities that specialize in CG. The kind of exercise that might make your students hate you.
I am going to come out say give the simpler solution, which may not fit your problem. Why not just change your artwork to prevent this problem from occuring.
In problem 1, just divide the polys in Maya or whatever beforehand. For the 3 lines problem, again, divide your polys at the intersections to prevent fighting. Pre-computed solutions will always run faster than on the fly ones - especially on limited hardware. From profesional experience, I can say that it also does scale, well it scales ok. It just requires some tweaking from the art side and technical reviews to make sure nothing is created "ilegally." You may end up getting more polys doing it this way than rendering on the fly, but at least you won't have to do a ton of math on CPUs that may not be up to the task.
If you do not have control over the artwork pipeline, this won't work as writing some sort of a converter would take longer than getting a BSP sub-division routine up and running. Still, KISS is often the best solution.
Suppose we have an image (pixel buffer) that is in black and white, so each pixel is either black or white (not gray scale).
Now somewhere in the middle of that images, place a green dot. It may have a radius of n for rendering purposed, but it is really a just point. Give the dot a randomly selected direction and speed, and start it moving. If the image is all white pixels, the dot will bounce off the edges of the image, infinitely wandering around the picture. This is quite easy... just reverse either the rise or run of the dot's vector.
Next, suppose the image has some globs of black pixels. As the dot encounters these globs of black pixels, the angle of reflection needs to be calculated. This is also quite easy of the the black pixels have a fixed slope, as in my sketch (blue X represents black pixels). You can find the slope of the blue Xs and easily calculate the new vector.
But how about the case where the black pixels form really unfriendly surfaces? What are some approaches to figuring out this angle?
This is the subject that I am interested in.
There must be some algorithms that exist for this kind of purpose, but I never ran across any in school. I am not asking how to code this, rather approaches to writing the algorithm to do this. I have a few ideas that I'll try, but if there are some standard ways to do this that exist, I'd like to learn about them.
Obviously I'd like to start with Black and White then move into RGBA.
I am looking for any reference material on just this sort of subject. Websites, books, or other references are very very welcome.
Also, if there are different StackOverflow tags that could be good, let me know.
Thanks much!
Edit********** More pics and information
Maybe I wasn't clear what I meant by "unfriendly surfaces". In the previous picture, our blue X's happened to just be a line. Picture a case where it is not a line, rather a wierd shape.
We start with our green pixel traveling at a slope of 2. Suppose it's vector is that of 12 pixels per frame. It would have a projected path like this:
But suppose instead of a nice friendly line, we have this:
In my mind I can kinda of see what is likely to happen if this were a ball and some walls.
Look for edge detection algorithms used in image processing. Some edge detectors also approximate the direction of edges.
You can think of the pixel neighborhood of the green dot, maybe somewhere between 3x3 and 7x7, as a small edge direction detection problem. One approach would take two passes at the pixels:
In the first pass, smooth the sharp black/white pixels using a Gaussian filter.
In the second pass, apply an edge detection operator, such as Sobel, Prewitt or Roberts to produce the X and Y derivatives of the pixels' intensity. You can then approximate the direction as:
angle = arctan(dx/dy)
The motivation for the smoothing pass is to give the edge detection operator information from farther-away pixels.
The Wikipedia page on the Canny edge detector has a good discussion on obtaining the direction (the "gradient") of an edge, including an example of a particular Gaussian filter you can use for smoothing.
Am doing something similar with a ball and randomly generated backgrounds.
The filter and edge detection is highly technical but all other processes using a 5*5 or 3*3 grid seem similarly difficult.
However, I think I may have a cheap way around this. Assuming a ball travelling in any direction, scan all leading edges of the ball - a semicircle. The further to the edge of the ball the collision occurs the closer to vertical is the collision. Again, I think, this should allow you to easily infer the background normal and from there the answer is fairly simple
I am trying to do my own blob detection who will receive a real time video, and try to detect a white paper sheet.
Even if is something written inside the paper. I need to detect the paper and is corner, because what i really want is to draw a opengl polygon over the paper in each corner of the paper will be a corner of the polygon. Then i need the coordinates of the paper to do other stuffs.
So i need to:
- detect a square white blob.
- get the coordinates of the cornes
- draw a polygon over the white sheet.
Any ideias how can i do that?
Much depends on context. For example, suppose that you:
know that the paper is always roughly centered (i.e. W/2, Y/2 is always inside the blob), and no more rotated than 45 degrees (30 would be better)
have a suitable border around the sheet so that the corners never touch the edges of the FOV
are able (through analysis of local variance, or if you're lucky, check of background color or luminance) to say whether a point is inside or outside the blob
the inside/outside function never fails (except possibly in the close vicinity of a border)
then you could walk a line from a point on the border (surely outside) and the center (surely inside), even through bisection, and find a point - an areal - on the edge.
Two edge points give a rect (two areals give a beam), two rects give an intersection (two beams give a larger areal) - and there's your corner. You should carry along the detection uncertainty (areal radius) in order to validate corners (another less elegant approach is to roughly calculate where the corner is, and pinpoint it with a spiral search or drunkard's walk).
This algorithm is amenable to parallelization and, as long as the hypotheses hold, should be really fast.
All that said, it remains a hack -- I agree with unwind, why reinvent the wheel? If you have memory or CPU constraints (embedded systems, etc.), I believe there ought to be OpenCV and e-Vision "lite" ports also for ARM and embedded platforms.
(Sorry for my terminology - I'm monkey-translating from Italian. "Areal" is likely to correspond to your "blob", a beam is the family of lines joining all couples of points in two different blobs, line intensity being the product of distance from a point from its areal's center)
I am trying to do my own blob detection who will receive a real time video, and try to detect a white paper sheet.
Your first shot could be a simple flood-fill. That is, select a good threshold to binarize the image and apply the algorithm. The threshold can be fixed if you know the paper is always brighter than X and the background is always darker than this. Or this can be an adaptive threshold, for example Otsu's method. OpenCV offers this for free.
If you'd need to speed it up you could use a union-find data structure.
Finally you'd need to come up with some heuristic how to identify the corners (e.g. the four extreme values in x/y direction).
Then i need [...] the coordinates of the cornes [...]
Then you don't need blob detection, but corner detection or contour detection in the first place. OpenCV has some nice functionality for exactly this.
If you can't use it, I would suggest to binarize the image as above and use a harris-detector to find the corners of the object.
OpenCV's TBB support could also come quite handy if you'd use it and you have problems to meet your real-time requirements.
I'm in the process of writing a C program using OpenCV to detect some rectangles made with tape, which are hollow on the inside. Problem is, each physical rectangle gives two digital rectangles: one for the inner perimeter, one for the outer perimeter. The outer rectangle in all cases completely encloses the inner rectangle.
I need some way to remove the inner rectangles, and in a reasonably efficient manner, as this is being run on a video feed and must not drop framerate considerably (approx. 15fps, on a BeagleBoard xM, which is not terribly powerful).
There are always four physical rectangles, and somewhere between four to eight digital rectangles depending on the cleanliness of the processing operations. The outer rectangle is detected reliably; the inner rectangle is not. The image is thresholded, eroded, and dilated such that the image is clean and detection is reliable in general.
I feel that this problem is separate from OpenCV and is really just working with rectangles and could probably be solved by me with some time, but the project is on a crunch deadline, so I'm also throwing this question in. Thanks in advance, guys.
there is a function called grouprectangle in opencv.
The function can remove multiple rectangles...
Have a happy coding.
Since you only have at most 8 digital rectangles, I think it would be fine to use the natural, brute force, algorithm to figure out which rectangles are inside other rectanges. It's OK to do O(N^2) algorithms when N is small, and 8 is small.
Here is the pseudo code:
for each rectangle i {
for each rectangle j {
if i != j and rectangle i is inside rectangle j {
disregard rectangle i
}
}
}
Solved - the speedy solution is to take the distance to one of the corners from the center point of the rectangle, and compare that distance between rectangles whose centers are very close together. The one with the shorter distance must be the inner rectangle.
Code-wise you'd want to calculate the center, then find, say, the bottom right point, which is just the point with both min x and min y. Calculate the distance between them and store it somehow. For each rectangle, iterate over the other ones and check if their centers are very close (a constant of ~30px works fine for this in my case). Compare the distances calculated earlier; the rectangle with the shorter distance should be deleted from the list of rectangles.