Similar to calibrating a single camera 2D image with a chessboard, I wish to determine the width/height of the chessboard (or of a single square) in pixels.
I have a camera aimed vertically at the ground, ensured to be perfectly level with the surface below. I am using the camera to determine the translation between consequtive frames (successfully achieved using fourier phase correlation), at the moment my result returns the translation in pixels, however I would like to use techniques similar to calibration, where I move the camera over the chessboard which is flat on the ground, to automatically determine the size of the chessboard in pixels, relative to my image height and width.
Knowing the size of the chessboard in millimetres, I can then convert a pixel unit to a real-world-unit in millimetres, ie, 1 pixel will represent a distance proportional to the height of the camera above the ground. This will allow me to convert a translation in pixels to a translation in millimetres, recalibrating every time I change the height of the camera.
What would be the recommended way of achieving this? Surely it must be simpler than single camera 2D calibration.
OpenCV can give you the position of the chessboard's corners with cv::findChessboardCorners().
I'm not sure if the perspective distortion will affect your calculations, but if the chessboard is perfectly aligned beneath the camera, it should work.
This is just an idea so don't hit me.. but maybe using the natural contrast of the chessboard?
"At some point it will switch from bright to dark pixels and that should happen (can't remember number of columns on chessboard) times." should be a doable algorithm.
Related
In my 2D map application, I have 16-bit heightmap textures containing altitudes in meters associated to a point on the map.
When I draw these textures on the screen, I would like to display an analysis such that the pixel referring to the highest altitude on the screen is white, the pixel referring to the lowest altitude in the screen is black and the values in-between are interpolated between those two.
I'm using an older OpenGL version and thus do not have access to modern pipeline functionality like GLSL or PBO (Which somehow can make getting color buffer contents to CPU side much more efficient than glReadPixels, as I've heard).
I have access to ATI_fragment_shader extension which makes possible to use a basic fragment shader to merge R and G channels in these textures and get a single float grayscale luminance value.
Then I would've been able to re-color these pixels again inside shader (Map them to 0-1 range) based on maximum and minimum pixel luminance values but I don't know what they are.
My question is, between the pixels currently on the screen, how do I find the pixels with maximum and minimum luminance values? Or as an alternative, how do I find these values inside a texture? (Because I could make a glCopyTexImage2D call after drawing the texture with grayscale luminance values on the screen and retrieve the data as a texture).
Stuff I've tried or read about so far:
-If I could somehow get current pixel RGB values in the color buffer to CPU side, I could find what I need manually and then use them. However, reading color buffer contents with glReadPixels is unacceptably slow. It's no use even if I set it up so that it completes one read operation over multiple frames.
-Downsampling the texture to 1x1 size until the last standing pixel is either minimum or maximum value and then using this 1x1 texture inside shader. I have no idea how to achieve this without GLSL and texel fetching support since I would have to look up the pixel which is to the right, up and up-right of the current one and find a min/max value between them.
I'm trying to use FreeType to create a bitmap font for a microcontroller, but I'm stuck on the fundamental difference in the way coordinates are expressed. My microcontroller expects an X and Y offset for the glyph bitmap relative to an origin point in the upper left corner, whereas FreeType is giving me "bearings" relative to an invisible baseline. I'm pretty sure bearingX is what I want for my X offset, but how do I determine my Y offset? I tried subtracting bearingY from the ascender height, but some of the offsets come out negative. This is unacceptable, because it makes drawing text in the upper left corner of a display impossible.
I solved my problem by pre-rendering all of the glyphs, and keeping track of the maximum ascent and descent in actual rendered pixels. Then I calculated the maximum height of all glyphs from the two values, and used that to calculate the Y-offset for each glyph bitmap from its top bearing. With an extra rendering step, I can also re-scale the face to more closely match my desired pixel height.
What I am doing is a pick program. There are many triangles and I want select the front and visible ones by a rectangular region. The main method is described below.
there are a lot of triangles and each triangle has its own color.
draw all the triangles to a frame buffer.
read the color of pixel in frame buffer and based on the color, we know which triangles are selected.
The problem is that there are some tiny triangles can not be displayed in the final frame buffer. Just like the green triangle in the picture. I think the triangle is too tiny and ignored by the graphic card.
My question is how to display the tiny triangles in the final frame buffer? or how to know which triangles are ignored by the graphic card?
Triangles are not skipped based on their size, but if a pixel center does not fall inside or lie on the top or left edge (this is referred to as coverage testing) they do not generate any fragments during rasterization.
That does mean that certain really small triangles are never rasterized, but it is not entirely because of their size, just that their position is such that they do not satisfy pixel coverage.
Take a moment to examine the following diagram from the DirectX API documentation. Because of the size and position of the the triangle I have circled in red, this triangle does not satisfy coverage for any pixels (I have illustrated the left edge of the triangle in green) and thus never shows up on screen despite having a tangible surface area.
If the triangle highlighted were moved about a half-pixel in any direction it would cover at least one pixel. You still would not know it was a triangle, because it would show up as a single pixel, but it would at least be pickable.
Solving this problem will require you to ditch color picking altogether. Multisample rasterization can fix the coverage issue for small triangles, but it will compute pixel colors as the average of all samples and that will break color picking.
Your only viable solution is to do point inside triangle testing instead of relying on rasterization. In fact, the typical alternative to color picking is to cast a ray from your eye position through the far clipping plane and test for intersection against all objects in the scene.
The usability aspect of what you seem to be doing seems somewhat questionable to me. I doubt that most users would expect a triangle to be pickable if it's so small that they can't even see it. The most obvious solution is that you let the user zoom in if they really need to selectively pick such small details.
On the part that can actually be answered on a technical level: To find out if triangles produced any visible pixels/fragments/samples, you can use queries. If you want to count the pixels for n "objects" (which can be triangles), you would first generate the necessary query object names:
GLuint queryIds[n]; // probably dynamically allocated in real code
glGenQueries(n, queryIds);
Then bracket the rendering of each object with glBeginQuery()/glEndQuery():
loop over objects
glBeginQuery(GL_SAMPLES_PASSED, queryIds[i]);
// draw object
glEndQuery(GL_SAMPLES_PASSED);
Then at the end, you can get all the results:
loop over objects
GLint pixelCount = 0;
glGetQueryObjectiv(queryIds[i], GL_QUERY_RESULT, &pixelCount);
if (pixelCount > 0) {
// object produced visible pixels
}
A couple more points to be aware of:
If you only want to know if any pixels were rendered, but don't care how many, you can use GL_ANY_SAMPLES_PASSED instead of GL_SAMPLES_PASSED.
The query counts samples that pass the depth test, as the rendering happens. So there is an order dependency. A triangle could have visible samples when it is rendered, but they could later be hidden by another triangle that is drawn in front of it. If you only want to count the pixels that are actually visible at the end of the rendering, you'll need a two-pass approach.
Looking to see if anyone can recommend a computationally efficient method for translating/shifting an image by (x,y) pixels.
Reason being, I have been part successful in implementing the fourier-mellin transform to determine the rotation and translation between image frames. Once the image is unrotated I would like to untranslate the image by the calculated pixel offset (x,y). Allowing me to test the image correlation after rotation and translation.
I would think that a efficient method would be to:
Make a border cv::copyMakeBorder().
Use a ROI e.g. make a new matrix header without copying data.
Good luck
I am trying to do my own blob detection who will receive a real time video, and try to detect a white paper sheet.
Even if is something written inside the paper. I need to detect the paper and is corner, because what i really want is to draw a opengl polygon over the paper in each corner of the paper will be a corner of the polygon. Then i need the coordinates of the paper to do other stuffs.
So i need to:
- detect a square white blob.
- get the coordinates of the cornes
- draw a polygon over the white sheet.
Any ideias how can i do that?
Much depends on context. For example, suppose that you:
know that the paper is always roughly centered (i.e. W/2, Y/2 is always inside the blob), and no more rotated than 45 degrees (30 would be better)
have a suitable border around the sheet so that the corners never touch the edges of the FOV
are able (through analysis of local variance, or if you're lucky, check of background color or luminance) to say whether a point is inside or outside the blob
the inside/outside function never fails (except possibly in the close vicinity of a border)
then you could walk a line from a point on the border (surely outside) and the center (surely inside), even through bisection, and find a point - an areal - on the edge.
Two edge points give a rect (two areals give a beam), two rects give an intersection (two beams give a larger areal) - and there's your corner. You should carry along the detection uncertainty (areal radius) in order to validate corners (another less elegant approach is to roughly calculate where the corner is, and pinpoint it with a spiral search or drunkard's walk).
This algorithm is amenable to parallelization and, as long as the hypotheses hold, should be really fast.
All that said, it remains a hack -- I agree with unwind, why reinvent the wheel? If you have memory or CPU constraints (embedded systems, etc.), I believe there ought to be OpenCV and e-Vision "lite" ports also for ARM and embedded platforms.
(Sorry for my terminology - I'm monkey-translating from Italian. "Areal" is likely to correspond to your "blob", a beam is the family of lines joining all couples of points in two different blobs, line intensity being the product of distance from a point from its areal's center)
I am trying to do my own blob detection who will receive a real time video, and try to detect a white paper sheet.
Your first shot could be a simple flood-fill. That is, select a good threshold to binarize the image and apply the algorithm. The threshold can be fixed if you know the paper is always brighter than X and the background is always darker than this. Or this can be an adaptive threshold, for example Otsu's method. OpenCV offers this for free.
If you'd need to speed it up you could use a union-find data structure.
Finally you'd need to come up with some heuristic how to identify the corners (e.g. the four extreme values in x/y direction).
Then i need [...] the coordinates of the cornes [...]
Then you don't need blob detection, but corner detection or contour detection in the first place. OpenCV has some nice functionality for exactly this.
If you can't use it, I would suggest to binarize the image as above and use a harris-detector to find the corners of the object.
OpenCV's TBB support could also come quite handy if you'd use it and you have problems to meet your real-time requirements.