I have an image. I have to find the height of a particular object in it.
If we directly take the pixel length it will not give the exact height.How to approach this problem?
After you calibrated your camera, you will have a transformation from image plane to world coordinates. Using this information you can predict the height of the object you are looking for, of course in this step you somehow need to identify the object that you are interested in.
Generally speaking, this question is too broad and covers many fundamental concepts of computer vision, so please consult your favorite textbook before attacking the problem.
An alternative approach: place the object on top of a A4 paper sheet and take a picture from above. Since you know the size of paper, you can calculate the size of the object based on that.
To detect a paper sheet, check this post or this.
Related
Is there a way to change the clutter perspective for a given container or widget?
The clutter perspective controls how all the clutter actors on the screen are displayed when rotated, translated, scaled, etc.
What I would really like to do is to change the perspective's origin from the center of the screen to another coordinate.
I have messed with a few of the stage methods. However, I haven't had much luck understanding some of the results, and often I hit some stability issues.
I know there are transformation matrices that do all the logic under the hood, and there are documented ways to change the transform matrices. Honestly, I haven't researched much further and just though I would ask for guidance before spending a lot of time on it.
Which leads me to another question regarding the matrices and transformations. Can one of these matrices be used to skew an actor? Or deform it into a trapezoid, etc? And any idea how to get started on that, ie. what a skew matrix would look like?
Finally, does anyone know why the clip path was deprecated? It seems that would have worked for what I ultimately want to do: draw irregular shaped 2d objects on the screen If I can implement an answer to question 2, then I guess a clip box with a transformation can be used here.
1, I do not know if (or how) one might change the Clutter stage's focal point.
2 A skew or shear transformation matrix is easy enough to construct, and can be implemented in the GJS Clutter functions Clutter.Actor.set_transform(T) and Clutter.Actor.set_child_transform(T) where T is a Clutter.Matrix .
This does present another problem, however, for the current codebase; and this leads to another question. (I guess I should post it somewhere else). But, when a transform is set on a clutter actor (or its children), the rest of the actor's properties are ignored. This has the added effect that the Tweener library cannot be used for animation of these properties.
3 Finally, one can use Cairo to draw irregular shaped objects and paths on a Clutter actor, however, the reactive area for the actor (ie. mouse-enter and -leave events) will still be for the entire actor, not defined by the Cairo path.
I am trying to create a bounding box around each character in an image. I have converted the image to binary and thresholded it but I don't understand how to create a bounding box despite reading the manual.
There are a few options for the bounding box technique, but I think you'll get a great result combining these two:
First, use the technique demonstrated here to detect a large portion of text and put a bounding rectangle around it so you crop the image to this area;
Second, experiment with the technique recently presented by OpenCV, also demonstrated here. It could be used to locate/extract individual characters on the resulting image of the first step.
I suppose you are trying to implement the OCR mechanism yourself instead of relying on great APIs such as Tesseract.
If you are looking for more information on how to do digit/text recognition, check this answer.
As said before "rudely", I encourage you to rewrite your question with more detail on what you already did. We didn't understand what you would like to do. If character recognition is what you want, have a look at this.
I had a generalized question to find out if it was possible or not to do matrix calculations on a rectangle. I have a CvRect that has information stored in it with coordinates and I have a cvMat that has transformational data. What I would like to know is if there was a way to get the Rect to use the matrix data to generate a rotated, skewed, and repositioned rectangle out of it. I've searched online, but I was only able to get information on image transforms.
Thanks in advance for the help.
No, this is not possible. cv::Rect is also not capable of that, as it only describes rectangles in a Manhattan world. There is cv::RotatedRect, but this also does not handle skewing.
You can, however, feed the corner points of your rectangle to cv::transform:
http://opencv.itseez.com/modules/core/doc/operations_on_arrays.html?highlight=transform#cv2.transform
You will then obtain four points that are transformed accordingly. Note that there are also more specialized versions of this function, e.g. warpPerspective() and warpAffine().
I have a sequence of images taken from a camera. The images consists of hand and surroundings. I need to remove everything except the hand.
I am new to Image processing. Would anyone help me in regard with the above Question. I am comfortable using C and Matlab.
A really simple approach if you have a stationary background and a moving hand (and quite a few images!) is simply to take the average of the set of images away from each image. If nothing else, it's a gentle introduction to Matlab.
The name of the problem you are trying to solve is "Image Segmentation". The Wikipedia page here: wiki is a good start.
If lighting consistency isn't a problem for you, I'd suggest starting with simple RGB thresholding and see how far that gets you before trying anything more complicated.
Have a look at OpenCV, a FOSS library for computer vision applications. Specifically, see the Video Surveillance module. For a walk through of background subtraction in MATLAB, see this EETimes article.
Can you specify what kind of images you have. Is the background moving or static? For a static background it is a bit straightforward. You simply need to subtract the incoming image from the background image. You can use some morphological operations to make it look better. They all depend on the quality of images that you have. If you have moving background I would suggest you go for color based segmentation. Convert the image to YCbCr then threshold appropriately. I know there are some papers available on it(However I dont have time to locate them). I suggest reading them first. Here is one link which might help you. Read the skin segmentation part.
http://www.stanford.edu/class/ee368/Project_03/Project/reports/ee368group08.pdf
background subtraction is simple to implement (estimate background as average of all frames, then subtract each frame from background and threshold resulting absolute difference) but unfortunately only works well if 1. camera has manual gain and exposure 2. lighting conditions do not change 3.background is stationary. 4. the background is visible for much longer than the foreground.
given your description i assume these are not the case - so what you can use - as already pointed out - is colour as a means of segmenting foreground from background. as it's a hand you are trying to isolate best bet is to learn the hand colour. opencv provides some means of doing this. if you want to do this yourself you just get the colour of some of the hand pixels (you would need to specify this manually for at least one frame) and convert them to HUE (which encapsulates the colour in a brightness independen way. skin colour has a very constant hue) and then make a HUE histogram. compare this to the rest of the pixels and then decided if the hue is simmilar enough.
My application presents an image that can be scaled to a certain size. I'm using the Image WPF control with the scaling method of FANT.
However, there is no documentation how this scaling algorithm works.
Can anyone reference me to the relevant link for this algorithm description?
Nir
Avery Lee of VirtualDub states that it's a box filter for downscaling and linear for upscaling. If I'm not mistaken, "box filter" here means basically that each output pixel is a "flat" average of several input pixels.
In practice, it's a lot more blurry for downscaling than GDI's cubic downscaling, so the theory about averaging sounds about right.
I know what it is, but I couldn't find much on Google either :(
http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4056711 is the appropriate paper I think; behind a pay-wall.
You don't need to understand the algorithm to use it. You should explicitly make the choice each time you create a bitmap control that is scaled whether you want it high-quality scaled or low quality scaled.