similarity between an image and its rotated version using SIFT - sift

I have implemented SIFT in opencv for comparing images... i have not yet written the program for comparing.Thinking of using FLANN for the same.But,my problem is that,looking into the 128 elements of the descriptor,cannot really understand the similarity of an image and its rotated version.
By reading Lowe's paper,i do understand that the descriptor co-ordinates are all rotated in terms of the keypoint orientation...but,how exactly is the similarity obtained.Can we undertstand the similarity by just viewing the 128 values.
pls,help me...this is for my project presentation.

You can first use Lowe's metric to compute some putative matches between the two images. The metric is that for any given descriptor de in image 1, find the distance to all descriptors de' in image 2. If the ratio of the closest distance to the second closest distance is below a threshold, then accept it.
After this, you can do RANSAC or other form of robust estimation or Hough Transform to check geometric consistency in terms of position, orientation, and scale of the keypoints that you accepted as putative matches.

If I recall correctly, SIFT will give you a set of 128-value descriptors that describe each of the interest points. You also have the location of each point in each of the images, as well as its "direction" (I forget what the "direction" is called in the paper) and scale in each image.
Once you've found two points that have matching descriptors, you can calculate the transformation from the interest point in one image to the same point in the other image by comparing coordinates and directions.
If you have enough matches, you see if all (or a majority of) the interest points have the same transformation. If they do, the images are similar, if they don't, the images are different.
Hope this helps...

What you are looking for is basically ASIFT
You can find the code here and some overview

Related

Simple Multi-Blob Detection of a Binary Image?

If there is a given 2d array of an image, where threshold has been done and now is in binary information.
Is there any particular way to process this image to that I get multiple blob's coordinates on the image?
I can't use openCV because this process needs to run simultaneously on 10+ simulated robots on a custom simulator in C.
I need the blobs xy coordinates, but first I need to find those multiple blobs first.
Simplest criteria of pixel group size should be enough. But I don't have any clue how to start the coding.
PS: Single blob should be no problem. Problem is multiple blobs.
Just a head start ?
Have a look at QuickBlob which is a small, standalone C library that sounds perfectly suited for your needs.
QuickBlob comes with a small command-line tool (csv-blobs) that outputs the position and size of each blob found within the input image:
./csv-blobs white image.png
X,Y,size,color
28.37,10.90,41,white
51.64,10.36,42,white
...
Here's an example (output image is produced thanks to the show-blobs.py tiny Python utility that comes with QuickBlob):
You can go through the binary image labeling the connected parts with an algorithm like the following:
Create a 2D array of ints, labelArray, that will hold the labels of the connected regions and initiate it to all zeros.
Iterate over each binary pixel, p, row by row
A. If p is true and the corresponding value for this position in the labelArray is 0 (unlabeled), assign it to a new label and do a breadth-first search that will add all surrounding binary pixels that are also true to that same label.
The only issue now is if you have multiple blobs that are touching each other. Because you know the size of the blobs, you should be able to figure out how many blobs are in a given connected region. This is the tricky part. You can try doing a k-means clustering at this point. You can also try other methods like using binary dilation.
I know that I am very late to the party, but I am just adding this for the benefipeople who are researching this problem.
Here is a nice description that might fit your needs.
http://www.mcs.csueastbay.edu/~grewe/CS6825/Mat/BinaryImageProcessing/BlobDetection.htm

How to identify optimal parameters for cvCanny for polygon approximation

This is my source image (ignore the points, they were added manually later):
My goal is to get a rough polygon approximation of the two hands. Something like this:
I have a general idea on how to do this; I want to use cvCanny to find edges, cvFindContours to find contours, and then cvApproxPoly.
The problem I'm facing is that I have no idea on how to properly use cvCanny, particularly, what should I use for the last 3 parameters (threshold1&2, apertureSize)? I tried doing:
cvCanny(source, cannyProcessedImage, 20, 40, 3);
but the result is not ideal. The left hand looks relatively fine but for the right hand it detected very little:
In general it's not as reliable as I'd like. Is there a way to guess the "best" parameters for Canny, or at least a detailed explanation (understandable by a beginner) of what they do so I can make educated guesses? Or perhaps there's a better way to do this altogether?
It seems you have to lower your thresholds.
The Canny algorithm work on the hysteresis threshold: it selects a contour if at least a pixel is as bright as the max threshold, and takes all the connected contour pixels if they are above the lower threshold.
Papers recommend to take the two thresholds in a scale of 2:1 oe 3:1 (by example 10 and 30, or 20 and 60, etc). For some applications, a threshold determined manually and hardcoded is enough. It may your case, too. I suspect that if you lower your thresholds, you will have good results, because the images are not that complicated.
A number of methods to automatically determine the best canny thresholds have been proposed. Most of them rely on gradient magnitudes to estimate the best thresholds.
Steps:
Extract the gradients (Sobel is a good option)
You can convert it to uchar. Gradients teoretically can have greater numerical values than 255, but that's ok. opencv's sobel returns uchars.
make a histogram of the resulting image.
take the max threshold at the 95th percentile of your histogram, and the lower as high/3.
You should probably adjust the percentile value depending on your app, but the results will be much more robust than a hardcoded hig and low values
Note: An excellent threshold detection algorithm is implemented in Matlab. It is based on the idea above, but a bit more sophisticated.
Note 2: This methods will work if the contours and illumination do not varies a lot between image areas. If the contours are crisper on one part of the image, then you need locally adaptive thresholds, and that's another story. But looking at you pics, it should not be the case.
Maybe one of the easiest solution is make Otsu thresholding on grayscale image, find contours on the binary image and than approximate them. Here's code:
Mat img = imread("test.png"), gray;
vector<Vec4i> hierarchy;
vector<vector<Point2i> > contours;
cvtColor(img, gray, CV_BGR2GRAY);
threshold(gray, gray, 0, 255, THRESH_OTSU);
findContours(gray, contours, hierarchy, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
for(size_t i=0; i<contours.size(); i++)
{
approxPolyDP(contours[i], contours[i], 5, false);
drawContours(img, contours, i, Scalar(0,0,255));
}
imshow("result", img);
waitKey();
And this is result:

Detect signs on roads

I have a video which has got turn left,turn right etc marks on the roads.
I have to detect those signs.I am going ahead with template matching in which I am matching the edge detected outputs,But I am not getting satisfactory results,Is there any other way to detect it? Please help.
If you want a solution that is not too complicated but more robust than template matching, I suggest you'd go for Hough voting on SIFT descriptors. This method is provides some degree of robustness to various problems, including partial occlusion of the sign, illumination variations and deformations of the sign. In particular, the method is completely invariant to rotation and uniform scaling of the template object.
The basic idea of the algorithm is as follows:
a) extract SIFT features from the template and query images.
b) set an arbitrary reference point in the template image and calculate, for each keypoint in the template image, the vector from the keypoint to the reference point.
c) match keypoints from the template image to the query image.
d) cast a vote for each matched keypoint for all object locations in the query image that this keypoint agrees with. You do that using the vectors calculated in step (b) and the location, scale and orientation of the matched keypoints in the query image.
e) If the object is indeed located in the image, the votes map should have a strong local maximum at it's location.
f) Optionally, you can verify the detection by using template matching.
You can read more about that method on Wikipedia here or in the original paper (by D. Lowe) here.
Using SIFT or SURF. You can get the invariable descriptor with training you can determine if the vector that represent the road marks (turn left, right or stop) match with the new in the video.
You might try extracting features and training a classifier (linear discriminant, neural network, naive Bayes, etc.). There are many candidate features you might try, but I'd think that you wouldn't need anything too complicated, even if the edge detection is poor, assuming that isolation of the sign is good. Some features to consider are: horizontal and vertical projections (row and column totals) and simple statistics of edge pixels (mean, standard deviation, skewness, etc. For more feature ideas, see any of these books:
"Shape Classification and Analysis: Theory and Practice", by Costa and Cesar
"Algorithms for Image Processing and Computer Vision", by J. R. Parker
"Digital Image Processing", by Gonzalez and Woods

Chart optimization: More than million points

I have custom control - chart with size, for example, 300x300 pixels and more than one million points (maybe less) in it. And its clear that now he works very slowly. I am searching for algoritm which will show only few points with minimal visual difference.
I have a link to the component which have functionallity exactly what i need
(2 million points demo):
I will be grateful for any matherials, links or thoughts how to realize such functionallity.
If I understand your question correctly, then you are looking to plot a graph of a dataset where you have ~1M points, but the chart's horizontal resolution is much smaller? If so, you can down-sample your dataset to get about the number of available x values. If your data is sorted in equal intervals, you can extract every N'th point and plot it. Choose N such that the number of points is, say, double the resolution (in this case, N=2000 will give you 500 points to display).
If the intervals are very different from eachother (not regularly spaced), you can approximate your graph with a polynomial, or spline or any other method that fits, and then interpolate 300-600 points from that approximation.
EDIT:
Depending on the nature of the data, you may end up with aliasing artifacts when you simply sample every N't point. There are probably better methods for coping with this problem, but again - it depends on what exactly you want to plot.
You could always buy the control - it is for sale!
John-Daniel Trask (Co-founder of Mindscape ;-)

Recognizing tetris pieces in C

I have to make an application that recognizes inside an black and white image a piece of tetris given by the user. I read the image to analyze into an array.
How can I do something like this using C?
Assuming that you already loaded the images into arrays, what about using regular expressions?
You don't need exact shape matching but approximately, so why not give it a try!
Edit: I downloaded your doc file. You must identify a random pattern among random figures on a 2D array so regex isn't suitable for this problem, lets say that's the bad news. The good news is that your homework is not exactly image processing, and it's much easier.
It's your homework so I won't create the code for you but I can give you directions.
You need a routine that can create a new piece from the original pattern/piece rotated. (note: with piece I mean the 4x4 square - all the cells of it)
You need a routine that checks if a piece matches an area from the 2D image at position x,y - the matching area would have corners (x-2, y-2, x+1, y+1).
You search by checking every image position (x,y) for a match.
Since you must use parallelism you can create 4 threads and assign to each thread a different rotation to search.
You might not want to implement that from scratch (unless required, of course) ... I'd recommend looking for a suitable library. I've heard that OpenCV is good, but never done any work with machine vision myself so I haven't tested it.
Search for connected components (i.e. using depth-first search; you might want to avoid recursion if efficiency is an issue; use your own stack instead). The largest connected component should be your tetris piece. You can then further analyze it (using the shape, the size or some kind of border description)
Looking at the shapes given for tetris pieces in Wikipedia, called "I,J,L,O,S,T,Z", it seems that the ratios of the sides of the bounding box (easy to find given a binary image and C) reveal whether you have I (4:1) or O (1:1); the other shapes are 2:3.
To detect which of the remaining shapes you have (J,L,S,T, or Z), it looks like you could collect the length and position of the shape's edges that fall on the bounding box's edges. Thus, T would show 3 and 1 along the 3-sides, and 1 and 1 along the 2 sides. Keeping track of the positions helps distinguish J from L, S from Z.

Resources