Search image pattern - c

I need to do a program that does this: given an image (5*5 pixels), I have to search how many images like that exist in another image, composed by many other images. That is, i need to search a given pattern in an image.
The language to use is C. I have to use parallel computing to search in the 4 angles (0º, 90º, 180º and 270º).
What is the best way to do that?

Seems straight forward.
Create 4 versions of the image rotated by 0°, 90°, 180°, and 270°.
Start four threads each with one version of the image.
For all positions from (0,0) to (width - 5, height - 5)
Comapare the 25 pixels of the reference image with the 25 pixels at the current position
If they are equal enough using some metric, report the finding.

Use normalized correlation to determine a match of templates.
#Daniel, Daniel's solution is good for leveraging your multiple CPUs. He doesn't mention a quality metric that would be useful and I would like to suggest one quality metric that is very common in image processing.
I suggest using normalized correlation[1] as a comparison metric because it outputs a number from -1 to +1. Where 0 is no correlation 1 would be output if the two templates were identical and -1 would be if the two templates were exactly opposite.
Once you compute the normalized correlation you can test to see if you have found the template by doing either a threshold test or a peak-to-average test[2].
[1 - footnote] How do you implement normalized correlation? It is pretty simple and only has two for loops. Once you have an implementation that is good enough you can verify your implementation by checking to see if the identical image gets you a 1.
[2 - footnote] You do the ratio of the max(array) / average(array_without_peak). Then threshold to make sure you have a good peak to average ratio.

There's no need to create the additional three versions of the image, just address them differently or use something like the class I created here. Better still, just duplicate the 5x5 matrix and rotate those instead. You can then linearly scan the image for all rotations (which is a good thing).
This problem will not scale well for parallel processing since the bottleneck is certainly accessing the image data. Having multiple threads accessing the same data will slow it down, especially if the threads get 'out of sync', i.e. one thread gets further through the image than the other threads so that the other threads end up reloading the data the first thread has discarded.
So, the solution I think will be most efficient is to create four threads that scan 5 lines of the image, one thread per rotation. A fifth thread loads the image data one line at a time and passes the line to each of the four scanning threads, waiting for all four threads to complete, i.e. load one line of image, append to five line buffer, start the four scanning threads, wait for threads to end and repeat until all image lines are read.

5 * 5 = 25
25 bits fits in an integer.
each image can be encoded as an array of 4 integers.
Iterate your larger image, (hopefully it is not too big),
pulling out all 5 * 5 sub images, convert to an array of 4 integers and compare.

Related

How to normalize multiple array of different size in matlab

I use set of images for image processing in which each image generates unique code (Freeman chain code). The size of array for each image varies. However the value ranges from 0 to 7. For e.g. First image creates array of 3124 elements. Second image creates array of 1800 elements.
Now for further processing, I need a fixed size of those array. So, is there any way to Normalize it ?
There is a reason why you are getting different sized arrays when applying a chain code algorithm to different images. This is because the contours that represent each shape are completely different. For example, the letter C and D will most likely contain chain codes that are of a different length because you are describing a shape as a chain of values from a starting position. The values ranging from 0-7 simply tell you which direction you need to look next given the current position of where you're looking in the shape. Usually, chain codes have the following convention:
3 2 1
4 x 0
5 6 7
0 means to move to the east, 1 means to move north east, 2 means to move north and so on. Therefore, if we had the following contour:
o o x
o
o o o
With the starting position at x, the chain code would be:
4 4 6 6 0 0
Chain codes encode how we should trace the perimeter of an object given a starting position. Now, what you are asking is whether or not we can take two different contours with different shapes and represent them using the same number of values that represent their chain code. You can't because of the varying length of the chain code.
tl;dr
In general, you can't. The different sized arrays mean that the contours that are represented by those chain codes are of different lengths. What you are actually asking is whether or not you can represent two different and unrelated contours / chain codes with the same amount of elements.... and the short answer is no.
What you need to think about is why you want to try and do this? Are you trying to compare the shapes between different contours? If you are, then doing chain codes is not the best way to do that due to how sensitive chain codes are with respect to how the contour changes. Adding the slightest bit of noise would result in an entirely different chain code.
Instead, you should investigate shape similarity measures instead. An authoritative paper by Remco Veltkamp talks about different shape similarity measures for the purposes of shape retrieval. See here: http://www.staff.science.uu.nl/~kreve101/asci/smi2001.pdf . Measures such as the Hausdorff distance, Minkowski distance... or even simple moments are some of the most popular measures that are used.

Rendering image using Multithread

I have a ray tracing algorithm, which works with only 1 thread and I am trying to make it work with any number of threads.
My question is, which way can I divide this task among threads.
At first my Instructor told me to just divide the width of the image, for example if I have an 8x8 image, and I want 2 threads to do the task, let thread 1 render 0 to 3 horizontal area ( of course all the way down vertically ) and thread 2 render 4 to 7 horizontal area.
I found this approach to work perfect when both my image length and number of threads are powers of 2, but I have no idea how can I deal with odd number of threads or any number of threads that cant divide width without a reminder.
My approach to this problem was to let threads render the image by alternating, for example if I have an 8x8 image, andlets say if I have 3 threads.
thread 1 renders pixels 0,3,6 in horizontal direction
thread 1 renders pixels 1,4,7 in horizontal direction
thread 1 renders pixels 2,5 in horizontal direction
Sorry that I cant provide all my code, since there are more than 5 files with few hundreds line of code in each one.
Here is the for loops that loop trough horizontal area, and the vertical loop is inside these but I am not going to provide it here.
My Instructor`s suggestion
for( int px=(threadNum*(width/nthreads)); px < ((threadNum+1)*(width/nthreads)); ++px )
threadNum is the current thread that I am on (meaning thread 0,1,2 and so on)
width is the width of the image
nthreads is the overall number of threads.
My solution to this problem
for( int px= threadNum; px< width; px+=nthreads )
I know my question is not so clear, and sorry but I cant provide the whole code here, but basically all I am asking is which way is the best way to divide the rendering of the image among given number of threads ( can be any positive number). Also I want threads to render the image by columns, meaning I cant touch the part of the code which handles vertical rendering.
Thank you, and sorry for chaotic question.
First thing, let me tell you that under the assumption that the rendering of each pixel is independent from the other pixels, your task is what in the HPC field is called an "embarassing parallel problem"; that is, a problem that can be efficiently divided between any number of thread (until each thread has a single "unit of work"), without any intercommunication between the processes (which is very good).
That said, it doesn't mean that any parallelization scheme is as good as any other. For your specific problem, I would say that the two main factors to keep in mind are load balancing and cache efficiency.
Load balancing means that you should divide the work among threads in a way that each thread has roughly the same amount of work: in this way you prevent one or more threads from waiting for that one last thread that has to finish it's last job.
E.g.
You have 5 threads and you split your image in 5 big chunks (let's say 5 horizontal strips, but they could be vertical and it wouldn't change the point). Being the problem embarassing parallel, you expect a 5x speedup, and instead you get a meager 1.2x.
The reason might be that your image has most of computationally expensive details in the lower part of the image (I know nothing of rendering, but I assume that a reflective object might take far more time to render than a flat empty space), because is composed by a set of polished metal marbles on the floor on an empty frame.
In this scenario, only one thread (the one with the bottom 1/5 of the image) does all the work anyway, while the other 4 remains idling after finishing their brief tasks.
As you can imagine, this isn't a good parallelization: keeping load balancing in mind alone, the best parallelization scheme would be to assign interleaved pixels to each core for them to process, under the (very reasonable) assumption that the complexity of the image would be averaged on each thread (true for natural images, might yield surprises in very very limited scenarios).
With this solution, your image is eavenly distributed among pixels (statistically) and the worst case scenario is N-1 threads waiting for a single thread to compute a single pixel (you wouldn't notice, performance-wise).
To do that you need to cycle over all pixels forgetting about lines, in this way (pseudo code, not tested):
for(i = thread_num; i < width * height; i+=thread_num)
The second factor, cache efficiency deals with the way computers are designed, specifically, the fact that they have many layers of cache to speed up computations and prevent the CPUs to starve (remain idle while waiting for data), and accessing data in the "right way" can speed up computations considerably.
It's a very complex topic, but in your case, a rule of thumb might be "feeding to each thread the right amount of memory will improve the computation" (emphasys on "right amount" intended...).
It means that, even if passing to each thread interleaved pixels is probably the perfect balancing, it's probably the worst possible memory access pattern you could devise, and you should pass "bigger chunks" to them, because this would keep the CPU busy (note: memory aligment comes also heavily into play: if your image has padding after each line keep them multiples of, say, 32 bytes, like some image formats, you should keep it into consideration!!)
Without expanding an already verbose answer to alarming sizes, this is what I would do (I'm assuming the memory of the image is consecutive, without padding between lines!):
create a program that splits the image into N consecutive pixels (use a preprocessor constant or a command argument for N, so you can change it!) for each of M threads, like this:
1111111122222222333333334444444411111111
do some profiling for various values of N, stepping from 1 to, let's say, 2048, by powers of two (good values to test might be: 1 to get a base line, 32, 64, 128, 256, 512, 1024, 2048)
find out where the perfect balance is between perfect load balancing (N=1), and best caching (N <= the biggest cache line in your system)
a try the program on more than one system, and keep the smalles value of N that gives you the best test results among the machines, in order to make your code run fast everywhere (as the caching details vary among systems).
b If you really really want to squeeze every cycle out of every system you install your code on, forget step 4a, and create a code that automatically finds out the best value of N by rendering a small test image before tackling the appointed task :)
fool around with SIMD instructions (just kidding... sort of :) )
A bit theoretical (and overly long...), but still I hope it helps!
An alternating division of the columns will probably lead to a suboptimal cache usage. The threads should operate on a larger continuous range of data. By the way, if your image is stored row-wise it would also be better to distribute the rows instead of the columns.
This is one way to divide the data equally with any number of threads:
#define min(x,y) (x<y?x:y)
/*...*/
int q = width / nthreads;
int r = width % nthreads;
int w = q + (threadNum < r);
int start = threadNum*q + min(threadNum,r);
for( int px = start; px < start + w; px++ )
/*...*/
The remainder r is distributed over the first r threads. This is important when calculating the start index for a thread.
For the 8x8 image this would lead to:
thread 0 renders columns 0-2
thread 1 renders columns 3-5
thread 2 renders columns 6-7

How to approximate line segments in a grayscale image?

did any one know how to approximate lines from grayscale image resulted from line segment detector: using opencv or C language! in the image attached you see that each finger composed of many lines, what i need to do is to make each finger consists of exactly two parallel lines (i.e. approximate small lines to fit into only one line), if any one helps me, i will appreciate that.
N.B. i'm new to stackocerflow therefore i'm not allowed to post images, so for more clarification, that's the link of the image.
http://www.2shared.com/photo/Ff7mFtV3/Optimal.html
grayscale image resulted from line segment detector (LSD)
What have you done so far? You might need some heuristics. First add all segments on a table, try calculating the inclination of each of the segments and then sorting them by this as index. Afterwards, consider all segments that have an inclination say close by 5% or something to have the exact same inclination. This will induce a partitioning in the table. You might want to draw them using different colors so that you find the perfect parameter value.
Now you need to 'merge' all segments that have the same inclination and are close together. I'd try to measure the distance between the segments (google an algorithm for that) and sort the segments of each partition according to this. Consider merging segments that are close by less than, for instance, 3% of the total image height in pixels or something (find that empirically).
Last step, merging the segments should be very easy compared to the rest.
If you really want to find the fingers, you can stop earlier and compare the groups of same inclination to check if there are two almost (by 7% or so) parallel. The 5 closest pairs of inclinations should be fingers :-)

Loading tiles for a 2D game

Im trying to make an 2D online game (with Z positions), and currently im working with loading a map from a txt file. I have three different map files. One contains an int for each tile saying what kind of floor there is, one saying what kind of decoration there is, and one saying what might be covering the tile. The problem is that the current map (20, 20, 30) takes 200 ms to load, and I want it to be much much bigger. I have tried to find a good solution for this and have so far come up with some ideas.
Recently I'v thought about storing all tiles in separate files, one file per tile. I'm not sure if this is a good idea (it feels wrong somehow), but it would mean that I wouldn't have to store any unneccessary tiles as "-1" in a text file and I would be able to just pick the right tile from the folder easily during run time (read the file named mapXYZ). If the tile is empty I would just be able to catch the FileNotFoundException. Could anyone tell me a reason for this being a bad solution? Other solutions I'v thought about would be to split the map into smaller parts or reading the map during startup in a BackgroundWorker.
Try making a much larger map in the same format as your current one first - it may be that the 200ms is mostly just overhead of opening and initial processing of the file.
If I'm understanding your proposed solution (opening one file per X,Y or X,Y,Z coordinate of a single map), this is a bad idea for two reasons:
There will be significant overhead to opening so many files.
Catching a FileNotFoundException and eating it will be significantly slower - there is actually a lot of overhead with catching exceptions, so you shouldn't rely on them to perform application logic.
Are you loading the file from a remote server? If so, that's why it's taking so long. Instead you should embed the file into the game. I'm saying this because you probably take 2-3 bytes per tile, so the file's about 30kb and 200ms sounds like a reasonable download time for that size of file (including overhead etc, and depending on your internet connection).
Regarding how to lower the filesize - there are two easy techniques I can think of that will decrease the filesize a bit:
1) If you have mostly empty squares and only some significant ones, your map is what is often referred to as 'sparse'. When storing a sparse array of data you can use a simple compression technique (formally known as 'run-length encoding') where each time you come accross empty squares, you specify how many of them there are. So for example instead of {0,0,0,0,0,0,0,0,0,0,1,1,2,3,0,0,0,0,0,0,0,0,0,0,0,0,1} you could store {10 0's, 1, 1, 2, 3, 12 0's, 1}
2) To save space, I recommend that you store everything as binary data. The exact setup of the file mainly depends on how many possible tile types there are, but this is a better solution than storing the ascii characters corresponding to the base-10 representation of the numers, separated by delimiters.
Example Binary Format
File is organized into segments which are 3 or 4 bytes long, as explained below.
First segment indicates the version of the game for which the map was created. 3 bytes long.
Segments 2, 3, and 4 indicate the dimensions of the map (x, y, z). 3 bytes long each.
The remaining segments all indicate either a tile number and is 3 bytes long with an MSB of 0. The exception to this follows.
If one of the tile segments is an empty tile, it is 4 bytes long with an MSB of 1, and indicates the number of empty tiles including that tile that follow.
The reason I suggest the MSB flag is so that you can distinguish between segments which are for tiles, and segments which indicate the number of empty tiles which follow that segment. For those segments I increase the length to 4 bytes (you might want to make it 5) so that you can store larger numbers of empty tiles per segment.

About curse of dimensionality

My question is about this topic I've been reading about a bit. Basically my understanding is that in higher dimensions all points end up being very close to each other.
The doubt I have is whether this means that calculating distances the usual way (euclidean for instance) is valid or not. If it were still valid, this would mean that when comparing vectors in high dimensions, the two most similar wouldn't differ much from a third one even when this third one could be completely unrelated.
Is this correct? Then in this case, how would you be able to tell whether you have a match or not?
Basically the distance measurement is still correct, however, it becomes meaningless when you have "real world" data, which is noisy.
The effect we talk about here is that a high distance between two points in one dimension gets quickly overshadowed by small distances in all the other dimensions. That's why in the end, all points somewhat end up with the same distance. There exists a good illustration for this:
Say we want to classify data based on their value in each dimension. We just say we divide each dimension once (which has a range of 0..1). Values in [0, 0.5) are positive, values in [0.5, 1] are negative. With this rule, in 3 dimensions, 12.5% of the space are covered. In 5 dimensions, it is only 3.1%. In 10 dimensions, it is less than 0.1%.
So in each dimension we still allow half of the overall value range! Which is quite much. But all of it ends up in 0.1% of the total space -- the differences between these data points are huge in each dimension, but negligible over the whole space.
You can go further and say in each dimension you cut only 10% of the range. So you allow values in [0, 0.9). You still end up with less than 35% of the whole space covered in 10 dimensions. In 50 dimensions, it is 0.5%. So you see, wide ranges of data in each dimension are crammed into a very small portion of your search space.
That's why you need dimensionality reduction, where you basically disregard differences on less informative axes.
Here is a simple explanation in layman terms.
I tried to illustrate this with a simple illustration shown below.
Suppose you have some data features x1 and x2 (you can assume they are blood pressure and blood sugar levels) and you want to perform K-nearest neighbor classification. If we plot the data in 2D, we can easily see that the data nicely group together, each point has some close neighbors that we can use for our calculations.
Now let's say we decide to consider a new third feature x3 (say age) for our analysis.
Case (b) shows a situation where all of our previous data comes from people the same age. You can see that they are all located at the same level along the age (x3) axis.
Now we can quickly see that if we want to consider age for our classification, there is a lot of empty space along the age(x3) axis.
The data that we currently have only over a single level for the age. What happens if we want to make a prediction for someone that has a different age(red dot)?
As you can see there are not enough data points close this point to calculate the distance and find some neighbors. So, If we want to have good predictions with this new third feature, we have to go and gather more data from people of different ages to fill the empty space along the age axis.
(C) It is essentially showing the same concept. Here assume our initial data, were gathered from people of different ages. (i.e we did not care about the age in our previous 2 feature classification task and might have assumed that this feature does not have an effect on our classification).
In this case , assume our 2D data come from people of different ages ( third feature). Now, what happens to our relatively closely located 2d data, if we plot them in 3D? If we plot them in 3D, we can see that now they are more distant from each other,(more sparse) in our new higher dimension space(3D). As a result, finding the neighbors becomes harder since we don't have enough data for different values along our new third feature.
You can imagine that as we add more dimensions the data become more and more apart. (In other words, we need more and more data if you want to avoid having sparsity in our data)

Resources