what is the meaning of this angle parameter in vlfeat's sift function? - sift

I am wondering what the output parameter angles is and why the length of angle is 4?

Have a look at the original paper, section 5. Orientation assignment:
An orientation histogram is formed from the gradient orientations of sample points within a region around the keypoint [...] Peaks in the orientation histogram correspond to dominant directions of local gradients. The highest peak in the histogram is detected, and then any other local peak that is within 80% of the highest peak is used to also create a keypoint with that orientation. Therefore, for locations with multiple peaks of similar magnitude, there will be multiple keypoints created at
the same location and scale but different orientations.
This is also explained by the VLFeat implementation (see sift.c):
This histogram is then smoothed and the maximum is selected. In
addition to the biggest mode, up to other three modes whose amplitude
is within the 80% of the biggest mode are retained and returned as
additional orientations.

Related

2d geometry : angle+power prediction for shot

before any talking, please look at this :
http://i.stack.imgur.com/pQDAC.jpg
I want to know how can predict right angle and power for ai to make shot meet player . both of them have static position and doesn't move .
Thanks for any help...
You would need to take a look at horizontal projectiles for starters. The problem is that different power will require different angles of launch, so you would need to try out various power ranges or angle ranges.
EDIT: The image you have attached describes the path of a given projectile (bullet, bomb, any item you throw) across a horizontal plane (parallel to the ground). These particular type of problems usually require a variation of the equations of linear motion which is what you have there on the website.
Besides the equations of motion, the website I linked should give you some simple problems and how you can solve them to make sure that you are following.
As per your question, the targets will be static thus the distance component of the equation will be known and will not change. The other components you will need to find is the angle of launch and the initial velocity of the round (denoted by the power you use).
An approach would be to take a range of angles [1,89] degree inclusive and see what initial velocity you would need to make the projectile travel distance.
If you will be dealing with situations identical to the image, that is, there will be no obstacles in the middle, you can also assume that the angle of launch will always be 45 degrees since that will always give you the maximum range for a constant initial velocity. If you take this approach you will simply need to find the initial velocity require to make a projectile travel distance at an angle of 45 degrees.

shadow and shading

I have read lots of ray tracer algorithm on the web. But, I have no clear understanding of the shading and shadow. Is below pseudocode correct written according to my understanding ?
for each primitive
check for intersection
if there is one
do color be half of the background color
Ishadow = true
break
for each ambient light in environment
calculate light contribution to the color
if ( Ishadow == false )
for each point light
calculate diffuse shading
calculate reflection direction
calculate specular light
trace for reflection ray // (i)
add color returned from i after multiplied by some coefficient
trace for refraction ray // (ii)
add color returned from ii after multiplied by some coefficient
return color value calculated until this point
You should integrate your shadows with the normal ray-tracing path:
For every screen-pixel you send a ray through the scene and you eventually determine the closest object-intersection: at the point of the closest object-intersection you would at first read out the pixel color (texture of the object at that point), aside from calculating reflection-vector etc (using the normal-vector) you would now additionally cast a ray from that intersection-point to each of the light-sources in your scene: if these rays intersect other objects before hitting the light-sources then the intersection-point is in shadow and you can adapt the final color of that point accordingly.
The trouble with pseudocode is that it is easy to get "pseudo" enough that it becomes the same well of ambiguity that we are trying to avoid by getting away from natural languages. "Color be half of the background color?" The fact that this line appears before you iterate through your light sources is confusing. How can you be setting Ishadow before you iterate over light sources?
Maybe a better description would be:
given a ray in space
find nearest object with which ray intersects
for each point light
if normal at surface of intersected object points toward light (use dot product for this)
cast a ray into space from the surface toward the light
if ray intersection is closer than light* light is shadowed at this point
*If you're seeing strange artifacts in your shadows, there is a mistake that is made by every single programmer when they write their first ray tracer. Floating point (or double-precision) math is imprecise and you will frequently (about half the time) re-intersect yourself when doing a shadow trace. The explanation is a bit hard to describe without diagrams, but let me see what I can do.
If you have an intersection point on the surface of a sphere, under most circumstances, that point's representation in a floating point register is not mathematically exact. It is either slightly inside or slightly outside the sphere. If it is inside the sphere and you try to run an intersection test to a light source, the nearest intersection will be the sphere itself. The intersection distance will be very small, so you can simply reject any shadow ray intersection that is closer than, say .000001 units. If your geometry is all convex and incapable of legitimately shadowing itself, then you can simply skip testing the sphere when doing shadow tests.

Determine chessboard dimensions in pixels

Similar to calibrating a single camera 2D image with a chessboard, I wish to determine the width/height of the chessboard (or of a single square) in pixels.
I have a camera aimed vertically at the ground, ensured to be perfectly level with the surface below. I am using the camera to determine the translation between consequtive frames (successfully achieved using fourier phase correlation), at the moment my result returns the translation in pixels, however I would like to use techniques similar to calibration, where I move the camera over the chessboard which is flat on the ground, to automatically determine the size of the chessboard in pixels, relative to my image height and width.
Knowing the size of the chessboard in millimetres, I can then convert a pixel unit to a real-world-unit in millimetres, ie, 1 pixel will represent a distance proportional to the height of the camera above the ground. This will allow me to convert a translation in pixels to a translation in millimetres, recalibrating every time I change the height of the camera.
What would be the recommended way of achieving this? Surely it must be simpler than single camera 2D calibration.
OpenCV can give you the position of the chessboard's corners with cv::findChessboardCorners().
I'm not sure if the perspective distortion will affect your calculations, but if the chessboard is perfectly aligned beneath the camera, it should work.
This is just an idea so don't hit me.. but maybe using the natural contrast of the chessboard?
"At some point it will switch from bright to dark pixels and that should happen (can't remember number of columns on chessboard) times." should be a doable algorithm.

comparing bmps for brightness

I have a two bmp files of the same scene and I would like determine if one is more bright than the other.
Similarly I have a set of bmps with different contrasts and another set of bmps with different saturation.
How do I compare these images for brightness,contrast and saturation ? These test images are saved by a tool provided by the sensor manufacturer.
I am using gcc 4.5.
To compare the brightness of two images you need to compare the grey value of the pixels (yes, one by one). In the RGB colour space the brightness (grey value) is the mean of R,G and B, so you have brightness = (R+G+B) / 3
Comparing the contrast and especially the saturation will prove to be not that easy, for a start you could have a look at HSL and HSV but in general I'd suggest to get a good book on the image processing topic.
The answer of (R+G+B)/3 is really not even a good approximation of brightness (at least from what we know today)!
[BRIGHTNESS]
What you really SHOULD do is convert to another color scale and compare the brightness using that channel of a color scale that incorporates brightness into it. Look here!!!
Formula to determine brightness of RGB color
there are a great coupld of answers here that talk about conversion or RGB into luminance, etc...
[CONTRAST]
Contrast is a function of the spread of the pixel values throughout the full range of possible pixel values. One understands the contrast by putting together a histogram of all the pixels (where the x axis represents the a pixel value, and the y axis represents how many pixels are of that value), and analyzing the histogram to understand if there is good distribution throught the entire range, or not. Comparing contrast can be done many ways, but potentially a good starting point, would be to find the pixel-value center point (average of the histogram data) of each image, and potentially some histogram width parameter (where lets say the width is about the center point and is large enough to incorporate 90% of all pixels), and compare the center and width parameters of both images. This is ONLY a starting point.
[SATURATION]
To compare saturation, one might convert the image to the HSL colour space. The S in HSL stands for Saturation. Comparing saturation within this colour space becomes exactly like comparing brightness as outlined above!!!

What is dataset Bounding Box?

I am bit new to Imaging and want to understand below:
what is the bounding box of a dataset and why is that needed? Does it represent something of measurement in real world or just for computer screen where it is displayed? How is this related to the image size specified in pixels?
What does WMTS layers zoom level & matrix sets mean? I understand that WMTS works by using getting tiles of the dataset. Also, I see that the get Capabilities for a specific WMTS dataset returns back matrix Sets in the XML which I don't understand?
what do the matrix datasets and zoom levels signify and how can I understand them as a layman?
I have tried googling a bit but it looks like the articles assume some technical knowledge around this already which I am trying to gather.
A bounding box is the (imaginary) rectangle that you can draw around a dataset (or feature) that touches it's maximums and minimums in both X and Y direction. It is measured in the same units as the geometry. It is related to the image size in pixels as the resolution or scale which are bbox.width/image.width or (the inverse), and are in units of metres/pixel or pixels/metre (or degree or foot).
A WMTS layer is a set of pre-rendered tiles that have been produced at a fixed set of scales and over a fixed area. These are related in the matrix sets of a WMTS layer - the zoom level is how far down that set of scales you have traversed with 0 being the top and an arbitrary number (usually between 15-20 for global data sets) being the lowest (or most detailed).
See 2. - You should not really need to understand them in detail as your client library will handle all of that for you.

Resources