I notice that Vlfeat/Sift generates same number of descriptors for different images, why?
And what is the aim of orientation assignment? It seems there is a similar process in the descriptor formation process.
I am quite new to the SIFT and confused about a lot of stuff in SIFT. Thanks for your help.
Vlfeat/Sift generates same number of descriptors for different images, why?
Except if you explicitly pass a file with given input keypoints (*) the number of descriptors is NOT the same and clearly depends on the input image content.
For example, if you compare the Starbucks logo with Lena:
./sift --frames starbucks.pgm; wc -l starbucks.frame
601 starbucks.frame
./sift --frames lena.pgm; wc -l lena.frame
1769 lena.frame
Here I used 300x300 pixels images. The --frames outputs the keypoints found with one position, scale and orientation per line.
(*) This means you ask VLFeat to describe a set of pre-defined interest points. With the sift VLFeat command line tool, you can do that with the --read-frames option.
What is the aim of orientation assignment?
This is to achieve rotation invariance. If you refer to the original paper:
One or more orientations are assigned to each keypoint location based on local
image gradient directions. All future operations are performed on image data that
has been transformed relative to the assigned orientation, scale, and location
for each feature, thereby providing invariance to these transformations.
Related
I've found some similar questions on the internet, but not with system calls.
I'm doing an exercise in my system programming class. It asks you to combine ppm image files in binary form(P6, not P3) using only system calls. Input is taken from command line. The first input is the larger image file, the second input is the smaller image file, and after the inputs the files are combined and written into a third file, which you can specify the name of as a third input. The smaller file is written into the top-right of the larger file.
Since we're using system calls, I'm assuming we're mainly using open/read/write/lseek. I know that my ppm files will store the magic number P6 at the beginning, and then the width and height of the file. I know that I can use lseek by writing from the left side of row 1 of the larger ppm file, then stopping when my pointer reaches the difference in size between the two files, and writing the remaining pixels for the line from the smaller file. Then continuing, line by line...
However, I have no idea how to implement this. I'm not sure how to even read the width and height of the file, either. I'm sort of lost. Can someone help me get on the right track? I'd post my code if I actually had anything meaningful, but I just created file descriptors for the files and opened them so far.
EDIT:
I combine them by overwriting the top right of the larger file with the smaller file. I'm not guaranteed anything about the images, so I have to somehow pull the width and height from both files and compare them, to make sure the first file has both a larger width and height than the second.
I think they want us to loop through the files, pixel by pixel, reading each rgb group of values for each pixel using read. Then writing in a loop, pixel by pixel, using write.
EDIT2:
Here's a list of unix system calls. They're available in the c library, but this is all I'm allowed to use: http://www.di.uevora.pt/~lmr/syscalls.html
I have to calculate the flight path of a projectile and draw the result in a bitmap file. So far I'm pretty clueless how to do that.
Would it be a good idea to safe the values of the flight path in a struct and transfer it to the bitmap file?
Do you have any other suggestions how it could be done in a better way?
The simplest way to produce an image file without much hassle with only standard C library tools is most likely writing a bmp file. For start, check the Wikipedia article on this file format, it gives a quite complete description of it.
If you don't want to go too deep in that, save for example a 640x480 or so empty 24 bit ("truecolor") .bmp image, and rip out it's header for your use. Depending on the program you use to save your image, you might end up with varying header size, however since the data is not compressed, it is fairly easy isolate the header. For a 640x480 image the data will be exactly 921600 bytes long, anything preceding it is the header.
From here the colors are (usually) in RGB order, bottom to top row, left to right. Experimenting a little should give you the proper results.
If you only have the standard C libraries to work with, it is unlikely there is anything much simpler to implement. Of course this case you will have to write a pixel matrix (so no much assistance for solving the actual problem you want to image), but that's true for any image format (maybe except if you rather aim for creating an SVG for a twist, it is neither too hard, just XML).
I know that it might sound silly but, while working on project at some time I feel the need of knowing the very basics of file formats.
I know every thing is stored in binary 1-0 in hard disk and can get an input stream of that.
But now what if
I don't know the format of file now how to decide it by input stream
I know its format now what part of the input stream represent different portions of file
for eg. take a jpeg file that red background now what part of stream represent this information.
I need urgent help(any type links to blog e-books) will be highly appreciable.
Thank you
List of file signatures and Magic number (programming)
Why do you need to operate on such a low level like input streams? Use a library to get you the information needed on the given file. And btw your jpeg example is a bad one. jpeg is a pixel based image format which has no such thing like "background". That "background" exists only because the user interpretes the red pixels as background.
For the purposes of this example, suppose there exist 2 binary files A and B, each containing a variation of, say, youtube video, where
A contains a 5 second ad
B contains no ad
With the exception for the ad, A contains the same content as B
Total length of file A is 60 seconds
Total length of file B is 55 seconds
As a general rule, if we were to compare bits patterns of each file, would we arrive to the same conclusion: files contain 55 seconds worth of common bits?
If we extend the problem further, say to the world of 2 jars, the only difference between which are comments, would it be appropriate to compare the order of bits and based on what we find, determine the degree of likeness?
It's easy to determine whether files are identical or not. Will the approach of comparing bits help accurately determine the degree to which files are close to one another?
The question is not about video files, but rather a general binary files. I mention video file above for example purposes only.
It depends on the file-format, but in your examples — no, probably not.
Video with and without initial ad: videos are usually encoded by breaking them into small time-blocks, and then encoding and compressing those blocks; if you insert an ad at the beginning, then you will most likely cause the block-transitions to happen at different time offsets within the main video.
Jar-file with and without comments (or with different comments): same story; changing the length of a comment within a file will affect the splitting of the entire file into compressible blocks, so all blocks after an altered comment will be compressed differently. (This is, of course, assuming that the jar-file actually includes the comments. Just because comments were in the source-code, that doesn't mean the jar-file will have them; that depends on compiler settings and so on.)
Most video compression these days is done with lossy algorithms. The compression is done both within a frame and BETWEEN frames. If the extra video frames added in your "A" video "leak" into the original movie because of the inter-frame compression, then by definition your two video files will be different videos, even though logically they're the same movie with 5 seconds of ad tacked onto the front. The compression algorithm will have merged 1 or more frames of the two videos into a hybrid of the two, and this fundamentally changes things.
I am new to C; I have an image file translated by means of online tools into a .h and .c file. The C file contains an array of 1024 16 bit hexadecimal numbers, used to denote on/off of bits. I want to read this file and draw the image onscreen using DMA...but I'm very much at a loss as to how to do this. Can anybody out there help? Does anyone even know what I'm talking about?
To draw an image onscreen, use DMA[3]. This is channel 3 of DMA for images.
This is how you set up DMA in a .h file:
http://nocash.emubase.de/gbatek.htm#gbadmatransfers
And then to draw an image using DMA:
#######include image.h
DMA[3].src = (specify your image source here, where you're drawing from)
DMA[3].dst = (where you're drawing pixels to)
In your scenario, I think you indicate the name of the file in your source.
Keep in mind you're using POINTERS to images for src and dst.
DMA[3].cnt = (how many times you want to do it) | flag1 | flag2...
Here are some flags:
DMA_SOURCE_FIXED means you draw from the same pixel over and over again. If this is what you want, then turn this bit on in cnt.
DMA_DESTINATION_FIXED applies that you're drawing TO the same pixel over and over again. If this is what you want, then turn on this bit in cnt.
Otherwise, DMA_SOURCE_INCREMENT and DMA_DESTINATION_INCREMENT are on by default (if not, you can turn them on in cnt anyway).
This is what I used for VBA, so I'm sorry if this does not answer your question (I'm kind of inexperienced with C as well...).
#Michael Yes, I mean the Visual Boy Advance