I would like to compare two mp4 files, does somebody has an idea?
Maybe by interposing the video spectrum?
Thanks.
I had an idea for this a while back. I never implemented it, but it went something like this:
Get a good video library to do the heavy lifting for you, I like Aforge.NET
Use the library to walk through the video and extract bitmap frames, get a few hundred
Fix the resolution to a single aspect ratio
Reduce the images to something low-res like 16x16 or 64x64, using a nearest neighbor approach. This will blur the image such that two similar images will reduce to the same
Gather a chunk of these images by relative video timestamp and hash them to further reduce the data
Compare said hashes
Again, I never implemented this, so I don't know if it works, but the thing it has going for it is that video is very complex. While comparing any given frame to another won't work, based on different formats, resolutions, etc., the odds of a series of reduced hashes being the same from two different videos seems very low. Thus, few false positives. Also it seems like it could also tell you if one span of video was contained in another.
If I get around to making something like this I'll circle back here and post about it.
Related
I'm asking a question for a school project. I am trying to play a video in a loop and change its speed according to the distance from the Kinect 1414. When you are far the video plays at normal speed, which then increases as you get closer.
I tried in different ways but sometimes either the video doesn't loop or it doesn't show, I can only hear the audio. Other solutions don't let me change the speed of the video. Do you know any way to have the distance of a person affect the video?
Thanks.
Stack Overflow isn't designed for general "how do I do this" type questions like this. It's for specific "I tried X, expected Y, but got Z instead" type questions. But I'll try to help in a general sense.
You need to break your problem down into smaller pieces and then take on those pieces one at a time. For example, can you start with a very basic sketch that just displays the distance from the Kinect? Don't worry about the video yet, just display the distance on the screen.
Separately from that, can you create another sketch that just shows a video playing in a loop? Work your way up from there: can you make it so its speed is based on a hard-coded value? How about on a value like mouseX?
Get those two basic sketches working perfectly by themselves before you start thinking about combining them. Then if you get stuck on one of those steps, you can post a MCVE along with a specific question (in a new question post), and we'll go from there. Good luck.
I'm trying to look through and find a way to annotate a video in C with polygons bounding boxes, however I'm stuck at a very elementary step.
Assuming I know how to break a .MPEG movie up into multiple JPEG images, how do I manipulate that file in C? The things I'll eventually need to draw on are text, points, and lines, but I am having a hard time figuring out how to get started with this.
If I declare:
FILE* img = fopen('foo.jpeg', 'r');
then what could I do with img? Is there a way to access certain pixels in the drawing?
What you did in your code sample is just opening a file. You didn't even read any data from it yet.
The simplest way to load an image file is to use dedicated library, such as SOIL.
If you weren't able to do it by yourself, however, I really don't think you will be able to accomplish your project goals - it is really advanced stuff you want to create, and you failed, as you already noticed, on the most basic of steps.
I am thinking about creating a database system for images where they are stored with compact signatures and then matched against a "query image" that could be a resized, cropped, brightened, rotated or a flipped version of the stored one. Note that I am not talking about image similarity algorithms but rather strictly about duplicate detection. This would make things a lot simpler. The system wouldn't care if two images have an elephant on them, it would only be important to detect if the two images are in fact the same image.
Histogram comparisons simply won't work for cropped query images. The only viable way to go I see is shape/edge detection. Images would first be somehow discretized, every pixel being converted to an 8-level grayscale for example. The discretized image will contain vast regions in the same colour which would help indicate shapes. These shapes then could be described with coefficients and their relative position could be remembered. Compact signatures would be produced out of that. This process will be carried out over each image being stored and over each query image when a comparison has to be performed. Does that sound like an efficient and realisable algorithm? To illustrate this idea:
removed dead ImageShack link
I know this is an immature research area, I have read Wikipedia on the subject and I would ask you to propose your ideas about such an algorithm.
SURF should do its job.
http://en.wikipedia.org/wiki/SURF
It is fast an robust, it is invariant on rotations and scaling and also on blure and contrast/lightning (but not so strongly).
There is example of automatic panorama stitching.
Check article on SIFT first
http://en.wikipedia.org/wiki/Scale-invariant_feature_transform
If you want to do a feature detection driven model, you could perhaps take the singular value decomposition of the images (you'd probably have to do a SVD for each color) and use the first few columns of the U and V matrices along with the corresponding singular values to judge how similar the images are.
Very similar to the SVD method is one called principle component analysis which I think will be easier to use to compare between images. The PCA method is pretty close to just taking the SVD and getting rid of the singular values by factoring them into the U and V matrices. If you follow the PCA path, you might also want to look into correspondence analysis. By the way, the PCA method was a common method used in the Netflix Prize for extracting features.
How about converting this python codes to C back?
Check out tineye.com They have a good system that's always improving. I'm sure you can find research papers from them on the subject.
The article you might be referring to on Wikipedia on feature detection.
If you are running on Intel/AMD processor, you could use the Intel Integrated Performance Primitives to get access to a library of image processing functions. Or beyond that, there is the OpenCV project, again another library of image processing functions for you. The advantage of a using library is that you can try various algorithms, already implemented, to see what will work for your situation.
I'm looking for the most realistic way of playing sound of a rolling ball. Currently I'm using a Wav sample that I play over and over as long as the ball is moving - which just doesn't feel right.
I've been thinking about completely synthesizing the sound, which I know very little about (almost nothing), I'd be grateful for any tutorials/research materials/samples concerning synthesis of sound of a ball made of particular material rolling on surface made of another material. Also if this idea is completely wrong, please suggest another way of doing this.
Thanks!
I would guess that you'll get the biggest bang for your buck by doing a dynamic frequency adjustment on the sound that makes the playback frequency proportional to the velocity of the ball. I don't know what type of sound library you use, but most will support some variant of this.
For example, in FMOD you could use the Channel::setFrequency method. Ideally, you would compute your desired playback frequency based on your WAV's original sample frequency (Fo), the ball's current velocity (Vc), and the ball's 'ideal' velocity at which the default WAV sounds right (Vi). Something generally like:
F = Fo * ( Vc / Vi )
This will tend to break down as the ball gets farther away from the 'ideal' velocity. You might want to have several different WAVs that are appropriate for different speed ranges that you switch to at certain threshold velocities. Within each WAV's bracket, you'd do the same kind of frequency adjustment.
Another note: this is probably not something that is worth doing every frame. I'd guess that doing this more than 20 times per second would be a waste of time.
ADDENDUM: Playback frequency scaling like this can also be used for simulating the Doppler effect as well. Once you have your adjusted playback frequency, you'd perform another scale of the Frequency based on the velocity of the ball relative to the 'listener' (the camera).
Have you tried playing the sound forward, then playing it backward, and looping that? I use this trick graphically to creating repeating patterns. I don't know much about sound but it might work?
One approach might be to analyze the sound of a rolling ball, and decompose it into its component waveforms. Then you'd be able to generate your own wav file with synthesized waves.
You should be able to do this using an FFT on a sample of the sound.
One drawback is that the sound will likely sound synthesized - you'll have to add noise and such to make it sound more realistic. Getting it to sound real enough may be the hardest part.
I don't think you need the trouble to synthesize that. It would be way too hard to even sound convincing.
Depending on how your scene is, you could loop the sound foward/backwards and simulate a doppler effect applying a low pass filter and/or changing it's pitch.
By the way, freesoung.org is a great place for free samples. They are not professionally recorded but are a good starting point for manipulation. On the other hand, sound ideas has some great sample cds (they're actually industry standard) if you can find them on the cheap. You just have to search for which one has rolling ball sounds.
I really like the approach described in this SIGGRAPH paper:
http://www.cs.ubc.ca/~kvdoel/publications/foleyautomatic.pdf
It describes synthesizing the sound of a rock rolling in a wok (no, really :). The idea is to use modal synthesis (i.e. convolved impulse responses), and the results can be very convincing.
Here's a link to the video demo that goes with the paper:
http://www.cs.ubc.ca/~kvdoel/publications/foleyautomatic.mpeg
And here's a link to the JASS library (written by one of the authors), which was used to create the sound for the video:
http://www.cs.ubc.ca/~kvdoel/jass/jass.html
I'm not sure if you could make it run on a smart phone, but with an efficient enough convolution routine/approximation you might be able to do something interesting...
My question is 'why?' - do you see some benefit in this, or is it just for fun? Your question implies that you aren't happy with the wav you are using but I strongly believe that synthesising your own is going to sound far inferior.
If your wav sample doesn't sound right, I'd suggest try to find another sample. Synthesising a sound is not easy and is never going to sound as realistic as your sample.
Real time synthesis may require more resources for processing and computation. You may very well end up prerendering your synthesised sound into a wav file and performing a playback.
If you want to simulate the sound of different materials then you can use some DSP, or even simple tricks like slowing or speeding the wav playback. The simplest way is the prerender these in another application and store one copy of the file for each use.
I'm attempting to use a large number of short sound samples in a game I'm creating in Silverlight 2. The samples are less than 2 seconds long.
I would prefer to load all the audio samples onto the canvas during the initualization. I have been adding the media element to the canvas and a generic list to manage it. So far, it appears to work.
When I play the sample the first time, it plays perfectly. If it has finished playing and I want to re-use the same element, it cuts off the first part of the sound. To play the sample again, I stop and play the media element.
Is there another method I should use the samples so that the audio is not clipped and good performance is obtained?
Also, it's probably a good idea to make sure that all of your audio samples are brought down to the client side initially. Depending on how you set it up, it's possible that the MediaElements are using their progressive download functionality to get the media files from the server. While there's nothing wrong with this per se (browser caching should be helping you out after the initial download), it does mean that you have to deal with the browser cache, and there are some potential issues there.
Possible steps to try:
Mark your audio files as "Content". This will get them balled up in the .xap.
Load your audio files into MemoryStreams (see Application.GetResourceStream method) and call MediaElement.SetSource().
HTH,
Erik
Some comments:
From MSDN:
Try to limit the number of MediaElement objects you have in your application at once. If you have over one hundred MediaElement objects in your application tree, regardless of whether they are playing concurrently or not, MediaFailed events may be raised. The way to work around this is to add MediaElement objects to the tree as they are needed and remove them when they are not.
You could try to seek to the start of the sample to reset the point currently being played before re-using it with:
mediaelement.Position = new TimeSpan();
See also MSDNs MediaElement.Position.
One techique you can use, although I'm not sure how well it will work in Silverlight, is create one large file with all of your samples joined together (probably with a half-second or so of silence between each). Figure out the timecode for each sample and seek the media element to that position and play. You'll only need as many media elements as simultaneous sounds you want to play.