MediaRecorder SetNextOutputFile() dropping frames on transition? - android-mediarecorder

Is anyone else seeing dropped video frames at transitions between recorded files using MediaRecorder.SetNexVideoFile()?
For a continuous chunked video recording application I am calling SetNextVideoFile() upon receiving a MediaRecorder.MEDIA_RECORDER_INFO_MAX_FILESIZE_APPROACHING notification (having called SetMaxFileSize prior to Prepare()). I see a few frames dropped at the transition of each new file.
I have examined my individual files by demultiplexing audio and video (h.264 + AAC .mp4) and inspecting their track durations separately. There is no apparent loss of audio samples, but the video tracks are missing a few frames on each file and are consequently shorter in duration. This adversely affects efforts to play the chunks back seamlessly, e.g. with the ExoPlayer2 ConcatenatingMediaSource or other playback and postprocessing tools; file transitions result in visible hiccups.
I have tried a variety of maximum file sizes corresponding to durations of 10 seconds to 10 minutes. The dropping of frames is found at all of these file sizes.

Related

Non Redundant Image Extraction From Video

I am collecting data for a project. The data collection is done by recording videos of the subjects and the environment. However, while training the network, I would not want to train it with all the images collected in the video sequence.
The main objective is to not train the network with redundant images. The video sequence collected at 30 frames/sec can have redundant images (images that are very similar) within the short intervals. T(th) frame and (T+1)th frame can be similar.
Can someone suggest ways to extract only the images that can be useful for training ?
Update #2: Further resources,
https://github.com/JohannesBuchner/imagehash
https://www.pyimagesearch.com/2017/11/27/image-hashing-opencv-python/
https://www.pyimagesearch.com/2020/04/20/detect-and-remove-duplicate-images-from-a-dataset-for-deep-learning/
Update #1: You can use this repo to calculate similarity between given images. https://github.com/quickgrid/image-similarity**
If frames with certain objects(e.g., vehicle, device) are important, then use pretrained object detectors if available, to extract important frames.
Next, use a similarity method to remove similar images in nearby frames. Until a chosen threshold is exceeded keep removing nearby N frames.
This link should be helpful in finding right method for your case,
https://datascience.stackexchange.com/questions/48642/how-to-measure-the-similarity-between-two-images
This repository below should help implement the idea with few lines of code. It uses CNN to extract features then calculates there cosine distance as mentioned there.
https://github.com/ryanfwy/image-similarity

Feeding multiple sample rates to the same buffer source : ffmpeg filters

I have a remote source of PCM audio samples which keeps changing the sample rate. It sometimes supplies 16Khz and the later 48Khz depending upon the bandwidth. I would like to convert them to FLTP through a filter, before feeding to an audio decoder. When I do that I get the error "Changing audio frame properties on the fly is not supported. [Invalid argument]".
Can someone please suggest a way this can be done?
Is it possible to create a filter graph with multiple buffer sources but only one sink?
I used swr_convert_frame() keeping an array of SwrContext * for different input sample rates.

av_read_frame reads frames from cache

I want to detect an object with my camera. For performance reason, i like to keep the connection to my camera alive and read new images on demand.
The function to read images calls av_read_frame till the frame is complete and then does some calculation.
My problem now is, that the frames "chain-up". If i stop frequently asking for new frames, i get old-images and not the current, because they're not yet readed (even i don't need them). If possible, i don't want to read the images with an additional thread because i don't want to waste resources on my RaspberryPi. Any ideas how to disable this "cache" or other ideas?

About attempting to sync audio and video

I've got a little side project going on using SDL2/SDL_mixer and a couple other sound libraries. I've been trying for a while now to synchronize my audio and video but haven't been able to get it anywhere near successfully. All new to this stuff so forgive the poorman's logic and coding. At first I thought to set the delay to SDL_Delay(30) after every frame, and then a few other numbers in that range. Not quite right. Then I tried doing it by getting Ticks. Where I would get the difference between current_ticks and last_ticks and set a delay if the delta between ticks was <=30 and set the delay to 30-delta. Still not quite right (by far). Hoping someone on here with more experience might guide me in the right direction. In regards to the video, it's a visualizer of course, seems like a popular beginners project.
The basic way you synchronize audio and video is that you choose one to use as a timer source and present the other according to that timer. The easiest is generally audio, but because it's generally buffered ahead, you need some method of measuring what time in the audio stream is actually coming out of the speakers. Once you get that, it's just a matter of waiting until the audio reaches the right time for the next video frame and displaying it.

Using Silverlight 2 for short audio caching

I'm attempting to use a large number of short sound samples in a game I'm creating in Silverlight 2. The samples are less than 2 seconds long.
I would prefer to load all the audio samples onto the canvas during the initualization. I have been adding the media element to the canvas and a generic list to manage it. So far, it appears to work.
When I play the sample the first time, it plays perfectly. If it has finished playing and I want to re-use the same element, it cuts off the first part of the sound. To play the sample again, I stop and play the media element.
Is there another method I should use the samples so that the audio is not clipped and good performance is obtained?
Also, it's probably a good idea to make sure that all of your audio samples are brought down to the client side initially. Depending on how you set it up, it's possible that the MediaElements are using their progressive download functionality to get the media files from the server. While there's nothing wrong with this per se (browser caching should be helping you out after the initial download), it does mean that you have to deal with the browser cache, and there are some potential issues there.
Possible steps to try:
Mark your audio files as "Content". This will get them balled up in the .xap.
Load your audio files into MemoryStreams (see Application.GetResourceStream method) and call MediaElement.SetSource().
HTH,
Erik
Some comments:
From MSDN:
Try to limit the number of MediaElement objects you have in your application at once. If you have over one hundred MediaElement objects in your application tree, regardless of whether they are playing concurrently or not, MediaFailed events may be raised. The way to work around this is to add MediaElement objects to the tree as they are needed and remove them when they are not.
You could try to seek to the start of the sample to reset the point currently being played before re-using it with:
mediaelement.Position = new TimeSpan();
See also MSDNs MediaElement.Position.
One techique you can use, although I'm not sure how well it will work in Silverlight, is create one large file with all of your samples joined together (probably with a half-second or so of silence between each). Figure out the timecode for each sample and seek the media element to that position and play. You'll only need as many media elements as simultaneous sounds you want to play.

Resources