Tell libavcodec/ffmpeg to drop frame

Tell libavcodec/ffmpeg to drop frame - c

I'm building an app in which I create a video.
Problem is, sometime (well... most of the time) the frame acquisition process isn't quick enough.
What I'm currently doing is to skip the current frame acquisition if I'm late, however FFMPEG/libavcodec considers every frame I pass to it as the next frame in line, so If I drop 1 out of 2 frames, a 20seconds video will only last 10. More problems come in as soon as I add sound, since sound processing is way faster...
What I'd like would be to tell FFMPEG : "last frame should last twice longer that originally intended", or anything that could allow me to process in real time.
I tried to stack the frames at a point, but this ends up killing all my memory (I also tried to 'stack' my frames in the hard drive, which was way to slow, as I expected)
I guess I'll have to work with the pts manually, but all my attempts have failed, and reading some other apps code which use ffmpeg, such as VLC, wasn't of a great help... so any advice would be much appreciated!
Thanks a lot in advance!

your output will probably be considered variable framerate (vfr), but you can simply generate a timestamp using wallclock time when a frame arrives and apply it to your AVFrame before encoding it. then the frame will be displayed at the correct time on playback.
for an example of how to do this (at least the specifying your own timestamp part), see doc/examples/muxing.c in the ffmpeg distribution (line 491 in my current git pull):
frame->pts += av_rescale_q(1, video_st->codec->time_base, video_st->time_base);
here the author is incrementing the frame timestamp by 1 in the video codec's timebase rescaled to the video stream's timebase, but in your case you can simply rescale the number of seconds since you started capturing frames from an arbitrary timebase to your output video stream's timebase (as in the above example). for example, if your arbitrary timebase is 1/1000, and you receive a frame 0.25 seconds since you started capturing, then do this:
AVRational my_timebase = {1, 1000};
frame->pts = av_rescale_q(250, my_timebase, avstream->time_base);
then encode the frame as usual.

Many (most?) video formats don't permit leaving out frames. Instead, try reusing old video frames when you can't get a fresh one in time.

Just an idea.. when it's lagging with the processing have you tried to pass to it the same frame again (and drop the current one)? Maybe it can process the duplicated frame quickly.

There's this ffmpeg command line switch -threads ... for multicore processing, so you should be able to do something similar with the API (though I have no idea how). This might solve your problem.

Related

How to find OBS Studio recording where the recording was paused and resumed programmatically?

I would like to programmatically find where the video was paused and resumed during the entire recording.
I have a screen recording by OBS studio which is over 7 hours and it has many pause and resume. So I want to programmatically find times / frames / location of this pauses and resumes.
The file type is MKV

As suggested in the comments I don't think this will be straightforward unless you have some clock points etc.
You could use a video analysis solution to go trough the frame by frame - it will be processing intensive but may meet your needs.
For example, using openCV you could read in the video frame by frame and compare each frame with it's predecessor, identifying groups of frames which are mostly the same (assuming your video frames are all more or less the same when you are paused).
If the video includes some sort of play or pause button on the screen when it paused you could also use OpenCv to detect this.

VIDIOC_DQBUF too slow for requested FPS

I use V4L2 to get frames from my web camera, but I found out that even with timeperframe equal to 1/30, call to ioctl(fd, VIDIOC_DQBUF, &buffer) takes ~60ms which is too slow! 30 frames * 60ms = 1800ms, almost 2 seconds to shoot 30 frames, this is ~15fps!
Also, I found out that if I decrease resolution by half (640x480 -> 320x240), the duration of VIDIOC_DQBUF call also decreases by half.
Can I do something with this or this is the limits of my hardware? Maybe there's something I should set up before open the stream?
I use Logitech C270 Web camera and source code in C is here, if you want to see it.

Play multiple wav audio with C and libao at same time

I'm using libao (ao_play) to play some buffers. I listen the keyboard keys and for each key I have a wav sound to play. It's simple.
With ao_play I see that the application blocks while is playing the sound. Because I want to play multiple audios at same time, I needed to use threads (with pthread lib).
It works, but I fell like a workaround and if I play to much files (maybe 10 or something like this) so everything stuck for some seconds and so come back.
Well, my question is: how to play multiple sounds at same time non-blocking using libao (and not using threads)?

This not a real design, more like a guess.
First of all, you'll need threads because it's a good old tradition to separate computations from visualisations, or audializations in this case. You'll need an audio thread that renders the stream and sends it to the output.
So, each time your main thread discovers a keypress, it sends a note to the audio thread. That latter captures an event and adds a wave to the currently played stream. The stream is rendered in frames (64, or 1024, or 10240 samples, or whatever you fancy your latency, if the wave itself is a simple mix of few possible samples, it can be notably realtime.) You should keep track of notes currently played, position per each sample. If latency is low, thus granularity high, you can even align sample edges by buffer edges, which would notably simplify rendering.
And after current buffer is rendered you simply send it to DAC and proceed with the next frame.
A quick glance at libao's help page does not reveal any mixing capabilities, so you'll need to create a simple mixer on your own, or you may actually need an existing solution, some simple opensource audio rendering library.

Trouble implementing a real time program in C

I have a encoder which encodes a speech file(.wav) that i give as input. Now what i want to do is to write a program such that i can speak in the mic and at the same time the encoder can process it. Basically i want to record and process a speech signal in real time (a small delay can be tolerated). To do this i was thinking of making a loop inside which i would first record the speech for say 1 sec in a file say speech.in, then i would copy this file to temp and pass this temp to the encoder. In the meantime the recorder should overwrite the speech.in file and save the next 1 sec of data in it.And continue this loop...
The problem i am having is i cant write a program to control the recorder to do the thing i want. Is there any recorder which can be easily controlled or any code to do it ?
This is the only way i could think of to implement this. Any other(hopefully better) solution is also welcome.
*edit: I am working on Ubuntu 10.04 but i have used the same program on windows as well so any suggestion on either platform is welcome

Your proposed way is not the way to go. At least, this is not how it's done on Windows and Mac. (I don't know how linux flavoured machines would do it but I'm guessing the methodology is the same)
You'll have to open the audio device, and allocate a set of (say 4) internal memorybuffers (length of 100ms sound would suffice, but you'll have to experiment how small you can get the buffer (the smaller, the less latency, but the more chances on audio glitches)).
You attach these to the audio device and ask for a callback when any of these buffers are filled. When you get the first call back, make sure you encode the buffer quickly enough before the 1st buffer is used again by the audiodevice and is overwritten with new data.
You could simultaneously output the encoded sound to the audiodevice again. The latency would be similar to the length of 1 of the buffers.

Sounds like this would be best served by threading.
Here is a MSDN link

playing only part of a sound using FMOD

I'm trying to play only part of a sound using FMOD, say frames 50000-100000 of a 200000 frame file.
I have found a couple of ways to seek forward (i.e. to start playback at frame 50000) but I have not found a way to make sure the sound stops playing at 100000. Is there any way FMOD can natively do this without having to add lbsndfile or the like into the picture?
I should also mention that I am using the streaming option. I have to assume that these sounds are arbitrarily large and cannot be comfortably/quickly loaded into memory.

You can use Channel::setDelay for sample accurate starting and stopping of sounds. Use FMOD_DELAYTYPE_DSPCLOCK_START to set the start time of the sound and FMOD_DELAYTYPE_DSPCLOCK_END to set the end time.
Check out the docs for Channel::setDelay, FMOD_DELAYTYPE, System::getDSPClock.

You should be able to use the streaming callback to stop the stream when you get to the desired point.
Option 1: When you create the stream, set lenbytes to an even divisor of the number of frames you wish to play. In your example, set 'lenbytes' to 5000, then keep a counter in the callback. When you get to 10, stop the stream.
Option 2: use FSOUND_Stream_AddSyncPoint with pcmoffset set to your desired stopping point. Register a callback with FSOUND_Stream_SetSyncCallback. Stop the stream in the callback.

To start playback at sample 50,000 and end at 100,000 you could do the following assuming the sound file sample rate and the system sample rate are the same. As DSP clock works in system output samples you may need to do some maths to adjust your end sample in terms of output rate. See Sound::getDefaults for sound sample rate and System::getSoftwareFormat for system rate.
unsigned int sysHi, sysLo;
// ... create sound, play sound paused ...
// Seek the data to the desired start offset
channel->setPosition(50000, FMOD_TIMEUNIT_PCM);
// For accurate sample playback get the current system "tick"
system->getDSPClock(&sysHi, &sysLo);
// Set start offset to a couple of "mixes" in the future, 2048 samples is far enough in the future to avoid issues with mixer timings
FMOD_64BIT_ADD(sysHi, sysLo, 0, 2048);
channel->setDelay(FMOD_DELAYTYPE_DSPCLOCK_START, sysHi, sysLo);
// Set end offset for 50,000 samples from our start time, which means the end sample will be 100,000
FMOD_64BIT_ADD(sysHi, sysLo, 0, 50000);
channel->setDelay(FMOD_DELAYTYPE_DSPCLOCK_END, sysHi, sysLo);
// ... unpause sound ...