I am trying to record a h.264 live stream using the following code:
AVOutputFormat* fmt = av_guess_format(NULL, "test.mpeg", NULL);
AVFormatContext* oc = avformat_alloc_context();
oc->oformat = fmt;
avio_open2(&oc->pb, "test.mpeg", AVIO_FLAG_WRITE, NULL, NULL);
AVStream* stream = NULL;
...
while(!done)
{
// Read a frame
if(av_read_frame(inputStreamFormatCtx, &packet)<0)
return false;
if(packet.stream_index==correct_index)
{
////////////////////////////////////////////////////////////////////////
// Is this a packet from the video stream -> decode video frame
if (stream == NULL){//create stream in file
stream = avformat_new_stream(oc, pFormatCtx->streams[videoStream]->codec->codec);
avcodec_copy_context(stream->codec, pFormatCtx->streams[videoStream]->codec);
stream->sample_aspect_ratio = pFormatCtx->streams[videoStream]->codec->sample_aspect_ratio;
stream->sample_aspect_ratio.num = pFormatCtx->streams[videoStream]->codec->sample_aspect_ratio.num;
stream->sample_aspect_ratio.den = pFormatCtx->streams[videoStream]->codec->sample_aspect_ratio.den;
// Assume r_frame_rate is accurate
stream->r_frame_rate = pFormatCtx->streams[videoStream]->r_frame_rate;
stream->avg_frame_rate = stream->r_frame_rate;
stream->time_base = av_inv_q(stream->r_frame_rate);
stream->codec->time_base = stream->time_base;
avformat_write_header(oc, NULL);
}
av_write_frame(oc, &packet);
...
}
}
However, the ffmpeg says
encoder did not produce proper pts making some up
when the code runs to av_write_frame(); what's the problem here?
First make sure the inputStreamFormatCtx allocated and filled by right values (which is the cause of 90% of problem with demuxing/remuxing problems) - check some samples on internet to know how u should allocate and set its values.
The error tells us what is happening and it seems it is just a warning.
PTS (Presentation Time Stamp) is a number based on stream->time_base which tells us when we should show the decoded frame of this packet. when u get a live stream via network it's possible the server hasn't put a valid number for PTS of packet and when you receive the data it has a INVALID PTS (which you can find out by reading packet.pts and check if its a AV_NOPTS_VALUE). so then libav tries to generate the right pts based on frame rate and time_base of stream. It's a helpful attempt and if the recorded file can be played on a real motion (fps-wise) you should be happy. and if the recorded file will be played on a fast or slow motion (fps-wise) you got a problem and you can't rely on libav to correct the fps anymore. so then you should calculate the right fps by decoding the packets and then calculate the right pts based on the stream->time_base and set it to packet.pts.
Related
I'm a newbie and I'm playing with ESP32 and IR receiver to capture signal from AC IR remote.
Currently, I refer to example code for capturing IR signal as follows:
static void nec_rx_init()
{
rmt_config_t rmt_rx;
rmt_rx.channel = RMT_RX_CHANNEL;
rmt_rx.gpio_num = RMT_RX_GPIO_NUM;
rmt_rx.clk_div = RMT_CLK_DIV;
rmt_rx.mem_block_num = 1;
rmt_rx.rmt_mode = RMT_MODE_RX;
rmt_rx.rx_config.filter_en = true;
rmt_rx.rx_config.filter_ticks_thresh = 100;
rmt_rx.rx_config.idle_threshold = rmt_item32_tIMEOUT_US / 10 * (RMT_TICK_10_US);
rmt_config(&rmt_rx);
rmt_driver_install(rmt_rx.channel, 3000, 0);
}
//get RMT RX ringbuffer
RingbufHandle_t rb = NULL;
rmt_get_ringbuf_handle(RMT_RX_CHANNEL, &rb);
// rmt_rx_start(channel, rx_idx_rst) - Set true to reset memory index for receiver
rmt_rx_start(RMT_RX_CHANNEL, 1);
while(rb) {
uint32_t rx_size = 0;
//try to receive data from ringbuffer.
//RMT driver will push all the data it receives to its ringbuffer.
//We just need to parse the value and return the spaces of ringbuffer.
rmt_item32_t* item = (rmt_item32_t*) xRingbufferReceive(rb, &rx_size, 1000);
...
}
Although IR signal is emitted from an AC IR remote is about 100 items but I always see that rx_size is only 256 (64 items). So it is problem, how can I capture total signals from AC IR remote?. Note that I set the buffer size from 3000 to 10000.
I appreciate any suggestion for me in order to deal with this problem.
I had this same issue this past week. I was receiving a serial packet that was 128 bits, but only ever got the first 64. After digging into it a bit deeper I found that the hardware buffer on the RMT interface defaults to a 64x32-bit RAM block per channel. You can set the channel up to use the memory blocks normally assigned to subsequent channels if you need to receive more data at once.
For my project I used the following function to give 4 RAM blocks to channel 0, thus increasing the maximum receive size to 256 bits, which is more than enough for my application. I also had to move the receive to channel 4 since channel 0 was now using the memory block for channel 1.
rmt_set_mem_block_num((rmt_channel_t) 0, 4);
Documentation for this function can be found here:
https://docs.espressif.com/projects/esp-idf/en/stable/api-reference/peripherals/rmt.html#_CPPv421rmt_set_mem_block_num13rmt_channel_t7uint8_t
It's also worth noting that it did throw an error in the serial monitor while the issue was occurring, which did help find the cause of the issue.
E (33323) rmt: RMT[0] ERR
E (33323) rmt: status: 0x14000100
E (33373) rmt: RMT RX BUFFER FULL
With the default amount of RAM I was getting an error code of 0x14000040, and when I increased it to 2 blocks I got a status code of 0x13000040. After increasing to 4 blocks of RAM the error message stopped appearing.
Have USB 3.0 HDMI Capture device. It uses YUY2 format (2 bytes per pixel) and 1920x1080 resolution.
Video capture Output Pin connects directly to Video Render input Pin.
And all works good. It shows me 1920x1080 without any freezes.
But I need to make screenshot every second. So this is what I do:
void CaptureInterface::ScreenShoot() {
IMemInputPin* p_MemoryInputPin = nullptr;
hr = p_RenderInputPin->QueryInterface(IID_IMemInputPin, (void**)&p_MemoryInputPin);
IMemAllocator* p_MemoryAllocator = nullptr;
hr = p_MemoryInputPin->GetAllocator(&p_MemoryAllocator);
IMediaSample* p_MediaSample = nullptr;
hr = p_MemoryAllocator->GetBuffer(&p_MediaSample, 0, 0, 0);
long buff_size = p_MediaSample->GetSize(); //buff_size = 4147200 Bytes
BYTE* buff = nullptr;
hr = p_MediaSample->GetPointer(&buff);
//BYTE CaptureInterface::ScreenBuff[1920*1080*2]; defined in header
//--------- TOO SLOW (1.5 seconds for 4 MBytes) ----------
std::memcpy(ScreenBuff, buff, buff_size);
//--------------------------------------------
p_MediaSample->Release();
p_MemoryAllocator->Release();
p_MemoryInputPin->Release();
return;
}
Any other operations with this buffer is very slow too.
But If I use memcpy on other data (2 arrays in my class for example same size 4MB) It is very fast. <0.01sec
Video memory is (might be) slow to read back by its nature (e.g. VMR9 IBasicVideo->GetCurrentImage very slow and you can find other references). You normally want to grab the data before it actually reaches video memory.
Additionally, the way you read data is not quite reliable. You don't know what frame you are actually copying and it might so happen that you even read blackness or garbage, or vice versa your acquiring access to buffer freezes the main video streaming. This is because you are grabbing an unused buffer from pool of available buffers rather than a buffer that corresponds to specific video frame. Your getting an image from such buffer happen in a fragile assumption that unused data from previously streamed frame was initialized and is not yet overwritten by anything else.
I am decoding using FFMpeg. The videos I am decoding are H.264 or MPEG4 videos using C code. I am using the 32bit libs. I have successfully decoded and extracted the metadata for the first frame. I would now like to decode the last frame. I have a defined duration of the video, and felt it was a safe assumption to say that isLastFrame = duration. Here's what I have, any suggestions?
AVFormatContext* pFormatCtx = avformat_alloc_context();
avformat_open_input(&pFormatCtx, filename, NULL, NULL);
int64_t duration = pFormatCtx->duration;
i=0;
while(av_read_frame(pFormatCtx, &packet)>=0) {
/* Is this a packet from the video stream? */
if(packet.stream_index==videoStream) {
/* Decode video frame*/
avcodec_decode_video2(pCodecCtx, pFrame, &duration, &packet);
}
Any help is much appreciated! :)
Thanks everyone for your help but I found that the reason the AV_SEEK_FRAME duration wasn't working was because you must multiply it by 1000 for it to be applicable in read frame. Also please note that the reason I have but decode_video instead of the decode functions calls is because I was using 32 bit and created my own but if you plug in video_decode() or I believe it's decode_video2 it works just as well. Hopefully this will help any fellow decoders in the future.
AVFormat Format;
int64_t duration = Format->duration;
duration = duration * 1000;
if (av_seek_frame(Format, Packet->stream_index, duration, AVSEEK_FLAG_ANY) <= 0)
{
/* read the frame and decode the packet */
if (av_read_frame(FormatContext, &Packet) >= 0)
{
/*decode the video frame*/
decode_video(CodecContext, Frame, &duration, &Packet);
}
This might be what you're looking for:
Codecs which have the CODEC_CAP_DELAY capability set have a delay
between input and output, these need to be fed with avpkt->data=NULL,
avpkt->size=0 at the end to return the remaining frames.
Link to FFmpeg documentation
I'm writing a video editor, and I need to seek to exact frame, knowing the frame number.
Other posts on stackoverflow told me that ffmpeg may give me a few broken frames after seeking, which is not a problem for playback but a big problem for video editors.
And I need to seek by frame number, not by time, which will become inaccurate when converted to frame number.
I've read dranger's tuts (which is outdated now), and end up with:
av_seek_frame(fmt_ctx, video_stream_id, frame, AVSEEK_FLAG_ANY);
It always seek to frame No. 0, and always return 0 which means success.
Then I tried to read Blender's source code and found it really complex(maybe I should implement an image buffer?).
So, is there any simple way to seek to a frame with just a simple call like seek(context, frame_number)(while getting a full frame, not a broken one)? Or, is there any lightweight library that simplifies this?
EDIT:
Thanks to praks411,I found the solution:
void AV_seek(AV * av, size_t frame)
{
int frame_delta = frame - av->frame_id;
if (frame_delta < 0 || frame_delta > 5)
av_seek_frame(av->fmt_ctx, av->video_stream_id,
frame, AVSEEK_FLAG_BACKWARD);
while (av->frame_id != frame)
AV_read_frame(av);
}
void AV_read_frame(AV * av)
{
AVPacket packet;
int frame_done;
while (av_read_frame(av->fmt_ctx, &packet) >= 0) {
if (packet.stream_index == av->video_stream_id) {
avcodec_decode_video2(av->codec_ctx, av->frame, &frame_done, &packet);
if (frame_done) {
...
av->frame_id = packet.dts;
av_free_packet(&packet);
return;
}
}
av_free_packet(&packet);
}
}
EDIT2:
Turns out there is a library for this: FFMS2.
It is "an FFmpeg based source library [...] for easy frame accurate access", and is portable (at least across Windows and Linux).
av_seek_frame will only seek based on timestamp to the key-frame. Since it seeks to the keyframe, you may not get what you want. Hence it is recommended to seek to nearest keyframe and then read frame by frame util you reach the desired frame.
However, if you are dealing with fixed FPS value, then you can easily map timestamp to frame index.
Before seeking you will need to convert your time to AVStream.time_base units if you have specified stream. Read ffmpeg documentation of av_seek_frame in avformat.h.
For example, if you want to seek to 1.23 seconds of clip:
double m_out_start_time = 1.23;
int flgs = AVSEEK_FLAG_ANY;
int seek_ts = (m_out_start_time*(m_in_vid_strm->time_base.den))/(m_in_vid_strm->time_base.num);
if(av_seek_frame(m_informat, m_in_vid_strm_idx,seek_ts, flgs) < 0)
{
PRINT_MSG("Failed to seek Video ")
}
I am trying to jump to a specific frame by setting the CV_CAP_PROP_POS_FRAMES property and then reading the frame like this:
cvSetCaptureProperty( input_video, CV_CAP_PROP_POS_FRAMES, current_frame );
frame = cvQueryFrame( input_video );
The problem I am facing is that, OpenCV 2.1 returns the same frame for the 12 consecutive values of current_frame whereas I want to read each individual frame, not just the key frames. Can anyone please tell me what's wrong?
I did some research and found out that the problem is caused by the decompression algorithm.
The MPEG-like algorithms (including HD, et all) do not compress each frame separately, but save a keyframe from time to time, and then only the differences between the last frame and subsequent frames.
The problem you reported is caused by the fact that, when you select a frame, the decoder (ffmpeg, likely) automatically advances to the next keyframe.
So, is there a way around this? I don't want only key frames but each individual frame.
I don't know whether or not this would be precise enough for your purpose, but I've had success getting to a particular point in an MPEG video by grabbing the frame rate, converting the frame number to a time, then advancing to the time. Like so:
cv::VideoCapture sourceVideo("/some/file/name.mpg");
double frameRate = sourceVideo.get(CV_CAP_PROP_FPS);
double frameTime = 1000.0 * frameNumber / frameRate;
sourceVideo.set(CV_CAP_PROP_POS_MSEC, frameTime);
Due to this limitation in OpenCV, it may be wise to to use FFMPEG instead. Moviepy is a nice wrapper library.
# Get nth frame from a video
from moviepy.video.io.ffmpeg_reader import FFMPEG_VideoReader
cap = FFMPEG_VideoReader("movie.mov",True)
cap.initialize()
cap.get_frame(n/FPS)
Performance is great too. Seeking to the nth frame with get_frame is O(1), and a speed-up is used if (nearly) consecutive frames are requested. I've gotten better-than-realtime results loading three 720p videos simultaneously.
CV_CAP_PROP_POS_FRAMES jumps to a key frame. I had the same issue and worked around it using this (python-)code. It's probably not totally efficient, but get's the job done:
def seekTo(cap, position):
positiontoset = position
pos = -1
cap.set(cv.CV_CAP_PROP_POS_FRAMES, position)
while pos < position:
ret, image = cap.read()
pos = cap.get(cv.CV_CAP_PROP_POS_FRAMES)
if pos == position:
return image
elif pos > position:
positiontoset -= 1
cap.set(cv.CV_CAP_PROP_POS_FRAMES, positiontoset)
pos = -1
I've successfully used the following on OpenCV 3 / Python 3:
# Skip to 150 frame then read the 151th frame
cap.set(cv2.CAP_PROP_POS_FRAMES, 150))
ret, frame = cap.read()
After some years assuming this as a unsavable bug, I think I've figured out a way to use with a good balance between speed and correctness.
A previous solution suggested to use the CV_CAP_PROP_POS_MSEC property before reading the frame:
cv::VideoCapture sourceVideo("/some/file/name.mpg");
const auto frameRate = sourceVideo.get(CV_CAP_PROP_FPS);
void readFrame(int frameNumber, cv::Mat& image) {
const double frameTime = 1000.0 * frameNumber / frameRate;
sourceVideo.set(CV_CAP_PROP_POS_MSEC, frameTime);
sourceVideo.read(image);
}
It does return the expected frame, but the problem is that using CV_CAP_PROP_POS_MSEC may be very slow, for example for a video conversion.
Note: using global variables for simplicity.
On the other hand, if you just want to read the video sequentially, it is enough to read frame without seeking at all.
for (int frameNumber = 0; frameNumber < nFrames; ++frameNumber) {
sourceVideo.read(image);
}
The solution comes from combining both: using a variable to remember the last queried frame, lastFrameNumber, and only seek when requested frame is not the next one. In this way it is possible to increase the speed in a sequential reading while allowing random seek if necessary.
cv::VideoCapture sourceVideo("/some/file/name.mpg");
const auto frameRate = sourceVideo.get(CV_CAP_PROP_FPS);
const int lastFrameNumber = -2; // guarantee seeking the first time
void readFrame(int frameNumber, cv::Mat& image) {
if (lastFrameNumber + 1 != frameNumber) { // not the next frame? seek
const double frameTime = 1000.0 * frameNumber / frameRate;
sourceVideo.set(CV_CAP_PROP_POS_MSEC, frameTime);
}
sourceVideo.read(image);
lastFrameNumber = frameNumber;
}