I am experimenting with video and would like to know how I can extract I-frames from H264 contained in MPEG-TS container.
What I want to do is generate preview images out of a video stream.
As the I-frame is supposed to be a complete picture fro which P- and B-Frames derive, is there a possibility to just extract the data of the picture without having to decode it using a codec?
I have already done some work with MPEG-TS container format but I am not that much specialized in codecs.
I am rather in search of information.
Thanks a lot.
I am no expert in this domain but I believe the answer to your question is NO.
If you want to save the I-frame as a JPEG image, you still need to "transcode" the video frame i.e. you first need to decode the I-frame using a H264 decoder and then encode it using a JPEG encoder. This is so because the JPEG encoder does not understand a H264 frame, it only accepts uncompressed video frames as input.
As an aside, since the input to the JPEG encoder is an uncompressed frame, you can generate a JPEG image from any type of frame (I/P/B) as it would already be decoded (using reference I frame, if needed) before feeding to the encoder.
As others have noted decoding h.264 is complicated. You could write your own decoder but it is a major effort. Why not use an existing decoder?
Intel's IPP library has the basic building blocks for a decoder and a sample decoer:
Code Samples for the IntelĀ® Integrated Performance Primitives
There's libavcodec:
Using libavformat and libavcodec
Revised avcodec_sample.0.4.9.cPP
I am not expert in this domain too. But I've played with decoding. Use this gstreamer pipeline to extract preview from video.mp4:
gst-launch -v filesrc location=./video.mp4 ! qtdemux name=demux demux.video_00 ! ffdec_h264 ! videorate ! 'video/x-raw-yuv,framerate=1/1' ! jpegenc ! multifilesink location=image-%05d.jpeg
If you want to write some code, replace videorate with appsrc/appsink elements. Write control program to the pipelines (see example):
filesrc location=./video.mp4 ! qtdemux name=demux demux.video_00 ! ffdec_h264 ! appsink
appsrc ! 'video/x-raw-yuv,framerate=1/1' ! jpegenc ! multifilesink location=image-%05d.jpeg
Buffers without GST_BUFFER_FLAG_DELTA_UNIT flag set is I-frames. You can safely skip many frames and start decoding stream at any I-frame.
Related
I am trying to acquire image data from a logitech USB camera (C270 HD WEBCAM) connected to a NVIDIA Jetson Nano for image size 864x480 using the below GStreamer command but I am experiencing a blank screen attached below (which means it is not working though there are no issues).
gst-launch-1.0 -v v4l2src device="/dev/video1" ! 'video/x-raw,width=(int)864,height=(int)480' ! videoconvert ! ximagesink
Blank window created by ximagesink
When I try to capture the same image(864x480) with jpeg compression then it is working
gst-launch-1.0 -v v4l2src device="/dev/video0" ! 'image/jpeg,width=(int)864,height=(int)480' ! jpegparse ! jpegdec ! videoconvert ! fpsdisplaysink video-sink=ximagesin
Checked both the pipeline in C programming too but same result.
Please let me know if there are any issues with the first pipeline. Thanks in advance.
-RK
Maybe your camera does not support yuv. You can check it using.
v4l2-ctl --list-formats-ext
On the other hand you might specify YUV format for the gstreamer to use. Something like this may work:
... 'video/x-raw, width=1280, height=720, format=YUY2' ! ...
or
... videoconvert ! 'video/x-raw, width=1280, height=720, format=YUY2' ! ...
GstH264NalParser *parser = NULL;
GstH264NalUnit nal_unit = { 0 };
parser = gst_h264_nal_parser_new();
GstH264ParserResult parser_result = gst_h264_parser_identify_nalu(parser,
buffer_map.data,
0,
buffer_map.size,
&nal_unit); /* This returns GST_H264_PARSER_NO_NAL */
Why is that? Unless data is not supposed to come from a GstMapInfo* but some other data structure. A GstStructure pointer from a GstSample, perhaps?
Context
Writing a small program that parses h.264 encoded video from Gstreamer's videotestsrc and appsink plug-ins. So far, so good.
Using the (bad) x264enc plug-in in my pipeline to convert the stream before feeding it into an h264parse, then into appsink. Pretty sure the h264parse is an unnecessary step, but I get the same results with and without.
Convinced that am using incorrect struct to read data into NALU parse function.
If you believe your incoming data is good, odds are you need to do a small conversion because in h264 streams there are a few different modes of encoding.
I'm not sure why that is, but you sometimes need to do a small conversion. That is what the h264parse element is for.
Pad Templates:
SRC template: 'src'
Availability: Always
Capabilities:
video/x-h264
parsed: true
stream-format: { avc, avc3, byte-stream }
alignment: { au, nal }
So in your pipeline you might try permutations on the stream-format and alignmnet options, such as:
gst-launch videotestsrc ! ... ! h264parse ! video/x-h264,stream-format=byte-stream,alignment=nal ! appsink
I am transcoding a video using FFMPEG API in c code.
I am trying to set the video bit rate using the ffmpeg API as shown below:
ovCodecCtx->bit_rate = 100 * 1000;
The Encoder I am using is libx264.
But this parameter is not taken into effect and the resulting video quality is very bad.
I have even tried setting related parameters like rc_min_rate, rc_max_rate, etc.. but the video quality is still very low as these related parameters are not taken into effect.
Could any expert tell how one can set the bit rate correctly using the FFMPEG API?
Thanks
I have found the solution to my problem. In fact somebody who was facing the same problem has posted the solution in ffmpeg(libav) user forum. This seems to work in my case too. I am posting the answer to my own question so that other users facing similar issue might benefit from this post.
Problem:
Setting the Video Bit Rate programmatically for the H264 Video Codec was not honoured by the libx264 Codec. Even though it was working for MPEG1, 2 and MPEG4 video codecs, this setting was not recognised for H264 Video Codec. And the resulting video quality was very bad.
Solution:
We need to set the pts for the decoded/resized frames before they are fed to encoder.
The person who found the solution has gone through ffmpeg.c source and was able to figure this out. We need to first rescale the AVFrame's pts from the stream's time_base to the codec time_base to get a simple frame number (e.g. 1, 2, 3).
pic->pts = av_rescale_q(pic->pts, ost->time_base, ovCodecCtx->time_base);
avcodec_encode_video2(ovCodecCtx, &newpkt, pic, &got_packet_ptr);
And when we receive back the encoded packet from the libx264 codec, we need to rescale the pts and dts of the encoded video packet to the stream time base
newpkt.pts = av_rescale_q(newpkt.pts, ovCodecCtx->time_base, ost->time_base);
newpkt.dts = av_rescale_q(newpkt.dts, ovCodecCtx->time_base, ost->time_base);
Thanks
I am implementing the operation of encoding video with TI DM365 mpeg4 encoder and containerizing it with ffmpeg mp4 container using a dummy FMP4 codec to produce headers and footers. While the container is proven to be working correctly using similar Intel based mpeg4 encoder, the dm365 gives a mosaic result if P frames are used at all. Using only I frames works, but I would like to minimize amount of data stored.
The example of the result can be viewed here. Settings are 1-Iframe, 9-Pframes
TI developers didn't answer my question regarding this in 2 days, so I am trying to get help here.
This may help, a TI data sheet on the various settings/parameters and their effect. Apologies if it is telling you stuff you already know...
TI Data Sheet spraba9.pdf
Please guide me to achieve the following result in my program (written in C):
I have a stream source as HTTP MPEG TS stream (codecs h264 & aac), It has 1 video and 1 audio substream.
I need to get MPEG ES frames (of same codecs), to send them via RTP to
RTSP clients. It'll be best if libavformat give frames with RTP
header.
MPEG ES is needed, because, as i know, media players on Blackberry
phones do not play TS (i tried it).
Although, i appreciate if anyone point me some another format, easier to get
in this situation, that can hold h264 & aac, and plays well on
blackberry and other phones.
I've already succeed with other task to open the stream and remux to
FLV container.
Tried to open two output format contexts with "rtp" formats, also got
frames. Sent to client. No success.
I've also tried writing frames to "m4v" AVFormatContext, have got
frames, have cut them by NAL, added RTP header before each frame, and sent to client. Client displays 1st frame and hangs, or plays a second of video+audio (faster than needed) each 10 seconds or more.
In VLC player log i have this: http://pastebin.com/NQ3htvFi
I've scaled timestamps to make them start with 0 for simplicity.
I compared it with what VLC (or Wowza, sorry i dont remember) incremented audio TS by 1024, not 1920, so i did additional linear scaling to be similar to other streamers.
Packet dump of playback of bigbuckbunny_450.mp4 is here:
ftp://rtb.org.ua/tmp/output_my_bbb_450.log
BTW in both cases i've hardly copied SDP from Wowza or VLC.
What is the right way to get what i need?
I'm also interested if there's some library similar to
libavformat? Maybe even in embryo state.