FFmpeg: what does av_parser_parse2 do? - c

When sending h264 data for frame decoding, it seems like a common method is to first call av_parser_parse2 from the libav library on the raw data.
I looked for documentation but I couldn't find anything other than some example codes. Does it group up packets of data so that the resulting data starts out with NAL headers so it can be perceived a frame?
The following is a link to a sample code that uses av_parser_parse2:
https://github.com/DJI-Mobile-SDK-Tutorials/Android-VideoStreamDecodingSample/blob/master/android-videostreamdecodingsample/jni/dji_video_jni.c
I would appreciate if anyone could explain those library details to me or link me resources for better understanding.
Thank you.

It is like you guessed, av_parser_parse2() for H.264 consumes input data, looks for NAL start codes 0x000001 and checks the NAL unit type looking for frame starts and outputs the input data, but with a different framing.
That is it consumes the input data, ignores its framing by putting all consecutive data into a big buffer and then restores the framing from the H.264 byte stream alone, which is possible because of the start codes and the NAL unit types. It does not increase or decrease the amount of data given to it. If you get 30k out, you have put 30k in. But maybe you did it in little pieces of around 1500 bytes, the payload of the network packets you received.
Btw, when the function declaration is not documented well, it is a good idea to look at the implementation.
Just to recover the framing is not involved enough to call it parsing. But the H.264 parser in ffmpeg also gathers some more information from the H.264 stream, eg. whether it is interlaced, so it really deserves its name.
It however does not decode the image data of the H.264 stream.

DJI's video transmission does not guarantee the data in each packet belongs to a single video frame. Mostly a packet contains only part of the data needed for a single frame. It also does not guarantee that a packet contains data from one frame and not two consecutive frames.
Android's MediaCodec need to be queued with buffers, each holding the full data for a single frame.
This is where av_parser_parse2() comes in. It gathers packets until it can find enough data for a full frame. This frame is then sent to MediaCodec for decoding.

Related

Is it possible to force ffmpeg/libav decoder to output an H.264 frame with only current information?

I'm currently using a slightly legacy version of ffmpeg/libav to decode H.264 frames.
It decodes them with a call to:
avcodec_decode_video2(context, &outPicture, &gotPicture, inNALPacket);
For this I provide a series of NAL packets, and once it has 'enough' it produces the image frame, as outPicture.
So far, so good.
However, sometimes (due to network issues) a packet/NAL goes missing.
I can detect this.
When this happens I would like to give up on this frame, and tell the decoder to just give me its best shot at the image, given the data so far.
Is there any way of doing this? eg. can I construct an inNALPacket that essentially tells the encoder to give up and move on?

How to switch between data stream and control using (UART) bus

This question is about firmware for an 8 outgoing channels IR transmitter. It is a micro-controller board with 8 IR leds. The goal is to have a transmitter capable of sending streams of data using one or multiple channels.
The data is delivered to the board over UART and then transmitted over one or multiple channels.
My transmitter circuit is faster than the UART, so no flow control is required.
Currently I have the channel fixed in the firmware, so each byte from the UART is transmitted directly. This means that there is no way to set the desired channel over UART, which is what I want.
Of course, the easiest solution is to append the data byte with a control byte in which each bit represents one channel. This had the advantage that each byte can be routed to one or more channels, but of course increases overhead dramatically.
Because of the stream type of transmission, I am trying to avoid a length field in my transmitter.
My research work is in the network stack on top of this.
My question is if there are schemes or good practices to solve this. I expect that similar problems are in robotics, where sensor data streams cross control signals all the time, but I could not find a simple and elegant solution.
I generally use the SLIP transmission protocol in my projects. It is very fast, easy to implement, and works very good to frame ANY packet you want.
http://www.tcpipguide.com/free/t_SerialLineInternetProtocolSLIP.htm
https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=slip%20protocol
Basically, you feed each byte to be transmitted or received into a function that uses 0xC0 as both a header and a footer. Since 0xC0 is a valid byte in the data packet that you could be sending a few transformations are made to data bytes of 0xC0 in order to GUARANTEED that 0xC0 will only be a header and footer.
Then using the reverse algorithm on the other side you can frame the incoming data and look for 0xC0 twice in the right order. This signifies a full packet that will can be buffered up and flagged for main cpu processing.
The SLIP will guarantee the right framing of the packet.
Then it is up to you to define your own packet format that exists as the data field inside the SLIP packet has correctly framed the packet.
I often do the following...
<0xC0> ...<0xC0>
Use different opcodes for your different channels. You can easily add another layer with Acknowledgements if you want.
Seems like the only sensible solution is to create a carrier protocol for the UART data. You might want this anyway, since UART has poor immunity to EMI. You can make it more reliable by including a CRC check to the protocol. (Please note that the built-in error handling of UART through start/stop/parity is very naive and very much outdated since the mid 70s or so.)
Typically these protocols go like <sync token> <header> <data> <checksum>, where the header may contain a data length and the data can then be of variable length.
Probably not an option at this point, but SPI would have been a much more pleasant interface to work with for this. You could then have one shift register per 8 IR diodes and select channel through the SPI slave select through some MUX/DEMUX circuit. Everything would work synchronously and no carrier protocol is needed. And it would completely remove the need for a MCU between the data sender and the diodes.

Do Audio Queue Services buffers need to be even multiples of packet size?

I'm trying to use Audio Queue Services to play mp3 audio that is being delivered from an external process. I am using NSTask and NSOutputHandle to get the output from the command - that part works fine. I'm using Audio File Stream Services to parse the data from the command - that seems to work as well. In my Audio File Stream Services Listener function, I'm not sure what to do with the packets that come in. It would be great if I could just throw them at the audio queue but apparently it doesn't work that way. You're supposed to define a series of buffers and enqueue them on the audio queue. Can the buffers correspond to the packets or do I have to somehow convert them? I'm not very good at C or pointer math so the idea of converting arbitrary-sized packets to non-matching-sized buffers it kind of scary to me. I've read the Apple docs many times but it only covers reading from a file, which seems to skip this whole packet/buffer conversion step.
You should be able to configure the AudioQueue such that the buffer sizes match your packet sizes. Additionally, the AudioQueue will do the job of decoding the mp3 - you shouldn't need to do any of your own conversions.
Use the inBufferByteSize parameter to configure the buffer size:
OSStatus AudioQueueAllocateBuffer (
AudioQueueRef inAQ,
UInt32 inBufferByteSize,
AudioQueueBufferRef *outBuffer
);
If your packets are all different sizes, you can use AudioQueueAllocateBuffer to allocate each buffer with that custom size before filling it, and free it instead of re-queueing it after use by the audio queue callback.
For less memory management (which impacts performance), if you know the max packet size, you can allocate a buffer that big, and then only partially fill that buffer (after checking the packet size to make sure it fits). There's a parameter, mAudioDataByteSize, for the amount with which each buffer is actually filled.

Why does reordering gzip packets damages the output?

I'm using the idea of the gzip code posted in zlib.
For initialization I use deflateInit2(p_strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED, (15+16), 8, Z_DEFAULT_STRATEGY).
I'm zipping a stream. Each packet with Z_FULL_FLUSH, except from the last which I use Z_FINISH.
After zipping each packet, I'm reordering the packets.
data in packets ---> [zip] ---> [reordering] ---> ...
If I inflate the data after the zip, I'm getting the exact file before zipping.
If I inflate the data after the reordering of the packets (again: each packet is deflated with Z_FULL_FLUSH, except for the last Z_FINISH) I get a file that is very similar to the original file before zipping. The difference is in the end of the file: it lack of bytes. That's because when I'm inflating it, I get an error for the last packet (Z_DATA_ERROR). If I inflate, let's say, with chunks of 50KB, the inflated file after reordering is the same file as the input, less <50KB (the whole last packet is gone cause of the error). If I decrease the inflating chunk size to 8B, I still get the Z_DATA_ERROR, but now I loose less data while inflating, (In my example I lack one Byte from the original file).
I'm not reordering the last packet (Z_FINISH).
I tried to send all of the packets with Z_FULL_FLUSH and then, send another "empty" packet (only Z_FINISH which is 10 bytes).
Why is this happening?
If I use Z_FULL_FLUSH, Why can't the inflater inflate it correctly?
does it remember the order of the deflated packets?
Any information will help,
Thanks.
Since you are using Z_FULL_FLUSH which erases the history at each flush, you can reorder the packets, except for the last one. The one you did Z_FINISH on must be the last packet. It doesn't need to have any data though. You can feed all of your data from your last packet using Z_FULL_FLUSH, and then do one final packet with no input data and Z_FINISH. That will permit you to reorder the packets before that empty one all you like. Just always have that last one at the end.
The reason is that the deflate format is self terminating, so that last piece marks the end of the stream. If you reorder it to the middle somewhere, then the inflation with stop when it hits that packet.
The gzip header and trailer need to be maintained at the beginning and the end, and the CRC in the trailer updated accordingly. The CRC check at the end depends on the order of the data.
Why are trying to do what you're trying to do? What are you optimizing?
GZip is a streaming protocol. The compression depends on the prior history of the stream. You can't reorder it.

Receive message of undefined size in UART in C

I'm writing my own drivers for LPC2148 and a question came to mind.
How do I receive a message of unspecified size in UART?
The only 2 things that come to mind are: 1 - Configure a watchdog and end the receiving when the time runs out. 2- make it so that whenever a meswsage is sent to it there must be an end of message character.
The first choice seems better in my opinion, but I'd like to know if anybody has a better answer, and I know there must be.
Thank you very much
Just give the caller whatever bytes you have received so far. The UART driver shouldn't try to implement the application protocol, the application should do that.
It looks like a wrong use for a watchdog. I ended up with three solutions for this problem:
Use fixed-size packets and DMA; so, you receive one packet per transaction. Apparently, it is not possible in your case.
Receive message char-by-char until the end-of-message character is received. Kind of error-prone, since the EOM char may appear in the data, probably.
Use a fixed-size header before every packet. In the header, store payload size and/or message type ID.
The third approach is probably the best one. You may combine it with the first one, i.e. use DMA to receive header and then data (in the second transaction, after the data size is known from the header). It is also one of the most flexible approaches.
One more thing to worry about is to keep bytestream in sync. There may be rubbish laying in the UART input buffers, which may get read as data, or you can get only a part of a packet after your MCU is powered (i.e. the beginning of the packet had already been sent by that time). To avoid that, you can add magic bytes in your packet header, and probably CRC.
EDIT
OK, one more option :) Just store everything you receive in a growing buffer for later use. That is basically what PC drivers do.
Real embedded uart drivers usually use a ring buffer. Bytes are stored in order and the clients promise to read from the buffer before it's full.
A state machine can then process the message in multiple passes with no need for a watchdog to tell it reception is over
better to go for option 2) append end of transmission character to the transmission string.
but i suggest to add start of transmission also to validate that you are receiving actual transmission.
Watchdog timer is used to reset system when there is a unexpected behavior of device. I think it is better to use a buffer which can store size of data that your application requires.

Resources