AVFrame deprecated attributes to regain? - c

I want to print out some attributes of video frames: I've looked into AVFrame struct, but only found the following disappointments:
attribute_deprecated short * dct_coeff
attribute_deprecated uint32_t * mb_type
It seems to me everything I am interested in is already obsolete. Btw, I didn't find
int16_t(*[2] motion_val )[2]
attribute in the actual frame I captured. My question is: how can i get access to those attributes such as dct_coeff or motion_vector or mb_type of a frame at all?

See av_frame_get_side_data(frame,AV_FRAME_DATA_MOTION_VECTORS) for the motion vectors. The other two have no replacement. The documentation states that they're mpeg-specific and using internal implementation details, which is why no replacement was provided.
(Don't forget to set avctx->flags2 & AV_CODEC_FLAG2_EXPORT_MVS, otherwise it's not exported.)
For the two with no replacement, I understand you might want that type of information if you're e.g. writing a stream analyzer, but FFmpeg really doesn't provide a stream-analyzer-level API right now. They could - if there's a more generic API - obviously be added as a separate side-data type. If you want that, you should probably become a FFmpeg developer and work on a broader API that is not MPEG-specific (e.g. does not use internal macros for mb_type), possibly even implement it for other codecs. In any other case, I don't really see why you would want that information. Can you elaborate?

Related

How to design generic backward compatible API for embedded software application library interface in C?

I am tasked to assist with the design of a dynamic library (exposed with a C interface) aimed to be used in embed software application on various embed platform (Android,Windows,Linux).
Main requirements are speed , and decoupling.
For the decoupling part : one of our requirement is to be able to facilitate integration and so permit backward compatibility and resilience.
My library have some entry points that should be called by the integrating software (like an initialize constructor to provide options as where to log, how to behave etc...) and could also call some callback in the application (an event to inform when task is finished).
So I have come with several propositions but as each of one not seems great I am searching advice on a better or standard ways to achieve decoupling an d backward compatibility than this 3 ways that I have come up :
First an option that I could think of is to have a generic interface call for my exposed entry points for example with a hashmap of key/values for the parameters of my functions so in pseudo code it gives something like :
myLib.Initialize(Key_Value_Option_Array_Here);
Another option is to provide a generic function to provide all the options to the library :
myLib.SetOption(Key_Of_Option, Value_OfOption);
myLib.SetCallBack(Key_Of_Callbak, FunctionPointer);
When presenting my option my collegue asked me why not use a google protobuf argument as interface between the library and the embed software : but it seems weird to me, as their will be a performance hit on each call for serialization and deserialization.
Are there any more efficient or standard way that you coud think of?
You could have a struct for optional arguments:
typedef struct {
uint8_t optArg1;
float optArg2;
} MyLib_InitOptArgs_T;
void MyLib_Init(int16_t arg1, uint32_t arg2, MyLib_InitOptArgs_T const * optionalArgs);
Then you could use compound literals on function call:
MyLib_Init(1, 2, &(MyLib_InitOptArgs_T){ .optArg2=1.2f });
All non-specified values would have zero-ish value (0, NULL, NaN), and would be considered unused. Similarly, when passing NULL for struct pointer, all optional arguments would be considered unused.
Downside with this method is that if you expect to have many new arguments in the future, structure could grow too big. But whether that is an issue, depends on what your limits are.
Another option is to simply have multiple smaller initialization functions for initializating different subsystems. This could be combined with the optional arguments system above.

GradCam Implementation in TFJS

I'm trying to implement GradCam (https://arxiv.org/pdf/1610.02391.pdf) in tfjs, based on the following Keras Tutorial (http://www.hackevolve.com/where-cnn-is-looking-grad-cam/) and a simple image classification demo from tfjs, similar to (https://github.com/tensorflow/tfjs-examples/blob/master/webcam-transfer-learning/index.js) with a simple dense, fully-connected layer at the end.
However, I'm not able to retrieve the gradients needed for the gradcam computation. I tried different ways to retrieve gradients for the last sequential layer, but did not succeed, as types of tf.LayerVariable from the respective layer is not convertible to the respective type of tf.grads or tf.layerGrads.
Did anybody already succeeded to get the gradients from sequential layer to a tf.function like object?
I'm not aware of the ins and outs of the implementation, but I think something along the lines of this: http://jlin.xyz/advis/ is what you're looking for?
Source code is available here: https://github.com/jaxball/advis.js (not mine!)
This official example in the tfjs-examples repo should be close to, if not exactly, what you want:
https://github.com/tensorflow/tfjs-examples/blob/master/visualize-convnet/cam.js#L49

Difference between Data_Wrap_Struct and TypedData_Wrap_Struct?

I'm wrapping a C struct in a Ruby C extension but I can't find the differente between Data_Wrap_Struct and TypedData_Wrap_Struct in the docs, what's the difference between the two functions?
It's described pretty well in the official documentation.
The tl;dr is that Data_Wrap_Struct is deprecated and just lets you set the class and the mark/free functions for the wrapped data. TypedData_Wrap_Struct instead lets you set the class and then takes a pointer to a rb_data_type_struct structure that allows for more advanced options to be set for the wrapping:
the mark/free functions as before, but also
an internal label to identify the wrapped type
a function for calculating memory consumption
arbitrary data (basically letting you wrap data at a class level)
additional flags for garbage collection optimization
Check my unofficial documentation for a couple examples of how this is used.

What is a MsgPack 'zone'

I have seen references to 'zone' in the MsgPack C headers, but can find no documentation on what it is or what it's for. What is it? Furthermore, where's the function-by-function documentation for the C API?
msgpack_zone is an internal structure used for memory management & lifecycle at unpacking time. I would say you will never have to interact with it if you use the standard, high-level interface for unpacking or the alternative streaming version.
To my knowledge, there is no detailed documentation: instead you should refer to the test suite that provides convenient code samples to achieve the common tasks, e.g. see pack_unpack_c.cc and streaming_c.cc.
From what I could gather, it is a move-only type that stores the actual data of a msgpack::object. It very well might intended to be an implementation detail, but it actually leaks into users' code sometimes. For example, any time you want to capture a msgpack::object in a lambda, you have to capture the msgpack::zone object as well. Sometimes you can't use move capture (e.g. asio handlers in some cases will only take copyable handlers, or your compiler doesn't support the feature). To work around this, you can:
msgpack::unpacked r;
while (pac_.next(&r)) {
auto msg = result.get();
io_->post([this, msg, z = std::shared_ptr<msgpack::zone>(r.zone().release())]() {
// msg is valid here
}));
}

GstBuffer Pixel Type (Determining BPP)

I am trying to write a GStreamer (0.10.34) plugin. I need manipulate an incoming image. I have my Sink caps set as "video/x-raw-yuv" so I know I'll be getting video.
I am having trouble in understanding how to use the GstBuffer, more specifically:
How do I get the bits per pixel?
Given the bpp, how do I determine the dimensions of the buffer?
I am currently elbows deep in 0.10.34 core documentation reading about GstStructure and GstQuarks... I think I'm in the wrong area.
As always, thanks for any advice.
After some source code hunting (jpegenc), I found the BaseLib plugins, most importantly GstVideo. This gives you the function gst_video_format_parse_caps
GstVideoFormat seems to be what you use to parse incoming video information.

Resources