FFMPEG library add padding the plane/frame processed - c

I am trying to create a filter to be a part of FFMPEG. In the process of creating it I need to create a padding around the frame so the image does not resample, just has the needed width and height. I know this is possible with libswscale/swscale.h, but I have no been able to find any example as to how to do the padding for the plane that is being processed. Example code below:
if (av_frame_is_writable(in)) {
out = in;
} else {
out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
if (!out) {
av_frame_free(&in);
return AVERROR(ENOMEM);
}
av_frame_copy_props(out, in);
}
for (p = 0; p < filter->nb_planes; p++) {
// did not find any documentation as to
//how set those attributes to add padding to the plane
filter->sws_ctx = sws_getContext(src_w, src_h, src_pix_fmt,
dst_w, dst_h, dst_pix_fmt,
SWS_BILINEAR, NULL, NULL, NULL);
}

There is no other way as to do it inside the filter. The funcionality has to be implemented from vf_pad filter.
Credit: #durandal_1707 from from #ffmpeg IRC

Related

Migrating custom dynamic partitioner from Flink 1.7 to Flink 1.9

I am trying to migrate a custom dynamic partitioner from Flink 1.7 to Flink 1.9. The original partitioner implemented the selectChannels method within the StreamPartitioner interface like this:
// Original: working for Flink 1.7
//#Override
public int[] selectChannels(SerializationDelegate<StreamRecord<T>> streamRecordSerializationDelegate,
int numberOfOutputChannels) {
T value = streamRecordSerializationDelegate.getInstance().getValue();
if (value.f0.isBroadCastPartitioning()) {
// send to all channels
int[] channels = new int[numberOfOutputChannels];
for (int i = 0; i < numberOfOutputChannels; ++i) {
channels[i] = i;
}
return channels;
} else if (value.f0.getPartitionKey() == -1) {
// random partition
returnChannels[0] = random.nextInt(numberOfOutputChannels);
} else {
returnChannels[0] = partitioner.partition(value.f0.getPartitionKey(), numberOfOutputChannels);
}
return returnChannels;
}
I am not sure how to migrate this to Flink 1.9, since the StreamPartitioner interface has changed as illustrated below:
// New: required by Flink 1.9
#Override
public int selectChannel(SerializationDelegate<StreamRecord<T>> streamRecordSerializationDelegate) {
T value = streamRecordSerializationDelegate.getInstance().getValue();
if (value.f0.isBroadCastPartitioning()) {
/*
It is illegal to call this method for broadcast channel selectors and this method can remain not
implemented in that case (for example by throwing UnsupportedOperationException).
*/
} else if (value.f0.getPartitionKey() == -1) {
// random partition
returnChannels[0] = random.nextInt(numberOfChannels);
} else {
returnChannels[0] = partitioner.partition(value.f0.getPartitionKey(), numberOfChannels);
}
//return returnChannels;
return returnChannels[0];
}
Note that selectChannels has been replaced with selectChannel. So, it is no longer possible to return multiple output channels as originally done above for the case of broadcasted elements. As a matter of fact, selectChannel should not be invoked for this particular case. Any thoughts on how to tackle this?
With Flink 1.9, you cannot dynamically broadcast to all channels anymore. Your StreamPartitioner has to statically specify if it's a broadcast with isBroadcast. Then, selectChannel is never invoked.
Do you have a specific use case, where you'd need to dynamically switch?

Issue reading / writing a dynamic BYTE array to the registry [duplicate]

This question already has an answer here:
How to get size of dynamic array in C++ [duplicate]
(1 answer)
Closed 6 years ago.
I have this method that initializes a buffer:
void CCreateReportDlg::InitAutoAssignStates()
{
int iNumColumns = m_Grid.GetColumnCount();
ASSERT(m_pbyAutoAssignStates == NULL);
if (m_pbyAutoAssignStates == NULL)
{
m_pbyAutoAssignStates = new BYTE[iNumColumns];
if (m_pbyAutoAssignStates != NULL)
{
// This sets them all to AUTO_ASSIGN_INCLUDE
ZeroMemory(m_pbyAutoAssignStates, iNumColumns * sizeof(BYTE));
// DATE is never used for auto assign
m_pbyAutoAssignStates[COLUMN_DATE] = AUTO_ASSIGN_NOT_USED;
}
}
}
So far, so good. This buffer gets passed into a dialog class.
// Receives pointer to a BYTE* array.
// This is owned by the parent.
void CAutoAssignSettingsDlg::SetAutoAssignStates(BYTE *pbyAutoAssignStates)
{
m_pbyAutoAssignStates = pbyAutoAssignStates;
}
No problems there. I then have a checked list on the dialog that is mapped to each of the states in the above buffer.
When the popup dialog is about to close it revises the buffer:
void CAutoAssignSettingsDlg::UpdateAutoAssignStates()
{
LVITEM sItem;
int iAssign, iNumAssign;
if (m_pbyAutoAssignStates != NULL)
{
sItem.mask = LVIF_IMAGE|LVIF_PARAM;
sItem.iSubItem = 0;
iNumAssign = m_listAssign.GetItemCount();
for (iAssign = 0; iAssign < iNumAssign; iAssign++)
{
sItem.iItem = iAssign;
m_listAssign.GetItem(&sItem);
if (sItem.iImage == IMG_CHECKED)
m_pbyAutoAssignStates[sItem.lParam] = AUTO_ASSIGN_EXCLUDE;
else
m_pbyAutoAssignStates[sItem.lParam] = AUTO_ASSIGN_INCLUDE;
}
}
}
This all works. But then I want to save it to the registry. At the moment I do it like this:
theApp.WriteProfileBinary(strSection, _T("AssignStates"), m_pbyAutoAssignStates, sizeof(m_pbyAutoAssignStates));
Finally, in the parent dialog, I adjusted the code that reads the settings in from the registry. So now, before the InitAutoAssignStates call I do this:
theApp.GetProfileBinary(strSection,_T("AssignStates"), &ppData, &uSize);
if (uSize > 0)
{
m_pbyAutoAssignStates = new BYTE[uSize];
memcpy(m_pbyAutoAssignStates, ppData, uSize);
}
// Tidy memory
if (uSize != 0)
{
delete[] ppData;
ppData = NULL;
}
The subsequent InitAutoAssignStates method is only called now if the buffer is NULL. So in theory I shoudlread back in the buffer that I saved. But it is not working. The set of items ticked in my check boxes do not match.
What am I doing wrong?
I found a related question that said you could not do what I was trying to achieve without knowing the number of elements. This did surprise me but I am not going to argue.
I adjusted my code to pass in the number of elements to the popup dialog and then I was able to save like this:
theApp.WriteProfileBinary(strSection, _T("AssignStates"),
m_pbyAutoAssignStates,
sizeof(m_pbyAutoAssignStates[0]) * m_iNumAutoAssignStateValues);
This works correctly. When I read this buffer back I get matching check boxes in my list.

How can I get screenshot from all displays with X11?

I was working on writing a screenshot thing, and found this excellent topic for Mac: How can I get screenshot from all displays on MAC?
I was wondering if anyone has the equivalent for x11 library? To get all the monitors and then screenshot them all?
I had found this topic: https://stackoverflow.com/a/5293559/1828637
But the code linked from there is not as easy to follow for a novice like me.
Will RootWindow(3) get the area of all the monitors combined? Then I can go through and get the monitors dimensions then XGetImage those sections on the return of RootWindow?
I had come across this topic: How do take a screenshot correctly with xlib? But I'm not sure if it has multi-monitor support. I do this in ctypes so I cant test that code easily without going through the grueling task of writing it first. So I was wondering if this is correct or how would I modify it to handle multi mon please?
Edit
The poster there shared his code, it is seen here: https://github.com/Lalaland/ScreenCap/blob/master/src/screenCapturerImpl.cpp#L96 but it's complicated and I don't understand it. It uses functions like XFixesGetCursorImage which I can't find in the documentation, and I don't see how the multi monitors work there. Author of that topic warned he doesn't remember the code and it may not work with modern Linux.
This is not a perfect answer to the question, but the following code could be modified to get a very fast version of your desired end result:
https://github.com/Clodo76/vr-desktop-mirror/blob/master/DesktopCapture/main.cpp
The DesktopCapturePlugin_Initialize method converts all the displays into objects:
UNITY_INTERFACE_EXPORT void UNITY_INTERFACE_API DesktopCapturePlugin_Initialize()
{
DesksClean();
g_needReinit = 0;
IDXGIFactory1* factory;
CreateDXGIFactory1(__uuidof(IDXGIFactory1), reinterpret_cast<void**>(&factory));
IDXGIAdapter1* adapter;
for (int i = 0; (factory->EnumAdapters1(i, &adapter) != DXGI_ERROR_NOT_FOUND); ++i)
{
IDXGIOutput* output;
for (int j = 0; (adapter->EnumOutputs(j, &output) != DXGI_ERROR_NOT_FOUND); j++)
{
DXGI_OUTPUT_DESC outputDesc;
output->GetDesc(&outputDesc);
MONITORINFOEX monitorInfo;
monitorInfo.cbSize = sizeof(MONITORINFOEX);
GetMonitorInfo(outputDesc.Monitor, &monitorInfo);
// Maybe in future add a function to identify the primary monitor.
//if (monitorInfo.dwFlags == MONITORINFOF_PRIMARY)
{
int iDesk = DeskAdd();
g_desks[iDesk].g_width = monitorInfo.rcMonitor.right - monitorInfo.rcMonitor.left;
g_desks[iDesk].g_height = monitorInfo.rcMonitor.bottom - monitorInfo.rcMonitor.top;
auto device = g_unity->Get<IUnityGraphicsD3D11>()->GetDevice();
IDXGIOutput1* output1;
output1 = reinterpret_cast<IDXGIOutput1*>(output);
output1->DuplicateOutput(device, &g_desks[iDesk].g_deskDupl);
}
output->Release();
}
adapter->Release();
}
factory->Release();
}
Then the OnRenderEvent method copies a frame from the display into a texture (provided by unity in this case):
void UNITY_INTERFACE_API OnRenderEvent(int eventId)
{
for (int iDesk = 0; iDesk < g_nDesks; iDesk++)
{
if (g_desks[iDesk].g_deskDupl == nullptr || g_desks[iDesk].g_texture == nullptr)
{
g_needReinit++;
return;
}
IDXGIResource* resource = nullptr;
const UINT timeout = 0; // ms
HRESULT resultAcquire = g_desks[iDesk].g_deskDupl->AcquireNextFrame(timeout, &g_desks[iDesk].g_frameInfo, &resource);
if (resultAcquire != S_OK)
{
g_needReinit++;
return;
}
g_desks[iDesk].g_isPointerVisible = (g_desks[iDesk].g_frameInfo.PointerPosition.Visible == TRUE);
g_desks[iDesk].g_pointerX = g_desks[iDesk].g_frameInfo.PointerPosition.Position.x;
g_desks[iDesk].g_pointerY = g_desks[iDesk].g_frameInfo.PointerPosition.Position.y;
ID3D11Texture2D* texture;
HRESULT resultQuery = resource->QueryInterface(__uuidof(ID3D11Texture2D), reinterpret_cast<void**>(&texture));
resource->Release();
if (resultQuery != S_OK)
{
g_needReinit++;
return;
}
ID3D11DeviceContext* context;
auto device = g_unity->Get<IUnityGraphicsD3D11>()->GetDevice();
device->GetImmediateContext(&context);
context->CopyResource(g_desks[iDesk].g_texture, texture);
g_desks[iDesk].g_deskDupl->ReleaseFrame();
}
g_needReinit = 0;
}

FFmpeg C Api - Reduce fps but maintain video duration

Using the FFmpeg C API I'm trying to convert an input video into a video that looks like an animated gif - meaning no audio stream and a video stream of 4/fps.
I have the decode/encode part working. I can drop the audio stream from the output file, but I'm having trouble reducing the fps. I can change the output video stream's time_base to 4/fps, but it increases the video's duration - basically playing it in slow mo.
I think I need to drop the extra frames before I write them to the output container.
Below is the loop where I read the input frames, and then write them to output container.
Is this where I'd drop the extra frames? How do I determine which frames to drop (I,P,B frames)?
while(av_read_frame(input_container, &decoded_packet)>=0) {
if (decoded_packet.stream_index == video_stream_index) {
len = avcodec_decode_video2(input_stream->codec, decoded_frame, &got_frame, &decoded_packet);
if(len < 0) {
exit(1);
}
if(got_frame) {
av_init_packet(&encoded_packet);
encoded_packet.data = NULL;
encoded_packet.size = 0;
if(avcodec_encode_video2(output_stream->codec, &encoded_packet, decoded_frame, &got_frame) < 0) {
exit(1);
}
if(got_frame) {
if (output_stream->codec->coded_frame->key_frame) {
encoded_packet.flags |= AV_PKT_FLAG_KEY;
}
encoded_packet.stream_index = output_stream->index;
encoded_packet.pts = av_rescale_q(current_frame_num, output_stream->codec->time_base, output_stream->time_base);
encoded_packet.dts = av_rescale_q(current_frame_num, output_stream->codec->time_base, output_stream->time_base);
if(av_interleaved_write_frame(output_container, &encoded_packet) < 0) {
exit(1);
}
else {
current_frame_num +=1;
}
}
frame_count+=1;
av_free_packet(&encoded_packet);
}
}
}
It looks like you are decoding then re-encoding the video. In the decoded state there is no such thing as I/B/P. They are all I frames. This is also where you should be dropping frames. You must decode every frame, but once decoded, drop the frames you no longer want by simply not sending them to the encoder. And finally, don't touch the timebase at all.

DirectShow Custom Source Pin

I am new to DirectShow. I, like many others, am trying to create a socket-based P2P streaming solution for a WPF-based card game. I want each player to be able to see each other via small video windows.
My questions are two-fold. The first is How do I lower the frame sample rate and resolution? I believe 320x200 x 15 to 20 fps should be fine. I am using the SampleGrabber callback to grab frame data and send it over the socket; which is actually working with no compression at 640x480 resolution.
My second question is, since each frame contains 921,600 bytes, this really bogs down and I get very slow rendering just across my local WiFi connected LAN. I added a simple MJPEG compression (wanting to switch to h.264 later) and I noticed the bytes drop to around 330-360k. Not a bad improvement.
On the receiving end Do I need to create a custom DirectShow Source Pin in order to serve up the bytes received from the socket so I can attach a decoder and render the bytes in a window?
I just wanted to ask this first since it seems like a lot of work to create a new COM object (haven't done that in about 15 years!), register it, and use/debug it.
Is there perhaps another way?
Also if that is the way to go, should I use a SampleGrabber on the receiving end and create a BitmapSource from the decompressed bytes, or should I allow DirectShow to create a child window? Thing is, I want to have more than one other player and I set an extra byte in the socket to tell what table position they are in. How do I render each position in turn?
For those that are interested, here is how you set the resolution and add an encoder/compressor:
// Create a graph builder
int hr = captureGraphBuilder.SetFiltergraph(graphBuilder);
// Find a capture device (WebCam) and attach it to the graph
sourceFilter = FindCaptureDevice();
hr = graphBuilder.AddFilter(sourceFilter, "Video Capture");
// Get the source output Pin
IPin sourcePin = DsFindPin.ByDirection((IBaseFilter)sourceFilter, PinDirection.Output, 0);
IAMStreamConfig sc = (IAMStreamConfig)sourcePin;
int count;
int size;
sc.GetNumberOfCapabilities(out count, out size);
VideoInfoHeader v;
AMMediaType media2 = null;
IntPtr memPtr = Marshal.AllocCoTaskMem(size);
for (int i = 0; i < count; ++i)
{
sc.GetStreamCaps(i, out media2, memPtr);
v = (VideoInfoHeader)Marshal.PtrToStructure(media2.formatPtr, typeof(VideoInfoHeader));
// Break when width is 160
if (v.BmiHeader.Width == 160)
break;
}
// Set the new media format t0 160 x 120
hr = sc.SetFormat(media2);
Marshal.FreeCoTaskMem(memPtr);
DsUtils.FreeAMMediaType(media2);
// Create a FramGrabber
IBaseFilter grabberF = (IBaseFilter)new SampleGrabber();
ISampleGrabber grabber = (ISampleGrabber)grabberF;
// Set the media type
var media = new AMMediaType
{
majorType = MediaType.Video,
subType = MediaSubType.MJPG
//subType = MediaSubType.RGB24
};
// The media sub type will be MJPG
hr = grabber.SetMediaType(media);
DsUtils.FreeAMMediaType(media);
hr = grabber.SetCallback(this, 1);
hr = graphBuilder.AddFilter(grabberF, "Sample Grabber");
IPin grabberPin = DsFindPin.ByDirection(grabberF, PinDirection.Input, 0);
// Get the MPEG compressor
Guid iid = typeof(IBaseFilter).GUID;
object compressor = null;
foreach (DsDevice device in DsDevice.GetDevicesOfCat(FilterCategory.VideoCompressorCategory))//.MediaEncoderCategory))
{
if (device.Name == "MJPEG Compressor")
{
device.Mon.BindToObject(null, null, ref iid, out compressor);
hr = graphBuilder.AddFilter((IBaseFilter)compressor, "Compressor");
break;
}
string name = device.Name;
}
// This also works!
//IBaseFilter enc = (IBaseFilter)new MJPGEnc();
//graphBuilder.AddFilter(enc, "MJPEG Encoder");
// Get the input and out pins of the compressor
IBaseFilter enc = (IBaseFilter)compressor;
IPin encPinIn = DsFindPin.ByDirection(enc, PinDirection.Input, 0);
IPin encPinOut = DsFindPin.ByDirection(enc, PinDirection.Output, 0);
// Attach the pins: source to input, output to grabber
hr = graphBuilder.Connect(sourcePin, encPinIn);
hr = graphBuilder.Connect(encPinOut, grabberPin);
// Free the pin resources
Marshal.ReleaseComObject(sourcePin);
Marshal.ReleaseComObject(enc);
Marshal.ReleaseComObject(encPinIn);
Marshal.ReleaseComObject(encPinOut);
Marshal.ReleaseComObject(grabberPin);
// Create a render stream
hr = captureGraphBuilder.RenderStream(PinCategory.Preview, MediaType.Video, sourceFilter, null, grabberF);
Marshal.ReleaseComObject(sourceFilter);
Configure(grabber);

Resources