FFmpeg C Api - Reduce fps but maintain video duration - c

Using the FFmpeg C API I'm trying to convert an input video into a video that looks like an animated gif - meaning no audio stream and a video stream of 4/fps.
I have the decode/encode part working. I can drop the audio stream from the output file, but I'm having trouble reducing the fps. I can change the output video stream's time_base to 4/fps, but it increases the video's duration - basically playing it in slow mo.
I think I need to drop the extra frames before I write them to the output container.
Below is the loop where I read the input frames, and then write them to output container.
Is this where I'd drop the extra frames? How do I determine which frames to drop (I,P,B frames)?
while(av_read_frame(input_container, &decoded_packet)>=0) {
if (decoded_packet.stream_index == video_stream_index) {
len = avcodec_decode_video2(input_stream->codec, decoded_frame, &got_frame, &decoded_packet);
if(len < 0) {
exit(1);
}
if(got_frame) {
av_init_packet(&encoded_packet);
encoded_packet.data = NULL;
encoded_packet.size = 0;
if(avcodec_encode_video2(output_stream->codec, &encoded_packet, decoded_frame, &got_frame) < 0) {
exit(1);
}
if(got_frame) {
if (output_stream->codec->coded_frame->key_frame) {
encoded_packet.flags |= AV_PKT_FLAG_KEY;
}
encoded_packet.stream_index = output_stream->index;
encoded_packet.pts = av_rescale_q(current_frame_num, output_stream->codec->time_base, output_stream->time_base);
encoded_packet.dts = av_rescale_q(current_frame_num, output_stream->codec->time_base, output_stream->time_base);
if(av_interleaved_write_frame(output_container, &encoded_packet) < 0) {
exit(1);
}
else {
current_frame_num +=1;
}
}
frame_count+=1;
av_free_packet(&encoded_packet);
}
}
}

It looks like you are decoding then re-encoding the video. In the decoded state there is no such thing as I/B/P. They are all I frames. This is also where you should be dropping frames. You must decode every frame, but once decoded, drop the frames you no longer want by simply not sending them to the encoder. And finally, don't touch the timebase at all.

Related

FFMPEG library add padding the plane/frame processed

I am trying to create a filter to be a part of FFMPEG. In the process of creating it I need to create a padding around the frame so the image does not resample, just has the needed width and height. I know this is possible with libswscale/swscale.h, but I have no been able to find any example as to how to do the padding for the plane that is being processed. Example code below:
if (av_frame_is_writable(in)) {
out = in;
} else {
out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
if (!out) {
av_frame_free(&in);
return AVERROR(ENOMEM);
}
av_frame_copy_props(out, in);
}
for (p = 0; p < filter->nb_planes; p++) {
// did not find any documentation as to
//how set those attributes to add padding to the plane
filter->sws_ctx = sws_getContext(src_w, src_h, src_pix_fmt,
dst_w, dst_h, dst_pix_fmt,
SWS_BILINEAR, NULL, NULL, NULL);
}
There is no other way as to do it inside the filter. The funcionality has to be implemented from vf_pad filter.
Credit: #durandal_1707 from from #ffmpeg IRC

how to concate multiple (mp4 format) video into one video(mp4 format) in c#

I'm writing a program Which will concat multiple (mp4 format) to one file.But Problem is that my code only show last video file in merged file(m.mp4).The code which I am using is given below for description.
FileStream fs = new FileStream("m.mp4",FileMode.Append);
for (i = 1; i <= 3; i++)
{
//m1,m2,m3 are video mp4 files on my disk
var bytes = File.ReadAllBytes("m" + i + ".mp4");
fs.Write(bytes, 0, bytes.Length);
}
Console.WriteLine("Done!!");

How can I get screenshot from all displays with X11?

I was working on writing a screenshot thing, and found this excellent topic for Mac: How can I get screenshot from all displays on MAC?
I was wondering if anyone has the equivalent for x11 library? To get all the monitors and then screenshot them all?
I had found this topic: https://stackoverflow.com/a/5293559/1828637
But the code linked from there is not as easy to follow for a novice like me.
Will RootWindow(3) get the area of all the monitors combined? Then I can go through and get the monitors dimensions then XGetImage those sections on the return of RootWindow?
I had come across this topic: How do take a screenshot correctly with xlib? But I'm not sure if it has multi-monitor support. I do this in ctypes so I cant test that code easily without going through the grueling task of writing it first. So I was wondering if this is correct or how would I modify it to handle multi mon please?
Edit
The poster there shared his code, it is seen here: https://github.com/Lalaland/ScreenCap/blob/master/src/screenCapturerImpl.cpp#L96 but it's complicated and I don't understand it. It uses functions like XFixesGetCursorImage which I can't find in the documentation, and I don't see how the multi monitors work there. Author of that topic warned he doesn't remember the code and it may not work with modern Linux.
This is not a perfect answer to the question, but the following code could be modified to get a very fast version of your desired end result:
https://github.com/Clodo76/vr-desktop-mirror/blob/master/DesktopCapture/main.cpp
The DesktopCapturePlugin_Initialize method converts all the displays into objects:
UNITY_INTERFACE_EXPORT void UNITY_INTERFACE_API DesktopCapturePlugin_Initialize()
{
DesksClean();
g_needReinit = 0;
IDXGIFactory1* factory;
CreateDXGIFactory1(__uuidof(IDXGIFactory1), reinterpret_cast<void**>(&factory));
IDXGIAdapter1* adapter;
for (int i = 0; (factory->EnumAdapters1(i, &adapter) != DXGI_ERROR_NOT_FOUND); ++i)
{
IDXGIOutput* output;
for (int j = 0; (adapter->EnumOutputs(j, &output) != DXGI_ERROR_NOT_FOUND); j++)
{
DXGI_OUTPUT_DESC outputDesc;
output->GetDesc(&outputDesc);
MONITORINFOEX monitorInfo;
monitorInfo.cbSize = sizeof(MONITORINFOEX);
GetMonitorInfo(outputDesc.Monitor, &monitorInfo);
// Maybe in future add a function to identify the primary monitor.
//if (monitorInfo.dwFlags == MONITORINFOF_PRIMARY)
{
int iDesk = DeskAdd();
g_desks[iDesk].g_width = monitorInfo.rcMonitor.right - monitorInfo.rcMonitor.left;
g_desks[iDesk].g_height = monitorInfo.rcMonitor.bottom - monitorInfo.rcMonitor.top;
auto device = g_unity->Get<IUnityGraphicsD3D11>()->GetDevice();
IDXGIOutput1* output1;
output1 = reinterpret_cast<IDXGIOutput1*>(output);
output1->DuplicateOutput(device, &g_desks[iDesk].g_deskDupl);
}
output->Release();
}
adapter->Release();
}
factory->Release();
}
Then the OnRenderEvent method copies a frame from the display into a texture (provided by unity in this case):
void UNITY_INTERFACE_API OnRenderEvent(int eventId)
{
for (int iDesk = 0; iDesk < g_nDesks; iDesk++)
{
if (g_desks[iDesk].g_deskDupl == nullptr || g_desks[iDesk].g_texture == nullptr)
{
g_needReinit++;
return;
}
IDXGIResource* resource = nullptr;
const UINT timeout = 0; // ms
HRESULT resultAcquire = g_desks[iDesk].g_deskDupl->AcquireNextFrame(timeout, &g_desks[iDesk].g_frameInfo, &resource);
if (resultAcquire != S_OK)
{
g_needReinit++;
return;
}
g_desks[iDesk].g_isPointerVisible = (g_desks[iDesk].g_frameInfo.PointerPosition.Visible == TRUE);
g_desks[iDesk].g_pointerX = g_desks[iDesk].g_frameInfo.PointerPosition.Position.x;
g_desks[iDesk].g_pointerY = g_desks[iDesk].g_frameInfo.PointerPosition.Position.y;
ID3D11Texture2D* texture;
HRESULT resultQuery = resource->QueryInterface(__uuidof(ID3D11Texture2D), reinterpret_cast<void**>(&texture));
resource->Release();
if (resultQuery != S_OK)
{
g_needReinit++;
return;
}
ID3D11DeviceContext* context;
auto device = g_unity->Get<IUnityGraphicsD3D11>()->GetDevice();
device->GetImmediateContext(&context);
context->CopyResource(g_desks[iDesk].g_texture, texture);
g_desks[iDesk].g_deskDupl->ReleaseFrame();
}
g_needReinit = 0;
}

DirectShow Custom Source Pin

I am new to DirectShow. I, like many others, am trying to create a socket-based P2P streaming solution for a WPF-based card game. I want each player to be able to see each other via small video windows.
My questions are two-fold. The first is How do I lower the frame sample rate and resolution? I believe 320x200 x 15 to 20 fps should be fine. I am using the SampleGrabber callback to grab frame data and send it over the socket; which is actually working with no compression at 640x480 resolution.
My second question is, since each frame contains 921,600 bytes, this really bogs down and I get very slow rendering just across my local WiFi connected LAN. I added a simple MJPEG compression (wanting to switch to h.264 later) and I noticed the bytes drop to around 330-360k. Not a bad improvement.
On the receiving end Do I need to create a custom DirectShow Source Pin in order to serve up the bytes received from the socket so I can attach a decoder and render the bytes in a window?
I just wanted to ask this first since it seems like a lot of work to create a new COM object (haven't done that in about 15 years!), register it, and use/debug it.
Is there perhaps another way?
Also if that is the way to go, should I use a SampleGrabber on the receiving end and create a BitmapSource from the decompressed bytes, or should I allow DirectShow to create a child window? Thing is, I want to have more than one other player and I set an extra byte in the socket to tell what table position they are in. How do I render each position in turn?
For those that are interested, here is how you set the resolution and add an encoder/compressor:
// Create a graph builder
int hr = captureGraphBuilder.SetFiltergraph(graphBuilder);
// Find a capture device (WebCam) and attach it to the graph
sourceFilter = FindCaptureDevice();
hr = graphBuilder.AddFilter(sourceFilter, "Video Capture");
// Get the source output Pin
IPin sourcePin = DsFindPin.ByDirection((IBaseFilter)sourceFilter, PinDirection.Output, 0);
IAMStreamConfig sc = (IAMStreamConfig)sourcePin;
int count;
int size;
sc.GetNumberOfCapabilities(out count, out size);
VideoInfoHeader v;
AMMediaType media2 = null;
IntPtr memPtr = Marshal.AllocCoTaskMem(size);
for (int i = 0; i < count; ++i)
{
sc.GetStreamCaps(i, out media2, memPtr);
v = (VideoInfoHeader)Marshal.PtrToStructure(media2.formatPtr, typeof(VideoInfoHeader));
// Break when width is 160
if (v.BmiHeader.Width == 160)
break;
}
// Set the new media format t0 160 x 120
hr = sc.SetFormat(media2);
Marshal.FreeCoTaskMem(memPtr);
DsUtils.FreeAMMediaType(media2);
// Create a FramGrabber
IBaseFilter grabberF = (IBaseFilter)new SampleGrabber();
ISampleGrabber grabber = (ISampleGrabber)grabberF;
// Set the media type
var media = new AMMediaType
{
majorType = MediaType.Video,
subType = MediaSubType.MJPG
//subType = MediaSubType.RGB24
};
// The media sub type will be MJPG
hr = grabber.SetMediaType(media);
DsUtils.FreeAMMediaType(media);
hr = grabber.SetCallback(this, 1);
hr = graphBuilder.AddFilter(grabberF, "Sample Grabber");
IPin grabberPin = DsFindPin.ByDirection(grabberF, PinDirection.Input, 0);
// Get the MPEG compressor
Guid iid = typeof(IBaseFilter).GUID;
object compressor = null;
foreach (DsDevice device in DsDevice.GetDevicesOfCat(FilterCategory.VideoCompressorCategory))//.MediaEncoderCategory))
{
if (device.Name == "MJPEG Compressor")
{
device.Mon.BindToObject(null, null, ref iid, out compressor);
hr = graphBuilder.AddFilter((IBaseFilter)compressor, "Compressor");
break;
}
string name = device.Name;
}
// This also works!
//IBaseFilter enc = (IBaseFilter)new MJPGEnc();
//graphBuilder.AddFilter(enc, "MJPEG Encoder");
// Get the input and out pins of the compressor
IBaseFilter enc = (IBaseFilter)compressor;
IPin encPinIn = DsFindPin.ByDirection(enc, PinDirection.Input, 0);
IPin encPinOut = DsFindPin.ByDirection(enc, PinDirection.Output, 0);
// Attach the pins: source to input, output to grabber
hr = graphBuilder.Connect(sourcePin, encPinIn);
hr = graphBuilder.Connect(encPinOut, grabberPin);
// Free the pin resources
Marshal.ReleaseComObject(sourcePin);
Marshal.ReleaseComObject(enc);
Marshal.ReleaseComObject(encPinIn);
Marshal.ReleaseComObject(encPinOut);
Marshal.ReleaseComObject(grabberPin);
// Create a render stream
hr = captureGraphBuilder.RenderStream(PinCategory.Preview, MediaType.Video, sourceFilter, null, grabberF);
Marshal.ReleaseComObject(sourceFilter);
Configure(grabber);

Image Analysis Program Based on Hashcode Method Resulting in Errors

I am trying to write a program that will recognize an image on the screen, compare it against a resource library, and then calculate based on the result of the image source.
The first thing that I did was to create the capture screen function which looks like this:
private Bitmap Screenshot()
{
System.Drawing.Bitmap Table = new System.Drawing.Bitmap(88, 40, PixelFormat.Format32bppArgb);
System.Drawing.Graphics g = System.Drawing.Graphics.FromImage(RouletteTable);
g.CopyFromScreen(1047, 44, 0, 0, Screen.PrimaryScreen.Bounds.Size);
return Table;
}
Then, I analyze this picture. The first method I used was to create two for loops and analyze both the bitmaps pixel by pixel. The problem with this method was time, it took a long time to complete 37 times. I looked around and found the convert to bytes and the convert to hash methods. This is the result:
public enum CompareResult
{
ciCompareOk,
ciPixelMismatch,
ciSizeMismatch
};
public CompareResult Compare(Bitmap bmp1, Bitmap bmp2)
{
CompareResult cr = CompareResult.ciCompareOk;
//Test to see if we have the same size of image
if (bmp1.Size != bmp2.Size)
{
cr = CompareResult.ciSizeMismatch;
}
else
{
//Convert each image to a byte array
System.Drawing.ImageConverter ic = new System.Drawing.ImageConverter();
byte[] btImage1 = new byte[1];
btImage1 = (byte[])ic.ConvertTo(bmp1, btImage1.GetType());
byte[] btImage2 = new byte[1];
btImage2 = (byte[])ic.ConvertTo(bmp2, btImage2.GetType());
//Compute a hash for each image
SHA256Managed shaM = new SHA256Managed();
byte[] hash1 = shaM.ComputeHash(btImage1);
byte[] hash2 = shaM.ComputeHash(btImage2);
for (int i = 0; i < hash1.Length && i < hash2.Length&& cr == CompareResult.ciCompareOk; i++)
{
if (hash1[i] != hash2[i])
cr = CompareResult.ciPixelMismatch;
}
}
return cr;
}
After I analyze the two bitmaps in this function, I call it in my main form with the following:
Bitmap Table = Screenshot();
CompareResult success0 = Compare(Properties.Resources.Result0, Table);
if (success0 == CompareResult.ciCompareOk)
{ double result = 0; Num.Text = result.ToString(); goto end; }
The problem I am getting is that once this has all been accomplished, I am always getting a cr value of ciPixelMismatch. I cannot get the images to match, even though the images are identical.
To give you a bit more background on the two bitmaps, they are approximately 88 by 40 pixels, and located at 1047, 44 on the screen. I wrote a part of the program to automatically take a picture of that area so I did not have to worry about the wrong location or size being captured:
Table.Save("table.bmp");
After I took the picture and saved it, I moved it from the bin folder in the project directly to the resource folder and ran the program again. Despite all of this, the result is still ciPixelMismatch. I believe the problem lies within the format that the pictures are being saved as. I believe that despite them being the same image, they are being analyzed in different formats, maybe one of the pictures contains a bit more information than the other which is causing the mismatch. Can somebody please help me solve this problem? I am just beginning with my c# programming, I am 5 days into the learning process, and I am really at a loss for this.
Yours sincerely,
Samuel

Resources