Is there any point to call Invalidate(Region region) when using the double buffer technic? In the Paint event I still have to draw everything, so it might would be more efficient to just call Invalidate()?
You'de better implement the double buffering manually by using BufferedGraphicsContext as is explained in this MSDN Article
Related
The typical way to handle a button press in GTK is:
g_signal_connect(GTK_BUTTON(myButton), "pressed", G_CALLBACK(myButtonHandler), NULL);
However, I find it bad, slow, and unnecessary to use strings (such as "pressed") for internal identification. If I can find the numerical signal ID that corrosponds to this, I can skip the parsing step. But how do I connect an event by ID rather than string name? I did a lot of digging and found this, and I also learned that g_signal_connect is a macro that expands to g_signal_connect_data, but none of these quite solve my problem.
Is this possible, if so, how do I do this?
You can use g_signal_connect_closure_by_id() but then you have to create a GClosure structure to hold your callback and callback data.
I would really recommend against this, as it will add boilerplate to your code for little benefit. You generally only connect signals once. If you are connecting a signal in a tight loop, then you are probably doing something wrong, or you have a very unusual use case. Anyway, signal names are actually interned, which means that you are not even incurring the cost of string comparisons; only the cost of splitting the string at : if the signal has a detail annotation. Don't bother with optimizing this unless it is actually showing up as a bottleneck on your profiler graphs.
I'm looking for a method to wait for the GPU to finish its work in DirectX9. Something equivalent to the glFinish command in OpenGL...
I already know that it's not something I should do, but I have to! I'm writing a threaded Graphics Engine integrated in WPF and I need to make sort of an off-screen rendering in order to give a valid surface to a D3DImage. The frames are very long to compute (more than 100ms) and the rendering of the WPF Image sometimes occurs while the frame is not fully computed by my Engine even if I lock everything the right way. I'm almost sure it's just a Finish issue but I didn't find out how to do that.
So far, I tried to launch a DX9 query like this :
using namespace SlimDX.Direct3D9;
public class GraphicsDevice: Device
{
...
public void Finish()
{
var query = new Query(this, QueryType.Event);
EndScene();
while (!query.CheckStatus(true)) ;
}
}
But it does not seem to work...
So, first question without talking about WPF, do you know how to wait for the GPU to finish what has been sent to the driver?
Thanks!
This was the solution.
I was not aware that it actually work!
I used an EventQuery to 'mark' my last call to the GPU.
Then I put some kind of infinite loop flushing the GPU instructions and waiting for the EventQuery to be finally fired by the GPU, using the GetData/CheckStatus methods.
I am using the SharpDX.WPF project for the WPF abilities, it seems like an easy to understand low-overhead library, compared to the Toolkit that comes with SharpDX (which has the same issue!)
First: I fixed the SharpDX.WPF project for the latest SharpDX using the following: https://stackoverflow.com/a/19791534/442833
Then I made the following hacky adjustment to DXElement.cs, a solution that was also done here:
private Query queryForCompletion;
public void Render()
{
if (Renderer == null || IsInDesignMode)
return;
var test = Renderer as D3D11;
if (queryForCompletion == null)
{
queryForCompletion = new Query(test.Device,
new QueryDescription {Type = QueryType.Event, Flags = QueryFlags.None});
}
Renderer.Render(GetDrawEventArgs());
Surface.Lock();
test.Device.ImmediateContext.End(queryForCompletion);
// wait until drawing completes
Bool completed;
var counter = 0;
while (!(test.Device.ImmediateContext.GetData(queryForCompletion, out completed)
&& completed))
{
Console.WriteLine("Yielding..." + ++counter);
Thread.Yield();
}
//Surface.Invalidate();
Surface.AddDirtyRect(new Int32Rect(0, 0, Surface.PixelWidth, Surface.PixelHeight));
Surface.Unlock();
}
Then I render 8000 cubes in a cube pattern...
Yielding...
gets printed to the console quite often, but the flickering is still there.
I am assuming that WPF is nice enough to show the image using a different thread before the rendering is done, not sure though...
This same issue also happens when I use the Toolkit variant of WPF support with SharpDX.
Images to demonstate the issue:
Bad
Better
Almost
Intended
Note: It randomly switches between these old images, randomly. I am also using really old hardware which makes the flickering much more appearant (GeForce Quadro FX 1700)
A made a repo which contains the exact same source-code as I am using to get this issue:
https://github.com/ManIkWeet/FlickeringIssue/
Related to D3DImage locking, note that the D3DImage.TryLock API has rather unconventional semantics which most developers would not expect:
Beware!
You must call Unlock even in the case where TryLock indicates failure (i.e., returns false)
Although perhaps more of an alarming design choice than a bug per se, misunderstanding this behavior will trivially result in D3DImage deadlocks and hangs, and thus might be responsible for much of the frustration people experience in attempting to get D3DImage working properly.
The following code is a correct WPF D3D render with no flicker in my app:
void WPF_D3D_render(IntPtr pSurface)
{
if (TryLock(new Duration(default(TimeSpan))))
{
SetBackBuffer(D3DResourceType.IDirect3DSurface9, pSurface);
AddDirtyRect(new Int32Rect(0, 0, PixelWidth, PixelHeight));
}
Unlock(); // <--- !
}
Yes, this unintuitive code is actually correct; it is the case that that D3DImage.TryLock(0) leaks one internal D3D buffer lock every time it returns failure. You don't have to take my word for it, here's the CLR code from PresentationCore.dll v4.0.30319:
private bool LockImpl(Duration timeout)
{
bool flag = false;
if (_lockCount == uint.MaxValue)
throw new InvalidOperationException();
if (_lockCount == 0)
{
if (timeout == Duration.Forever)
flag = _canWriteEvent.WaitOne();
else
flag = _canWriteEvent.WaitOne(timeout.TimeSpan, false);
UnsubscribeFromCommittingBatch();
}
_lockCount++;
return flag;
}
Notice that the internal _lockCount field is incremented regardless of whether the function returns success or failure. You have to call Unlock() yourself, as shown in the first code example above, if you want to avoid certain deadlock. Failing to do so creates is nasty to debug, too, because the component won't (potentially) deadlock until the next render pass, by which time the relevant evidence is long gone.
The unusual behavior does not seem to be mentioned at MSDN, but to be fair, that documentation doesn't note that you have to call Unlock() if the call is successful, either.
The problem is not the Locking mechanism. Normally you use Present to draw to present the image. Present will wait until all drawing is ready. With D3DImage you are not using the Present() method. Instead of Presenting, you lock, adding a DirtyRect and unlock the D3DImage.
The rendering is done asynchrone so when you are unlocking, the draw actions might not be ready. This is causing the flicker effect. Sometimes you see items half drawn. A poor solution (i've tested with) is adding a small delay before unlocking. It helped a little, but it wasn't a neat solution. It was terrible!
Solution:
I continued with something else; I was expirimenting with MSAA (antialiasing) and the first problem I faced was; MSAA cannot be done on the dx11/dx9 shared texture, so i decided to render to a new texture (dx11) and create a copy to the dx9 shared texture. I slammed my head on the tabel, because now it was anti-aliased AND flicking-free!! Don't forget to call Flush() before adding a dirty rect.
So, creating a copy of the texture: DXDevice11.Device.ImmediateContext.ResolveSubresource(_dx11RenderTexture, 0, _dx11BackpageTexture, 0, ColorFormat); (_dx11BackpageTexture is shared texture) will wait until the rendering is ready and will create a copy.
This is how I got rid of the flickering....
I think you are not locking properly. As far as I understand the MSDN documentation you are supposed to lock during the entire rendering not just at the end of it:
While the D3DImage is locked, your application can also render to the Direct3D surface assigned to the back buffer.
The information you find on the net about D3DImage/SharpDX is somewhat confusing because the SharpDX guys don't really like the way D3DImage is implemented (can't blame them), so there are statements about this being a "bug" on Microsofts side when its actually just improper usage of the API.
Yes, locking during rendering has performance issues, but it is probably not possible to fix them without porting WPF to DirectX11 and implementing something like a SwapChainPanel which is available in UWP apps. (WPF itself still runs on DirectX9)
If the locking is a performance issue for you, one idea I had (but never tested) is that you could render to an offscreen surface and reduce the lock duration to copying that surface over to the D3DImage. No idea if that would help performance wise but its something to try.
I'm currently working on a Worms game which involves terrain deformation. I used to do it with .GetData, modifying the color array, then using .SetData, but I looked into changing it to make the work done on the GPU instead (using RenderTargets).
All is going well with that, but I have come into another problem. My whole collision detection against the terrain was based on a Color array representing the terrain, but I do not have that color array anymore. I could use .GetData every time I modify the terrain to update my Color array, but that would defeat the purpose of my initial changes.
What I would be okay with is using GetData once at the beginning, and then modifying that array based on the changes I make to the terrain later on by some other means. I do not know how I would do this though, can anyone help?
I've done a bit of research, and I have yet to find a solution to getting rid of any GetData calls every time my terrain is modified, but I have found ways to "optimize" it, or at least reduce the GetData calls as much as possible.
Crater drawing is batched, meaning that rather than draw each one as it’s created, I add them to a list and draw all of them every few frames. This reduces the number of GetData calls – one per batch of craters rather than one per crater.
After drawing craters to the render target, I wait a few frames before calling GetData to make sure the GPU has processed all of the drawing commands. This minimizes pipeline stalls.
If I have a pending GetData call to make and more craters come in, the craters will stay batched until the GetData call is complete. In other words, the drawing and getting are synchronized so that a GetData call always happens several frames after drawing a batch of craters, and any new crater draw requests wait until after a pending GetData.
If anyone else has any other suggestions I would still be glad to hear them.
Is there some reason that identical math operations would take significantly longer in one Silverlight app than in another?
For example, I have some code that takes a list of points and transforms them (scales and translates them) and populates another list of points. It's important that I keep the original points intact, hence the second list.
Here's the relevant code (scale is a double and origin is a point):
public Point transformPoint(Point point) {
// scale, then translate the x
point.X = (point.X - origin.X) * scale;
// scale, then translate the y
point.Y = (point.Y - origin.Y) * scale;
// return the point
return point;
}
Here's how I'm doing the loop and timing, in case it's important:
DateTime startTime = DateTime.Now;
foreach (Point point in rawPoints) transformedPoints.Add(transformPoint(point));
Debug.Print("ASPX milliseconds: {0}", (DateTime.Now - startTime).Milliseconds);
On a run of 14356 points (don't ask, it's modeled off a real world number in the desktop app), the breakdown is as follows:
Silverlight app #1: 46 ms
Silverlight app #2: 859 ms
The first app is an otherwise empty app that is doing the loop in the MainPage constructor. The second is doing the loop in a method in another class, and the method is called during an event handler in the GUI thread, I think. But should any of that matter, considering that identical operations are happening within the loop itself?
There maybe something huge I'm missing in how threading works or something, but this discrepancy doesn't make sense to me at all.
In addition to the other comments and answers I'm going to read between the lines a little.
In the first app you have pretty much this code in isolation running in the MainPage constructor. IWO you've create a fresh Silverlight app and slapped this code in it and thats it.
In the second app you have more actual real world stuff. At the very least you have this code running as the result of a button click on a rudimentory UI. Therein lies the clue.
Take a blank app and drop a button on it. Run it and click the button, what does the button do? There are animations attached to visual states of the button. This animation (or other animations or loops) are likely running in parrallel with your code when you click the button. Timers (whether you do it properly with StopWatch or not) record elapsed time, not just the time your thread takes. Hence when other threads are doing other things (like animations) your timing will be off.
My first suspicion would be that Silverlight App #2 triggers a garbage collection. Scaling ~15,000 points should be taking a millisecond, not nearly a second.
Try to reduce memory allocations in your code. Can transformedPoints be an array, rather than a dynamically grown data structure?
You can also look at the GC performance counters, but simply reducing the memory allocation may turn out to be simpler.
Could it be possible your code is not being inlined in the CLR by the app that is running slower?
I'm not sure how the CLR in SL handles inlining, but here is a link to some of the prerequisites for inlining in 3.5 SP1.
http://udooz.net/blog/2009/04/clr-improvements-in-net-35-sp1/