How to wait for the GPU to finish its work in DirectX9? - directx-9

I'm looking for a method to wait for the GPU to finish its work in DirectX9. Something equivalent to the glFinish command in OpenGL...
I already know that it's not something I should do, but I have to! I'm writing a threaded Graphics Engine integrated in WPF and I need to make sort of an off-screen rendering in order to give a valid surface to a D3DImage. The frames are very long to compute (more than 100ms) and the rendering of the WPF Image sometimes occurs while the frame is not fully computed by my Engine even if I lock everything the right way. I'm almost sure it's just a Finish issue but I didn't find out how to do that.
So far, I tried to launch a DX9 query like this :
using namespace SlimDX.Direct3D9;
public class GraphicsDevice: Device
{
...
public void Finish()
{
var query = new Query(this, QueryType.Event);
EndScene();
while (!query.CheckStatus(true)) ;
}
}
But it does not seem to work...
So, first question without talking about WPF, do you know how to wait for the GPU to finish what has been sent to the driver?
Thanks!

This was the solution.
I was not aware that it actually work!
I used an EventQuery to 'mark' my last call to the GPU.
Then I put some kind of infinite loop flushing the GPU instructions and waiting for the EventQuery to be finally fired by the GPU, using the GetData/CheckStatus methods.

Related

Having to use DelayedTask otherwise LoadMask doesn't show

I am doing a few processes that take time so I want to be able to show a mask, I use LoadMask. Problem is that the code that I want to run seems to run too quick and I presume is blocking the UI to show the mask. The process takes around 4 seconds so I know the mask isn't being enabled and disabled at the same time.
The way I got around this was to use a delayed task, but it kind of feels like a code smell.
Can I do this a different way?
Here is what I have
var myMask = new Ext.LoadMask(win, {});
myMask.show();
var task = new Ext.util.DelayedTask(function () {
....... // DO MY stuff and eventually do a myMask.hide()
task.delay(400);
It works great but I was wondering if this is the right way to go ?
If I remove the delayedtaks then the mask never seems to display, but the process does work.
Assuming your "process" is some kind of local sequential procedure, what you said above is pretty much correct. If your code is running a busy loop it won't relinquish control to the UI "thread" for the browser to redraw.
So you won't see the mask here;
mask();
busyLoop();
unmask();
But if you do:
mask();
setTimeout(function() {
busyLoop();
unmask();
}, 1);
It gives the browser time to draw before you get into your stuff.

SetWindowPos/ShowWindow with a timeout

I'm using the SetWindowPos function for an automation task to show a window. I know that there are two ways that Windows provides to do this:
Synchronously: SetWindowPos or ShowWindow.
Asynchronously: SetWindowPos with SWP_ASYNCWINDOWPOS or ShowWindowAsync.
Now, I'd like to get the best of both worlds: I want to be able to show the window synchronously, because I'd like it to be done when the function returns. But I don't want the call to hang my process - if it takes too long, I want to be able to abort the call.
Now, while looking for an answer, the only thing I could come up with is using a separate thread and using SendMessageTimeout, but even then, if the thread hangs, there's not much I can do to end it except of TerminateProcess, which is not a clean solution.
I also have seen this answer, but as far as I understand, it has no alternative for native WinAPI.
The answer in the question you linked to simply loops until either the desired condition occurs or the timeout expires. It uses Sleep() every iteration to avoid hogging the processor. So a version for WinAPI can be written quite simply, as follows:
bool ShowWindowAndWait(HWND hWnd, DWORD dwTimeout) {
if (IsWindowVisible(hWnd)) return true;
if (!ShowWindowAsync(hWnd, SW_SHOW)) return false;
DWORD dwTick = GetTickCount();
do {
if (IsWindowVisible(hWnd)) return true;
Sleep(15);
} while (dwTimeout != 0 && GetTickCount() - dwTick < dwTimeout);
return false;
}
Unfortunately I think this is the best you're going to get. SendMessageTimeout can't actually be used for this purpose because (as far as I know anyway) there's no actual message you could send with it that would cause the target window to be shown. ShowWindowAsync and SWP_ASYNCWINDOWPOS both work by scheduling internal window events, and this API isn't publicly exposed.

D3DImage and SharpDX flickering on slow hardware

I am using the SharpDX.WPF project for the WPF abilities, it seems like an easy to understand low-overhead library, compared to the Toolkit that comes with SharpDX (which has the same issue!)
First: I fixed the SharpDX.WPF project for the latest SharpDX using the following: https://stackoverflow.com/a/19791534/442833
Then I made the following hacky adjustment to DXElement.cs, a solution that was also done here:
private Query queryForCompletion;
public void Render()
{
if (Renderer == null || IsInDesignMode)
return;
var test = Renderer as D3D11;
if (queryForCompletion == null)
{
queryForCompletion = new Query(test.Device,
new QueryDescription {Type = QueryType.Event, Flags = QueryFlags.None});
}
Renderer.Render(GetDrawEventArgs());
Surface.Lock();
test.Device.ImmediateContext.End(queryForCompletion);
// wait until drawing completes
Bool completed;
var counter = 0;
while (!(test.Device.ImmediateContext.GetData(queryForCompletion, out completed)
&& completed))
{
Console.WriteLine("Yielding..." + ++counter);
Thread.Yield();
}
//Surface.Invalidate();
Surface.AddDirtyRect(new Int32Rect(0, 0, Surface.PixelWidth, Surface.PixelHeight));
Surface.Unlock();
}
Then I render 8000 cubes in a cube pattern...
Yielding...
gets printed to the console quite often, but the flickering is still there.
I am assuming that WPF is nice enough to show the image using a different thread before the rendering is done, not sure though...
This same issue also happens when I use the Toolkit variant of WPF support with SharpDX.
Images to demonstate the issue:
Bad
Better
Almost
Intended
Note: It randomly switches between these old images, randomly. I am also using really old hardware which makes the flickering much more appearant (GeForce Quadro FX 1700)
A made a repo which contains the exact same source-code as I am using to get this issue:
https://github.com/ManIkWeet/FlickeringIssue/
Related to D3DImage locking, note that the D3DImage.TryLock API has rather unconventional semantics which most developers would not expect:
Beware!
You must call Unlock even in the case where TryLock indicates failure (i.e., returns false)
Although perhaps more of an alarming design choice than a bug per se, misunderstanding this behavior will trivially result in D3DImage deadlocks and hangs, and thus might be responsible for much of the frustration people experience in attempting to get D3DImage working properly.
The following code is a correct WPF D3D render with no flicker in my app:
void WPF_D3D_render(IntPtr pSurface)
{
if (TryLock(new Duration(default(TimeSpan))))
{
SetBackBuffer(D3DResourceType.IDirect3DSurface9, pSurface);
AddDirtyRect(new Int32Rect(0, 0, PixelWidth, PixelHeight));
}
Unlock(); // <--- !
}
Yes, this unintuitive code is actually correct; it is the case that that D3DImage.TryLock(0) leaks one internal D3D buffer lock every time it returns failure. You don't have to take my word for it, here's the CLR code from PresentationCore.dll v4.0.30319:
private bool LockImpl(Duration timeout)
{
bool flag = false;
if (_lockCount == uint.MaxValue)
throw new InvalidOperationException();
if (_lockCount == 0)
{
if (timeout == Duration.Forever)
flag = _canWriteEvent.WaitOne();
else
flag = _canWriteEvent.WaitOne(timeout.TimeSpan, false);
UnsubscribeFromCommittingBatch();
}
_lockCount++;
return flag;
}
Notice that the internal _lockCount field is incremented regardless of whether the function returns success or failure. You have to call Unlock() yourself, as shown in the first code example above, if you want to avoid certain deadlock. Failing to do so creates is nasty to debug, too, because the component won't (potentially) deadlock until the next render pass, by which time the relevant evidence is long gone.
The unusual behavior does not seem to be mentioned at MSDN, but to be fair, that documentation doesn't note that you have to call Unlock() if the call is successful, either.
The problem is not the Locking mechanism. Normally you use Present to draw to present the image. Present will wait until all drawing is ready. With D3DImage you are not using the Present() method. Instead of Presenting, you lock, adding a DirtyRect and unlock the D3DImage.
The rendering is done asynchrone so when you are unlocking, the draw actions might not be ready. This is causing the flicker effect. Sometimes you see items half drawn. A poor solution (i've tested with) is adding a small delay before unlocking. It helped a little, but it wasn't a neat solution. It was terrible!
Solution:
I continued with something else; I was expirimenting with MSAA (antialiasing) and the first problem I faced was; MSAA cannot be done on the dx11/dx9 shared texture, so i decided to render to a new texture (dx11) and create a copy to the dx9 shared texture. I slammed my head on the tabel, because now it was anti-aliased AND flicking-free!! Don't forget to call Flush() before adding a dirty rect.
So, creating a copy of the texture: DXDevice11.Device.ImmediateContext.ResolveSubresource(_dx11RenderTexture, 0, _dx11BackpageTexture, 0, ColorFormat); (_dx11BackpageTexture is shared texture) will wait until the rendering is ready and will create a copy.
This is how I got rid of the flickering....
I think you are not locking properly. As far as I understand the MSDN documentation you are supposed to lock during the entire rendering not just at the end of it:
While the D3DImage is locked, your application can also render to the Direct3D surface assigned to the back buffer.
The information you find on the net about D3DImage/SharpDX is somewhat confusing because the SharpDX guys don't really like the way D3DImage is implemented (can't blame them), so there are statements about this being a "bug" on Microsofts side when its actually just improper usage of the API.
Yes, locking during rendering has performance issues, but it is probably not possible to fix them without porting WPF to DirectX11 and implementing something like a SwapChainPanel which is available in UWP apps. (WPF itself still runs on DirectX9)
If the locking is a performance issue for you, one idea I had (but never tested) is that you could render to an offscreen surface and reduce the lock duration to copying that surface over to the D3DImage. No idea if that would help performance wise but its something to try.

On creating expensive WPF objects and multithreading

The classic advice in multithreading programing is to do processor heavy work on a background thread and return the result to the UI thread for minor processing (update a label, etc). What if generating the WPF element itself is the operation which is expensive?
I'm working with a third party library which generates some intense elements, which can take around to 0.75s - 1.5s to render. Generating one isn't too bad, but when I need to create 5 of them to show at once it noticeably locks the UI (including progress spinners). Unfortunately, there isn't any other place to create them because WPF is thread affine.
I've already tried DispatcherPriority.Background but its not enough. What is the recommended way to deal with this problem?
If the objects being created derived from Freezable, then you can actually create them on a different thread than the UI thread - you just have to call Freeze on them while you're on the worker thread, and then you can transfer them over. However, that doesn't help you for items that don't derive from Freezable.
Have you tried creating them one at a time? The following example doesn't do any useful work but it does show how the basic structure for doing a lot of work in little bits:
int count = 100;
Action slow = null;
slow = delegate
{
Thread.Sleep(100);
count -= 1;
if (count > 0)
{
Dispatcher.BeginInvoke(slow, DispatcherPriority.Background);
}
};
Dispatcher.BeginInvoke(slow, DispatcherPriority.Background);
The 'work' here is to sleep for a tenth of a second. (So if you replace that with real work that takes about as long, you'll get the same behaviour.) This does that 100 times, so that's a total of 10 seconds of 'work'. The UI remains reasonably responsive for the whole time - things like dragging the window around become a bit less smooth, but it's perfectly usable. Change both those Background priorities to Normal, and the application locks up.
The key here is that we end up returning after doing each small bit of work having queued up the next bit - we end up calling Dispatcher.BeginInvoke 100 times in all instead of once. That gives the UI a chance to respond to input on a regular basis.

silverlight math performance question

Is there some reason that identical math operations would take significantly longer in one Silverlight app than in another?
For example, I have some code that takes a list of points and transforms them (scales and translates them) and populates another list of points. It's important that I keep the original points intact, hence the second list.
Here's the relevant code (scale is a double and origin is a point):
public Point transformPoint(Point point) {
// scale, then translate the x
point.X = (point.X - origin.X) * scale;
// scale, then translate the y
point.Y = (point.Y - origin.Y) * scale;
// return the point
return point;
}
Here's how I'm doing the loop and timing, in case it's important:
DateTime startTime = DateTime.Now;
foreach (Point point in rawPoints) transformedPoints.Add(transformPoint(point));
Debug.Print("ASPX milliseconds: {0}", (DateTime.Now - startTime).Milliseconds);
On a run of 14356 points (don't ask, it's modeled off a real world number in the desktop app), the breakdown is as follows:
Silverlight app #1: 46 ms
Silverlight app #2: 859 ms
The first app is an otherwise empty app that is doing the loop in the MainPage constructor. The second is doing the loop in a method in another class, and the method is called during an event handler in the GUI thread, I think. But should any of that matter, considering that identical operations are happening within the loop itself?
There maybe something huge I'm missing in how threading works or something, but this discrepancy doesn't make sense to me at all.
In addition to the other comments and answers I'm going to read between the lines a little.
In the first app you have pretty much this code in isolation running in the MainPage constructor. IWO you've create a fresh Silverlight app and slapped this code in it and thats it.
In the second app you have more actual real world stuff. At the very least you have this code running as the result of a button click on a rudimentory UI. Therein lies the clue.
Take a blank app and drop a button on it. Run it and click the button, what does the button do? There are animations attached to visual states of the button. This animation (or other animations or loops) are likely running in parrallel with your code when you click the button. Timers (whether you do it properly with StopWatch or not) record elapsed time, not just the time your thread takes. Hence when other threads are doing other things (like animations) your timing will be off.
My first suspicion would be that Silverlight App #2 triggers a garbage collection. Scaling ~15,000 points should be taking a millisecond, not nearly a second.
Try to reduce memory allocations in your code. Can transformedPoints be an array, rather than a dynamically grown data structure?
You can also look at the GC performance counters, but simply reducing the memory allocation may turn out to be simpler.
Could it be possible your code is not being inlined in the CLR by the app that is running slower?
I'm not sure how the CLR in SL handles inlining, but here is a link to some of the prerequisites for inlining in 3.5 SP1.
http://udooz.net/blog/2009/04/clr-improvements-in-net-35-sp1/

Resources