The classic advice in multithreading programing is to do processor heavy work on a background thread and return the result to the UI thread for minor processing (update a label, etc). What if generating the WPF element itself is the operation which is expensive?
I'm working with a third party library which generates some intense elements, which can take around to 0.75s - 1.5s to render. Generating one isn't too bad, but when I need to create 5 of them to show at once it noticeably locks the UI (including progress spinners). Unfortunately, there isn't any other place to create them because WPF is thread affine.
I've already tried DispatcherPriority.Background but its not enough. What is the recommended way to deal with this problem?
If the objects being created derived from Freezable, then you can actually create them on a different thread than the UI thread - you just have to call Freeze on them while you're on the worker thread, and then you can transfer them over. However, that doesn't help you for items that don't derive from Freezable.
Have you tried creating them one at a time? The following example doesn't do any useful work but it does show how the basic structure for doing a lot of work in little bits:
int count = 100;
Action slow = null;
slow = delegate
{
Thread.Sleep(100);
count -= 1;
if (count > 0)
{
Dispatcher.BeginInvoke(slow, DispatcherPriority.Background);
}
};
Dispatcher.BeginInvoke(slow, DispatcherPriority.Background);
The 'work' here is to sleep for a tenth of a second. (So if you replace that with real work that takes about as long, you'll get the same behaviour.) This does that 100 times, so that's a total of 10 seconds of 'work'. The UI remains reasonably responsive for the whole time - things like dragging the window around become a bit less smooth, but it's perfectly usable. Change both those Background priorities to Normal, and the application locks up.
The key here is that we end up returning after doing each small bit of work having queued up the next bit - we end up calling Dispatcher.BeginInvoke 100 times in all instead of once. That gives the UI a chance to respond to input on a regular basis.
Related
In my project, have a data provider, which provides data in every 2 milli seconds. Following is the delegate method in which the data is getting.
func measurementUpdated(_ measurement: Double) {
measurements.append(measurement)
guard measurements.count >= 300 else { return }
ecgView.measurements = Array(measurements.suffix(300))
DispatchQueue.main.async {
self.ecgView.setNeedsDisplay()
}
guard measurements.count >= 50000 else { return }
let olderMeasurementsPrefix = measurements.count - 50000
measurements = Array(measurements.dropFirst(olderMeasurementsPrefix))
print("Measurement Count : \(measurements.count)")
}
What I am trying to do is that when the array has more than 50000 elements, to delete the older measurement in the first n index of Array, for which I am using the dropFirst method of Array.
But, I am getting a crash with the following message:
Fatal error: Can't form Range with upperBound < lowerBound
I think the issue due to threading, both appending and deletion might happen at the same time, since the delegate is firing in a time interval of 2 millisecond. Can you suggest me an optimized way to resolve this issue?
So to really fix this, we need to first address two of your claims:
1) You said, in effect, that measurementUpdated() would be called on the main thread (for you said both append and dropFirst would be called on main thread. You also said several times that measurementUpdated() would be called every 2ms. You do not want to be calling a method every 2ms on the main thread. You'll pile up quite a lot of them very quickly, and get many delays in their updating, as the main thread is going to have UI stuff to be doing, and that always eats up time.
So first rule: measurementUpdated() should always be called on another thread. Keep it the same thread, though.
Second rule: The entire code path from whatever collects the data to when measurementUpdated() is called must also be on a non-main thread. It can be on the thread that measurementUpdated(), but doesn't have to be.
Third rule: You do not need your UI graph to update every 2ms. The human eye cannot perceive UI change that's faster than about 150ms. Also, the device's main thread will get totally bogged down trying to re-render as frequently as every 2ms. I bet your graph UI can't even render a single pass at 2ms! So let's give your main thread a break, by only updating the graph every, say, 150ms. Measure the current time in MS and compare against the last time you updated the graph from this routine.
Fourth rule: don't change any array (or any object) in two different threads without doing a mutex lock, as they'll sometimes collide (one thread will be trying to do an operation on it while another is too). An excellent article that covers all the current swift ways of doing mutex locks is Matt Gallagher's Mutexes and closure capture in Swift. It's a great read, and has both simple and advanced solutions and their tradeoffs.
One other suggestion: You're allocating or reallocating a few arrays every 2ms. It's unnecessary, and adds undue stress on the memory pools under the hood, I'd think. I suggest not doing append and dropsFirst calls. Try rewriting such that you have a single array that holds 50,000 doubles, and never changes size. Simply change values in the array, and keep 2 indexes so that you always know where the "start" and the "end" of the data set is within the array. i.e. pretend the next array element after the last is the first array element (pretend the array loops around to the front). Then you're not churning memory at all, and it'll operate much quicker too. You can surely find Array extensions people have written to make this trivial to use. Every 150ms you can copy the data into a second pre-allocated array in the correct order for your graph UI to consume, or just pass the two indexes to your graph UI if you own your graph UI and can adjust it to accommodate.
I don't have time right now to write a code example that covers all of this (maybe someone else does), but I'll try to revisit this tomorrow. It'd actually be a lot better for you if you made a renewed stab at it yourself, and then ask us a new question (on a new StackOverflow) if you get stuck.
Update As #Smartcat correctly pointed this solution has the potential of causing memory issues if the main thread is not fast enough to consume the arrays in the same pace the worker thread produces them.
The problem seems to be caused by ecgView's measurements property: you are writing to it on the thread receiving the data, while the view tries to read from it on the main thread, and simultaneous accesses to the same data from multiple thread is (unfortunately) likely to generate race conditions.
In conclusion, you need to make sure that both reads and writes happen on the same thread, and can easily be achieved my moving the setter call within the async dispatch:
let ecgViewMeasurements = Array(measurements.suffix(300))
DispatchQueue.main.async {
self.ecgView.measurements = ecgViewMeasurements
self.ecgView.setNeedsDisplay()
}
According to what you say, I will assume the delegate is calling the measuramentUpdate method from a concurrent thread.
If that's the case, and the problem is really related to threading, this should fix your problem:
func measurementUpdated(_ measurement: Double) {
DispatchQueue(label: "MySerialQueue").async {
measurements.append(measurement)
guard measurements.count >= 300 else { return }
ecgView.measurements = Array(measurements.suffix(300))
DispatchQueue.main.async {
self.ecgView.setNeedsDisplay()
}
guard measurements.count >= 50000 else { return }
let olderMeasurementsPrefix = measurements.count - 50000
measurements = Array(measurements.dropFirst(olderMeasurementsPrefix))
print("Measurement Count : \(measurements.count)")
}
}
This will put the code in an serial queue. This way you can ensure that this block of code will run only one at a time.
I'm looking for a method to wait for the GPU to finish its work in DirectX9. Something equivalent to the glFinish command in OpenGL...
I already know that it's not something I should do, but I have to! I'm writing a threaded Graphics Engine integrated in WPF and I need to make sort of an off-screen rendering in order to give a valid surface to a D3DImage. The frames are very long to compute (more than 100ms) and the rendering of the WPF Image sometimes occurs while the frame is not fully computed by my Engine even if I lock everything the right way. I'm almost sure it's just a Finish issue but I didn't find out how to do that.
So far, I tried to launch a DX9 query like this :
using namespace SlimDX.Direct3D9;
public class GraphicsDevice: Device
{
...
public void Finish()
{
var query = new Query(this, QueryType.Event);
EndScene();
while (!query.CheckStatus(true)) ;
}
}
But it does not seem to work...
So, first question without talking about WPF, do you know how to wait for the GPU to finish what has been sent to the driver?
Thanks!
This was the solution.
I was not aware that it actually work!
I used an EventQuery to 'mark' my last call to the GPU.
Then I put some kind of infinite loop flushing the GPU instructions and waiting for the EventQuery to be finally fired by the GPU, using the GetData/CheckStatus methods.
I am using the SharpDX.WPF project for the WPF abilities, it seems like an easy to understand low-overhead library, compared to the Toolkit that comes with SharpDX (which has the same issue!)
First: I fixed the SharpDX.WPF project for the latest SharpDX using the following: https://stackoverflow.com/a/19791534/442833
Then I made the following hacky adjustment to DXElement.cs, a solution that was also done here:
private Query queryForCompletion;
public void Render()
{
if (Renderer == null || IsInDesignMode)
return;
var test = Renderer as D3D11;
if (queryForCompletion == null)
{
queryForCompletion = new Query(test.Device,
new QueryDescription {Type = QueryType.Event, Flags = QueryFlags.None});
}
Renderer.Render(GetDrawEventArgs());
Surface.Lock();
test.Device.ImmediateContext.End(queryForCompletion);
// wait until drawing completes
Bool completed;
var counter = 0;
while (!(test.Device.ImmediateContext.GetData(queryForCompletion, out completed)
&& completed))
{
Console.WriteLine("Yielding..." + ++counter);
Thread.Yield();
}
//Surface.Invalidate();
Surface.AddDirtyRect(new Int32Rect(0, 0, Surface.PixelWidth, Surface.PixelHeight));
Surface.Unlock();
}
Then I render 8000 cubes in a cube pattern...
Yielding...
gets printed to the console quite often, but the flickering is still there.
I am assuming that WPF is nice enough to show the image using a different thread before the rendering is done, not sure though...
This same issue also happens when I use the Toolkit variant of WPF support with SharpDX.
Images to demonstate the issue:
Bad
Better
Almost
Intended
Note: It randomly switches between these old images, randomly. I am also using really old hardware which makes the flickering much more appearant (GeForce Quadro FX 1700)
A made a repo which contains the exact same source-code as I am using to get this issue:
https://github.com/ManIkWeet/FlickeringIssue/
Related to D3DImage locking, note that the D3DImage.TryLock API has rather unconventional semantics which most developers would not expect:
Beware!
You must call Unlock even in the case where TryLock indicates failure (i.e., returns false)
Although perhaps more of an alarming design choice than a bug per se, misunderstanding this behavior will trivially result in D3DImage deadlocks and hangs, and thus might be responsible for much of the frustration people experience in attempting to get D3DImage working properly.
The following code is a correct WPF D3D render with no flicker in my app:
void WPF_D3D_render(IntPtr pSurface)
{
if (TryLock(new Duration(default(TimeSpan))))
{
SetBackBuffer(D3DResourceType.IDirect3DSurface9, pSurface);
AddDirtyRect(new Int32Rect(0, 0, PixelWidth, PixelHeight));
}
Unlock(); // <--- !
}
Yes, this unintuitive code is actually correct; it is the case that that D3DImage.TryLock(0) leaks one internal D3D buffer lock every time it returns failure. You don't have to take my word for it, here's the CLR code from PresentationCore.dll v4.0.30319:
private bool LockImpl(Duration timeout)
{
bool flag = false;
if (_lockCount == uint.MaxValue)
throw new InvalidOperationException();
if (_lockCount == 0)
{
if (timeout == Duration.Forever)
flag = _canWriteEvent.WaitOne();
else
flag = _canWriteEvent.WaitOne(timeout.TimeSpan, false);
UnsubscribeFromCommittingBatch();
}
_lockCount++;
return flag;
}
Notice that the internal _lockCount field is incremented regardless of whether the function returns success or failure. You have to call Unlock() yourself, as shown in the first code example above, if you want to avoid certain deadlock. Failing to do so creates is nasty to debug, too, because the component won't (potentially) deadlock until the next render pass, by which time the relevant evidence is long gone.
The unusual behavior does not seem to be mentioned at MSDN, but to be fair, that documentation doesn't note that you have to call Unlock() if the call is successful, either.
The problem is not the Locking mechanism. Normally you use Present to draw to present the image. Present will wait until all drawing is ready. With D3DImage you are not using the Present() method. Instead of Presenting, you lock, adding a DirtyRect and unlock the D3DImage.
The rendering is done asynchrone so when you are unlocking, the draw actions might not be ready. This is causing the flicker effect. Sometimes you see items half drawn. A poor solution (i've tested with) is adding a small delay before unlocking. It helped a little, but it wasn't a neat solution. It was terrible!
Solution:
I continued with something else; I was expirimenting with MSAA (antialiasing) and the first problem I faced was; MSAA cannot be done on the dx11/dx9 shared texture, so i decided to render to a new texture (dx11) and create a copy to the dx9 shared texture. I slammed my head on the tabel, because now it was anti-aliased AND flicking-free!! Don't forget to call Flush() before adding a dirty rect.
So, creating a copy of the texture: DXDevice11.Device.ImmediateContext.ResolveSubresource(_dx11RenderTexture, 0, _dx11BackpageTexture, 0, ColorFormat); (_dx11BackpageTexture is shared texture) will wait until the rendering is ready and will create a copy.
This is how I got rid of the flickering....
I think you are not locking properly. As far as I understand the MSDN documentation you are supposed to lock during the entire rendering not just at the end of it:
While the D3DImage is locked, your application can also render to the Direct3D surface assigned to the back buffer.
The information you find on the net about D3DImage/SharpDX is somewhat confusing because the SharpDX guys don't really like the way D3DImage is implemented (can't blame them), so there are statements about this being a "bug" on Microsofts side when its actually just improper usage of the API.
Yes, locking during rendering has performance issues, but it is probably not possible to fix them without porting WPF to DirectX11 and implementing something like a SwapChainPanel which is available in UWP apps. (WPF itself still runs on DirectX9)
If the locking is a performance issue for you, one idea I had (but never tested) is that you could render to an offscreen surface and reduce the lock duration to copying that surface over to the D3DImage. No idea if that would help performance wise but its something to try.
I have an import file method in a WPF app that reads a file and inserts some records in a DB.
This method runs in a BackgroundWorker object.
I have a progress bar being updated inside a Dispatcher.Invoke call. If I run as is, it takes ~1 minute to import 200k records, if I just don't show any progress, it takes just 4 to 5 seconds! And if I use Dispatcher.BeginInvoke with Background priority, it takes the same 4 to 5 seconds, but the progress bar + a counter are being updated and takes ~1 minute. So, obviusly, the UI is the problem here.
And the other problem is that I need to show a progress, so I was thinking if there is any way to use Dispatcher.BeginInvoke but first check if there is anything on the queue and if so, I just skip it, which would behave like: in the 1st second, 1% done, 2 secs later 50% done and in the 4th second 100% done).
Any help on this?
thanks!!!
The problem is that your callbacks are queuing up on the Dispatcher. Each one will cause the screen to repaint, and because they are at Background priority the next one will wait for that repaint to complete before being processed, so you will have to repaint once per callback, which can be slow.
Instead of trying to wait until nothing at all is in the dispatcher queue, just wait until the previous progress callback has been handled before posting a new one. This will ensure you never have more than one active at a time, so they can't queue up.
You can do this by setting a flag when you post the callback and clearing it once it has been processed. For example:
private void backgroundWorker_DoWork(object sender, DoWorkEventArgs e)
{
var pending = false;
for (int i = 0; i < 1000000; i++)
{
// Do some work here
// ...
// Only report progress if there is no progress report pending
if (!pending)
{
// Set a flag so we don't post another progress report until
// this one completes, and then post a new progress report
pending = true;
var currentProgress = i;
Dispatcher.BeginInvoke(new Action(() =>
{
// Do something with currentProgress
progressBar.Value = currentProgress;
// Clear the flag so that the BackgroundWorker
// thread will post another progress report
pending = false;
}), DispatcherPriority.Background);
}
}
}
I would simply update a progress counter in the background thread (it only writes to the counter), and have the UI read (only read) the timer every 500 ms or so... There is no reason to update faster than that. Also, because one thread is write only, and one is read only there is no threading issues required. The code becomes massively simpler, cleaner, and more maintainable.
-Chert Pellett
Impossible to say without seeing code, but
I have a progress bar being updated inside a Dispatcher.Invoke call
Why? That's what ReportProgress is for.
If I had to guess (and I do), I'd say you're reporting progress to often. For example, don't report progress after every record, but after batches of 100 or whatever.
I just solved the same case, but using the object returned by BeginInvoke, and I think it’s quite elegant too!
DispatcherOperation uiOperation = null;
while (…)
{
…
if (uiOperation == null || uiOperation.Status == DispatcherOperationStatus.Completed || uiOperation.Status == DispatcherOperationStatus.Aborted)
{
uiOperation = uiElement.Dispatcher.BeginInvoke(…);
}
}
The progress bars become a little choppier (less smooth), but it flies. In my case, the code parses line-by-line from a text file using StreamReader.ReadLine(). Updating the progress bar after reading every line would cause the read operations to complete before the progress bar was even halfway filled. Using the synchronous Dispatcher.Invoke(…) would slow down the entire operation to 100 KiB/s, but the progress bar would accurately track the progress. Using the solution above, my application finished parsing 8,000 KiB in a second with just 3 progress bar updates.
One difference from using BackgroundWorker.ReportProgress(…) is that the progress bar can show finer detail in longer-running operations. BackgroundWorker.ReportProgress(…) is limited to reporting progress in increments of 1% from 0% to 100%. If your progress bar represents more than 100 operations, finer values are desirable. Of course, that could also be achieved by not using the percentProgress argument and passing in a userState to BackgroundWorker.ReportProgress(…) instead.
Is there some reason that identical math operations would take significantly longer in one Silverlight app than in another?
For example, I have some code that takes a list of points and transforms them (scales and translates them) and populates another list of points. It's important that I keep the original points intact, hence the second list.
Here's the relevant code (scale is a double and origin is a point):
public Point transformPoint(Point point) {
// scale, then translate the x
point.X = (point.X - origin.X) * scale;
// scale, then translate the y
point.Y = (point.Y - origin.Y) * scale;
// return the point
return point;
}
Here's how I'm doing the loop and timing, in case it's important:
DateTime startTime = DateTime.Now;
foreach (Point point in rawPoints) transformedPoints.Add(transformPoint(point));
Debug.Print("ASPX milliseconds: {0}", (DateTime.Now - startTime).Milliseconds);
On a run of 14356 points (don't ask, it's modeled off a real world number in the desktop app), the breakdown is as follows:
Silverlight app #1: 46 ms
Silverlight app #2: 859 ms
The first app is an otherwise empty app that is doing the loop in the MainPage constructor. The second is doing the loop in a method in another class, and the method is called during an event handler in the GUI thread, I think. But should any of that matter, considering that identical operations are happening within the loop itself?
There maybe something huge I'm missing in how threading works or something, but this discrepancy doesn't make sense to me at all.
In addition to the other comments and answers I'm going to read between the lines a little.
In the first app you have pretty much this code in isolation running in the MainPage constructor. IWO you've create a fresh Silverlight app and slapped this code in it and thats it.
In the second app you have more actual real world stuff. At the very least you have this code running as the result of a button click on a rudimentory UI. Therein lies the clue.
Take a blank app and drop a button on it. Run it and click the button, what does the button do? There are animations attached to visual states of the button. This animation (or other animations or loops) are likely running in parrallel with your code when you click the button. Timers (whether you do it properly with StopWatch or not) record elapsed time, not just the time your thread takes. Hence when other threads are doing other things (like animations) your timing will be off.
My first suspicion would be that Silverlight App #2 triggers a garbage collection. Scaling ~15,000 points should be taking a millisecond, not nearly a second.
Try to reduce memory allocations in your code. Can transformedPoints be an array, rather than a dynamically grown data structure?
You can also look at the GC performance counters, but simply reducing the memory allocation may turn out to be simpler.
Could it be possible your code is not being inlined in the CLR by the app that is running slower?
I'm not sure how the CLR in SL handles inlining, but here is a link to some of the prerequisites for inlining in 3.5 SP1.
http://udooz.net/blog/2009/04/clr-improvements-in-net-35-sp1/