Running several sequences of Tasks serially using TPL - silverlight

I have a class that returns an IEnumerable. I then execute these tasks in order. Let's say the class is TaskProvider.
public class TaskProvider {
public IEnumerable<Task> SomeThingsToDo() { return work; }
}
I am executing these with the following:
public void ExecuteTasks(IEnumerable<Task> tasks)
{
var enumerator = tasks.GetEnumerator();
ExecuteNextTask(enumerator);
}
static void ExecuteNextTask(IEnumerator<Task> enumerator)
{
bool moveNextSucceeded = enumerator.MoveNext();
if (!moveNextSucceeded) return;
enumerator
.Current
.ContinueWith(x => ExecuteNextTask(enumerator));
}
Now I have a situation where I might have multiple instances of TaskProvider, each generating a list of tasks. I want each list of tasks to be executed in order, meaning that all the tasks from one provider finish before the next one starts.
Then, most importantly, I need to know when all the tasks are completed.
What's the TPL way of accomplishing this?
(FWIW, I'm using the Async CTP for Silverlight.)

Here's the approach I took, and so far all my tests are passing.
First, I created a unioned enumerable of all the tasks from the various providers:
var tasks = from provider in providers
from task in provider.SomeThingsToDo()
select task;
I believe that part of my original problem was that I did a ToList (more or less) and thus began the execution of the tasks prematurely.
Next, I added a callback to ExecuteTasks and ExecuteNextTask. Admittedly, not as clean as I'd hoped. Here's the revised implementation:
public void ExecuteTasks(IEnumerable<Task> tasks, Action callback)
{
var enumerator = tasks.GetEnumerator();
ExecuteNextTask(enumerator, callback);
}
static void ExecuteNextTask(IEnumerator<Task> enumerator, Action callback)
{
bool moveNextSucceeded = enumerator.MoveNext();
if (!moveNextSucceeded)
{
if (callback != null) callback();
return;
}
enumerator
.Current
.ContinueWith(x => ExecuteNextTask(enumerator, callback));
}
I didn't need a thread-safe structure for storing the list of tasks, because the list is generated only once.

at worst you could have a static concurrentqueue of Ienumerables which you ExecuteNextTask method works it's way through...
something like:
public static class ExecuteController {
private static ConcurrentQueue<IEnumerable<Task>> TaskLists = new ConcurrentQueue<IEnumerable<Task>>();
public void ExecuteTaskList(IEnumerable<Task> tasks) {
TaskLists.Enqueue(tasks);
TryStartExec();
}
public void TryStartExec() {
check if there is a new task list and if so exec it with your code.
possibly need to lock around the dequeue but i think there is an atomic dequeue method on concurrent queue..
}
}

Related

Unbounded Collection based stream in Flink

Is it possible to create an unbounded collection streams in flink. Like in a map if we add a element flink should process as in the socket stream. It should not exit once the initial elements are read.
You can create a custom SourceFunction that never terminates (until cancel() is called, and emits elements as they appear. You'd want to have a class that looks something like:
class MyUnboundedSource extends RichParallelSourceFunction<MyType> {
...
private transient volatile boolean running;
...
#Override
public void run(SourceContext<MyType> ctx) throws Exception {
while (running) {
// Call some method that returns the next record, if available.
MyType record = getNextRecordOrNull();
if (record != null) {
ctx.collect(record);
} else {
Thread.sleep(NO_DATA_SLEEP_TIME());
}
}
}
#Override
public void cancel() {
running = false;
}
}
Note that you'd need to worry about saving state for this to support at least once or exactly once generation of records.

Flink executes dataflow twice

I'm new to Flink and I work with DataSet API. After a whole bunch of processing as the last stage I need to normalize one of the values by dividing it by its maximum value. So, I have used the .max() operator to take the max and later I'm passing the result as constructor's argument to the MapFunction.
This works, however all the processing is performed twice. One job is executed to find max values, and later another job is executed to create final result (starting execution from the beginning)... Is there any workaround to execute whole dataflow only once?
final List<Tuple6<...>> maxValues = result.max(2).collect();
assert maxValues.size() == 1;
result.map(new NormalizeAttributes(maxValues.get(0))).writeAsCsv(...)
#FunctionAnnotation.ForwardedFields("f0; f1; f3; f4; f5")
#FunctionAnnotation.ReadFields("f2")
private static class NormalizeAttributes implements MapFunction<Tuple6<...>, Tuple6<...>> {
private final Tuple6<...> maxValues;
public NormalizeAttributes(Tuple6<...> maxValues) {
this.maxValues = maxValues;
}
#Override
public Tuple6<...> map(Tuple6<...> value) throws Exception {
value.f2 /= maxValues.f2;
return value;
}
}
collect() immediately triggers an execution of the program up to the dataset requested by collect(). If you later call env.execute() or collect() again, the program is executed second time.
Besides the side effect of execution, using collect() to distribute values to subsequent transformation has also the drawback that data is transferred to the client and later back into the cluster. Flink offers so-called Broadcast variables to ship a DataSet as a side input into another transformation.
Using Broadcast variables in your program would look as follows:
DataSet maxValues = result.max(2);
result
.map(new NormAttrs()).withBroadcastSet(maxValues, "maxValues")
.writeAsCsv(...);
The NormAttrs function would look like this:
private static class NormAttr extends RichMapFunction<Tuple6<...>, Tuple6<...>> {
private Tuple6<...> maxValues;
#Override
public void open(Configuration config) {
maxValues = (Tuple6<...>)getRuntimeContext().getBroadcastVariable("maxValues").get(1);
}
#Override
public PredictedLink map(Tuple6<...> value) throws Exception {
value.f2 /= maxValues.f2;
return value;
}
}
You can find more information about Broadcast variables in the documentation.

Awaiting IEnumerable<Task<T>> individually C#

In short, I have a Task enumerable, and I would like to run each Task within the array in an await fashion. Each Task will perform a slow network operation and from my end I simply need to update the WinForm UI once the task is finished.
Below is the code I'm currently using, however I think this is more of a hack than an actual solution:
private void btnCheckCredentials_Click(object sender, EventArgs e)
{
// GetNetCredentials() is irrelevant to the question...
List<NetworkCredential> netCredentials = GetNetCredentials();
// This call is not awaited. Displays warning!
netCredentials.ForEach(nc => AwaitTask(ValidateCredentials(nc)));
}
public async Task<bool> ValidateCredentials(NetworkCredential netCredential)
{
// Network-reliant, slow code here...
}
public async Task AwaitTask(Task<bool> task)
{
await task;
// Dumbed-down version of displaying the task results
txtResults.Text += task.Result.ToString();
}
2nd line of btnCheckCredentials_Click() warning is shown:
Because this call is not awaited, execution of the current method continues before the call is completed. Consider applying the 'await' operator to the result of the call.
This actually works the way I wanted to, since I do not want to wait for the operation to complete. Instead I just want to fire away the tasks, and then do something as soon as each one of them finishes.
The Task.WhenAny() or Task.WhenAll() methods do function as I expect, since I would like to know of every task finishing - as soon as it finishes. Task.WaitAll() or Task.WaitAny() are blocking and therefore undesirable as well.
Edit: All tasks should start simultaneously. They may then finish in any order.
Are you looking for Task.WhenAll?
await Task.WhenAll(netCredentials.Select(nc => AwaitTask(ValidateCredentials(nc)));
You can do all the completion processing you need in AwaitTask.
The await task; is a bit awkward. I'd do it like this:
public async Task AwaitTask(netCredential credential)
{
var result = await ValidateCredentails(credential);
// Dumbed-down version of displaying the task results
txtResults.Text += result.ToString();
}
You can do this by using Task.WhenAny marking it as async (async void here is fine since you're inside an event handler):
private async void btnCheckCredentials_Click(object sender, EventArgs e)
{
// GetNetCredentials() is irrelevant to the question...
List<NetworkCredential> netCredentials = GetNetCredentials();
var credentialTasks = netCredentials
.Select(cred => ValidateCredentialsAsync(cred))
.ToList();
while (credentialTasks.Count > 0)
{
var finishedTask = await Task.WhenAny(credentialTasks);
// Do stuff with finished task.
credentialTasks.Remove(finishedTask);
}
}
You can fire and forget each task and add callback when task is completed.
private async void btnCheckCredentials_Click(object sender, EventArgs e)
{
List<NetworkCredential> netCredentials = GetNetCredentials();
foreach (var credential in netCredentials)
{
ValidateCredentails(credential).ContinueWith(x=> ...) {
};
}
}
So instead of labda expression you can create callback method and know exactly when the particular task finished.

How to properly canalize multithreaded message flow in a single threaded service?

In a WPF application, I have a 3rd party library that is publishing messages.
The messages are like :
public class DialectMessage
{
public string PathAndQuery { get; private set; }
public byte[] Body { get; private set; }
public DialectMessage(string pathAndQuery, byte[] body)
{
this.PathAndQuery = pathAndQuery;
this.Body = body;
}
}
And I setup the external message source from my app.cs file :
public partial class App : Application
{
static App()
{
MyComponent.MessageReceived += MessageReceived;
MyComponent.Start();
}
private static void MessageReceived(Message message)
{
//handle message
}
}
These messages can be publishing from multiple thread at a time, making possible to call the event handler multiple times at once.
I have a service object that have to parse the incoming messages. This service implements the following interface :
internal interface IDialectService
{
void Parse(Message message);
}
And I have a default static instance in my app.cs file :
private readonly static IDialectService g_DialectService = new DialectService();
In order to simplify the code of the parser, I would like to ensure only one message at a time is parsed.
I also want to avoid locking in my event handler, as I don't want to block the 3rd party object.
Because of this requirements, I cannot directly call g_DialectService.Parse from my message event handler
What is the correct way to ensure this single threaded execution?
My first though is to wrap my parsing operations in a Produce/Consumer pattern. In order to reach this goal, I've try the following :
Declare a BlockingCollection in my app.cs :
private readonly static BlockingCollection<Message> g_ParseOperations = new BlockingCollection<Message>();
Change the body of my event handler to add an operation :
private static void MessageReceived(Message message)
{
g_ParseOperations.Add(message);
}
Create a new thread that pump the collection from my app constructor :
static App()
{
MyComponent.MessageReceived += MessageReceived;
MyComponent.Start();
Task.Factory.StartNew(() =>
{
Message message;
while (g_ParseOperations.TryTake(out message))
{
g_DialectService.Parse(message);
}
});
}
However, this code does not seems to work. The service Parse method is never called.
Moreover, I'm not sure if this pattern will allow me to properly shutdown the application.
What have I to change in my code to ensure everything is working?
PS: I'm targeting .Net 4.5
[Edit] After some search, and the answer of ken2k, i can see that I was wrongly calling trytake in place of take.
My updated code is now :
private readonly static CancellationTokenSource g_ShutdownToken = new CancellationTokenSource();
private static void MessageReceived(Message message)
{
g_ParseOperations.Add(message, g_ShutdownToken.Token);
}
static App()
{
MyComponent.MessageReceived += MessageReceived;
MyComponent.Start();
Task.Factory.StartNew(() =>
{
while (!g_ShutdownToken.IsCancellationRequested)
{
var message = g_ParseOperations.Take(g_ShutdownToken.Token);
g_DialectService.Parse(message);
}
});
}
protected override void OnExit(ExitEventArgs e)
{
g_ShutdownToken.Cancel();
base.OnExit(e);
}
This code acts as expected. Messages are processed in the correct order. However, as soon I exit the application, I get a "CancelledException" on the Take method, even if I just test the IsCancellationRequested right before.
The documentation says about BlockingCollection.TryTake(out T item):
If the collection is empty, this method immediately returns false.
So basically your loop exits immediately. What you may want is to call the TryTake method with a timeout parameter instead, and exit your loop when a mustStop variable becomes true:
bool mustStop = false; // Must be set to true on somewhere else when you exit your program
...
while (!mustStop)
{
Message yourMessage;
// Waits 500ms if there's nothing in the collection. Avoid to consume 100% CPU
// for nothing in the while loop when the collection is empty.
if (yourCollection.TryTake(out yourMessage, 500))
{
// Parses yourMessage here
}
}
For your edited question: if you mean you received a OperationCanceledException, that's OK, it's exactly how methods that take a CancellationToken object as parameter must behave :) Just catch the exception and exit gracefully.

How to share an Array between all Classes in an application?

I want to share an Array which all classes can "get" and "change" data inside that array. Something like a Global array or Multi Access array. How this is possible with ActionScript 3.0 ?
There are a couple of ways to solve this. One is to use a global variable (as suggested in unkiwii's answer) but that's not a very common approach in ActionScript. More common approaches are:
Class variable (static variable)
Create a class called DataModel or similar, and define an array variable on that class as static:
public class DataModel {
public static var myArray : Array = [];
}
You can then access this from any part in your application using DataModel.myArray. This is rarely a great solution because (like global variables) there is no way for one part of your application to know when the content of the array is modified by another part of the application. This means that even if your data entry GUI adds an object to the array, your data list GUI will not know to show the new data, unless you implement some other way of telling it to redraw.
Singleton wrapping array
Another way is to create a class called ArraySingleton, which wraps the actual array and provides access methods to it, and an instance of which can be accessed using the very common singleton pattern of keeping the single instance in a static variable.
public class ArraySingleton {
private var _array : Array;
private static var _instance : ArraySingleton;
public static function get INSTANCE() : ArraySingleton {
if (!_instance)
_instance = new ArraySingleton();
return _instance;
}
public function ArraySingleton() {
_array = [];
}
public function get length() : uint {
return _array.length;
}
public function push(object : *) : void {
_array.push(object);
}
public function itemAt(idx : uint) : * {
return _array[idx];
}
}
This class wraps an array, and a single instance can be accessed through ArraySingleton.INSTANCE. This means that you can do:
var arr : ArraySingleton = ArraySingleton.INSTANCE;
arr.push('a');
arr.push('b');
trace(arr.length); // traces '2'
trace(arr.itemAt(0)); // trace 'a'
The great benefit of this is that you can dispatch events when items are added or when the array is modified in any other way, so that all parts of your application can be notified of such changes. You will likely want to expand on the example above by implementing more array-like interfaces, like pop(), shift(), unshift() et c.
Dependency injection
A common pattern in large-scale application development is called dependency injection, and basically means that by marking your class in some way (AS3 meta-data is often used) you can signal that the framework should "inject" a reference into that class. That way, the class doesn't need to care about where the reference is coming from, but the framework will make sure that it's there.
A very popular DI framework for AS3 is Robotlegs.
NOTE: I discourage the use of Global Variables!
But here is your answer
You can go to your default package and create a file with the same name of your global variable and set the global variable public:
//File: GlobalArray.as
package {
public var GlobalArray:Array = [];
}
And that's it! You have a global variable. You can acces from your code (from anywhere) like this:
function DoSomething() {
GlobalArray.push(new Object());
GlobalArray.pop();
for each (var object:* in GlobalArray) {
//...
}
}
As this question was linked recently I would add something also. I was proposed to use singleton ages ago and resigned on using it as soon as I realized how namespaces and references work and that having everything based on global variables is bad idea.
Aternative
Note this is just a showcase and I do not advice you to use such approach all over the place.
As for alternative to singleton you could have:
public class Global {
public static const myArray:Alternative = new Alternative();
}
and use it almost like singleton:
var ga:Alternative = Global.myArray;
ga.e.addEventListener(GDataEvent.NEW_DATA, onNewData);
ga.e.addEventListener(GDataEvent.DATA_CHANGE, onDataChange);
ga.push(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, "ten");
trace(ga[5]); // 5
And your Alternative.as would look similar to singleton one:
package adnss.projects.tchqs
{
import flash.utils.Proxy;
import flash.utils.flash_proxy;
public class Alternative extends Proxy
{
private var _data:Array = [];
private var _events:AltEventDisp = new AltEventDisp();
private var _dispatching:Boolean = false;
public var blockCircularChange:Boolean = true;
public function Alternative() {}
override flash_proxy function getProperty(id:*):* {var i:int = id;
return _data[i += (i < 0) ? _data.length : 0];
//return _data[id]; //version without anal item access - var i:int could be removed.
}
override flash_proxy function setProperty(id:*, value:*):void { var i:int = id;
if (_dispatching) { throw new Error("You cannot set data while DATA_CHANGE event is dipatching"); return; }
i += (i < 0) ? _data.length : 0;
if (i > 9 ) { throw new Error ("You can override only first 10 items without using push."); return;}
_data[i] = value;
if (blockCircularChange) _dispatching = true;
_events.dispatchEvent(new GDataEvent(GDataEvent.DATA_CHANGE, i));
_dispatching = false;
}
public function push(...rest) {
var c:uint = -_data.length + _data.push.apply(null, rest);
_events.dispatchEvent(new GDataEvent(GDataEvent.NEW_DATA, _data.length - c, c));
}
public function get length():uint { return _data.length; }
public function get e():AltEventDisp { return _events; }
public function toString():String { return String(_data); }
}
}
import flash.events.EventDispatcher;
/**
* Dispatched after data at existing index is replaced.
* #eventType adnss.projects.tchqs.GDataEvent
*/
[Event(name = "dataChange", type = "adnss.projects.tchqs.GDataEvent")]
/**
* Dispatched after new data is pushed intwo array.
* #eventType adnss.projects.tchqs.GDataEvent
*/
[Event(name = "newData", type = "adnss.projects.tchqs.GDataEvent")]
class AltEventDisp extends EventDispatcher { }
The only difference form Singleton is that you can actually have multiple instances of this class so you can reuse it like this:
public class Global {
public static const myArray:Alternative = new Alternative();
public static const myArray2:Alternative = new Alternative();
}
to have two separated global arrays or even us it as instance variable at the same time.
Note
Wrapping array like this an using methods like myArray.get(x) or myArray[x] is obviously slower than accessing raw array (see all additional steps we are taking at setProperty).
public static const staticArray:Array = [1,2,3];
On the other hand you don't have any control over this. And the content of the array can be changed form anywhere.
Caution about events
I would have to add that if you want to involve events in accessing data that way you should be careful. As with every sharp blade it's easy to get cut.
For example consider what happens when you do this this:
private function onDataChange(e:GDataEvent):void {
trace("dataChanged at:", e.id, "to", Global.myArray[e.id]);
Global.myArray[e.id]++;
trace("new onDataChange is called before function exits");
}
The function is called after data in array was changed and inside that function you changing the data again. Basically it's similar to doing something like this:
function f(x:Number) {
f(++x);
}
You can see what happens in such case if you toggle myArray.blockCircularChange. Sometimes you would intentionally want to have such recursion but it is likely that you will do it "by accident". Unfortunately flash will suddenly stop such events dispatching without even telling you why and this could be confusing.
Download full example here
Why using global variables is bad in most scenarios?
I guess there is many info about that all over the internet but to be complete I will add simple example.
Consider you have in your app some view where you display some text, or graphics, or most likely game content. Say you have chess game. Mayby you have separated logic and graphics in two classes but you want both to operate on the same pawns. So you create your Global.pawns variable and use that in both Grahpics and Logic class.
Everything is randy-dandy and works flawlessly. Now You come with the great idea - add option for user to play two matches at once or even more. All you have to do is to create another instance of your match... right?
Well you are doomed at this point because, every single instance of your class will use the same Global.pawns array. You not only have this variable global but also you have limited yourself to use only single instance of each class that use this variable :/
So before you use any global variables, just think twice if the thing you want to store in it is really global and universal across your entire app.

Resources