NSRecursivelock v NSNotification performance

NSRecursivelock v NSNotification performance - arrays

So I was using an array to hold objects, then iterating over them to call a particular method. As the objects in this array could be enumerated/mutated on different threads I was using an NSRecursiveLock.
I then realised that as this method is always called on every object in this array, I could just use an NSNotification to trigger it, and do away with the array and lock.
I am just double checking that this is a good idea? NSNotification will be faster than a lock, right?
Thanks

First, it seems that the locking should be done on the objects, not on the array. You are not mutating the array, but the objects. So you need one mutex per object. It will give you a finer granularity and allow concurrent updates to different objects to proceed in parallel (it wouldn't with a global lock).
Second, a recursive lock is complete overkill. If you wish to have mutual exclusion on the objects, each object should have a standard mutex. If what you are doing inside the critical section is cpu bound and really short, you might consider using a spinlock (it won't even trap in the OS. Only use for short CPU bound critical sections though). Recursive mutex are meant to be used when a thread k can (because of its logic) acquire another lock on an object it has already locked himself (hence the recursive name).
Third, NSNotifications (https://developer.apple.com/library/mac/documentation/Cocoa/Reference/Foundation/Classes/NSNotification_Class/) will allocate memory, drop the notification in a notification Center, do locking to actually implement that (the adding to the center) and finally dispatch the notifications in the center and deallocate the notification. So it is heavier than a plain simple locking. It "hides" the synchronization APIs inside the center, but it does not eliminate them, and you pay the price of memory allocations/deallocations.
If you wish to modify the array (add/remove), then you should also synchronize on these operations, but that would be unfortunate as access to independent entries of the array would now collide.
Hope that helps.

Related

How does one implement weak references with Boehm GC?

I have a personal project which I implement using the Boehm GC. I need to implement a sort of event type, which should hold references to other events. But I also need to ensure that the events pointed to are still collectable, thus I need weak references for this.
Let's say we have events A, B and C. I configure these events to signal event X whenever any of them is signaled. This means A, B and C must hold a reference to event X. What I want is that if event X is unreachable the events A, B and C don't need to signal it anymore. Thus a weak reference is what I thought of.
Is there any other way to do this? I don't want to change the GC but if necessary (the allocation interface remains clean) I could.
The project is written in C. If need be, I will provide more info. Notably, if there is any way to implement such events directly with this semantics, there's no need for actual weak references (events MAY have a reference cycle though while they are not signaled).

The Boehm GC does not have a concept of weak references per se. However, it does not scan memory allocated by the system malloc for references to managed objects, so pointers stored in such memory do not prevent the pointed-to object from being collected. Of course, that approach means that the objects containing the pointers will not be managed by the collector.
Alternatively, it should be possible to abuse GC_MALLOC_ATOMIC() or GC_malloc_explicitly_typed() to obtain a managed object that can contain pointers to other managed objects without preventing those other objects from being collected. That involves basically lying to GC about whether some members are pointers, so as to prevent them from being scanned.
Either way, you also require some mechanism for receiving notice when weakly-referenced objects are collected, so as to avoid attempting to access them afterward. GC has an interface for registering finalizer callbacks to be invoked before an object is collected, and that looks like your best available option for the purpose.
Overall, I think what you're asking for is doable, but with a lot of DIY involved. At a high level,
use GC_MALLOC_ATOMIC() to allocate a wrapper object around a pointer to the weakly referenced object. Allocating it this way allows the wrapper to itself be managed by GC, without the pointer within being scanned during GC's reachability analyses.
use GC_register_finalizer to register a finalizer function that sets the wrapper's pointer to NULL when GC decides that the pointed-to object is inaccessible.
users of the wrapper are obligated to check whether the pointer within is NULL before attempting to dereference it.

Swift threading issue in Array

In my project, have a data provider, which provides data in every 2 milli seconds. Following is the delegate method in which the data is getting.
func measurementUpdated(_ measurement: Double) {
measurements.append(measurement)
guard measurements.count >= 300 else { return }
ecgView.measurements = Array(measurements.suffix(300))
DispatchQueue.main.async {
self.ecgView.setNeedsDisplay()
}
guard measurements.count >= 50000 else { return }
let olderMeasurementsPrefix = measurements.count - 50000
measurements = Array(measurements.dropFirst(olderMeasurementsPrefix))
print("Measurement Count : \(measurements.count)")
}
What I am trying to do is that when the array has more than 50000 elements, to delete the older measurement in the first n index of Array, for which I am using the dropFirst method of Array.
But, I am getting a crash with the following message:
Fatal error: Can't form Range with upperBound < lowerBound
I think the issue due to threading, both appending and deletion might happen at the same time, since the delegate is firing in a time interval of 2 millisecond. Can you suggest me an optimized way to resolve this issue?

So to really fix this, we need to first address two of your claims:
1) You said, in effect, that measurementUpdated() would be called on the main thread (for you said both append and dropFirst would be called on main thread. You also said several times that measurementUpdated() would be called every 2ms. You do not want to be calling a method every 2ms on the main thread. You'll pile up quite a lot of them very quickly, and get many delays in their updating, as the main thread is going to have UI stuff to be doing, and that always eats up time.
So first rule: measurementUpdated() should always be called on another thread. Keep it the same thread, though.
Second rule: The entire code path from whatever collects the data to when measurementUpdated() is called must also be on a non-main thread. It can be on the thread that measurementUpdated(), but doesn't have to be.
Third rule: You do not need your UI graph to update every 2ms. The human eye cannot perceive UI change that's faster than about 150ms. Also, the device's main thread will get totally bogged down trying to re-render as frequently as every 2ms. I bet your graph UI can't even render a single pass at 2ms! So let's give your main thread a break, by only updating the graph every, say, 150ms. Measure the current time in MS and compare against the last time you updated the graph from this routine.
Fourth rule: don't change any array (or any object) in two different threads without doing a mutex lock, as they'll sometimes collide (one thread will be trying to do an operation on it while another is too). An excellent article that covers all the current swift ways of doing mutex locks is Matt Gallagher's Mutexes and closure capture in Swift. It's a great read, and has both simple and advanced solutions and their tradeoffs.
One other suggestion: You're allocating or reallocating a few arrays every 2ms. It's unnecessary, and adds undue stress on the memory pools under the hood, I'd think. I suggest not doing append and dropsFirst calls. Try rewriting such that you have a single array that holds 50,000 doubles, and never changes size. Simply change values in the array, and keep 2 indexes so that you always know where the "start" and the "end" of the data set is within the array. i.e. pretend the next array element after the last is the first array element (pretend the array loops around to the front). Then you're not churning memory at all, and it'll operate much quicker too. You can surely find Array extensions people have written to make this trivial to use. Every 150ms you can copy the data into a second pre-allocated array in the correct order for your graph UI to consume, or just pass the two indexes to your graph UI if you own your graph UI and can adjust it to accommodate.
I don't have time right now to write a code example that covers all of this (maybe someone else does), but I'll try to revisit this tomorrow. It'd actually be a lot better for you if you made a renewed stab at it yourself, and then ask us a new question (on a new StackOverflow) if you get stuck.

Update As #Smartcat correctly pointed this solution has the potential of causing memory issues if the main thread is not fast enough to consume the arrays in the same pace the worker thread produces them.
The problem seems to be caused by ecgView's measurements property: you are writing to it on the thread receiving the data, while the view tries to read from it on the main thread, and simultaneous accesses to the same data from multiple thread is (unfortunately) likely to generate race conditions.
In conclusion, you need to make sure that both reads and writes happen on the same thread, and can easily be achieved my moving the setter call within the async dispatch:
let ecgViewMeasurements = Array(measurements.suffix(300))
DispatchQueue.main.async {
self.ecgView.measurements = ecgViewMeasurements
self.ecgView.setNeedsDisplay()
}

According to what you say, I will assume the delegate is calling the measuramentUpdate method from a concurrent thread.
If that's the case, and the problem is really related to threading, this should fix your problem:
func measurementUpdated(_ measurement: Double) {
DispatchQueue(label: "MySerialQueue").async {
measurements.append(measurement)
guard measurements.count >= 300 else { return }
ecgView.measurements = Array(measurements.suffix(300))
DispatchQueue.main.async {
self.ecgView.setNeedsDisplay()
}
guard measurements.count >= 50000 else { return }
let olderMeasurementsPrefix = measurements.count - 50000
measurements = Array(measurements.dropFirst(olderMeasurementsPrefix))
print("Measurement Count : \(measurements.count)")
}
}
This will put the code in an serial queue. This way you can ensure that this block of code will run only one at a time.

LabVIEW: How to exchange lots of variables between loops?

I have two loops:
One loop gets data from a device and processes it. Scales received variables, calculates extra data.
Second loop visualizes the data and stores it.
There are lots of different variables that need to passed between those two loops - about 50 variables. I need the second loop to have access only to the newest values of the data. It needs to be able to read those variables any time they are needed to be visualized.
What is the best way to share such vector between two loops?

There are various ways of sharing data.
The fastest and simplest is a local variable, however that is rather uncontrolled, and you need to make sure to write them at one place (plus you need an indicator).
One of the most advanced options is creating a class for your data, and use an instance (if you create a by-ref class, otherwise it won't matter), and create a public 'GET' method.
In between you have sevaral other options:
queues
semaphores
property nodes
global variables
shared variables
notifiers
events
TCP-IP
In short there is no best way, it all depends on your skills and application.

As long as you're considering loops within the SAME application, there ARE good and bad ideas, though:
queues (OK, has most features)
notifiers (OK)
events (OK)
FGVs (OK, but keep an eye on massively parallel access hindering exec)
semaphores (that's not data comms)
property nodes (very inefficient, prone to race cond.)
global variables (prone to race cond.)
shared variables (badly implemented by NI, prone to race cond.)
TCP-IP (slow, awkward, affected by firewall config)

The quick and dirty way to do this is to write each value to an indicator in the producer loop - these indicators can be hidden offscreen, or in a page of a tab control, if you don't want to see them - and read a local variable of each one in the consumer loop. However if you have 50 different values it may become hard to maintain this code if you need to change or extend it.
As Ton says there are many different options but my suggestion would be:
Create a cluster control, with named elements, containing all your data
Save this cluster as a typedef
Create a notifier using this cluster as the data type
Bundle the data into the cluster (by name) and write this to the notifier in the producer loop
Read the cluster from the notifier in the consumer loop, unbundle it by name and do what you want with each element.
Using a cluster means you can easily pass it to different subVIs to process different elements if you like, and saving as a typedef means you can add, rename or alter the elements and your code will update to match. In your consumer loop you can use the timeout setting of the notifier read to control the loop timing, if you want. You can also use the notifier to tell the loops when to exit, by force-destroying it and trapping the error.

Two ways:
Use a display loop with SEQ (Single Element Queue)
Use a event structure with User Event. (Do not put two event structures in same loop!! Use another)
Use an enum with case structure and variant to cast the data to expected type.
(Notifier isn't reliable to stream data, because is a lossy scheme. Leave this only to trigger small actions)

If all of your variables can be bundled together in a single cluster to send at once, then you should use a single element queue. If your requirements change later such that the transmission cannot be lossy, then it's a matter of changing the input to the Obtain Queue VI (with a notifier you'd have to swap out all of the VIs). Setting up individual indicators and local variables would be pretty darn tedious. Also, not good style.

If the loops are inside of the same VI then:
The simplest solution would be local variables.
Little bit better to use shared variables.
Better is to use functional global variables (FGVs)
The best solution would be using SEQ (Single Element Queue).
Anyway for better understanding please go trough this paper.

Multithreading scenario for list

I've read this patterns but found it not to work. I get a rare exception that an item in foreach was changed.
lock (mylist)
{
foreach(var a in myList) {}
myList = new List<>() (or myList.Clear() )
}
I also tried this
foreach(var a in myList.ToList() ) { }
And that also generated exceptions. There are some other patterns described in this thread but I want to confirm/understand why above patterns didnt work
I've read a bit about how to properly lock a list. The exception doesn't occur often-- just very rarely and there was also a memory leak at the time.
1.
Do I need to use the lock everywhere I modify myList or does the lock prevent anyone from editing mylist? This may be the source of confusion.
2.
Is there a difference in lock mylist and casting and using syncroot?
See here
Properly locking a List<T> in MultiThreaded Scenarios?

Generally, if you have a shared resource then you need to lock a mutex that protects the resource whenever you use this resource. Does not matter if it's just reading or writing. If mutex is not locked in at least one of the places that you use the shared resource, you have a problem. For example, if you only lock a shared resource while modifying it, then it is possible that some thread is reading it while another is modifying it - a situation known as race condition occurs.
In your particular case, yes, you need to lock mylist everywhere you modify it. And not only where you modify it, but also everywhere you read it.

Create a non-blocking timer to erase data

Someone can show me how to create a non-blocking timer to delete data of a struct?
I've this struct:
struct info{
char buf;
int expire;
};
Now, at the end of the expire's value, I need to delete data into my struct. the fact is that in the same time, my program is doing something else. so how can I create this? even avoiding use of signals.

It won't work. The time it takes to delete the structure is most likely much less than the time it would take to arrange for the structure to be deleted later. The reason is that in order to delete the structure later, some structure has to be created to hold the information needed to find the structure later when we get around to deleting it. And then that structure itself will eventually need to be freed. For a task so small, it's not worth the overhead of dispatching.
In a difference case, where the deletion is really complicated, it may be worth it. For example, if the structure contains lists or maps that contain numerous sub-elements that must be traverse to destroy each one, then it might be worth dispatching a thread to do the deletion.
The details vary depending on what platform and threading standard you're using. But the basic idea is that somewhere you have a function that causes a thread to be tasked with running a particular chunk of code.
Update: Hmm, wait, a timer? If code is not going to access it, why not delete it now? And if code is going to access it, why are you setting the timer now? Something's fishy with your question. Don't even think of arranging to have anything deleted until everything is 100% finished with it.

If you don't want to use signals, you're going to need threads of some kind. Any more specific answer will depend on what operating system and toolchain you're using.

I think the motto is to have a timer and if it expires as in case of Client Server logic. You need to delete those entries for which the time is expired. And when a timer expires, you need to delete that data.
If it is yes: Then it can be implemented in couple of ways.
a) Single threaded : You create a sorted queue based on the difference of (interval - now ) logic. So that the shortest span should receive the callback first. You can implement the timer queue using map in C++. Now when your work is over just call the timer function to check if any expired request is there in your queue. If yes, then it would delete that data. So the prototype might look like set_timer( void (pf)(void)); add_timer(void * context, long time_to_expire); to add the timer.
b) Multi-threaded : add_timer logic will be same. It will access the global map and add it after taking lock. This thread will sleep(using conditional variable) for the shortest time in the map. Meanwhile if there is any addition to the timer queue, it will get a notification from the thread which adds the data. Why it needs to sleep on conditional variable, because, it might get a timer which is having lesser interval than the minimum existing already.
So suppose first call was for 5 secs from now
and the second timer is 3 secs from now.
So if the timer thread only sleeps and not on conditional variable, then it will wake up after 5 secs whereas it is expected to wake up after 3 secs.
Hope this clarifies your question.
Cheers,

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight