How can I define a thread safe global array with minimal modifications?
I want like every access to it to be accomplished by using mutex and synchronized block.
Something like this as 'T' will be some type (note that 'sync' keyword is not currently defined AFAIK):
sync Array!(T) syncvar;
And every access to it will be simmilar to this:
Mutex __syncvar_mutex;
//some func scope....
synchronized(__syncvar_mutex) { /* edits 'syncvar' safely */ }
My naive attempt was to do something like this:
import std.typecons : Proxy:
synchronized class Array(T)
{
static import std.array;
private std.array.Array!T data;
mixin Proxy!data;
}
Sadly, it doesn't work because of https://issues.dlang.org/show_bug.cgi?id=14509
Can't say I am very surprised though as automagical handling of multi-threading via hidden mutexes is very unidiomatic in modern D and the very concept of synchronized classes is mostly a relict from D1 times.
You can implement same solution manually, of course, by defining own SharedArray class with all necessary methods and adding locks inside the methods before calling internal private plain Array methods. But I presume you want something that work more out of the box.
Can't invent anything better right here and now (will think about it more) but it is worth noting that in general it is encouraged in D to create data structures designed for handling shared access explicitly instead of just protecting normal data structures with mutexes. And, of course, most encouraged approach is to not shared data at all using message passing instead.
I will update the answer if anything better comes to my mind.
It is fairly easy to make a wrapper around array that will make it thread-safe. However, it is extremely difficult to make a thread-safe array that is not a concurrency bottleneck.
The closest thing that comes to mind is Java's CopyOnWriteArrayList class, but even that is not ideal...
You can wrap the array inside a struct that locks the access to the array when a thread acquires a token and until it releases it.
The wrapper/locker:
acquire(): is called in loop by a thread. As it returns a pointer, the thread knows that it has the token when the method returns a non null value.
release(): is called by a thread after processing the data whose access has been acquired previously.
.
shared struct Locker(T)
{
private:
T t;
size_t token;
public:
shared(T) * acquire()
{
if (token) return null;
else
{
import core.atomic;
atomicOp!"+="(token, 1);
return &t;
}
}
void release()
{
import core.atomic;
atomicOp!"-="(token, 1);
}
}
and a quick test:
alias LockedIntArray = Locker!(size_t[]);
shared LockedIntArray intArr;
void arrayTask(size_t cnt)
{
import core.thread, std.random;
// ensure the desynchronization of this job.
Thread.sleep( dur!"msecs"(uniform(4, 20)));
shared(size_t[])* arr = null;
// wait for the token
while(arr == null) {arr = intArr.acquire;}
*arr ~= cnt;
import std.stdio;
writeln(*arr);
// release the token for the waiting threads
intArr.release;
}
void main(string[] args)
{
import std.parallelism;
foreach(immutable i; 0..16)
{
auto job = task(&arrayTask, i);
job.executeInNewThread();
}
}
With the downside that each block of operation over the array must be surrounded with an acquire/release pair.
You have the right idea. As an array, you need to be able to both edit and retrieve information. I suggest you take a look at the read-write mutex and atomic utilities provided by Phobos. A read operation is fairly simple:
synchronize on mutex.readLock
load (with atomicLoad)
copy the item out of the synchronize block
return the copied item
Writing should be almost exactly the same. Just syncronize on mutex.writeLock and do a cas or atomicOp operation.
Note that this will only work if you copy the elements in the array during a read. If you want to get a reference, you need to do additional synchronization on the element every time you access or modify it.
Related
I walked away from WWDC 2016 with the understanding that we should be wary about using C-based struct API directly from Swift. In Concurrent Programming With GCD in Swift 3, talking about C-based locks, they were very specific:
... And in Swift, since you have the entire Darwin module at your disposition, you will actually see the struct based traditional C locks. However, Swift assumes that anything that is struct can be moved, and that doesn't work with a mutex or with a lock. So we really discourage you from using these kind of locks from Swift. ...
... And if you want something ... that looks like the locks that you have in C, then you have to call into Objective-C and introduce a base class in Objective-C that has your lock as an ivar.
And then you will expose lock and unlock methods, and a tryLock if you need it as well, that you will be able to call from Swift when you will subclass this class. ...
#implementation LockableObject {
os_unfair_lock _lock;
}
- (void)lock { os_unfair_lock_lock(&_lock); }
- (void)unlock { os_unfair_lock_unlock(&_lock); }
#end
However, watching WWDC 2019 Developing a Great Profiling Experience, I notice that the author is using os_unfair_lock directly from Swift, without this Objective-C wrapper, effectively:
private var workItemsLock = os_unfair_lock()
func subWorkItem(...) {
...
os_unfair_lock_lock(&self.workItemsLock)
...
os_unfair_lock_unlock(&self.workItemsLock)
...
}
Empirically, this sort of direct use of os_unfair_lock appears to work, but that doesn’t mean anything. Respecting the caveat in the 2016 WWDC video, I have refrained from using os_unfair_lock directly from Swift.
So, the question is, are they being (ever so slightly) sloppy in the use of this API in this 2019 sample? Or was the 2016 video incorrect in its claims? Or has the handling of C-based struct changed since Swift 3, now rendering this pattern safe?
The API example of using private var workItemsLock = os_unfair_lock() can fail at runtime.
The threading primitives from C need a stable memory location so to use them or another struct that has one of these primitives directly as a member of it you have to use UnsafePointer. The reason for this is that UnsafePointer APIs once they have allocated a chunk of memory that memory is stable and cannot be moved or trivially be copied by the compiler.
If you change the example like this it is now valid
private var workItemsLock: UnsafeMutablePointer<os_unfair_lock> = {
// Note, somewhere else this will need to be deallocated
var pointer = UnsafeMutablePointer<os_unfair_lock>.allocate(capacity: 1)
pointer.initialize(to: os_unfair_lock())
return pointer
}()
func subWorkItem(...) {
...
os_unfair_lock_lock(self.workItemsLock)
...
os_unfair_lock_unlock(self.workItemsLock)
...
}
I have a set of functions within a module that need access to some shared initialization-time state. Effectively I'd like to model this with a static mutable vector like:
static mut defs: Vec<String> = vec![];
fn initialize() {
defs.push("One".to_string());
defs.push("Two".to_string());
}
(Example: http://is.gd/TyNQVv, fails with "mutable statics are not allowed to have destructors".)
My question is similar to Is it possible to use global variables in Rust?, but uses a Vec (i.e. a type with destructor), so the Option-based solution to that question doesn't seem to apply. Namely, this fails with the same error as my first attempt:
static mut defs: Option<Vec<String>> = None;
fn initialize() {
let init_defs = vec![];
init_defs.push("One".to_string());
init_defs.push("Two".to_string());
defs = Some(init_defs);
}
Is there a way to get access to a static ("global") vector that is populated at initialization time and visible at runtime?
Are there other patterns I should be considering to support this use case? Passing explicit references to the state vector is possible, but would clutter up a very large number of function signatures that all need access to this state.
You can use lazy_static for this purpose:
lazy_static! {
static ref defs: Vec<String> = {
let mut init = vec!["One".to_string(), "Two".to_string()];
// init.push(...); etc. etc.
init
}
}
That initialises a vector on the first access, and it is immutable after that. If you wish to modify it later, wrapping it in a std::sync::Mutex is a good first step.
Are there other patterns I should be considering to support this use case? Passing explicit references to the state vector is possible, but would clutter up a very large number of function signatures that all need access to this state.
One pattern to consider is creating a context object that stores all the info the functions need, e.g.
struct Context {
defs: Vec<String>
}
and then passing around Context ensures everyone knows what they need to know. You can even consider putting all/many/some of the functions as methods on Context, e.g.
impl Context {
fn foo(&self) {
if self.defs.len() > 10 {
println!("lots of defs");
}
}
// ...
}
This pattern is especially good if you need to modify the context (automatically ensures thread safety), and/or if you wish to have several independent instances in a single process.
So i have a program that does these calculations with numbers. The program is threaded, and the number of threads are specified from the user.
I will give a close example
static void *program_thread(void *thread)
{
bool somevar = true;
if(somevar)
{
work = getwork();
}
dowork(work);
if(condition1 blah blah)
somevar = false; /* disable getwork */
if(condition2)
somevar = true; /* condition was either met or not met, so we request
new work either way */
}
Then with pthreads(and i will skip some code) i do
int main(blah)
{
if (pthread_create(&thr->pth, NULL, program_thread, thread_number)) {
printf("%s","program thread create failed");
return 1;
}
}
Now i will start explaining. The number of threads created are specified from the user, so i do a for loop and create as many threads as i need.
Each thread calls
work = getwork();
Thus getting independant work to do, however the CPU is slow for this kind of job. It tries to compute something by trying 2^32 numbers(which is from 1 to 4 294 967 296)
But my CPU can only do around 3 million numbers per second, and by the time it reaches 4 billion numbers, it's restarted(for new work).
So i then thought of a better method. Instead of each thread getting totally different work, all the threads should get the same work and split the numbers they need to try.
The problem is, that i can't controll what work it get's, so i must fetch
work = getwork();
Before initiating the threads. The question is HOW? Using pthread_create obviously...but then what?
You get more than one way to do it:
split your work package into smaller parts (thus, your getWork returns a new, smaller work)
store your work in a common place, that you access from your thread using a reader-writer pattern
from the pthread API, the 4th parameter is given to your thread, you can do something like the following code :
Work = getWork();
if (pthread_create(&thr->pth, NULL, program_thread, (void*) &work))
...
And your program_thread function would be like that
static void *program_thread(void *pxThread)
{
Work* pWork = (Work*) pxThread;
...
Of course, you need to check the validaty of the pointer and common stuff (in my example, I created it on stack which is most probably a bad idea). Note that your code is givig a thread_number as a pointer, which is usually a bad idea. If you want to have more information transfered to your thread, simply hide it into a structure.
I'm not sure I fully understood your issue, but this could give you some hints most probably. Please note also that when doing multithreading, you need to take into account specific issues like race conditions, concurrent access and more complex lifecycle of objects...
can anyone in this forum give an example in C how two threads process data from one textfile.
As an example, I have one textfile that contains a paragraph. I have two threads that will process the data in the said file. One thread will count the number of lines in the paragraph. The second thread will count the numeric characters.
thanks
If you asked in C++ I could give you a code example, but I havent done ANSI C in a very long time so I will give you the design and pseudo code.
Please keep in mind this is really bad pseudo code that is meant to give an example. I'm not questioning WHY you would want to do this. For all I know it could be an excercise with threads or because you "feel like it".
Example 1
int integerCount = 0;
int lineCount = 0;
numericThread()
{
// By flagging the file as readonly you should
// be able to open it as many times as you wish
handle h = openfile ("textfile.txt". readonly);
while (!eof(h)) {
String word = readWord (h);
int outInteger
if (stringToInteger(word, outInteger)) {
++integerCount;
}
}
}
lineThread()
{
// By flagging the file as readonly you should
// be able to open it as many times as you wish
handle h = openfile ("textfile.txt". readonly);
while (!eof(h)) {
String word = readWord (h);
if (word.equals("\n") {
++lineCount ;
}
}
}
If for some reason you aren't able to open the file twice in readonly you will need to maintain a queue for each thread, having the main thread put words into each threads queue. The threads will then pull from the queue.
Example 2
int integerCount = 0;
int lineCount = 0;
queue numericQueue;
queue lineQueue;
numericThread()
{
while (!numericQueue.closed()) {
String word = numericQueue.pop();
int outInteger
if (stringToInteger(word, outInteger)) {
++integerCount;
}
}
}
lineThread()
{
while (!lineQueue.closed()) {
String word = lineQueue.pop();
if (word.equals("\n") {
++lineCount ;
}
}
}
mainThread()
{
handle h = openfile ("textfile.txt". readonly);
while (!eof(h)) {
String word = readWord(h);
numericQueue.push(word);
lineQueue.push(word);
}
numericQueue.close();
lineQueue.close();
}
There are lots of ways to do this. You can make different design decisions depending on how fast or simple or elegant or overengineered you want this to be. One way, as posted by Andrew Finnell is to have each thread open the file and read it completely independently. In theory this isn't great because you are doing expensive IO twice but in practice it's probably fine because the OS has likely cached the contents of whichever read executes first. Double IO is still more expensive than average because it involves a lot of needless system calls, but again in practice it will be irrelevant unless you have a very large file.
Another model of how to do this would be for each thread to have an input queue, or a shared global queue. The main thread reads the file and places each line in turn on the queue(s), and perhaps main doubles as one of your worker threads. This is more complicated because access to the queue(s) must be synchronized, or some lockless queue implementation must be used. In the case of a shared global queue, there is less duplication of data but now the lifecycle of that data is more complicated.
Just to point out how many ways such a simple thing can be done, you could go the overengineering route and make each thread generic. Instead of placing data on the queue(s) you place both data (or pointers to data) and function pointers and let each thread execute the callback. This kind of model might might sense if you plan on adding lots more kind of things to compute but want to limit the number of threads you will use.
I don't think you will see much performance difference in using 2 threads over one. Either way, you don't want both threads to read the file. Read the file first, then pass a COPY of the stream to the methods you want and process both. The threads will not have access to the same stream of data at the same time so you'll need to use 2 copies of the textfile.
P.S. It's possible that depending on the size of the file, you will actually loose performance using 2 threads.
I'm a bit confused about how to implement my state machine.
I already know it's hierarchical since some states share the same action.
I determine what I need to do by these parameters:
Class (Values are: Base, Derived, Specific)
OpCode
Parameter 1 - optional
Parameter 2 - optional
My hierarchy is determined by the Class and the OpCode represents the action.
Derived can use the OpCodes of Base and Specific can use OpCodes of both Base and Derived.
The naive implementation is the following:
void (*const state_table [MAX_CLASSES][MAX_OPCODES]) (state *) {
{base_state1, base_state2, NULL, NULL},
{base_state1, base_state2, derived_state1, NULL},
{base_state1,base_state2, derived_state1, specific_state3},
};
void dispatch(state *s)
{
if (state_table[s->Class][s->OpCode] != NULL)
state_table[s->Class][s->OpCode](s);
}
This will turn unmaintainable really quick.
Is there another way to map the state to a superclass?
EDIT:
Further calcualtion leads me to think that I'll probably use most if not all OpCodes but I will not use all of the Classes available to me.
Another clarification:
Some OpCodes might be shared through multiple derived and base Classes.
For example:
I have a Class called Any
which is a Base class. It has the
OpCodes: STATE_ON, STATE_OFF, STATE_SET.
I have another Class called
MyGroup which is a Derived class. It has the OpCodes:
STATE_FLIP, STATE_FLOP.
The third Class is a Specific
class called ThingInMyGroup which
has the OpCode:
STATE_FLIP_FLOP_AND_FLOOP.
So a message with class Any is sent from the server, recieved in all clients and processed.
A message with class MyGroup is sent from the server, recieved in all clients and processed only on clients that belong to MyGroup, any OpCodes that are valid for the Any class are valid for the MyGroup class.
A message with class ThingInMyGroup is sent from the server, recieved in all clients and processed only on clients that belong to MyGroup and are a ThingInMyGroup*, any **OpCodes that are valid for the Any class and MyGroup class are valid for the ThingInMyGroup class.
After a message is received the client will ACK/NACK accordingly.
I prefer not to use switch cases or const arrays as they will become unmaintainable when they get bigger.
I need a flexible design that allows me:
To specify which OpCodes are available
for each Class.
To specify a superclass for each Class and through that specification to allow me to call the function pointer that is represented by the current OpCode.
There are several ways to deal with this. Here is one:
edit -- with general purpose hierarchy added
typedef unsigned op_code_type;
typedef void (*dispatch_type)(op_code_type);
typedef struct hierarchy_stack hierarchy_stack;
struct hierarchy_stack {
dispatch_type func;
hierarchy_stack *tail;
};
void dispatch(state *s, hierarchy_stack *stk) {
if (!stk) {
printf("this shouldn't have happened");
} else {
stk->func(s, stk->tail);
}
}
void Base(state *s, hierarchy_stack *stk ) {
switch (s->OpCode) {
case bstate1:
base_state1(s);
break;
case bstate2:
base_state(2);
break;
default:
dispatch(s, stk);
}
}
void Derived(state *s, hierarchy_stack *stk ) {
switch(s->opcode) {
case dstate1:
deriveds_state1(s);
break;
default:
dispatch(s, stk);
}
}
...
NOTE : All function calls are tail calls.
This localizes your "class"es a good bit so that if you decide that Derived needs 100 more methods/opcodes then you only have to edit methods and the enum (or whatever) that you use to define opcodes.
Another, more dynamic way, to deal with this would be to have a parent pointer within each "class" that pointed to the "class" that would handle anything that it could not handle.
The 2D table approach is fast and flexible (Derived could have a different handler than Base for opcode 0), but it grows fast.
I wrote a little tool that generates code similar to your naive implementation based on a mini-language. The language just specified the state-opcode-action relationships, all of the actions were just C functions conforming to a typedef.
It didn't handle the HSM aspect, but this would be relatively easy to add to a language.
I'd recommend taking this approach -- create a little language that gives you a clean way to describe the state machine, and then generate code based on that machine description. That way when you need to insert a new state a month from now, the whole thing isn't a tangled mess to edit.
Let me know if you want the code and I'll make sure it's still available somewhere.