Establishing Data Consistency

Establishing Data Consistency - c

I have a fixed size FIFO type array to store newly added datas. In my main function this array keeps updating itself continuously and one thread working on this data. I want my thread to work on latest passed data to itself meanwhile main function keeps updating it. In the code below, I tried to demonstrate what I am trying to explain. Thread 1 includes a while(1) function itself as well. The reason I am updating Queue on main thread, because Thread 1 has a sleep duration. It may has a simple answer, however my brain currently stopped working.
int main(){
pthread_create(Thread1);
while(1) {
QueuePut(Some_Value);
arguments_of_Thread1.input = Queue;
...
}
return 0;

Related

Global Variables in Flink

I want to use a FIFO queue of size 2 to store elements of a datastream. At any instance, I need the previous element that came in the stream and not the current element. To do this, I have created a queue outside the stream code and I am enqueuing the current element. When my queue has two elements, I dequeue it and use the first element.
The problem I am facing is that I am not able to enqueue the queue as it is declared outside my stream code. I guess this is because streaming use multiple JVMs and my queue would be declared in one JVM.
Below is a sample code:
val queue = Queue[Array[Double]]() //Global Queue
val ws = dataStream.map(row => {
queue.enqueue(row)
println(queue.size) //Prints 0 always
if(queue.size == 2){
result = operate(queue(0))
queue.dequeue
}
result
})
Here, nothing is getting enqueued and the size of the queue is always 0.
Is there a way we can create global variables in Flink which are distributed across all the JVMs? If not, is there any other way to implement this logic?

Surprisingly enough, it worked when I replaced Queue with Scala List.

C -Mutex data structure with multithreading

I have a problem that i can't solve.
I have to make a data structure shared by some thread, the problems are:
The thread are executed simultaneusly and they should insert data in an particular structure, but every object should be inserted in mutex esclusion, because if an object is alredy present it must not be re-inserted.
I have think of making an array where threads put the key of object they are working, if another thread wants to put the same key it should wait for the current thread finish.
so, in other words every thread execute this function for lock element:
void lock_element(key_t key){
pthread_mutex_lock(&mtx_array);
while(array_busy==1){
pthread_cond_wait(&var_array,&mtx_array);
}
array_busy=1;
if((search_insert((int)key))==-1){
// the element is present in array and i can't insert,
// and i must wait for the array to be freed.
// (i think that the problem is here)
}
array_busy=0;
pthread_cond_signal(&var_array);
pthread_mutex_unlock(&mtx_array);
}
after i finish with the object i free the key in the arry with the follow function:
void unlock_element(key_t key){
pthread_mutex_lock(&mtx_array);
while(array_busy==1){
pthread_cond_wait(&var_array,&mtx_array);
}
array_busy=1;
zeroed((int)key);
array_busy=0;
pthread_cond_signal(&var_array);
pthread_mutex_unlock(&mtx_array);
}
in this way, the result change in every execution (for example: in a first time the program insert 300 object, and in a second time insert 100 object).
Thanks for the help!
UPDATE:
#DavidSchwartz #Asthor I modified the code as follows:
void lock_element(key_t key){
pthread_mutex_lock(&mtx_array);
while((search_insert((int)key))==-1){
//wait
pthread_cond_wait(&var_array,&mtx_array);
}
pthread_mutex_unlock(&mtx_array);
}
and...
void unlock_element(key_t key){
pthread_mutex_lock(&mtx_array);
zeroed((int)key);
pthread_cond_signal(&var_array);
pthread_mutex_unlock(&mtx_array);
}
But not work.. It behaves in the same way as before.
I also noticed a strange behavior of the function search_insert(key);
int search_insert(int key){
int k=0;
int found=0;
int fre=-1;
while(k<7 && found==0){
if(array[k]==key){
found=1;
} else if(array[k]==-1) fre=k;
k++;
}
if (found==1) {
return -1; //we can't put the key in the array
}else {
if(fre==-1) exit(EXIT_FAILURE);
array[fre]=key;
return 0;
}
}
never goes in
if(found == 1)

You have a couple of options.
The simplest option is just to hold the mutex during the entire operation. You should definitely choose this option unless you have strong evidence that you need greater concurrency.
Often, it's possible to just allow more than one thread to do the work. This pattern works like this:
Acquire the mutex.
Check if the object is in the collection. If so, use the object from the collection.
Otherwise, release the mutex.
Generate the object
Acquire the mutex again.
Check if the object is in the collection. If not, add it and use the object you generated.
Otherwise, throw away the object you generated and use the one from the collection.
This may result in two threads doing the same work. That may be unacceptable in your use case either because it's impossible (some work can only be done once) or because the gain in concurrency isn't worth the cost of the duplicated work.
If nothing else works, you can go with the more complex solution:
Acquire the mutex.
Check if the object is in the collection. If so, use the object in the collection.
Check if any other thread is working on the object. If so, block on the condition variable and go to step 2.
Indicate that we are working on the object.
Release the mutex.
Generate the object.
Acquire the mutex.
Remove the indication that we are working on the object.
Add the object to the collection.
Broadcast the condition variable.
Release the mutex.
This can be implemented with a separate collection just to track which objects are in progress or you can add a special version of the object to the collection that contains a value that indicates that it's in progress.

The answer is based on assumptions as it is.
Consider this scenario. You have 2 threads trying to insert their objects. Thread 1 and thread 2 both get objects with index 0. We then present 2 possible scenarios.
A:
Thread 1 starts, grabs the mutex and proceeds to insert their object. They finish, letting the next thread through from the mutex, which is 2. Thread 1 tries to get the mutex again to release the index but is blocked as thread 2 has it. Thread 2 tries to insert their object but fails due to the index being taken, so the insert never happens. It releases the mutex and thread 1 can grab it, releasing the index. However thread 2 has already attempted and failed to insert the object it had, meaning that we only get 1 insertion in total.
B:
Second scenario. Thread 1 starts, grabs the mutex, inserts the object, releases the mutex. Before thread 2 grabs it, thread 1 again grabs it, clearing the index and releases the mutex again. Thread 2 then successfully grabs the mutex, inserts the object it had before releasing the mutex. In this scenario we get 2 inserts.
In the end, the issue lies with there being no reaction inside the if statement when a thread fails to insert an object and the thread, not doing what it is meant to. That way you get less insertions than expected.

Whats the best way to asynchronously return a result (as a struct) that hasn't been fully "set up" (or processed) yet

Alright, I honestly have tried looking up "Asynchronous Functions in C" (Results are for C# exclusively), but I get nothing for C. So I'm going to ask it here, but if there are better, already asked questions on StackExchange or what-have-you, please direct me to them.
So I'm teaching myself about concurrency and asynchronous functions and all that, so I'm attempting to create my own thread pool. So far, I'm still in the planning phase of it, and I'm trying to find a clear path to travel on, however I don't want a hand-out of code, I just want a nudge in the right direction (or else the exercise is pointless).
What would be the best way to asynchronously return from a function that isn't really "ready"? In that, it will return almost immediately, even if it's currently processing the task given by the user. The "task" is going to be a callback and arguments to fit the necessary pthread_t arguments needed, although I'll work on attributes later. The function returns a struct called "Result", which contains the void * return value and a byte (unsigned char) called "ready" which will hold values 0 and 1. So while "Result" is not "ready", then the user shouldn't attempt to process the item yet. Then again, the "item" can be NULL if the user returns NULL, but "ready" lets the user know it finished.
struct Result {
/// Determines whether or not it has been processed.
unsigned char ready;
/// The return type, NULL until ready.
void *item;
};
The struct isn't really complete, but it's a basic prototype embodying what I'm attempting to do. This isn't really the issue here though, although let me know if its the wrong approach.
Next I have to actually process the thing, while not blocking until everything is finished. As I said, the function will create the Result, then asynchronously process it and return immediately (by returning this result). The problem is asynchronously processing. I was thinking of spawning another thread inside of the thread_pool, but I feel it's missing the point of a thread pool as it's not longer remaining simple.
Here's what I was thinking (which I've a feeling is grossly over-complicated). In the function add_task, spawn a new thread (Thread A) with a passed sub_process struct then return the non-processed but initialized result. In the spawned thread, it will also spawn another thread (see the problem? This is Thread B) with the original callback and arguments, join Thread A with Thread B to capture it's return value, which is then stored in the result's item member. Since the result will be pointing to the very same struct the user holds, it shouldn't be a problem.
My problem is that it spawns 2 threads instead of being able to do it in 1, so I'm wondering if I'm doing this wrong and complicating things.Is there a better way to do this? Does pthread's library have a function which will asynchronously does this for me? Anyway, the prototype Sub_Process struct is below.
/// Makes it easier than having to retype everything.
typedef void *(*thread_callback)(void *args);
struct Sub_Process {
/// Result to be processed.
Result *result;
/// Thread callback to be processed
thread_callback cb;
/// Arguments to be passed to the callback
void *args;
};
Am I doing it wrong? I've a feeling I'm missing the whole point of a Thread_Pool. Another question is, is there a way to spawn a thread that is created, but waiting and not doing anything? I was thinking of handling this by creating all of the threads by having them just wait in a processing function until called, but I've a feeling this is the wrong way to go about this.
To further elaborate, I'll also post some pseudocode of what I'm attempting here
Notes: Was recommended I post this question here for an answer, so it's been copy and pasted, lemme know if there is any faulty editing.
Edit: No longer spawns another thread, instead calls callback directly, so the extra overhead of another thread shouldn't be a problem.

I presume it is your intention is that a thread will request the asychronous work to be performed, then go on to perform some different work itself until the point where it requires the result of the asynchronous operation in order to proceed.
In this case, you need a way for the requesting thread to stop and wait for the Result to be ready. You can do this by embedding a mutex and condition variable pair inside the Result:
struct Result {
/// Lock to protect contents of `Result`
pthread_mutex_t lock;
/// Condition variable to signal result being ready
pthread_cond_t cond;
/// Determines whether or not it has been processed.
unsigned char ready;
/// The return type, NULL until ready.
void *item;
};
When the requesting thread reaches the point that it requires the asynchronous result, it uses the condition variable:
pthread_mutex_lock(&result->lock);
while (!result->ready)
pthread_cond_wait(&result->cond, &result->lock);
pthread_mutex_unlock(&result->lock);
You can wrap this inside a function that waits for the result to be available, destroys the mutex and condition variable, frees the Result structure and returns the return value.
The corresponding code in the thread pool thread when the processing is finished would be:
pthread_mutex_lock(&result->lock);
result->item = item;
result->ready = 1;
pthread_cond_signal(&result->cond);
pthread_mutex_unlock(&result->lock);
Another question is, is there a way to spawn a thread that is created,
but waiting and not doing anything? I was thinking of handling this by
creating all of the threads by having them just wait in a processing
function until called, but I've a feeling this is the wrong way to go
about this.
No, you're on the right track here. The mechanism to have the thread pool threads wait around for some work to be available is the same as the above - condition variables.

Advantages of a separate thread in C program

I have a capture program which in addition do capturing data and writing it into a file also prints some statistics.The function that prints the statistics
static void report(void)
{
/*Print statistics*/
}
is called roughly every second using an ALARM that expires every second.So The program is like
void capture_program()
{
while()
{
/*Main loop*/
if(doreport)
report();
}
}
The expiry of the timer sets the doreport flag.If this flag is set report() is called which clears the flag.
Now my question is
I am planning to move the reporting function to a separate thread.The main motivation to
do this is that the reporting function executes some code that is under a lock.Now if another process is holding the lock,then this will block causing the capture process to drop packets.So I think it might be a better idea to move the reporting to a thread.
2) If I were to implementing the reporting in a separate thread,should I still have to use
timers inside the thread to do reporting every second?
OR
Is there a better way to do that by making the thread wakeup at every 1 second interval

What are the advantages in moving the reporting function to a separate thread?
If your reporting function is trivial, for example, you just need to print some thing, I don't think a separate thread will help a lot.
If I were to implementing the reporting in a separate thread, should I still have to use timers inside the thread to do reporting every second?
You don't need timers, you can just go to sleep every second, like this:
static void report(void)
{
while (1) {
/*Print statistics*/
sleep(1);
}
}

Queue of variable length array or struct

How would one go about creating a queue that can hold an array, more over an array with variable amounts of rows.
char data[n][2][50];
//Could be any non 0 n e.g:
n=1; data = {{"status","ok}};
// or
n=3; {{"lat", "180.00"},{"long","90.123"},{"status","ok}};
// and so on
n to be added to the queue. Or is there even a better solution than what I'm asking? A queue is easy enough to write (or find re-usable examples of) for single data items but I'm not sure what method I would use for the above. Maybe a struct? That would solve for array and n...but would it solve for variable array?
More broadly the problem I'm trying to solved is this.
I need to communicate with a web server using POST. I have the code for this already written however I don't want to keep the main thread busy every time this task needs doing, especially since I need to make other checks such as is the connection up, if it isn't I need to back off and wait or try and bring it back online.
My idea was to have a single separate dedicated to this task. I figured creating a queue would be the best way for the main thread to let the child thread know what to do.
The data will be a variable number of string pairs. like:
Main
//Currently does
char data[MAX_MESSAGES_PER_POST][2][50];
...
assembles array
sendFunction(ptrToArray, n);
resumes execution with large and un predicatable delay
//Hopefully will do
...
queue(what needs doing)
carry on executing with (almost) no delay
Child
while(0)
{
if(allOtherConditionsMet()) //Device online and so forth
{
if(!empty(myQueue))
{
//Do work and deque
}
}
else
{
//Try and make condition ok. Like reconect dongle.
}
// sleep/Back off for a while
}

You could use an existing library, like Glib. GLib is cross platform. If you used GLib's asynchronous queues, you'd do something like:
The first thread to create the queue executes:
GAsyncQueue *q = g_async_queue_new ();
Other threads can reference (show intent to use the queue) with:
g_async_queue_ref (q);
After this, any thread can 'push' items to the queue with:
struct queue_item i;
g_async_queue_push (q, ( (gpointer) (&i)));
And any thread can 'pop' items from the queue with:
struct queue_item *d = g_async_queue_pop (q);
/* Blocks until item is available. */
Once a thread finishes using the queue and doesn't care any more about it, it calls:
g_async_queue_unref (q);
Even the thread which created the queue needs to do this.
There are a bunch of other useful functions, which you can all read about on the page documenting them. Synchronization (locking/consistency/atomicity of operations) is taken care of by the library itself.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight