Multithreading in C with CLion (Windows) - c

I have a section of code which is really time-consuming (~2s), and can be executed without the last one have finished.
for (x = last; x <= end && !found; x++) {
found = combine(combinaciones, x, end, &current);
}
Note that combine() it's the function I was talking about. But before the loop starts again (this for it's inside another loop) all the functions must have finished.
Also, combine() receives some arguments, combinaciones it's a pointer, and current an integer (by reference). The integer gets modified over time (and must get modified on the other active functions) and combinaciones it's a linked list (size of which is current-1). The 2nd and 3rd argument are just integers. The return it's a boolean that terminates the loop, but I don't matter if they make some extra operations.
I stress the importance of current. If current is equal to 5, combinaciones's 5th element will be the created, and then current will be 6. I mean that if two functions are allocating memory they souldn't allocate to the same position.
If that can't be done I can make the operations in parallel, and then allocate the memory on the main().
So, I started searching about multithreading and I found <pthread.h>, but it only worked on Linux (not on CLion). I found something caled fork, I don't know either if can be used. Would you tell me a Windows multithreading API, with an example, and the CLion implementation?
The code is huge and I don't think that would help a lot, with I have said should be enought. However, if something is unclear let me know. The code itself it's published on GitHub, but the (few) comments it has are in Spanish.
Thanks for your help!

Using Wsl (as #Singh said) I could use pthread.h. Then I just adapted my code. A very annoying bug that I had was that "IDs" (current variable) were assigned non-continuously (the first thread had "3", and the second one had "2"). I was able to fix it by using a global variable (pthread_mutex_t) and pthread_mutex_lock(), pthread_mutex_unlock(). For more information you can still check my code on GitHub.

Related

Backtracking algorithm crashes with hard cases (C)

I am actually making an algorithm that takes as an input a file containing tetriminoses (figures from tetris) and arranges them in smallest possible square.
Still I encounter a weird problem :
The algorithm works for less than 10 tetriminoses (every time) but start crashing with 11, 12... (I concluded that it depends on how complicated the solution is, as it finds some 14 and 15 solutions).
But the thing is, if I add an optimisation flag like -Ofast (the program is written in C) it works for every input I give him no matter how much time it takes (sometimes more than a hour..).
First I had a lot of leaks (I was using double linked list) so I changed for an Int Array, no more leaks, but same problem.
I tried using the debugger but it makes no sense (see picture) :
Debugger says my variables do not exist anymore but all I do is increment or decrement them.
For this example it just stopped while everything is fine (values of variables are correct)
Here is the link of my main function (the one that do the backtracking):
https://github.com/Caribou123/fillitAG/blob/master/19canplace/solve.c
The rest of program (same repository) consist of functions to put tetriminoses in my array, remove them from it, or print the result.
Basically I try placing a tetri, if I have enough space, I place the next one, otherwise I remove the last one and place it to the next available position etc..
Also I first thought that I were trying to place something outside of the Array, so now my Array is way bigger than it should be and filled with -1 for invalid cases (so in the worst case I just rewrite a -1), 0 for free ones, and integers values from 1 to 26 for figures.
The fact that the program works with the flag -Ofast really troubles me, as the algorithm seems to work perfectly, what could cause my program to crash ?
Here is how I tracked the number of recursions, by adding two static variables
And here is the output
(In case you want to test it yourself, use the 19canplace folder, and compile with : gcc *.c libft/libft.a)
Thanks in advance for your time,
Artiom

Two approaches to writing functions

I am asking this question in the context of the C language, though it applies really to any language supporting pointers or pass-by-reference functionality.
I come from a Java background, but have written enough low-level code (C and C++) to have observed this interesting phenomenon. Supposing we have some object X (not using "object" here in the strictest OOP sense of the word) that we want to fill with information by way of some other function, it seems there are two approaches to doing so:
Returning an instance of that object's type and assigning it, e.g. if X has type T, then we would have:
T func(){...}
X = func();
Passing in a pointer / reference to the object and modifying it inside the function, and returning either void or some other value (in C, for instance, a lot of functions return an int corresponding to the success/failure of the operation). An example of this here is:
int func(T* x){...x = 1;...}
func(&X);
My question is: in what situations makes one method better than the other? Are they equivalent approaches to accomplishing the same outcome? What are the restrictions of each?
Thanks!
There is a reason that you should always consider using the second method, rather than the first. If you look at the return values for the entirety of the C standard library, you'll notice that there's almost always an element of error handling involved in them. For example, you have to check the return value of the following functions before you assume they've succeeded:
calloc, malloc and realloc
getchar
fopen
scanf and family
strtok
There are other non-standard functions that follow this pattern:
pthread_create, etc.
socket, connect, etc.
open, read, write, etc.
Generally speaking, a return value conveys a number of items successfully read/written/converted or a flat-out boolean success/fail value, and in practice you'll almost always need such a return value, unless you're going to exit(EXIT_FAILURE); at any errors (in which case I would rather not use your modules, because they give me no opportunity to clean up within my own code).
There are functions that don't use this pattern in the standard C library, because they use no resources (e.g. allocations or files) and so there's no chance of any error. If your function is a basic translation function (e.g. like toupper, tolower and friends which translate single character values), for example, then you don't need a return value for error handling because there are no errors. I think you'll find this scenario quite rare indeed, but if that is your scenario, by all means use the first option!
In summary, you should always highly consider using option 2, reserving the return value for a similar use, for the sake of consistent with the rest of the world, and because you might later decide that you need the return value for communicating errors or number of items processed.
Method (1) passes the object by value, which requires that the object be copied. It's copied when you pass it in and copied again when it's returned. Method (2) passes only a pointer. When you're passing a primitive, (1) is just fine, but when you're passing an object, a struct, or an array, that's just wasted space and time.
In Java and many other languages, objects are always passed by reference. Behind the scenes, only a pointer is copied. This means that even though the syntax looks like (1), it actually works like (2).
I think I got you.
These to approach are very different.
The question you have to ask your self when ever you trying to decide which approach to take is :
Which class would have the responsibility?
In case you passing the reference to the object you are decapul the creation of the object to the caller and creating this functionality to be more serviceability and you would be able to create a util class that all of the functions inside will be stateless, they are getting object manipulate the input and returning it.
The other approach is more likely and API, you are requesting an opperation.
For an example, you are getting array of bytes and you would like to convert it to string, you would probably would chose the first approch.
And if you would like to do some opperation in DB you would chose the second one.
When ever you will have more than 1 function from the first approch that cover the same area you would encapsulate it into a util class, same applay to the second, you will encapsulate it into an API.
In method 2, we call x an output parameter. This is actually a very common design utilized in a lot of places...think some of the various built-in C functions that populate a text buffer, like snprintf.
This has the benefit of being fairly space-efficient, since you won't be copying structs/arrays/data onto the stack and returning brand new instances.
A really, really convenient quality of method 2 is that you can essentially have any number of "return values." You "return" data through the output parameters, but you can also return a success/error indicator from the function.
A good example of method 2 being used effectively is in the built-in C function strtol. This function converts a string to a long (basically, parses a number from a string). One of the parameters is a char **. When calling the function, you declare char * endptr locally, and pass in &endptr.
The function will return either:
the converted value if it was successful,
0 if it failed, or
LONG_MIN or LONG_MAX if it was out of range
as well as set the endptr to point to the first non-digit it found.
This is great for error reporting if your program depends on user input, because you can check for failure in so many ways and report different errors for each.
If endptr isn't null after the call to strtol, then you know precisely that the user entered a non-integer, and you can print straight away the character that the conversion failed on if you'd like.
Like Thom points out, Java makes implementing method 2 simpler by simulating pass-by-reference behavior, which is just pointers behind the scenes without the pointer syntax in the source code.
To answer your question: I think C lends itself well to the second method. Functions like realloc are there to give you more space when you need it. However, there isn't much stopping you from using the first method.
Maybe you're trying to implement some kind of immutable object. The first method will be the choice there. But in general, I opt for the second.
(Assuming we are talking about returning only one value from the function.)
In general, the first method is used when type T is relatively small. It is definitely preferable with scalar types. It can be used with larger types. What is considered "small enough" for these purposes depends on the platform and the expected performance impact. (The latter is caused by the fact that the returned object is copied.)
The second method is used when the object is relatively large, since this method does not perform any copying. And with non-copyable types, like arrays, you have no choice but to use the second method.
Of course, when performance is not an issue, the first method can be easily used to return large objects.
An interesting matter is optimization opportunities available to C compiler. In C++ language compilers are allowed to perform Return Value Optimizations (RVO, NRVO), which effectively turn the first method into the second one "under the hood" in situations when the second method offers better performance. To facilitate such optimizations C++ language relaxes some address-identity requirements imposed on the involved objects. AFAIK, C does not offer such relaxations, thus preventing (or at least impeding) any attempts at RVO/NRVO.
Short answer: take 2 if you don't have a necessary reason to take 1.
Long answer: In the world of C++ and its derived languages, Java, C#, exceptions help a lot. In C world, there is not very much you can do. Following is an sample API I take from CUDA library, which is a library I like and consider well designed:
cudaError_t cudaMalloc (void **devPtr, size_t size);
compare this API with malloc:
void *malloc(size_t size);
in old C interfaces, there are many such examples:
int open(const char *pathname, int flags);
FILE *fopen(const char *path, const char *mode);
I would argue to the end of the world, the interface CUDA is providing is much obvious and lead to proper result.
There are other set of interfaces that the valid return value space actually overlaps with the error code, so the designers of those interfaces scratched their heads and come up with not brilliant at all ideas, say:
ssize_t read(int fd, void *buf, size_t count);
a daily function like reading a file content is restricted by the definition of ssize_t. since the return value has to encode error code too, it has to provide negative number. in a 32bit system, the max of ssize_t is 2G, which is very much limited the number of bytes you can read from your file.
If your error designator is encoded inside of the function return value, I bet 10/10 programmers won't try to check it, though they really know they should; they just don't, or don't remember, because the form is not obvious.
And another reason, is human beings are very lazy and not good at dealing if's. The documentation of these functions will describe that:
if return value is NULL then ... blah.
if return value is 0 then ... blah.
yak.
In the first form, things changes. How do you judge if the value has been returned? No NULL or 0 any more. You have to use SUCCESS, FAILURE1, FAILURE2, or something similar. This interface forces users to code more safer and makes the code much robust.
With these macro, or enum, it's much easier for programmers to learn about the effect of the API and the cause of different exceptions too. With all these advantages, there actually is no extra runtime overhead for it too.
I will try to explain :)
Let say you have to load a giant rocket into semi,
Method 1)
Truck driver places a truck on a parking lot, and goes on to find a hookers, you are stack with putting the load onto forklift or some kind of trailer to bring it to the track.
Method 2)
Truck driver forgets hooker and backs truck up right to the rocket, then you need just to push it in.
That is the difference between those two :). What it boils down to in programming is:
Method 1)
Caller function reserves and address for called function to return its return value to, but how is calling function going to get that value does not matter, will it have to reserve another address or not does not matter, I need something returned, it is your job to get it to me :). So called function goes and reserves the address for its calculations and than stores the value in address then returns value to caller. So caller goes and say oh thank you let me just copy it to the address I reserved earlier.
Method 2)
Caller function says "Hey I will help you, I will give you the address that I have reserved, store what ever calculations you do in it", this way you save not only memory but you save in time.
And I think second is better, and here is why:
So let say that you have struct with 1000 ints inside of it, method 1 would be pointless, it will have to reserve 2*100*32 bits of memory, which is 6400 plus you have to copy it to first location than copy it to second one. So if each copy takes 1 millisecond you will need to way 6.4 seconds to store and copy variables. Where if you have address you only have to store it once.
They are equivalent to me but not in the implementation.
#include <stdio.h>
#include <stdlib.h>
int func(int a,int b){
return a+b;
}
int funn(int *x){
*x=1;
return 777;
}
int main(void){
int sx,*dx;
/* case static' */
sx=func(4,6); /* looks legit */
funn(&sx); /* looks wrong in this case */
/* case dynamic' */
dx=malloc(sizeof(int));
if(dx){
*dx=func(4,6); /* looks wrong in this case */
sx=funn(dx); /* looks legit */
free(dx);
}
return 0;
}
In a static' approach it is more comfortable to me doing your first method. Because I don't want to mess with the dynamic part (with legit pointers).
But in a dynamic' approach I'll use your second method. Because it is made for it.
So they are equivalent but not the same, the second approach is clearly made for pointers and so for the dynamic part.
And so far more clear ->
int main(void){
int sx,*dx;
sx=func(4,6);
dx=malloc(sizeof(int));
if(dx){
sx=funn(dx);
free(dx);
}
return 0;
}
than ->
int main(void){
int sx,*dx;
funn(&sx);
dx=malloc(sizeof(int));
if(dx){
*dx=func(4,6);
free(dx);
}
return 0;
}

Value in C variable disappears

I have a C program that is giving me trouble. It is a plugin for the X-Plane flight simulator. You can view the whole code here. The basic function is pulling some information from curl, running and recording information about a flight, and then finally compiling a report of the flight.
The problem is that there is one variable that is misbehaving. I declare it at the beginning of the code because it is needed across multiple functions.
char fltno[9];
The content is set using a plugin function to get the value from a text box. It takes the value from the FltNoText widget, 8 characters long, and assigns it to the fltno variable.
XPGetWidgetDescriptor( FltNoText, fltno, 8);
I build some messages using the following method that include the fltno variable.
messg = malloc(snprintf(NULL, 0, "Message stuff %s", fltno) + 1);
sprintf(messg, "Message stuff %s", fltno);
This works just fine throughout the running of the program. Right up until the last time it is needed. This is the section that begins with:
if (inParam1 == (long)SendButton)
This will run at the end. When running this section, the fltno variable returns no text. There are many other variables which I am using in the same way, and they all seem to work fine. I ran the program over a short flight and it worked fine, but the two times I have run it on a longer flight, the variable has returned blank in that section.
Let me know if I need to explain more. You can probably tell I haven't written much C so suggestions are appreciated.
More Info
It was suggested that the adjacently decalared variables could be overflowing into fltno.
The Ppass[64] variable is copied from Ppass_def, which is set in the code and is definitely smaller than 64.
strcpy(Ppass, Ppass_def);
The tailnum[41] variable is read from a plugin function.
XPLMGetDatab( tailnum_ref, tailnum, 0, 40 );
Both of these variables are set at the beginning of the program, so it doesn't make sense that fltno only misbehaves at the very end.
Update 1
Thanks for the comments. As suggested by rodrigo, I replaced the longest malloc/sprintf instance with calls to some functions. Not sure if what I did is what you had in mind, but it seems to work at least as well as it did before. You can see the new code on the Github link.
I also did more testing and narrowed down a bit where the problem could be. The fltno variable is fine when run the last time at line 1193, but by the end of the case at line 1278, it is blank. I will try to do more testing later to narrow it down further.
Update 2
After more testing I narrowed it down to line 1292. Before that line the fltno is fine, after that it becomes blank.
strcpy(INlon, dlon);
The dlon variable is declared within the if statement where the switch is.
char dlon[12];
This variable is set every time that section runs, so it's strange that there are no problems until the end. It's set from a function that takes the decimal degrees latitude or longitude and returns a string like "N33 45.9622".
strcpy(dlon, degdm(lon, 1));
The INlon variable is declared at the beginning of the program, and this is the only time it is set.
char INlon[12];
Any ideas about why this part messes up the fltno variable?
Update 3
Thanks for the suggestions about strncpy and an alternative to the double sprintf calls. I changed a few other things and it seems to work now, I will do a full test later to be sure.
A key part that I really should have caught before now is that the latitude/longitude strings were only 12 long, which is too short. When the longitude is over 100 degrees, the string could be "W100 22.5678", which is 12 characters. This caused the end NULL to be cut off and was the source of some of my problems.
char INlat[13];
I noticed this when I was getting something like "W100 22.5678ABC1234" from those variables, where "ABC1234" is the fltno variable.
I used a pointer for dlat to avoid size problems, and use strncpy to make sure to not spill over to other variables.
strncpy(INlat, dlat, sizeof(INlat));
Finally, I found the asprintf function to replace the double instances of sprintf. I understand this won't work on all systems but it's a simple fix for now.
char * purl = NULL;
asprintf(&purl, "DATA1=%s&DATA2=%s", DATA1v1, DATA2);
So thanks for your help in fixing my code (assuming that it's working now). I feel like I knew enough to be able to fix it (once I was pointed in the right direction) but I'm still not sure exactly what was going on. If anyone wants to post an explanation of what was going wrong (as far as you can tell) I would be happy to accept an answer.

evalWithTimeout ignored when calling C / Fortran routines?

I'm working with "igraph" package, and the "evalWithTimeout" function in "R.utils".
I'm trying to do maximal clique detection, which I know it can get terrible (as terrible O(3^n) being n the number of nodes) so I encapsulated in a timeOut, but it gets ignored.
Minimal code to reproduce the problem
library(igraph)
library(R.utils)
g<-erdos.renyi.game(1e6,1e7,type="gnm")
o<-evalWithTimeout(maximal.cliques(g),timeout=1)
This should stop after one second. However it doesn't. I wonder if this is due to the use of underlying C / Fortran code (which is what maximal.cliques does). If so, how can i solve this?
This won't work with most C code, because R cannot interrupt C code, unless the C code cooperates. evalWithTimeout calls setTimeLimit, and this is from the manual page from setTimeLimit:
Time limits are checked whenever a user interrupt could occur.
This will happen frequently in R code and during Sys.sleep, but
only at points in compiled C and Fortran code identified by the
code author.
It is not trivial to make C code interruptible, because you need to deallocate all allocated memory.
I suggest to report a bug at https://github.com/igraph/igraph/issues and request to make maximal.cliques interruptible.

To find all possible permutations of a given string

This is just to work out a problem which looks pretty interesting. I tried to think over it, but couldn't find the way to solve this, in efficient time. May be my concepts are still building up... anyways the question is as follows..
Wanted to find out all possible permutation of a given string....... Also, share if there could be any possible variations to this problem.
I found out a solution on net, that uses recursion.. but that doesn't satisfies as it looks bit erroneous.
the program is as follows:-
void permute(char s[], int d)
{
int i;
if(d == strlen(s))
printf("%s",s);
else
{
for(i=d;i<strlen(s);i++)
{
swap(s[d],s[i]);
permute(s,d+1);
swap(s[d],s[i]);
}
}
}
If this program looks good (it is giving error when i ran it), then please provide a small example to understand this, as i am still developing recursion concepts..
Any other efficient algorithm, if exists, can also be discussed....
And Please,, this is not a HW........
Thanks.............
The code looks correct, though you only have the core of the algorithm, not a complete program. You'll have to provide the missing bits: headers, a main function, and a swap macro (you could make swap a function by calling it as swap(s, d, i)).
To understand the algorithm, it would be instructive to add some tracing output, say printf("permute(%s, %d)", s, d) at the beginning of the permute function, and run the program with a 3- or 4-character string.
The basic principle is that each recursive call to permute successively places each remaining element at position d; the element that was at position d is saved by putting it where the aforementioned remaining element was (i.e. the elements are swapped). For each placement, permute is called recursively to generate all desired substrings after the position d. So the top-level call (d=0) to permute successively tries all elements in position 0, second-level calls (d=1) try all elements in position 1 except for the one that's already in position 0, etc. The next-to-deepest calls (d=n-1) have a single element to try in the last position, and the deepest calls (d=n) print the resulting permutation.
The core algorithm requires Θ(n·n!) running time, which is the best possible since that's the size of the output. However this implementation is less efficient that it could be because it recomputes strlen(s) at every iteration, for a Θ(n²·n!) running time; the simple fix of precomputing the length would yield Θ(n·n!). The implementation requires Θ(n) memory, which is the best possible since that's the size of the input.
For an explanation of the recursion see Gilles answer.
Your code has some problems. First it will be hard to implement the required swap as a function in C, since C lacks the concept of call by reference. You could try to do this with a macro, but then you'd either have to use the exclusive-or trick to swap values in place, or use a temporary variable.
Then your repeated use of strlen on every recursion level blows up your complexity of the program. As you give it this is done at every iteration of every recursion level. Since your string even changes (because of the swaps) the compiler wouldn't even be able to notice that this is always the same. So he wouldn't be able to optimize anything. Searching for the terminating '\0' in your string would dominate all other instructions by far if you implement it like that.

Resources