How can I parallelize multiple for loop in OpenMP? - c

I am very new to coding and have manage to put these together from different sources. Code runs in c but I am unable to get to work in OpenMP. I get errors such as getting the correct key once and then the next four (4) or five (5) attempts does not generates the correct key. Any help is very welcome.
This is my code.
#include "omp.h"
#include <openssl/conf.h>
#include <openssl/err.h>
#include <openssl/evp.h>
#include <string.h>
#include <time.h>
void handleErrors(void)
{
ERR_print_errors_fp(stderr);
abort();
}
int encrypt(unsigned char *plaintext, int plaintext_len, unsigned char *key,
unsigned char *iv, unsigned char *ciphertext)
{
EVP_CIPHER_CTX *ctx;
int len;
int ciphertext_len;
if (!(ctx = EVP_CIPHER_CTX_new()))
handleErrors();
if (1 != EVP_EncryptInit_ex(ctx, EVP_aes_128_cbc(), NULL, key, iv))
handleErrors();
if (1 != EVP_EncryptUpdate(ctx, ciphertext, &len, plaintext, plaintext_len))
handleErrors();
ciphertext_len = len;
if (1 != EVP_EncryptFinal_ex(ctx, ciphertext + len, &len))
handleErrors();
ciphertext_len += len;
EVP_CIPHER_CTX_free(ctx);
return ciphertext_len;
}
int main(void)
{
int total_thread, thread_id;
double start_time, end_time;
start_time = omp_get_wtime();
printf("Starting of the program, start_time = %f\n", start_time);
/* A 128 bit key */
unsigned char *key = (unsigned char *)"secret##########";
/* A 128 bit IV */
unsigned char *iv = (unsigned char *)"\x01\x02\x03\x04\x05\x06\x07\x08";
/* Message to be encrypted */
unsigned char *plaintext =
(unsigned char *)"This is a really really top secret!";
/* Buffer for ciphertext. Ensure the buffer is long enough for the
ciphertext which may be longer than the plaintext, dependant on the
* algorithm and mode */
unsigned char ciphertext[128];
unsigned char ciphertextnew[128];
#pragma omp parallel
/* Encrypt the plaintext */
encrypt(plaintext, strlen((char *)plaintext), key, iv, ciphertext);
/* Do something useful with the ciphertext here */
printf("Ciphertext is:\n");
BIO_dump_fp(stdout, (const char *)ciphertext, 16);
/*char pbuffer[1024];*/
char password[17] = "################";
char alphabet[] = "ectres";
// int count;
#pragma omp parallel for
for (int s = 0; s < 6; s++)
{
password[0] = alphabet[s];
for (int t = 0; t < 6; t++)
{
password[1] = alphabet[t];
for (int u = 0; u < 6; u++)
{
password[2] = alphabet[u];
for (int v = 0; v < 6; v++)
{
password[3] = alphabet[v];
for (int w = 0; w < 6; w++)
{
password[4] = alphabet[w];
for (int x = 0; x < 6; x++)
{
password[5] = alphabet[x];
encrypt(plaintext, strlen((char *)plaintext),
password, iv, ciphertextnew);
if (strncmp(ciphertext, ciphertextnew, 16) == 0)
{
printf("\n%s", password);
printf(" Here is the correct key!\n\n");
end_time = omp_get_wtime();
total_thread = omp_get_num_threads();
thread_id = omp_get_thread_num();
printf("\nProgram start = %f\n", start_time);
printf("\nProgram end = %f\n", end_time);
printf("\n\nProgram runtime = %f seconds\n\n",
end_time - start_time);
printf("\nTotal number of threads = %d\n",
total_thread);
exit(0);
}
printf("\n%s", password);
}
}
}
}
}
}
return 0;
}
// add padding to key
void pad(char *s, int length)
{
int l;
l = strlen(s); /* its length */
while (l < length)
{
s[l] = '#'; /* insert a space */
l++;
}
s[l] = '\0'; /* strings needs to be terminated in null */
}

As mentionned by #Matthieu Brucher, you have an issue with password being shared.
Another one is the exit(0); statement. You can only parallelize structured-blocks with a single exit point at the bottom (i.e. more or less a statement block without any exit, return, goto...). So the exit statement would not be legit. It seems logical: if a thread hits the exit, what are the other ones supposed to do? How do they know that they have to exit too?
There are however specific directives to cancel a parallel loop, pretty much what the break; statement would do. The omp cancel directive will signal all threads to break from the parallel loop or parallel region. The omp cancellation point is the point where threads will check if a cancellation has been requested.
You'll have to find where the cancellation point should be put: going there too often has a cost in terms of overhead (putting it in the innermost loop may not be efficient), but not often enough means that a thread may keep running for too long before realizing that it should break from its loop (putting it in the outermost loop means that threads will almost never check for cancellation).
char password_found[17];
int flag_found=0;
#pragma omp parallel shared(password_found,flag_found)
{
char password[17] = "################"; //this is a thread private variable
// These will be done in parallel
#pragma omp for collapse(3)
for (int s = 0; s < 6; s++)
for (int t = 0; t < 6; t++)
for (int u = 0; u < 6; u++)
{
password[0] = alphabet[s];
password[1] = alphabet[t];
password[2] = alphabet[u];
// For every s,t,u, a single thread will loop through these
for (int v = 0; v < 6; v++)
for (int w = 0; w < 6; w++)
for (int x = 0; x < 6; x++)
{
password[3] = alphabet[v];
password[4] = alphabet[w];
password[5] = alphabet[x];
encrypt(plaintext, strlen((char *)plaintext),
password, iv, ciphertextnew);
if (strncmp(ciphertext, ciphertextnew, 16) == 0)
{
printf("\n%s", password);
printf(" Here is the correct key!\n\n");
flag_found=1;
strcpy(password_found,password); // Copy thread-private copy to shared variable
// Now, signal everyone to stop
#pragma omp cancel parallel
}
printf("\n%s is bad", password);
} // end inner group of loops
// Threads will check here is they should stop
#pragma omp cancellation point parallel
} // end of the outer group of loops
} // end of parallel region
// Do something now //
if (flag_found){
printf("\nThe password is %s\n",password_found);
return 0;
}else{
printf("Password not found\n");
return 1;
}

Related

How to order split command line arguments in lexicographical order using a function?

I'm working on a program that takes command line arguments and splits them in half and then orders them in lexicographical order.
For example:
hello, world!
would turn into:
he
ld!
llo
wor
I have a main method that reads through the arguments, a function that splits the arguments, and finally a function that is supposed to order the halves in lexicographical order. I can't get this to run properly because of argument type errors in the lexicographicalSort method and an incompatible pointer type in the main method. I'm having issues to correct these syntax errors, how exactly would I correct them? Also, is there anything here that would cause logical errors? This is what I have so far:
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
int splitString(char arg[], int n)
{
int len = strlen(arg);
int len1 = len/2;
int len2 = len - len1; // Compensate for possible odd length
char *s1 = malloc(len1 + 1); // one for the null terminator
memcpy(s1, arg, len1);
s1[len1] = '\0';
char *s2 = malloc(len2 + 1); // one for the null terminator
memcpy(s2, arg + len1, len2);
s2[len2] = '\0';
printf("%s\n", s1);
printf("%s\n", s2);
free(s1);
free(s2);
return 0;
}
int lexicographicalSort(char *arg[], int n)
{
char temp[50];
for(int i = 0; i < n; ++i)
scanf("%s[^\n]",arg[i]);
for(int i = 0; i < n - 1; ++i)
for(int j = i + 1; j < n ; ++j)
{
if(strcmp(arg[i], arg[j]) > 0)
{
strcpy(temp, arg[i]);
strcpy(arg[i], arg[j]);
strcpy(arg[j], temp);
}
}
for(int i = 0; i < n; ++i)
{
puts(arg[i]);
}
return 0;
}
int main(int argc, char *argv[])
{
if (argc > 1)
{
for (int i = 1; i < argc; i++)
{
int j = 1;
int k = strlen(argv[i]);
splitString(argv[i], j);
lexicographicalSort(argv[i], j);
}
}
}
Basic scheme is simple. Make an array of tuples {start_pointer, length}. Do some programming on args to split the args. Fill in the array as appropriate. Make sorting with qsort, or any other sort of your choise.
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
char *s = "hello, world! . hello.....";
char *pc;
int i, n, nargs;
struct pp{
char *p;
int l;
};
struct pp args[10], hargs[20];
struct pp *pargs;
int cmp(const void * v0, const void * v1) {
struct pp *pv0 = v0, *pv1 = v1;
return strncmp(pv0->p, pv1->p, pv0->l);
}
int main(void)
{
for(pc = s, i = 0; *pc; ++i){
sscanf(pc, "%*[^ ]%n", &n);
if(n > 0){
args[i].p = pc;
args[i].l = n;
}
for(pc += n, n = 0; isspace(*pc); ++pc);
}
for(nargs = i, i = 0; i < nargs; ++i)
printf("%d arg is: %.*s\n", i, args[i].l, args[i].p);
putchar('\n');
for(i = 0, pargs = hargs; i < nargs; ++i){
if(args[i].l == 1){
pargs->p = args[i].p;
pargs->l = 1;
pargs = pargs + 1;
}else {
pargs->p = args[i].p;
pargs->l = args[i].l / 2;
pargs = pargs + 1;
pargs->p = args[i].p + args[i].l / 2;
pargs->l = args[i].l - args[i].l / 2;
pargs = pargs + 1;
}
}
putchar('\n');
for(nargs = pargs - hargs, i = 0; i < nargs; ++i)
printf("%d arg is: %.*s\n", i, hargs[i].l, hargs[i].p);
qsort(hargs, nargs, sizeof(struct pp), cmp);
putchar('\n');
for(i = 0; i < nargs; ++i)
printf("%d arg is: %.*s\n", i, hargs[i].l, hargs[i].p);
return 0;
}
https://rextester.com/GSH22767
Upon splitting a C string, one needs one extra char to store extra null-terminator. There is one answer that bypasses this by storing the length. For completeness, this is closer to your original intention: allocating enough space to copy the programmes arguments. It probably works slower, but one is free to use the strings elsewhere in the programme.
#include <stdlib.h> /* malloc free EXIT qsort */
#include <stdio.h> /* fprintf */
#include <string.h> /* strlen memcpy */
#include <errno.h> /* errno */
static int strcompare(const void *a, const void *b) {
const char *a_str = *(const char *const*)a, *b_str = *(const char *const*)b;
return strcmp(a_str, b_str);
}
int main(int argc, char **argv) {
char *spacev = 0, **listv = 0;
size_t spacec = 0, listc = 0;
int is_done = 0;
do { /* "Try." */
int i;
char *sv;
size_t j;
/* This requires argc > 1. */
if(argc <= 1) { errno = EDOM; break; }
/* Allocate maximum space. */
for(i = 1; i < argc; i++) spacec += strlen(argv[i]) + 2;
if(!(spacev = malloc(spacec)) || !(listv = malloc(argc * 2))) break;
sv = spacev;
/* Copy and split the arguments. */
for(i = 1; i < argc; i++) {
const char *const word = argv[i];
const size_t word_len = strlen(word),
w0_len = word_len / 2, w1_len = word_len - w0_len;
if(w0_len) {
listv[listc++] = sv;
memcpy(sv, word, w0_len);
sv += w0_len;
*(sv++) = '\0';
}
if(w1_len) {
listv[listc++] = sv;
memcpy(sv, word + w0_len, w1_len);
sv += w1_len;
*(sv++) = '\0';
}
}
/* Sort. */
qsort(listv, listc, sizeof listv, &strcompare);
for(j = 0; j < listc; j++) printf("%s\n", listv[j]);
is_done = 1;
} while(0); if(!is_done) {
perror("split");
} {
free(spacev);
free(listv);
}
return is_done ? EXIT_SUCCESS : EXIT_FAILURE;
}
It is simpler than your original; instead of allocating each string individually, it counts the maximum number of chars needed (plus two for two null terminators) and allocates the block all at once (space.) The pointers to the new list also need allocating, the maximum is 2 * argc. Once you copy and modify the argument list, one has an actual array of strings that one can qsort.

Implementing syllabification algorithm but is really slow

I implemented simple syllabification algorithm following Improved Lansky algorithm but it's really slow when I need to run this algorithm on corpus over 2 million words. Could someone point me in the direction what causes it to be so slow? Algorithm below:
Everything after the last vowel (vowel group) belongs to the last syllable
Everything before the first vowel (vowel group) belongs to the first syllable
If the number of consonants between vowels is even (2n), they are divided into the
halves first half belongs to the left vowel(s) and second to the right vowel(s) (n/n).
If the number of consonants between vowel(s) is odd(2n + 1), we divide them into
n / n + 1 parts.
If there is only one consonant between vowels, it belongs to the left vowel(s).
#include <stdio.h>
#include <string.h>
#define VOWELS "aeiou"
int get_n_consonant_between(char *word, int length) {
int count = 0;
int i = 0;
while (i++ < length) {
if (strchr(VOWELS, *word)) break;
word++;
count++;
}
return count;
}
void syllabification(char *word, int n_vowel_groups) {
int i = 0, length = strlen(word), consonants;
int syllables = 0, vowel_group = 0, syl_length = 0;
char *syllable = word;
char hola[length];
memset(hola, 0, length);
if (n_vowel_groups < 2) {
printf("CAN'T BE SPLIT INTO SYLLABLES\n\n");
return;
}
while (i < length) {
if (strchr(VOWELS, word[i])) {
syl_length++;
i++;
if (vowel_group) continue;
vowel_group = 1;
}
else {
if (vowel_group) {
consonants = get_n_consonant_between(word + i, length - i);
if (consonants == 1) {
// printf("only one consonant\n");
syl_length++;
strncpy(hola, syllable, syl_length);
i++;
}
else {
int count = consonants / 2;
if ((consonants % 2) == 0) { /* number of consonants is 2n, first half belongs to the left vowel */
syl_length += count;
}
else {
syl_length += count;
}
strncpy(hola, syllable, syl_length);
i += count;
}
syllables++;
if (syllables == n_vowel_groups) {
printf("syllable done %d: %s\n", syllables, syllable);
break;
}
printf("syllable %d: %s\n", syllables, hola);
syllable = word + i;
syl_length = 0;
memset(hola, 0, length);
}
else {
syl_length++;
i++;
}
vowel_group = 0;
}
}
}
int count_vowel_groups(char *word) {
int i, nvowels = 0;
int vowel_group = 0;
for (i = 0; i < strlen(word); i++) {
if (strchr(VOWELS, word[i])) {
if (vowel_group) continue;
vowel_group = 1;
}
else {
if (vowel_group) nvowels++;
vowel_group = 0;
}
}
// printf("%d vowel groups\n", nvowels);
return nvowels;
}
void repl() {
char *line = NULL;
size_t len = 0;
int i = 0;
int count;
FILE *file = fopen("../syllables.txt", "r");
while(i++ < 15) {
getline(&line, &len, file);
printf("\n\n%s\n", line);
count = count_vowel_groups(line);
syllabification(line, count);
}
}
int main(int argc, char *argv[]) {
// printf("Syllabification test:\n");
repl();
}
`
This is a lot of code to go through to even check if the implementation is
correct, mainly because I don't know the terminology (like what exactly is a
vowel group) of the algorithm. I've looked up and google returns me a lot of
research papers (for which I can only see the abstract) for syllabification of
different languages, so I'm not sure if the code is correct at all.
But I have a few suggestions that might make your code faster:
Move all you strlen(word) out of the for-loop conditions. Save the length
in a variable and use that variable instead. So from
for (i = 0; i < strlen(word); i++)
to
size_t len = strlen(word);
for(i = 0; i < len; i++)
Don't use strchr for checking if a character is a vowel. I'd use a lookup
table for this:
// as global variable
char vowels[256];
int main(void)
{
vowels['a'] = 1;
vowels['e'] = 1;
vowels['i'] = 1;
vowels['o'] = 1;
vowels['u'] = 1;
...
}
and when you want to check if a character is a vowel:
// 0x20 | c make c a lower case character
if(vowel[0x20 | word[i]])
syl_length++;
i++;
if (vowel_group) continue;
vowel_group = 1;
}
The first suggestion might give you a small performance increase, compilers are
pretty clever and might optimize that anyway. The second suggestion might give
you more performance, because it's just a look up. On the worst case
strchr will have to go through the whole "aeiou" array many times.1
I also suggest that you profile your code. See this and this.
fotenotes
1I've made a very crud program that compares the runtime of the
suggestion. I added a few extra bits of code in the hope that the compiler
doesn't optimize the functions to aggressively.
#include <stdio.h>
#include <string.h>
#include <time.h>
int test1(time_t t)
{
char text[] = "The lazy dog is very lazy";
for(size_t i = 0; i < strlen(text); ++i)
t += text[i];
return t;
}
int test2(time_t t)
{
char text[] = "The lazy dog is very lazy";
size_t len = strlen(text);
for(size_t i = 0; i < len; ++i)
t += text[i];
return t;
}
#define VOWELS "aeiou"
char vowels[256];
int test3(time_t t)
{
char text[] = "The lazy dog is very lazy";
size_t len = strlen(text);
for(size_t i = 0; i < len; ++i)
{
if (strchr(VOWELS, text[i]))
t += text[i];
t += text[i];
}
return t;
}
int test4(time_t t)
{
char text[] = "The lazy dog is very lazy";
size_t len = strlen(text);
for(size_t i = 0; i < len; ++i)
{
if(vowels[0x20 | text[i]])
t += text[i];
t += text[i];
}
return t;
}
int main(void)
{
vowels['a'] = 1;
vowels['e'] = 1;
vowels['i'] = 1;
vowels['o'] = 1;
vowels['u'] = 1;
long times = 50000000;
long tmp = 0;
clock_t t1 = 0, t2 = 0, t3 = 0, t4 = 0;
for(long i = 0; i < times; ++i)
{
clock_t start,end;
time_t t = time(NULL);
start = clock();
tmp += test1(t);
end = clock();
t1 += end - start;
//t1 += ((double) (end - start)) / CLOCKS_PER_SEC;
start = clock();
tmp += test2(t);
end = clock();
t2 += end - start;
start = clock();
tmp += test3(t);
end = clock();
t3 += end - start;
start = clock();
tmp += test4(t);
end = clock();
t4 += end - start;
}
printf("t1: %lf %s\n", ((double) t1) / CLOCKS_PER_SEC, t1 < t2 ? "wins":"loses");
printf("t2: %lf %s\n", ((double) t2) / CLOCKS_PER_SEC, t2 < t1 ? "wins":"loses");
printf("t3: %lf %s\n", ((double) t3) / CLOCKS_PER_SEC, t3 < t4 ? "wins":"loses");
printf("t4: %lf %s\n", ((double) t4) / CLOCKS_PER_SEC, t4 < t3 ? "wins":"loses");
printf("tmp: %ld\n", tmp);
return 0;
}
The results are:
$ gcc b.c -ob -Wall -O0
$ ./b
t1: 10.866770 loses
t2: 7.588057 wins
t3: 10.801546 loses
t4: 8.366050 wins
$ gcc b.c -ob -Wall -O1
$ ./b
t1: 7.409297 loses
t2: 7.082418 wins
t3: 11.415080 loses
t4: 7.847086 wins
$ gcc b.c -ob -Wall -O2
$ ./b
t1: 6.292438 loses
t2: 5.855348 wins
t3: 9.306874 loses
t4: 6.584076 wins
$ gcc b.c -ob -Wall -O3
$ ./b
t1: 6.317390 loses
t2: 5.922087 wins
t3: 9.436450 loses
t4: 6.722685 wins
There are few things which you could do:
1) Profile the program and see where it spends most of the time.
2) Concentrate on most repetitive parts of the code.
3) Avoid multiple scans
4) Do not make unnecessary operations. Example:
a)
Do you need to always memset hola?
memset(hola, 0, length);
Is seems to me that you can get of rid of it.
b)
for (i = 0; i < strlen(word); i++) {
No need to calculate strlen(word) inside the loop.
You can do it outside:
int len = strlen(word);
for (i = 0; i < len; i++) {
You can get really good hints from profiling, follow them and zoom on the bottlenecks.

How to write a word at a time to *char?

I'm trying to implement a version of memset to write a word at a time instead of byte-by-byte.
The code I'm currently working with is:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* set in memory one word at a time
dest: destination
ch: the character to write
count: how many copies of ch to make in dest */
void wmemset(char *dest, int ch, size_t count) {
long long int word; // 64-bit word
unsigned char c = (unsigned char) ch; // explicit conversion
int i, loop, remain;
for (i = 0, word = 0; i < 8 ; ++i) {
word |= ((long long int)c << i * 8);
}
loop = count / 8;
remain = count % 8;
for (i = 0; i < loop; ++i, dest += 8) {
*dest = word; // not possible to set 64-bit to char
}
for (i = 0; i < remain; ++i, ++dest) {
*dest = c;
}
}
int main() {
char *c;
c = (char *) malloc(100);
wmemset(c, 'b', 100);
for (int i = 0; i < 100; i++)
printf("%c", c[i]);
}
I thought the 64-bit word would overflow to the rest of the byte of the pointer but it doesn't seem so.
How do I set one word at a time?
EDIT: added a few more comments and added main function.

How to handle integer overflow in C

I'm trying to create anagrams from words where answers are supposed to be for example:
The word "at" should have two anagrams.
ordeals should have 5040 anagrams.
abcdABCDabcd shoud have 29937600 anagrams.
abcdefghijklmnopqrstuvwxyz should have 403291461126605635584000000 anagrams.
abcdefghijklmabcdefghijklm should have 49229914688306352000000.
My program seems to work for the first three examples but not for the last two. How can I change the program to make it work?
#include <stdio.h>
#include <memory.h>
int contains(char haystack[], char needle) {
size_t len = strlen(haystack);
int i;
for (i = 0; i < len; i++) {
if (haystack[i] == needle) {
return 1;
}
}
return 0;
}
unsigned long long int factorial(unsigned long long int f) {
if (f == 0)
return 1;
return (f * factorial(f - 1));
}
int main(void) {
char str[1000], ch;
unsigned long long int i;
unsigned long long int frequency = 0;
float answer = 0;
char visited[1000];
int indexvisited = 0;
printf("Enter a string: ");
scanf("%s", str);
for (i = 0; str[i] != '\0'; ++i);
unsigned long long int nominator = 1;
for (int j = 0; j < i; ++j) {
ch = str[j];
frequency = 0;
if (!contains(visited, ch)) {
for (int k = 0; str[k] != '\0'; ++k) {
if (ch == str[k])
++frequency;
}
printf("Frequency of %c = %lld\n", ch, frequency);
visited[indexvisited] = ch;
visited[++indexvisited] = '\0';
nominator = nominator * factorial(frequency);
}
}
printf("Number of anagrams = %llu\n", (factorial( i )/nominator ) );
return 0;
}
Even though an unsigned long long is pretty big, it's not completely unbounded. Its maximum value is around 1*10^19. If your source string is 26 characters long, you calculate factorial(26) - which is around 4*10^26, much much bigger than will fit in an unsigned long long.
When you need to work with ridicously large numbers you have to split things, i'd say that using a long double to store the root number and a long unsigned int to store the 10th potence would do the trick.
4*10^26 == ld 4, lui 26 == ld * 10^lui
this could be usefull for calculations, not sure tho how to represent it, it'll overflow everything but a string
Just for the fun, here's the best I could come up with using only built-in datatypes. Instead of calculating factorials over and over (and, btw, avoid recursion for such things!), it has an "intelligent" n over k function. Note that it attempts to detect an overflow, but this is not really reliable.
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef size_t AsciiCountTable[0x80];
static int countAsciiCharacters(const char *input, AsciiCountTable table)
{
const char *c = input;
while (*c)
{
if ((unsigned char)*c > 0x7f)
{
// not an ascii character
return 0;
}
++table[(int)*c];
++c;
}
return 1;
}
static unsigned long long nOverK(size_t n, size_t k)
{
unsigned long long result = 1;
size_t barrier = n - k;
if (k > barrier) barrier = k;
for (size_t i = n; i > barrier; --i)
{
result *= i;
}
for (size_t i = 2; i <= n - barrier; ++i)
{
result /= i;
}
return result;
}
int main(int argc, char **argv)
{
if (argc < 2)
{
fprintf(stderr, "Usage: %s <word>\n", argv[0]);
return EXIT_FAILURE;
}
AsciiCountTable countTable = {0};
if (!countAsciiCharacters(argv[1], countTable))
{
fputs("Only ASCII characters allowed.\n", stderr);
return EXIT_FAILURE;
}
size_t positions = strlen(argv[1]);
unsigned long long permutations = 1;
for (int i = 0; i < 0x80; ++i)
{
size_t n = positions;
size_t k = countTable[i];
if (k > 0)
{
unsigned long long temp = permutations;
permutations *= nOverK(n, k);
if (temp > permutations)
{
fputs("Overflow detected.\n", stderr);
return EXIT_FAILURE;
}
positions -= k;
}
}
printf("permutations: %" PRIuMAX "\n", permutations);
return EXIT_SUCCESS;
}

Threading across multiple files

My program is reading in files and using thread to compute the highest prime number, when I put a print statement into the getNum() function my numbers are printing out. However, it seems to just lag no matter how many threads I input. Each file has 1 million integers in it. Does anyone see something apparently wrong with my code? Basically the code is giving each thread 1000 integers to check before assigning a new thread. I am still a C noobie and am just learning the ropes of threading. My code is a mess right now because I have been switching things around constantly.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include <pthread.h>
#include <math.h>
#include <semaphore.h>
//Global variable declaration
char *file1 = "primes1.txt";
char *file2 = "primes2.txt";
char *file3 = "primes3.txt";
char *file4 = "primes4.txt";
char *file5 = "primes5.txt";
char *file6 = "primes6.txt";
char *file7 = "primes7.txt";
char *file8 = "primes8.txt";
char *file9 = "primes9.txt";
char *file10 = "primes10.txt";
char **fn; //file name variable
int numberOfThreads;
int *highestPrime = NULL;
int fileArrayNum = 0;
int loop = 0;
int currentFile = 0;
sem_t semAccess;
sem_t semAssign;
int prime(int n)//check for prime number, return 1 for prime 0 for nonprime
{
int i;
for(i = 2; i <= sqrt(n); i++)
if(n % i == 0)
return(0);
return(1);
}
int getNum(FILE* file)
{
int number;
char* tempS = malloc(20 *sizeof(char));
fgets(tempS, 20, file);
tempS[strlen(tempS)-1] = '\0';
number = atoi(tempS);
free(tempS);//free memory for later call
return(number);
}
void* findPrimality(void *threadnum) //main thread function to find primes
{
int tNum = (int)threadnum;
int checkNum;
char *inUseFile = NULL;
int x=1;
FILE* file;
while(currentFile < 10){
if(inUseFile == NULL){//inUseFIle being used to check if a file is still being read
sem_wait(&semAccess);//critical section
inUseFile = fn[currentFile];
sem_post(&semAssign);
file = fopen(inUseFile, "r");
while(!feof(file)){
if(x % 1000 == 0 && tNum !=1){ //go for 1000 integers and then wait
sem_wait(&semAssign);
}
checkNum = getNum(file);
/*
*
*
*
* I think the issue is here
*
*
*
*/
if(checkNum > highestPrime[tNum]){
if(prime(checkNum)){
highestPrime[tNum] = checkNum;
}
}
x++;
}
fclose(file);
inUseFile = NULL;
}
currentFile++;
}
}
int main(int argc, char* argv[])
{
if(argc != 2){ //checks for number of arguements being passed
printf("To many ARGS\n");
return(-1);
}
else{//Sets thread cound to user input checking for correct number of threads
numberOfThreads = atoi(argv[1]);
if(numberOfThreads < 1 || numberOfThreads > 10){
printf("To many threads entered\n");
return(-1);
}
time_t preTime, postTime; //creating time variables
int i;
fn = malloc(10 * sizeof(char*)); //create file array and initialize
fn[0] = file1;
fn[1] = file2;
fn[2] = file3;
fn[3] = file4;
fn[4] = file5;
fn[5] = file6;
fn[6] = file7;
fn[7] = file8;
fn[8] = file9;
fn[9] = file10;
sem_init(&semAccess, 0, 1); //initialize semaphores
sem_init(&semAssign, 0, numberOfThreads);
highestPrime = malloc(numberOfThreads * sizeof(int)); //create an array to store each threads highest number
for(loop = 0; loop < numberOfThreads; loop++){//set initial values to 0
highestPrime[loop] = 0;
}
pthread_t calculationThread[numberOfThreads]; //thread to do the work
preTime = time(NULL); //start the clock
for(i = 0; i < numberOfThreads; i++){
pthread_create(&calculationThread[i], NULL, findPrimality, (void *)i);
}
for(i = 0; i < numberOfThreads; i++){
pthread_join(calculationThread[i], NULL);
}
for(i = 0; i < numberOfThreads; i++){
printf("this is a prime number: %d \n", highestPrime[i]);
}
postTime= time(NULL);
printf("Wall time: %ld seconds\n", (long)(postTime - preTime));
}
}
Yes I am trying to find the highest number over all. So I have made some head way the last few hours, rescucturing the program as spudd said, currently I am getting a segmentation fault due to my use of structures, I am trying to save the largest individual primes in the struct while giving them the right indices. This is the revised code. So in short what the first thread is doing is creating all the threads and giving them access points to a very large integer array which they will go through and find prime numbers, I want to implement semaphores around the while loop so that while they are executing every 2000 lines or the end they update a global prime number.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include <pthread.h>
#include <math.h>
#include <semaphore.h>
//Global variable declaration
char *file1 = "primes1.txt";
char *file2 = "primes2.txt";
char *file3 = "primes3.txt";
char *file4 = "primes4.txt";
char *file5 = "primes5.txt";
char *file6 = "primes6.txt";
char *file7 = "primes7.txt";
char *file8 = "primes8.txt";
char *file9 = "primes9.txt";
char *file10 = "primes10.txt";
int numberOfThreads;
int entries[10000000];
int entryIndex = 0;
int fileCount = 0;
char** fileName;
int largestPrimeNumber = 0;
//Register functions
int prime(int n);
int getNum(FILE* file);
void* findPrimality(void *threadNum);
void* assign(void *num);
typedef struct package{
int largestPrime;
int startingIndex;
int numberCount;
}pack;
//Beging main code block
int main(int argc, char* argv[])
{
if(argc != 2){ //checks for number of arguements being passed
printf("To many threads!!\n");
return(-1);
}
else{ //Sets thread cound to user input checking for correct number of threads
numberOfThreads = atoi(argv[1]);
if(numberOfThreads < 1 || numberOfThreads > 10){
printf("To many threads entered\n");
return(-1);
}
int threadPointer[numberOfThreads]; //Pointer array to point to entries
time_t preTime, postTime; //creating time variables
int i;
fileName = malloc(10 * sizeof(char*)); //create file array and initialize
fileName[0] = file1;
fileName[1] = file2;
fileName[2] = file3;
fileName[3] = file4;
fileName[4] = file5;
fileName[5] = file6;
fileName[6] = file7;
fileName[7] = file8;
fileName[8] = file9;
fileName[9] = file10;
FILE* filereader;
int currentNum;
for(i = 0; i < 10; i++){
filereader = fopen(fileName[i], "r");
while(!feof(filereader)){
char* tempString = malloc(20 *sizeof(char));
fgets(tempString, 20, filereader);
tempString[strlen(tempString)-1] = '\0';
entries[entryIndex] = atoi(tempString);
entryIndex++;
free(tempString);
}
}
//sem_init(&semAccess, 0, 1); //initialize semaphores
//sem_init(&semAssign, 0, numberOfThreads);
time_t tPre, tPost;
pthread_t coordinate;
tPre = time(NULL);
pthread_create(&coordinate, NULL, assign, (void**)numberOfThreads);
pthread_join(coordinate, NULL);
tPost = time(NULL);
}
}
void* findPrime(void* pack_array)
{
pack* currentPack= pack_array;
int lp = currentPack->largestPrime;
int si = currentPack->startingIndex;
int nc = currentPack->numberCount;
int i;
int j = 0;
for(i = si; i < nc; i++){
while(j < 2000 || i == (nc-1)){
if(prime(entries[i])){
if(entries[i] > lp)
lp = entries[i];
}
j++;
}
}
return (void*)currentPack;
}
void* assign(void* num)
{
int y = (int)num;
int i;
int count = 10000000/y;
int finalCount = count + (10000000%y);
int sIndex = 0;
pack pack_array[(int)num];
pthread_t workers[numberOfThreads]; //thread to do the workers
for(i = 0; i < y; i++){
if(i == (y-1)){
pack_array[i].largestPrime = 0;
pack_array[i].startingIndex = sIndex;
pack_array[i].numberCount = finalCount;
}
pack_array[i].largestPrime = 0;
pack_array[i].startingIndex = sIndex;
pack_array[i].numberCount = count;
pthread_create(&workers[i], NULL, findPrime, (void *)&pack_array[i]);
sIndex += count;
}
for(i = 0; i< y; i++)
pthread_join(workers[i], NULL);
}
//Functions
int prime(int n)//check for prime number, return 1 for prime 0 for nonprime
{
int i;
for(i = 2; i <= sqrt(n); i++)
if(n % i == 0)
return(0);
return(1);
}
Here is my latest update, having issues with my threads running, the only thread is thread 0 that is completing
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include <pthread.h>
#include <math.h>
#include <semaphore.h>
//Global variable declaration
char *file1 = "primes1.txt";
char *file2 = "primes2.txt";
char *file3 = "primes3.txt";
char *file4 = "primes4.txt";
char *file5 = "primes5.txt";
char *file6 = "primes6.txt";
char *file7 = "primes7.txt";
char *file8 = "primes8.txt";
char *file9 = "primes9.txt";
char *file10 = "primes10.txt";
sem_t semHold;
int numberOfThreads;
long unsigned int entries[10000000];
unsigned int entryIndex = 0;
int fileCount = 0;
char** fileName;
long unsigned int largestPrimeNumber = 0;
//Register functions
int prime(unsigned int n);
int getNum(FILE* file);
void* findPrimality(void *threadNum);
void* assign(void *num);
typedef struct package{
long unsigned int largestPrime;
unsigned int startingIndex;
unsigned int numberCount;
}pack;
pack pack_array[10];
//Beging main code block
int main(int argc, char* argv[])
{
if(argc != 2){ //checks for number of arguements being passed
printf("To many threads!!\n");
return(-1);
}
else{ //Sets thread cound to user input checking for correct number of threads
numberOfThreads = atoi(argv[1]);
if(numberOfThreads < 1 || numberOfThreads > 10){
printf("To many threads entered\n");
return(-1);
}
int threadPointer[numberOfThreads]; //Pointer array to point to entries
int i;
fileName = malloc(10 * sizeof(char*)); //create file array and initialize
fileName[0] = file1;
fileName[1] = file2;
fileName[2] = file3;
fileName[3] = file4;
fileName[4] = file5;
fileName[5] = file6;
fileName[6] = file7;
fileName[7] = file8;
fileName[8] = file9;
fileName[9] = file10;
FILE* filereader;
long unsigned int currentNum;
sem_init(&semHold, 0, 1);
for(i = 0; i < 10; i++){
filereader = fopen(fileName[i], "r");
while(fscanf(filereader, "%lu" , &currentNum)!= EOF){
entries[entryIndex] = currentNum;
// while(entryIndex < 5){
//char* tempString = malloc(20 *sizeof(long unsigned int));
//fgets(tempString, 20, filereader);
//tempString[strlen(tempString)-1] = '\0';
//currentNum = atoi(tempString);
//printf("Test %lu\n",currentNum);
//entries[entryIndex] = atoi(tempString);
//entryIndex++;
//free(tempString);
//}
entryIndex++;
}
}
printf("Test %lu\n",entries[9999999]);
//sem_init(&semAccess, 0, 1); //initialize semaphores
//sem_init(&semAssign, 0, numberOfThreads);
time_t tPre, tPost;
pthread_t coordinate;
tPre = time(NULL);
pthread_create(&coordinate, NULL, assign, (void**)numberOfThreads);
pthread_join(coordinate, NULL);
tPost = time(NULL);
printf("Largest prime = %lu , time: %ld\n", largestPrimeNumber,(long)(tPost-tPre));
}
}
void* findPrime(void* pack_array)
{
pack* currentPack = pack_array;
unsigned int lp = currentPack->largestPrime;
unsigned int si = currentPack->startingIndex;
unsigned int nc = currentPack->numberCount;
int i;
printf("Starting index Count: %d\n", si);
for(i = si; i < nc; i++){
if(i%100000==0)
printf("Here is i: %d\n", i);
if(entries[i]%2 != 0){
if(entries[i] > currentPack->largestPrime){
if(prime(entries[i])){
currentPack->largestPrime = entries[i];
printf("%lu\n", currentPack->largestPrime);
if(currentPack->largestPrime > largestPrimeNumber)
sem_wait(&semHold);
largestPrimeNumber = currentPack->largestPrime;
sem_post(&semHold);
}
}
}
}
}
void* assign(void* num)
{
int y = (int)num;
int i;
int count = 10000000/y;
int finalCount = count + (10000000%y);
int sIndex = 0;
printf("This is count: %d\n", count);
printf("This is final count: %d\n", finalCount);
pthread_t workers[y]; //thread to do the workers
for(i = 0; i < y; i++){
printf("for thread %d Starting index: %d\n", i, sIndex);
if(i == (y-1)){
pack_array[i].largestPrime = 0;
pack_array[i].startingIndex = sIndex;
pack_array[i].numberCount = finalCount;
}
pack_array[i].largestPrime = 0;
pack_array[i].startingIndex = sIndex;
pack_array[i].numberCount = count;
pthread_create(&workers[i], NULL, findPrime, (void *)&pack_array[i]);
printf("thread created\n");
sIndex += count;
}
for(i = 0; i < y; i++)
pthread_join(workers[i], NULL);
}
//Functions
int prime(unsigned int n)//check for prime number, return 1 for prime 0 for nonprime
{
int i;
for(i = 2; i <= sqrt(n); i++)
if(n % i == 0)
return(0);
return(1);
}
OK here's part of my solution, it's missing most of main, and has some other simple stuff missing, if you choose to base your code around this you can do one of two things load all the data before starting your workers, or have the main thread load it while the workers are running, I did the latter in my complete version. However you'll have to do some work to get that to be handled correctly because currently the workers will never exit.
Also you might want to try adapting your single array code above based on this.
So if you load all the data before starting the workers you don't need the condition variable and they can just exit when next_chunk is NULL. I recommend you figure out how to get loading while the workers are running working because it'll be more efficient.
Hint: pthread_cond_broadcast()
Also missing is the actual worker function.
// A singly linked list of chunks of 1000 numbers
// we use it as a queue of data to be processed
struct number_chunk
{
struct number_chunk *next;
int size;
int nums[1000];
};
pthread_mutex_t cnklst_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t data_available = PTHREAD_COND_INITIALIZER;
struct number_chunk *next_chunk = NULL;
void load_chunks(char *filename)
{
FILE *in = fopen(filename, "r");
int done = 0;
int i;
if(in == NULL) {
fprintf(stderr, "Failed to open file %s\n", filename);
return;
}
// read in all the chunks of 1000 numbers from the file
while(!done) {
struct number_chunk *cnk = malloc(sizeof(struct number_chunk)); // allocate a new chunk
cnk->next = NULL;
for(i=0; i < 1000; i++) { // do the actual reading
char tmp[20];
if(fgets(tmp, 20, in) == NULL) { // end of file, leave the read loop
done = 1;
break;
}
cnk->nums[i] = atoi(tmp);
}
// need to do this so that the last chunk in a file can have less than 1000 numbers in it
cnk->size = i;
// add it to the list of chunks to be processed
pthread_mutex_lock(&cnklst_mutex);
cnk->next = next_chunk;
next_chunk = cnk;
pthread_cond_signal(&data_available); // wake a waiting worker
pthread_mutex_unlock(&cnklst_mutex);
}
fclose(in);
}
struct number_chunk *get_chunk()
{
struct number_chunk *cnk = NULL;
pthread_mutex_lock(&cnklst_mutex);
//FIXME: if we finish we will never exit the thread
// need to return NULL when all the work that there will ever be
// is done, altertitively load everything before starting the workers and
// get rid of all the condition variable stuff
while(next_chunk == NULL)
pthread_cond_wait(&data_available, &cnklst_mutex);
cnk = next_chunk;
if(next_chunk != NULL) next_chunk = next_chunk->next;
pthread_mutex_unlock(&cnklst_mutex);
return cnk;
}
The way my workers report the final max prime is to just do it at the end by looking at a single global variable and setting it or not based on the highest prime they found during their run. Obviously you'll need to synchronize for that.
Also note it uses a mutex rather than a semaphore because of the use of pthread_cond_wait() If you haven't covered condition variables yet just drop that stuff and load everything before starting your workers.
Also since this is homework, read my code try to understand it then without looking at it again try to write your own.
I would have changed it more but I'm not sure how because it's already basically a really generic producer/consumer example that's missing some bits :P
Another thing to try if you do decide to adopt the same strategy I did and have the loading running in the main thread while the workers work you could add a second condition variable and a counter to limit the number of chunks in the queue and have your workers wake up the main thread if they run out of work.

Resources