Unexpected results printing array of int pointers in C - c

I have a simple C program which uses a varying number of pthreads to find the first N prime numbers, adding candidates found to be prime to an array. The array is passed as an arg to each thread, with each thread executing nearly all of its code in a critical section. I have no issue with the prime checking, and the printing to screen of prime numbers and the result_number variable as they are found works. However, when N primes are found, and the array is printed, I find that (roughly) every second time the program executes, some (variably from 1 to 5) of the early prime number array elements (generally restricted to those < 17) are printed out as extremely large or negative numbers, with the majority of primes printing fine. Only the code of the thread function is below (not checkPrime or main), as everything else seems to work fine.
Also, if the program is executed with a single thread (i.e. no sharing of the array between multiple threads for updating), this peculiarity never occurs.
result_number, candidate, N are all global vars.
void *primeNums(void *arg) {
pthread_mutex_lock(&mutex);
int *array = (int *) arg;
int is_prime = 0;
int j = 0;
while (result_number <= N) {
candidate++;
is_prime = checkPrime(candidate);
if (is_prime == 1) {
array[result_number] = candidate;
if (result_number == N) {
while (j < N) {
printf("%d\n", array[j]);
j++;
}
}
/* Test verification output; always accurate */
printf("Result number: %d = %d\n", result_number, candidate);
result_number++;
}
}
pthread_mutex_unlock(&mutex);
}
I genuinely do not believe other Qs cover this, as I have looked (and would have preferred finding an answer to writing my own question). There is a chance that I am not searching properly, admittedly.
EDIT: Example unwanted output:
-386877696
3
-395270400
32605
11
13
...
Continues on fine from here.

I have a hunch that if, instead of passing in array and using the global result_number, you pass in array + result_number and loop on a local variable: i = 0; while(i < N) {...}, you may solve your problem. (assuming array + result_number + N - 1 doesn't go out of bounds).
The problem with using global variables to hold the range information for each thread is that the thread function primeNums() modifies some of them. So if you start your first thread (thread #1) with result_number set to the start of the range you want thread #1 to process, thread #1 will keep changing its value while you reset it to give to thread #2. So thread #2 won't be processing the range you want it to process.
I assume you want each thread to process a separate range of indexes in your array? Currently you are passing a pointer to the beginning of the array to the function and using a global variable to hold the index into the array of the chunk you want to have processed by that thread.
To avoid using that global index all you have to do is pass a pointer to an offset into the middle of your array where you want processing to begin. So rather than pass in the value that points to the beginning element (array) pass in a value that points to the element you want to start processing from (array + result_number).
If you do that, inside your function primeNums() it acts as if the pointer you passed in is the beginning of an array (even though its somewhere in the middle) and you can run your loop from 0 to N because you have already added result_number before you called the function.
Having said all that I suspect you are still not going to be able to process this array in parallel (if that's is indeed what you are trying to do) because each thread relies on the candidate being set to the largest value from the previous thread...
To protect 'candidate` from being simultaneously changed by the code that launches the other threads (if you do that) you can take a copy of that variable after you synchronize on your mutex (lock). But, to be honest, I am not sure if this algorithm is going to let you parallelise this processing.

Related

How to solve a runtime error happening when I use a big size of static array

my development environment : visual studio
Now, I have to create a input file and print random numbers from 1 to 500000 without duplicating in the file. First, I considered that if I use a big size of local array, problems related to heap may happen. So, I tried to declare as a static array. Then, in main function, I put random numbers without overlapping in the array and wrote the numbers in input file accessing array elements. However, runtime errors(the continuous blinking of the cursor in the console window) continue to occur.
The source code is as follows.
#define SIZE 500000
int sort[500000];
int main()
{
FILE* input = NULL;
input = fopen("input.txt", "w");
if (sort != NULL)
{
srand((unsigned)time(NULL));
for (int i = 0; i < SIZE; i++)
{
sort[i] = (rand() % SIZE) + 1;
for (int j = 0; j < i; j++)
{
if (sort[i] == sort[j])
{
i--;
break;
}
}
}
for (int i = 0; i < SIZE; i++)
{
fprintf(input, "%d ", sort[i]);
}
fclose(input);
}
return 0;
}
When I tried to reduce the array size from 1 to 5000, it has been implemented. So, Carefully, I think it's a memory out phenomenon. Finally, I'd appreciate it if you could comment on how to solve this problem.
“First, I considered that if I use a big size of local array, problems related to heap may happen.”
That does not make any sense. Automatic local objects generally come from the stack, not the heap. (Also, “heap” is the wrong word; a heap is a particular kind of data structure, but the malloc family of routines may use other data structures for managing memory. This can be referred to simply as dynamically allocated memory or allocated memory.)
However, runtime errors(the continuous blinking of the cursor in the console window)…
Continuous blinking of the cursor is normal operation, not a run-time error. Perhaps you are trying to say your program continues executing without ever stopping.
#define SIZE 500000<br>
...
sort[i] = (rand() % SIZE) + 1;
The C standard only requires rand to generate numbers from 0 to 32767. Some implementations may provide more. However, if your implementation does not generate numbers up to 499,999, then it will never generate the numbers required to fill the array using this method.
Also, using % to reduce the rand result skews the distribution. For example, if we were reducing modulo 30,000, and rand generated numbers from 0 to 44,999, then rand() % 30000 would generate the numbers from 0 to 14,999 each two times out of every 45,000 and the numbers from 15,000 to 29,999 each one time out of every 45,000.
for (int j = 0; j < i; j++)
So this algorithm attempts to find new numbers by rejecting those that duplicate previous numbers. When working on the last of n numbers, the average number of tries is n, if the selection of random numbers is uniform. When working on the second-to-last number, the average is n/2. When working on the third-to-last, the average is n/3. So the average number of tries for all the numbers is n + n/2 + n/3 + n/4 + n/5 + … 1.
For 5000 elements, this sum is around 45,472.5. For 500,000 elements, it is around 6,849,790. So your program will average around 150 times the number of tries with 500,000 elements than with 5,000. However, each try also takes longer: For the first try, you check against zero prior elements for duplicates. For the second, you check against one prior element. For try n, you check against n−1 elements. So, for the last of 500,000 elements, you check against 499,999 elements, and, on average, you have to repeat this 500,000 times. So the last try takes around 500,000•499,999 = 249,999,500,000 units of work.
Refining this estimate, for each selection i, a successful attempt that gets completely through the loop of checking requires checking against all i−1 prior numbers. An unsuccessful attempt will average going halfway through the prior numbers. So, for selection i, there is one successful check of i−1 numbers and, on average, n/(n+1−i) unsuccessful checks of an average of (i−1)/2 numbers.
For 5,000 numbers, the average number of checks will be around 107,455,347. For 500,000 numbers, the average will be around 1,649,951,055,183. Thus, your program with 500,000 numbers takes more than 15,000 times as long than with 5,000 numbers.
When I tried to reduce the array size from 1 to 5000, it has been implemented.
I think you mean that with an array size of 5,000, the program completes execution in a short amount of time?
So, Carefully, I think it's a memory out phenomenon.
No, there is no memory issue here. Modern general-purpose computer systems easily handle static arrays of 500,000 int.
Finally, I'd appreciate it if you could comment on how to solve this problem.
Use a Fischer-Yates shuffle: Fill the array A with integers from 1 to SIZE. Set a counter, say d to the number of selections completed so far, initially zero. Then pick a random number r from 1 to SIZE-d. Move the number in that position of the array to the front by swapping A[r] with A[d]. Then increment d. Repeat until d reaches SIZE-1.
This will swap a random element of the initial array into A[0], then a random element from those remaining into A[1], then a random element from those remaining into A[2], and so on. (We stop when d reaches SIZE-1 rather than when it reaches SIZE because, once d reaches SIZE-1, there is only one more selection to make, but there is also only one number left, and it is already in the last position in the array.)

Making a character array rotate its cells left/right n times

I'm totally new here but I heard a lot about this site and now that I've been accepted for a 7 months software development 'bootcamp' I'm sharpening my C knowledge for an upcoming test.
I've been assigned a question on a test that I've passed already, but I did not finish that question and it bothers me quite a lot.
The question was a task to write a program in C that moves a character (char) array's cells by 1 to the left (it doesn't quite matter in which direction for me, but the question specified left). And I also took upon myself NOT to use a temporary array/stack or any other structure to hold the entire array data during execution.
So a 'string' or array of chars containing '0' '1' '2' 'A' 'B' 'C' will become
'1' '2' 'A' 'B' 'C' '0' after using the function once.
Writing this was no problem, I believe I ended up with something similar to:
void ArrayCharMoveLeft(char arr[], int arrsize, int times) {
int i;
for (i = 0; i <= arrsize ; i++) {
ArraySwap2CellsChar(arr, i, i+1);
}
}
As you can see the function is somewhat modular since it allows to input how many times the cells need to move or shift to the left. I did not implement it, but that was the idea.
As far as I know there are 3 ways to make this:
Loop ArrayCharMoveLeft times times. This feels instinctively inefficient.
Use recursion in ArrayCharMoveLeft. This should resemble the first solution, but I'm not 100% sure on how to implement this.
This is the way I'm trying to figure out: No loop within loop, no recursion, no temporary array, the program will know how to move the cells x times to the left/right without any issues.
The problem is that after swapping say N times of cells in the array, the remaining array size - times are sometimes not organized. For example:
Using ArrayCharMoveLeft with 3 as times with our given array mentioned above will yield
ABC021 instead of the expected value of ABC012.
I've run the following function for this:
int i;
char* lastcell;
if (!(times % arrsize))
{
printf("Nothing to move!\n");
return;
}
times = times % arrsize;
// Input checking. in case user inputs multiples of the array size, auto reduce to array size reminder
for (i = 0; i < arrsize-times; i++) {
printf("I = %d ", i);
PrintArray(arr, arrsize);
ArraySwap2CellsChar(arr, i, i+times);
}
As you can see the for runs from 0 to array size - times. If this function is used, say with an array containing 14 chars. Then using times = 5 will make the for run from 0 to 9, so cells 10 - 14 are NOT in order (but the rest are).
The worst thing about this is that the remaining cells always maintain the sequence, but at different position. Meaning instead of 0123 they could be 3012 or 2301... etc.
I've run different arrays on different times values and didn't find a particular pattern such as "if remaining cells = 3 then use ArrayCharMoveLeft on remaining cells with times = 1).
It always seem to be 1 out of 2 options: the remaining cells are in order, or shifted with different values. It seems to be something similar to this:
times shift+direction to allign
1 0
2 0
3 0
4 1R
5 3R
6 5R
7 3R
8 1R
the numbers change with different times and arrays. Anyone got an idea for this?
even if you use recursion or loops within loops, I'd like to hear a possible solution. Only firm rule for this is not to use a temporary array.
Thanks in advance!
If irrespective of efficiency or simplicity for the purpose of studying you want to use only exchanges of two array elements with ArraySwap2CellsChar, you can keep your loop with some adjustment. As you noted, the given for (i = 0; i < arrsize-times; i++) loop leaves the last times elements out of place. In order to correctly place all elements, the loop condition has to be i < arrsize-1 (one less suffices because if every element but the last is correct, the last one must be right, too). Of course when i runs nearly up to arrsize, i+times can't be kept as the other swap index; instead, the correct index j of the element which is to be put at index i has to be computed. This computation turns out somewhat tricky, due to the element having been swapped already from its original place. Here's a modified variant of your loop:
for (i = 0; i < arrsize-1; i++)
{
printf("i = %d ", i);
int j = i+times;
while (arrsize <= j) j %= arrsize, j += (i-j+times-1)/times*times;
printf("j = %d ", j);
PrintArray(arr, arrsize);
ArraySwap2CellsChar(arr, i, j);
}
Use standard library functions memcpy, memmove, etc as they are very optimized for your platform.
Use the correct type for sizes - size_t not int
char *ArrayCharMoveLeft(char *arr, const size_t arrsize, size_t ntimes)
{
ntimes %= arrsize;
if(ntimes)
{
char temp[ntimes];
memcpy(temp, arr, ntimes);
memmove(arr, arr + ntimes, arrsize - ntimes);
memcpy(arr + arrsize - ntimes, temp, ntimes);
}
return arr;
}
But you want it without the temporary array (more memory efficient, very bad performance-wise):
char *ArrayCharMoveLeft(char *arr, size_t arrsize, size_t ntimes)
{
ntimes %= arrsize;
while(ntimes--)
{
char temp = arr[0];
memmove(arr, arr + 1, arrsize - 1);
arr[arrsize -1] = temp;
}
return arr;
}
https://godbolt.org/z/od68dKTWq
https://godbolt.org/z/noah9zdYY
Disclaimer: I'm not sure if it's common to share a full working code here or not, since this is literally my first question asked here, so I'll refrain from doing so assuming the idea is answering specific questions, and not providing an example solution for grabs (which might defeat the purpose of studying and exploring C). This argument is backed by the fact that this specific task is derived from a programing test used by a programing course and it's purpose is to filter out applicants who aren't fit for intense 7 months training in software development. If you still wish to see my code, message me privately.
So, with a great amount of help from #Armali I'm happy to announce the question is answered! Together we came up with a function that takes an array of characters in C (string), and without using any previously written libraries (such as strings.h), or even a temporary array, it rotates all the cells in the array N times to the left.
Example: using ArrayCharMoveLeft() on the following array with N = 5:
Original array: 0123456789ABCDEF
Updated array: 56789ABCDEF01234
As you can see the first cell (0) is now the sixth cell (5), the 2nd cell is the 7th cell and so on. So each cell was moved to the left 5 times. The first 5 cells 'overflow' to the end of the array and now appear as the Last 5 cells, while maintaining their order.
The function works with various array lengths and N values.
This is not any sort of achievement, but rather an attempt to execute the task with as little variables as possible (only 4 ints, besides the char array, also counting the sub function used to swap the cells).
It was achieved using a nested loop so by no means its efficient runtime-wise, just memory wise, while still being self-coded functions, with no external libraries used (except stdio.h).
Refer to Armali's posted solution, it should get you the answer for this question.

Conceptual thread issue

I'm generating hashes (MD5) of numbers from 1 to N in some threads. According to the first letter of the hash, the number that generates it is stored in an array. E.g, the number 1 results in c4ca4238a0b923820dcc509a6f75849b and the number 2 in c81e728d9d4c2f636f067f89cc14862c, so they are stored in a specific array of hashes that starts with "c".
The problem is that I need to generate them sorted from the lower to the higher. It is very expensive to sort them after the sequence is finished, N can be as huge as 2^40. As I'm using threads the sorting never happens naturally. E.g. One thread can generate the hash of the number 12 (c20ad4d76fe97759aa27a0c99bff6710) and store it on "c" array and other then generates the hash of the number 8 (c9f0f895fb98ab9159f51fd0297e236d) and store it after the number 12 in "c" array.
I can't simply verify the last number on the array because as far as the threads are running they can be very far away from each other.
Is there any solution for this thread problem? Any solution that is faster than order the array after all the threads are finished would be great.
I'm implementing this in C.
Thank you!
Instead of having one array for each prefix (eg. "c"), have one array per thread for each prefix. Each thread inserts only into its own arrays, so it will always insert the numbers in increasing order and the individual thread arrays will remain sorted.
You can then quickly (O(N)) coalesce the arrays at the end of the process, since the individual arrays will all be sorted. This will also speed up the creation process, since you won't need any locking around the arrays.
Since you mentioned pthreads I'm going to assume you're using gcc (this is not necessarily the case but it's probably the case). You can use the __sync_fetch_and_add to get the value for the end of the array and add one to it in one atomic operation. It would go something like the following:
insertAt = __sync_fetch_and_add(&size[hash], 1);
arrayOfInts[insertAt] = val;
The only problem you'll run into is if you need to resize the arrays (not sure if you know the array size beforehand or not). For that you will need a lock (most efficiently one lock per array) that you lock exclusively while reallocating the array and non-exclusively when inserting. Particularly this could be done with the following functions (which assume programmer does not release an unlocked lock):
// Flag 2 indicates exclusive lock
void lockExclusive(int* lock)
{
while(!__sync_bool_compare_and_swap(lock, 0, 2));
}
void releaseExclusive(int* lock)
{
*lock = 0;
}
// Flag 8 indicates locking
// Flag 1 indicates non-exclusive lock
void lockNonExclusive(int* lock, int* nonExclusiveCount)
{
while((__sync_fetch_and_or(lock, 9) & 6) != 0);
__sync_add_and_fetch(nonExclusiveCount, 1);
__sync_and_and_fetch(lock, ~8);
}
// Flag 4 indicates unlocking
void releaseNonExclusive(int* lock, int* nonExclusiveCount)
{
while((__sync_fetch_and_or(lock, 4) & 8) != 0);
if(__sync_sub_and_fetch(nonExclusiveCount) == 0);
__sync_and_and_fetch(lock, ~1);
__sync_and_and_fetch(lock, 4);
}

Strange segmentation fault

I've defined a random function (int random(int sup, int seed)) which returns a value between 0 and sup-1.
I've defined a struct, point, of which pos_parents and population are 2-dimensional arrays.
The swap functions swap elements of the v array, which is a array of "indexes". All is done in order to sort out par_n members into pos_parents out of population members without sorting twice the same member.
This gives segmentation fault.
If I replace the variable r inside population[v[r]][j] with an explicit value, then it all functions. How is this possible? I've tried the random function and it doesn't seem to have any problem.
In addition, when segmentation fault occours, printf won't even activate for the first loop.
point population[pop_size][array_size];
point pos_parents[4*par_n][array_size];
int v[pop_size];
for (i=0; i<4*par_n;i++)
v[i]=i;
for(t=0;t<time_limit;t++) //The cycle of life
{
for(i=0;i<4*par_n;i++)
{
r=random(pop_size-i,i);
printf("%d\t",r);
for(j=0;j<array_size;j++)
{
pos_parents[i][j]=population[v[r]][j];
}
swap(&(v[r]),&(v[pop_size-1-i]));
}
When executing i type 3(route locations-array size), 8(pop_size), 1(time limit), 1 (par_n)
This is the entire code (less than 150 lines), always insert 1 to time_limit, because I haven't still completed the cycle.
https://docs.google.com/open?id=0ByylOngTmkJddVZqbGs1cS1IZkE
P.S.
I'm trying to write an evolutionary algorithm, for route optimization
The loop with v[i] = i; goes from 0 to 4 * par_n, but v is an array of size pop_size. That looks like an out-of-bounds problem waiting to strike. And the same again for the counter i in r = random(pop_size - i, i);, since i is used in v[i].

Party log register problem

there was a party.there was a log register in which entry and exit time of all the guests was logged.you have to tell the time at which there was maximum guest in the party.
input will be the entry and exit time of all the n guests [1,4] [2,5] [9,12] [5,9] [5,12]
the output will be t=5 as there was maximum 3 guest were there namly guest(starting from 1) 2,4 and 5.
what i tried so far is
main()
{
int ret;
int a[5]={1,2,9,5,5};
int b[5]={4,6,12,9,12};
int i,j;
int runs=5;
int cur = 0,p1 = 0,p2 = 0;
printf("input is ");
for(i=0;i<5;i++)
{
printf("(");
printf("%d,%d",a[i],b[i]);
printf(")");
}
while(runs--)
{
while(p1<5 && p2<5)
{
if(a[p1] <= b[p2])
{
cur ++;
p1 ++ ;
}
else {
cur --;
p2 ++ ;
}
ret = cur ;
}
}
printf("\n the output is %d",ret);
}
i am getting 3 as output..which is completely wrong! where am i making error?
Several things are problematic with your code. Here's a few pointers on where to improve it:
Your algorithm itself is doubtful. Assume that your first guest is the party host and stays from 1 until the end time of the party. With your current code, p2 will never change and you will ignore all other guests' leave times.
Even if your algorithm worked, it would assume that your input is sorted. By iterating p1/p2 you implicitly assume growing times in your array, which is already wrong for your sample input. So you ought to sort the input first.
You are assigning the result ret at each iteration of your main loop. This neglects the fact on whether the current state (cur) is the maximum number of guests or not. Hint: If yo are to compute a maximum of something and don't have any maximum computation in your code, there may be something missing.
Here's a different idea: Assuming you can spare an array of size maxtime, create an array filled with 0s. Process your input by increment the array at a certain time, if a guest arrives, and decrement it when a guest leaves. For example, the first 5 minutes would then look like [1, 1, 0, -1, 1, ...]. Then it's much simpler to walk linearly through the array and compute the maximum prefix sum. It's also much easier this way to compute the full time-interval for how long this maximum number of guests was present.
(If you want to go more fancy and have a much larger total time interval to cover, instead rely on a map with times as keys. Initialize like the array, then process the keys in sorted order.)
You are printing an index instead of the actual time. Try printing edited p1[ret] a[ret].

Resources