How can I make this more efficient? (Merging arrays in C) - c

The program is supposed to merge two arrays and place them in an output array. What I have is:
void Merge(int *arr1, int *arr2, int *output, int arr1size, int arr2size) {
int arr2count = 0, arr1count = 0;
while (arr1count < arr1size) {
if (arr2count >= arr2size) { /* dump arr1 because arr2 is done */
*output++ = *arr1++;
arr1count++;
}
else if (*arr1 < *arr2) {
*output++ = *arr1++;
arr1count++;
}
else {
*output++ = *arr2++;
arr2count++;
}
}
while (arr2count++ < arr2size) { /* dump arr2 */
*output++ = *arr2++;
}
}
How can I make this more efficient? I mean, strip literally any bit of code off to make it slightly more efficient.
For arguement's sake, consider the triple while loop implementation (shown below) less efficient.
while (arr1count < arr1size && arr2count < arr2size) { .... }
while (arr1count < arr1size) { .... }
while (arr2count < arr2size) { .... }
Also, this must use pointer notation, not array notation (I wish...)

I tried removing variables and increments. Note these are minor improvements while the algorithm still takes O(m+n) time.
Edit: incorporated the breaking of loop as mentioned by user2048454
Edit2: Removed two while loops and replaced with memcpy.Thanks to FUZxxl
void Merge2(int *arr1, int *arr2, int *output, int *a1last, int *a2last) {
while (arr1 < a1last && arr2 < a2last) {
if (*arr1 < *arr2) {
*output++ = *arr1++;
}
else {
*output++ = *arr2++;
}
}
/* Replaced while with memcpy () */
memcpy(output,arr1,sizeof(int)*(a1last-arr1));
memcpy(output,arr2,sizeof(int)*(a2last-arr2));
}
}
int main()
{
int a[]={1,3,5,7};
int b[]={2,4,6,8};
int c[10];
int i;
Merge2(a,b,c,&a[4],&b[4]); //&a[4] points to the end address of the array. Do not access value at that address, it is "out of bounds"
for(i=0; i<8; i++)
printf("%d ",c[i]);
printf("\n");
return 0;
}

Something like this?
void Merge(int *arr1, int *arr2, int *output, int arr1size, int arr2size) {
for (int i=0,i1=0,i2=0; i<arr1size+arr2size; i++) {
if (i1==arr1size) *output++ = *arr2++;
else if (i2==arr2size) *output++ = *arr1++;
else if (*arr1<*arr2) *output++ = *arr1++, i1++;
else *output++ = *arr2++, i2++;
}
}

Judging from the above code, the arrays are initially sorted and output should also contain sorted values.
One idea because of the restriction you mention would be to not use arr1count/arr2count, and use instead:
arr1last/arr2last
where:
"arr1last=arr1+arr1size" and "arr2last=arr2+arr2size"
this way you won't have to increment a counter and the compiler will juggle with fewer variables(--*count --*size ++*last) just do the compare on arr1 < arr1last. same for arr2
Also your first if when true, will always be true, so depending on the size of your arrays it might be worth breaking out at that point and going with the triple loop implementation you've mentioned, because the above 2 loop implementation might be inefficient if arr2size=1, arr1size=999 and arr2[0] would be among the first values in 'output'

Related

How to write a C function to detect cycles in a void* array

I'm trying to implement the C function int contains_cycle(void *const array[], size_t length) to detect if there are any "cycles" in an array of void pointers. All elements of this array either point to an adress of this array or to NULL. Pointers still quite overwhelm me and I've got no idea where to start.
Just to clarify, what I mean by cycle, here are some examples. Just for illustration the first element's adress is always at adress 0x1 and pointers have the size of 1 byte.
{NULL, 0x3, 0x2} -> should return 1, cycle between array[1] and array [2]
{0x2, 0x3, 0x1} -> should return 1, cycle between all the elements
{0x2, 0x3, NULL} -> should return 0, no cycle
I would appreciate any help and if my goal is still not quite clear, I am happy to explain more.
My idea would be iterating over the array and somehowe "follow" the pointers to see if I end up on the starting point again. If that's the case for at least one element, I've found a cycle.
Yes. You just "follow the pointers", but you need to know whether you followed to a pointer that you already hit.
So my idea to solve your problem is to make a struct that contains an index instead of a pointer because this makes life so much easier...
typedef struct {
size_t toIndex;
bool marked;
} Entry;
Then I create a new array of all these entries with the same length as the original. I calculate the toIndex that I store in the struct using the current element's pointer minus the address of the array's beginning.
bool contains_cycle(void* array[], size_t length) {
Entry newArray[length];
for(size_t i = 0; i < length; ++i) {
size_t toIndex = ((size_t) array[i] - (size_t) &array[0] ) / sizeof *array;
newArray[i] = (Entry) { toIndex, false };
}
After that I look for the first index where the pointer is not null
size_t index = 0;
for(size_t i = 0; i < length; ++i) {
if (array[i] == NULL) continue;
index = i;
break;
}
Now, if we just let a loop run until we hit some index that is out of bounds (this will implicitly detect if we hit a NULL-element) and check if the current element is already marked. if so, return true.
while(index < length) {
if (newArray[index].marked) return true;
newArray[index].marked = true;
index = newArray[index].toIndex;
}
If the loop exits without a return you know that the loop did not start from there. You now need to check if the loop started from any other index that you haven't marked yet. But I'm too lazy to implement that now. Go try this yourself :)
For now I just return false
return false;
}
I tried to replicate your examples in the main function.
#include <stdio.h>
#include <stdbool.h>
typedef struct {
size_t toIndex;
bool marked;
} Entry;
bool contains_cycle(void* array[], size_t length) {
Entry newArray[length];
for(size_t i = 0; i < length; ++i) {
size_t toIndex = ((size_t) array[i] - (size_t) &array[0] ) / sizeof *array;
newArray[i] = (Entry) { toIndex, false };
}
size_t index = 0;
for(size_t i = 0; i < length; ++i) {
if (array[i] == NULL) continue;
index = i;
break;
}
while(index < length) {
if (newArray[index].marked) return true;
newArray[index].marked = true;
index = newArray[index].toIndex;
}
return false;
}
int main() {
void* example1[3];
void* example2[3];
void* example3[3];
example1[0] = NULL;
example1[1] = &example1[2];
example1[2] = &example1[1];
example2[0] = &example2[1];
example2[1] = &example2[2];
example2[2] = &example2[0];
example3[0] = &example3[1];
example3[1] = &example3[2];
example3[2] = NULL;
printf("%d ", contains_cycle(example1, 3));
printf("%d ", contains_cycle(example2, 3));
printf("%d ", contains_cycle(example3, 3));
}
I'm certain that there can be a faster way but the one above does work with your examples

LRS using C program

So I want to create a function using C to find the longest repeated non overlapping substring in a given string. For example: input banana. Output: an.
I was thinking using comparison of the array of the string and checking for repeats. Is that a viable approach? How would I be able to compare substrings with the rest of the strings. I want to avoid using suffix trees if possible
#include <stdio.h>
#include <string.h>
void stringcheck(char a[],int len, int s1, int s2)
{
int i=s1+1;
int j=s2+1;
if(j<=len&&a[i]==a[j])
{
printf("%c",a[i]);
stringcheck(a,len,i,j);
}
}
void dupcheck(char a[], int len, int start)
{
for(int i=start;i<len-1;i++)
{
for(int j=i+1;j<=len;j++)
{
if(a[i]==a[j])
{
printf("%c",a[i]);
stringcheck(a,len,i,j);
i=len;
}
}
}
}
int main()
{
char input[99];
scanf("%s",input);
int start=0;
int len =strlen(input);
dupcheck(input,len,start);
return 0;
}
Yes, this is a valid approach.
You can compare the string - character by character, that way no need to truly save a substring.
You can see a dynamic solution using c++ taking that approach here: https://www.geeksforgeeks.org/longest-repeating-and-non-overlapping-substring/
This solution can be converted to c without many changes.
Another variant if the option is to save the substring by its' indexes.
You can then compare it against the string, and save the max substring, however this will take O(n^3) when the above solution does it in O(n^2).
edit: I converted the solution to c:
#include <stdio.h>
#include <string.h>
void longestRepeatedSubstring(char * str, char * res)
{
int n = strlen(str);
int LCSRe[n+1][n+1];
int res_length = 0; // To store length of result
int i, j, index = 0;
// Setting all to 0
memset(LCSRe, 0, sizeof(LCSRe));
// building table in bottom-up manner
for (i=1; i<=n; i++)
{
for (j=i+1; j<=n; j++)
{
// (j-i) > LCSRe[i-1][j-1] to remove
// overlapping
if (str[i-1] == str[j-1] &&
LCSRe[i-1][j-1] < (j - i))
{
LCSRe[i][j] = LCSRe[i-1][j-1] + 1;
// updating maximum length of the
// substring and updating the finishing
// index of the suffix
if (LCSRe[i][j] > res_length)
{
res_length = LCSRe[i][j];
index = (i>index) ? i : index;
}
}
else
LCSRe[i][j] = 0;
}
}
// If we have non-empty result, then insert all
// characters from first character to last
// character of string
j=0;
if (res_length > 0) {
for (i = index - res_length + 1; i <= index; i++) {
res[j] = str[i-1];
j++;
}
}
res[j]=0;
}
// Driver program to test the above function
int main()
{
char str[] = "banana";
char res[20];
longestRepeatedSubstring(str, res);
printf("%s",res);
return 0;
}

Changing the contents of an array in a recursive function

I am having trouble understanding something regarding recursion and arrays.
basically, what the program does is to check what is the maximum weights of items that can be placed in two boxes. I know it's far from perfect as it is right now, but this is not the point.
Generally everything is working properly, however, now I decided that I want to see the contents of each box when the weight is maximal. For this purpose I tried using arr1 and arr2.
I don't understand why I get different results for arr1 and arr2 (the first options gives me what I want, the second does not).
This is the program:
#define N 5
int help(int items[N][2], int box1[N], int box2[N], int rules[N][N],
int optimal,int current_weight,int item,int arr1[],int arr2[])
{
if (item == N)
{
if(current_weight>optimal) //This is the first option
{
memcpy(arr1,box1,sizeof(int)*N);
memcpy(arr2,box2,sizeof(int)*N);
}
return current_weight;
}
int k = items[item][1]; int sol;
for (int i = 0; i <= k; i++)
{
for (int j = 0; i+j <= k; j++)
{
box1[item] += i; box2[item] += j;
if (islegal(items, box1, box2, rules))
{
sol = help(items, box1, box2, rules, optimal,
current_weight + (i + j)*items[item][0],item+1,arr1,arr2);
if (sol > optimal)
{
optimal = sol;
memcpy(arr1,box1,sizeof(int)*N); //This is the second option
memcpy(arr2,box2,sizeof(int)*N);
}
}
box1[item] -= i; box2[item] -= j;
}
}
return optimal;
}
int insert(int items[N][2], int rules[N][N])
{
int box1[N] = { 0 }; int arr1[N] = { 0 };
int box2[N] = { 0 }; int arr2[N] = { 0 };
int optimal = 0;
int x = help(items, box1, box2, rules,0, 0,0,arr1,arr2);
print(arr1, N);
print(arr2, N);
return x;
}
Can anyone explain what causes the difference? Why the first option is correct and the second is not? I couldn't figure it out by my own.
Thanks a lot.
This doesn't work because when you pass box1 and box2 to help, they are mutated by help. It's pretty obvious from the algorithm that you want them to not be mutated. So, we can do as follows:
int help(int items[N][2], int box1in[N], int box2in[N], int rules[N][N],
int optimal,int current_weight,int item,int arr1[],int arr2[])
{
int box1[N];
int box2[N];
memcpy(box1, box1in, sizeof(int)*N);
memcpy(box2, box2in, sizeof(int)*N);
Your algorithm may still have problems but that problem is now removed.

Understanding returning values functions C

I'm trying to understand how the return value of a function works, through the following program that has been given to me,
It goes like this :
Write a function that given an array of character v and its dim, return the capital letter that more often is followed by its next letter in the alphabetical order.
And the example goes like : if I have the string "B T M N M P S T M N" the function will return M (because two times is followed by N).
I thought the following thing to create the function:
I'm gonna consider the character inserted into the array like integer thank to the ASCII code so I'm gonna create an int function that returns an integer but I'm going to print like a char; that what I was hoping to do,
And I think I did, because with the string BTMNMPSTMN the function prints M, but for example with the string 'ABDPE' the function returns P; that's not what I wanted, because should return 'A'.
I think I'm misunderstanding something in my code or into the returning value of the functions.
Any help would be appreciated,
The code goes like this:
#include <stdio.h>
int maxvolte(char a[],int DIM) {
int trovato;
for(int j=0;j<DIM-1;j++) {
if (a[j]- a[j+1]==-1) {
trovato=a[j];
}
}
return trovato;
}
int main()
{
int dim;
scanf("%d",&dim);
char v[dim];
scanf("%s",v);
printf("%c",maxvolte(v,dim));
return 0;
}
P.S
I was unable to insert the value of the array using in a for scanf("%c,&v[i]) or getchar() because the program stops almost immediately due to the intepretation of '\n' a character, so I tried with strings, the result was achieved but I'd like to understand or at least have an example on how to store an array of character properly.
Any help or tip would be appreciated.
There are a few things, I think you did not get it right.
First you need to consider that there are multiple pairs of characters satisfying a[j] - a[j+1] == -1
.
Second you assume any input will generate a valid answer. That could be no such pair at all, for example, ACE as input.
Here is my fix based on your code and it does not address the second issue but you can take it as a starting point.
#include <stdio.h>
#include <assert.h>
int maxvolte(char a[],int DIM) {
int count[26] = {0};
for(int j=0;j<DIM-1;j++) {
if (a[j] - a[j+1]==-1) {
int index = a[j] - 'A'; // assume all input are valid, namely only A..Z letters are allowed
++count[index];
}
}
int max = -1;
int index = -1;
for (int i = 0; i < 26; ++i) {
if (count[i] > max) {
max = count[i];
index = i;
}
}
assert (max != -1);
return index + 'A';
}
int main()
{
int dim;
scanf("%d",&dim);
char v[dim];
scanf("%s",v);
printf("answer is %c\n",maxvolte(v,dim));
return 0;
}
#include <stdio.h>
int maxvolte(char a[],int DIM) {
int hold;
int freq;
int max =0 ;
int result;
int i,j;
for(int j=0; j<DIM; j++) {
hold = a[j];
freq = 0;
if(a[j]-a[j+1] == -1) {
freq++;
}
for(i=j+1; i<DIM-1; i++) { //search another couple
if(hold==a[i]) {
if(a[i]-a[i+1] == -1) {
freq++;
}
}
}
if(freq>max) {
result = hold;
max=freq;
}
}
return result;
}
int main()
{
char v[] = "ABDPE";
int dim = sizeof(v) / sizeof(v[0]);
printf("\nresult : %c", maxvolte(v,dim));
return 0;
}

Segmentation Fault when returning integer

I recently joined Stackoverflow community because I had to ask this question. I've been searching for possible explanations and solutions on the website but so far nothing enlightened me as I wanted. My error is probably caused by a very specific line of code. I'm trying to create a function that reads an array of struct votes, (struct contains integer member number, char *category, char *nominee) and copies all the votes that contain the same number and category to another array of struct. Basically to show all the repeated votes.
typedef struct
{
int member;
char *categ;
char *nom;
}Vote
Vote vote(int member, char *categ, char *nom)
{
Vote result;
result.member = member;
result.categ = categ;
result.nom = nom;
return result;
}
int votes_count(Vote *v, int n, Vote *v1)
{
int result = 0;
int *index = malloc(sizeof(int) * 1000);
int a = 0;
for (int i = 0; i < n; ++i)
{
for (int j = 0; j < n; ++j)
{
if (a == 0 && v[i].member == v[j].member && strcmp(v[i].categ, v[j].categ) == 0)
{
v1[result++] = vote(v[j].member, str_dup(v[j].categ), str_dup(v[j].nom));
index[a++] = j;
}
for (int b = 0; b < a; ++b)
{
if( a > 0 && v[i].member == v[j].member && strcmp(v[i].categ, v[j].categ) == 0 && j != index[b])
{
v1[result++] = voto(v[j].member, str_dup(v[j].categ), str_dup(v[j].nom));
index[a++] = j;
}
}
}
}
return result;
}
Afterwads, it returns the number of elements of new array that contains all repetitions. I want to use an array of ints to save all line indexes so that the function doesn't read and copy the lines it already accounted.
Sorry if the code is hard to understand, if needed I can edit to be more understandable. Thanks for any answears.
P.S: I'm portuguese, sorry in advance for grammar mistakes
if your only intention is to harvest the duplicates, you only need to compare to the elements that came before an element
you don't need the index[] array
For simplicity, I used two integer arrays, you should change them to your struct arrays, also change the compare function.
unsigned fetchdups(int orig[], int dups[], unsigned count)
{
unsigned this, that, ndup=0;
for (this=1; this<count; this++){
for (that=0; that<this; that++){
/* change this to your compare() */
if(orig[that] == orig[this]) break;
}
if (this == that) continue; /* no duplicate */
dups[ndup++] = this;
}
return ndup;
}

Resources