Mergesort, weird behavior - c

Here's what I'm trying to do.
There are 3 arrays, cost[] node1[] and node2[].
These entires correspond to edges of a graph with node1[i],node2[i] and cost[i] specifying that there is an edge going from vertex node1[i] to node2[i] with an edge weight of cost[i].
I'm trying to sort these edges with respect to their weights, i.e sort the cost[] array using merge-sort. However whenever I'm, changing an entry in the cost[] array I also want to change the corresponding entries in the node1 and node2 array since even the nodes of the graph have to be modified. Ie if node1[]=1,2,3 and node2[]=2,3,1 cost[]={7 4 8} then after sorting the cost array the node1 and node2 should look like node1[]=2,1,3 node2[]=3,2,1. and cost[]=4,7,8
Here's my code.
#include<stdio.h>
#include<stdlib.h>
int merge_sort(int arr[],int low,int high,int node1[],int node2[])
{
int mid;
if(low<high) {
mid=(low+high)/2;
// Divide and Conquer
merge_sort(arr,low,mid,node1,node2);
merge_sort(arr,mid+1,high,node1,node2);
// Combine
merge(arr,low,mid,high,node1,node2);
}
return 0;
}
int merge(int arr[],int l,int m,int h,int node1[],int node2[])
{
int arr1[80000],arr2[80000]; // Two temporary arrays to
int arr3[70000],arr4[70000];
int arr5[70000],arr6[70000];
int n1,n2,i,j,k;
n1=m-l+1;
n2=h-m;
for(i=0; i<n1; i++)
{
arr1[i]=arr[l+i];
arr3[i]=node1[l+i];
arr5[i]=node2[l+i];
}
for(j=0; j<n2; j++)
{
arr2[j]=arr[m+j+1];
arr4[i]=node1[m+j+1];
arr6[i]=node2[m+j+1];
}
arr1[i]=99999; // To mark the end of each temporary array
arr2[j]=99999;
arr3[i]=99999;
arr4[j]=99999;
arr5[i]=99999;
arr6[j]=99999;
i=0;
j=0;
for(k=l; k<=h; k++) { //process of combining two sorted arrays
if(arr1[i]<=arr2[j])
{
arr[k]=arr1[i++];
//node1[k]=arr3[i++]; COMMENTED LINES!!!!!!!!!!!
//node2[k]=arr5[i++];
}
else
{
arr[k]=arr2[j++];
//node1[k]=arr4[j++]; COMMENTED LINES!!!!!!!!~!
//node2[k]=arr6[j++];
}
}
return(0);
}
int main(void)
{
int i,j,n,vert1,vert2,weight;
scanf("%d",&n);
int adjmat[n+1][n+1],cluster[n+1][n+1];
int *cost,*node1,*node2;
node1=malloc(sizeof(int)*1000000);
node2=malloc(sizeof(int)*1000000);
cost=malloc(sizeof(int)*1000000);
for(i=0;i<n+1;i++)
for(j=0;j<n+1;j++)
{
adjmat[i][j]=0;
cluster[i][j]=0;
}
for(i=1;i<n+1;i++)
cluster[i][0]=i;
for(i=1;i<(n+1)*(n+1);i++)
{
scanf("%d %d %d",&vert1,&vert2,&weight);
node1[i]=vert1;
node2[i]=vert2;
cost[i]=weight;
if(node1[i]==node1[i-1] && node2[i]==node2[i-1] && cost[i]==cost[i-1])
break;
// printf("%d %d %d\n",node1[i],node2[i],cost[i]);
adjmat[vert1][vert2]=weight;
adjmat[vert2][vert1]=weight;
}
printf("\n%d\n",i);
merge_sort(cost,1,124751,node1,node2);
for(j=1;j<i;j++)
printf("%d %d %d\n",node1[j],node2[j],cost[j]);
return(0);
}
Whenever I comment the lines in the merge function the code manages to sort the cost array. However whenever I un comment these lines somehow everything gets equated to 0. i.e all entires of the node1 node2 and cost arrays are 0. Could anyone tell me why this is happening? Thanks!

You probably have forgotten to take care of the side effect of the i++ operation. There is no need at all at that place to work with side effects, don't do that.

Related

resursive function , index switching to next based on what contained at index i j , C

I am trying to create a recursive function at the bottom of question there is code.
What I am trying to accomplish is
through j one by one at at index i --> if arr[i][j] !=i then switch to index number found from the value of arr[i][j] at current --> keep doing it recursively and printing current value of arr[i][j], i,j
The point of this code is basically input array can be any two dimension array and it should be check that it can be sorted in one special way
so {{1,3,1},{2,1,2},{3,2,3}} all three sub-arrays can be sorted to {{1,1,1},{2,2,2},{3,3,3}} which is what like to see and check
but in {{1,3,1},{2,1,1},{3,2,3}} array index i=1 and i=2 can not be sorted but i=3 can be sorted
So is it possible to do this recursively because I am getting garbage current_NEq in printf
is there any thing wrong with this code
Code
#include <stdio.h>
int t=0;
int recursive(int i,int n,int m,int arr[n][m],int index,int current)
{
for(int j=0;i<3;j++)
{
if(arr[i][j]!=i)
{
printf("i=%d j=%d , index=%d current_NEq=%d \n",i,j,i,arr[i][j]);
recursive(arr[i][j],n,m,arr,arr[i][j],arr[i][j]);
}
else
{
t=arr[i][j];
}
}
}
int main()
{
int x[][3]={{1,3,1},{2,1,2},{3,2,2}};
for(int i=1;i<=3;i++)
{
recursive(i,3,3,x,(0)+1,0);
}
printf("Hello, World!\n");
return 0;
}

QuickSort not sorting properly

I have a this program to receive a struct, store it and then sort it. I have tried to use Shell Sort, but then I went for the Quick Sort Algorithm. However, when I try to print the array after sorting, it still returns it unsorted. Bear in mind im trying to sort it by 'num_aluno'.
CODE
typedef struct ALUNO
{
char primeiro_nome[15];
char segundo_nome[15];
int num_aluno;
float nota;
}ALUNO;
void swap(ALUNO* a, ALUNO* b)
{
ALUNO t=*a;
*a=*b;
*b=t;
}
int partition(ALUNO *studentList, int low, int high)
{
int pivot= studentList[high].num_aluno;
int i=(low-1);
int j;
for(j=low;j<=high-1;j++)
{
if(studentList[j].num_aluno<=pivot);
{
i++;
swap(&studentList[i], &studentList[j]);
}
}
swap(&studentList[i+1], &studentList[high]);
return(i+1);
}
void quickSort(ALUNO *studentList, int low, int high)
{
if(low<high)
{
int pi=partition(studentList, low, high);
quickSort(studentList, low, pi-1);
quickSort(studentList, pi+1, high);
}
}
int main()
{
ALUNO *studentList=NULL;
int currentPos, studentListSize=1;
//float grade_sum=0;
studentList=(ALUNO*)calloc(studentList, studentListSize*sizeof(ALUNO));
printf("Insira o numero de alunos \n");
scanf("%d", &studentListSize);
studentList=(ALUNO*)realloc(studentList, studentListSize*sizeof(ALUNO));
for(currentPos=0;currentPos<studentListSize;currentPos++)
{
newStudent(studentList, currentPos);
}
quickSort(studentList, 0, studentListSize);
for(currentPos=0;currentPos<studentListSize;currentPos++)
{
printStudent(studentList,currentPos);
}
free(studentList);
return 0;
}
Any help would be appreciated
The reason it doesn't sort the list is, i and j are always the same value in the partition function.
You should start j from high -1, not from low.
Also, you are not considering the values of studentList[i]. You must guarantee that it is larger than pivot, otherwise there may be a value which is smaller than the pivot, yet at the left part of the array.
Here, corrected it.
int partition(ALUNO*studentList, int low, int high)
{
int pivot = studentList[high].num_aluno;
int i=low ;
int j;
for(j=high-1;j>=i;)
{
while(studentList[i].num_aluno < pivot)
i++;
while(studentList[j].num_aluno > pivot)
j--;
if(i<=j){
swap(&studentList[i], &studentList[j]);
i++;
j--;
}
}
swap(&studentList[i], &studentList[high]);
return(i+1);
}
An additional advice in case you have not heard of it, do not choose the first or last values as pivots. Instead, use the median of three strategy.
Median of three strategy: Pick the first, middle and last elements in the unsorted array. Change their locations according to their values in a sorted manner. Then take the middle value and use it as a pivot. This way you are avoiding the worst time complexity(O(n^2)) of QuickSort.

Radix sort gives wrong answer by changing just one loop of count subroutine

It seems a very trivial problem but after a lot of thinking I still can't figure it out. I worte these two codes for Radix sort.
Code 1
#include <stdio.h>
#include <malloc.h>
#define BUCKET_SIZE 10
void prin(int* arr,int n)
{
int i;
for(i=0;i<n;i++)
printf("%d ",*(arr+i));
printf("\n");
}
int maxi(int* arr,int n)
{
int i,max=0;
for(i=0;i<n;i++)
{
if(arr[i]>max)
max=arr[i];
}
return max;
}
int* count(int *arr,int n,int k)
{
int* count,i,index;
int* output;
count=(int*)calloc(BUCKET_SIZE-1,sizeof(int));
output=(int*)malloc(n*sizeof(int));
for(i=0;i<n;i++)
{
index=(arr[i]/k)%10;
count[index]++;
}
for(i=0;i<BUCKET_SIZE;i++)
count[i]+=count[i-1];
for(i=n-1;i>=0;i--)
{
index=(arr[i]/k)%10;
output[count[index]-1]=arr[i];
count[index]--;
}
return output;
}
int* radixsort(int* arr,int n)
{
int i,max,k=1;
max=maxi(arr,n);
while(max>0)
{
max/=10;
arr=count(arr,n,k);
k=k*10;
}
return arr;
}
void main()
{
int n,i;
scanf("%d",&n);
int* arr;
arr=(int*)malloc(n*sizeof(int));
for(i=0;i<n;i++)
scanf("%d",(arr+i));
arr=radixsort(arr,n);
prin(arr,n);
}
Now if I change the sort subroutine like below, this code will not sort the given array and I can't figure why this happened, I am still traversing the whole array so and I am still calculating the right index so my elements should be filled in the right place and I should have a sorted array.
Code 2
Only count function last loop changed.
int* count(int *arr,int n,int k)
{
int* count,i,index;
int* output;
count=(int*)calloc(BUCKET_SIZE-1,sizeof(int));
output=(int*)malloc(n*sizeof(int));
for(i=0;i<n;i++)
{
index=(arr[i]/k)%10;
count[index]++;
}
for(i=0;i<BUCKET_SIZE;i++)
count[i]+=count[i-1];
for(i=0;i<n;i++)
{
index=(arr[i]/k)%10;
output[count[index]-1]=arr[i];
count[index]--;
}
return output;
}
When I am doing just counting sort both functions work well. Can someone point me out where I am going wrong with radix sort, or what is the thing I am missing, and how both well in counting sort.
Thanks.
In your final loop in your count function,
when these lines copy the contents of each "bucket",
they write the last element of the output "bucket" first,
followed by the next-to-last, ending with the first element:
output[count[index]-1]=arr[i];
count[index]--;
In the first version of your program, since you visit the elements of the input array starting at the end of the array and working your way back toward the beginning,
you encounter the last element of each bucket first (and therefore put it in the last position in the output bucket), then the next-to-last element
(which you put in the next-to-last position in the output),
and so forth. The first element of each bucket is the last copied
and is copied to the first position in the bucket.
In the second version of your program, you continue to fill in the spaces in each output bucket from back to front, but you read the input from front to back. This has the result of putting the first element of each bucket in the last position within that bucket, and the last element of the bucket in the first position.
That is, each time you run the count function it reverses the order of elements within each bucket.
If you want to copy the input array reading it from front to back,
you need to fill in each output bucket from front to back
by using ++count[index] instead of --count[index].
You also have to start each entry of count[index] at a lower number so that you write to the correct locations.
Aside: your program does a lot more allocation than it needs to, and doesn't free any memory, so you have a potentially massive memory leak.
You might consider passing already-allocated arrays into count instead of always allocating new ones.
Here is a front to back example, that also replaces the original array with a sorted array, freeing the original array. An alternative would be to do a one time allocation of a second working array, radix sort back and forth between original and working arrays, then keep the sorted array, and free the "other" array.
#include <stdio.h>
#include <stdlib.h>
#define BUCKET_SIZE 10
void prin(int* arr, int n)
{
int i;
for(i = 0; i < n; i++)
printf("%d ", arr[i]);
printf("\n");
}
int maxi(int* arr, int n)
{
int i,max = 0;
for(i = 0; i < n; i++)
{
if(arr[i] > max)
max = arr[i];
}
return max;
}
/* replaces array with sorted array, frees original array */
void count(int** parr, int n, int k)
{
int* count, i, index;
int* arr = *parr;
int* output;
int sum, cur;
count=calloc(BUCKET_SIZE, sizeof(int));
output=malloc(n*sizeof(int));
for(i = 0; i < n; i++){
index = (arr[i]/k)%10;
count[index]++;
}
sum = 0;
for(i = 0; i < BUCKET_SIZE; i++){
cur = count[i];
count[i] = sum;
sum += cur;
}
for(i = 0; i < n; i++){
index = (arr[i]/k)%10;
output[count[index]++] = arr[i];
}
free(arr);
free(count);
*parr = output;
}
void radixsort(int** parr,int n)
{
int max,k=1;
max=maxi(*parr,n);
while(max>0)
{
max/=10;
count(parr,n,k);
k=k*10;
}
}
int main()
{
int n,i;
int* arr;
scanf("%d",&n);
arr = malloc(n*sizeof(int));
for(i = 0; i < n; i++)
scanf("%d",&arr[i]);
radixsort(&arr,n);
prin(arr,n);
free(arr);
return 0;
}

counting number of swaps in insertion sort

In the problem given here, i have to count total no. of swaps required while sorting an array using insertion sort.
here is my approach
#include <stdio.h>
int main()
{
int t, N, swaps, temp, i, j;
scanf("%d", &t);
while(t--){
scanf("%d", &N);
int arr[N];
swaps = 0;
for(i=0; i<N; ++i){
scanf("%d", &temp);
j=i;
while(j>0 && arr[j-1] > temp){
arr[j] = arr[j-1];
++swaps;
--j;
}
arr[j] = temp;
}
printf("%d\n", swaps);
}
return 0;
}
but, this soln is giving time limit exceeded.
How can i make it more fast?
and, what are the other better solutions of this problem?
this is a standard problem named inversion count
This can be solved using mergesort in O(n*lg(n)). Here is my code for counting the inversions
int a[200001];
long long int count;
void Merge(int p,int q,int r)
{
int n1,n2,i,j,k,li,ri;
n1=q-p+1;
n2=r-q;
int l[n1+1],rt[n2+1];
for(i=0;i<n1;i++)
l[i]=a[p+i];
for(i=0;i<n2;i++)
rt[i]=a[q+1+i];
l[n1]=LONG_MAX;
rt[n2]=LONG_MAX;
li=0;ri=0;
for(i=p;i<=r;i++)
{
if(l[li]<=rt[ri])
a[i]=l[li++];
else
{
a[i]=rt[ri++];
count+=n1-li;
}
}
}
void mergesort(int p,int r)
{
if(p<r)
{
int q=(p+r)/2;
mergesort(p,q);
mergesort(q+1,r);
Merge(p,q,r);
}
}
int main()
{
scanf("%d",&n);
for(i=0;i<n;i++)
scanf("%d",&a[i]);
count=0;
mergesort(0,n-1);
printf("%lld\n",count);
}
Basically the problem of inversion count is to find the no. of pairs i and j where j>i such that a[i]>a[j]
To know the idea behind this you should know the basic merge sort algorithm
http://en.wikipedia.org/wiki/Merge_sort
Idea:
Use divide and conquer
divide: size of sequence n to two lists of size n/2
conquer: count recursively two lists
combine: this is a trick part (to do it in linear time)
combine use merge-and-count. Suppose the two lists are A, B. They are already sorted. Produce an output list L from A, B while also counting the number of inversions, (a,b) where a is-in A, b is-in B and a>b.
The idea is similar to "merge" in merge-sort. Merge two sorted lists into one output list, but we also count the inversion.
Everytime a_i is appended to the output, no new inversions are encountered, since a_i is smaller than everything left in list B. If b_j is appended to the output, then it is smaller than all the remaining items in A, we increase the number of count of inversions by the number of elements remaining in A.
This reminds me of a similar problem you may want to look at: http://www.spoj.pl/problems/YODANESS/
In your problem, you can't afford the time to swap everything in case there are many swaps required. (imagine if the input was in reverse order 9,8,7,6.. then you would have to swap everything with everything basically.
I think in your case, each number must be swapped with all the numbers to the left of it that are smaller than it.
I suggest you use a range tree http://en.wikipedia.org/wiki/Range_tree
The great thing about a range tree is each node can know how many nodes are to its left and to its right. You could ask the tree "how many numbers are there greater than 10" very efficiently and that's how many swaps you would have for a 9 say.
The trick is to build the range tree as you move from i=0 to i=N-1. At each point you can query the tree against the ith number before inserting the ith number into the range tree.
good luck!
I did the same code in c++, and it is getting accepted,it is taking time about 4.2 seconds on spoj(http://www.spoj.com/submit/CODESPTB/).
here is the code snippet:
//http://www.spoj.com/problems/CODESPTB/
//mandeep singh #msdeep14
#include<iostream>
using namespace std;
int insertionsort(int arr[], int s)
{
int current,i,j,count=0;
for(i=1;i<s;i++)
{
current=arr[i];
for(j=i-1;j>=0;j--)
{
if(current<arr[j])
{
arr[j+1]=arr[j];
count++;
}
else
break;
}
arr[j+1]=current;
}
return count;
}
int main()
{
int t,n,i,res;
int arr[100000];
cin>>t;
while(t--)
{
cin>>n;
for(i=0;i<n;i++)
{
cin>>arr[i];
}
res=insertionsort(arr,n);
cout<<res<<endl;
}
return 0;
}
#include < stdio.h >
int main() {
int N, swaps, temp[100], i, j;
scanf("%d", & N);
int arr[N];
swaps = 0;
for (i = 0; i < N; i++) {
scanf("%d", & temp[i]);
j = i;
while (j > 0 && arr[j - 1] > temp[i]) {
arr[j] = arr[j - 1];
++swaps;
--j;
}
arr[j] = temp[i];
}
printf("%d", swaps);
return 0;
}

Kruskal C implementation

I have implemented the Kruskal algorithm in C using an adjacency matrix graph representation, the problem is, it keeps popping up segmentation fault error, I've been trying to figure out what is wrong for quite a while and I can't seem to find the problem, could anyone else take a look please?
Thanks.
Here is my code:
#include <stdio.h>
#include <stdlib.h>
#define MAXVERT 10
#define MAXEDGES 20
#define INF 100000
/*graph representation using an Adjacency matrix*/
typedef struct AdjMatrix
{
int nodes;
int adjMat[MAXVERT][MAXVERT];
} graph;
/*function prototypes*/
int find(int node, int *trees);
void merge(int i, int j, int *trees);
void printminimal(int min[][3], int n);
/*main algorithm*/
void kruskal(graph *g)
{
int EDGES[MAXEDGES][3]; /*graph edges*/
int MINEDGES[MAXVERT-1][3]; /*edges already in the minimal spanning tree*/
int nextedge=0;
int numedges=0;
int trees[MAXVERT]; /*tree subsets*/
int i, j, k;
int temp;
for(i=0;i<g->nodes;i++)
trees[i]=i;
k=0;
for(i=0; i<g->nodes; i++)
for(j=0; j<g->nodes; j++)
{
if(i<j)
{
EDGES[k][0]=i;
EDGES[k][1]=j;
EDGES[k][2]=g->adjMat[i][j];
k++;
}
else
break;
}
/*Bubblesort*/
for(i=0; i<g->nodes; i++)
for(j=0; j<i; j++)
{
if(EDGES[j][2] > EDGES[j+1][2])
{
temp=EDGES[j][0];
EDGES[j][0]=EDGES[j+1][0];
EDGES[j+1][0]=temp;
temp=EDGES[j][1];
EDGES[j][1]=EDGES[j+1][1];
EDGES[j+1][1]=temp;
temp=EDGES[j][2];
EDGES[j][2]=EDGES[j+1][2];
EDGES[j+1][2]=temp;
}
}
while(numedges < (g->nodes-1))
{
i=find(EDGES[nextedge][0], trees);
j=find(EDGES[nextedge][1], trees);
if((i!=j)&&(EDGES[nextedge][2]!=-1)) /*check if the nodes belong to the same subtree*/
{
merge(i,j,trees);
MINEDGES[numedges][0]=EDGES[nextedge][0];
MINEDGES[numedges][1]=EDGES[nextedge][1];
MINEDGES[numedges][2]=EDGES[nextedge][2];
numedges++;
}
nextedge++;
}
}
int find(int node, int *trees)
{
if(trees[node]!=node)
return trees[node];
else
return node;
}
void merge(int i, int j, int *trees)
{
if(i<j)
trees[j]=i;
else
trees[i]=j;
}
void printminimal(int min[][3], int n)
{
int i, weight=0;
printf("Minimal tree:\n(");
for(i=0;i<n;i++)
{
printf("(V%d,V%d), ", min[i][0],min[i][1]);
weight+=min[i][2];
}
printf(")\n Total weight sum of the minimal tree is: %d", weight);
}
int main(void)
{
int i,j;
graph *g=(graph *)malloc(sizeof(graph));
/*int adjMat[8][8] = {0,INF,INF,11,INF,1,7,
INF,0,INF,3,INF,4,8,INF,
INF,INF,0,INF,INF,INF,12,INF,
INF,3,INF,0,15,INF,INF,INF,
11,INF,INF,INF,0,20,INF,INF,
INF,4,INF,INF,20,0,INF,INF,
1,8,12,INF,INF,INF,0,5,
7,INF,INF,INF,INF,INF,5,0};*/
for(i=0;i<4;i++)
for(j=0;j<i;j++)
{
if(i==j)
{
g->adjMat[i][j]=0;
continue;
}
printf("%d-%d= ", i, j);
scanf("%d", &(g->adjMat[i][j]));
g->adjMat[j][i]=g->adjMat[i][j];
}
g->nodes=4;
kruskal(g);
}
In the kruskal function, where you intend to populate the EDGES array, you don't:
for(i=0; i<g->nodes; i++)
for(j=0; j<g->nodes; j++)
{
if(i<j)
{
EDGES[k][0]=i;
EDGES[k][1]=j;
EDGES[k][2]=g->adjMat[i][j];
k++;
}
else
break;
}
For j == 0, i is never < j, so you immediately break out of the inner loop. I suspect it should be i > j in the condition.
Since EDGES is uninitialised, find tries to access an unspecified element of trees.
I had to add the following to get this to kruskal to get it to compile from gcc:
int *dvra = trees;
You can then compile it with debug information:
gcc -g -o kruskal kruskal.c
and run it through gdb:
gdb kruskal
You can then type run and enter to start the program. I entered 1,2,3,... when prompted for values.
This then gives:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400a92 in find (node=32767, trees=0x7fffffffe110) at test.c:86
86 if(trees[node]!=node)
Hmm, that's curious. Trees holds only 10 items (value of the MAXVERT define), so accessing node 32767 goes out of bounds. If you enter 32767 in the calculator program and go to the programming (hexadecimal) mode, you will find it is 7FFF (or MAX_SHORT, the maximum 16-bit signed integer value). That's also interesting.
NOTE: You can investigate variable values by using the print command (e.g. print node) and the backtrace using the bt command.
These are coming from the while loop in kruskal (the only place that is calling find), so we need to investigate where that value is coming from. Lets quit out of gdb (press 'q' and enter then confirm with 'y' and enter).
Add the following to the while loop and run the resulting program:
printf("%d: nextedge=%d EDGES[nextedge][0]=%d EDGES[nextedge][1]=%d\n", numedges, nextedge, EDGES[nextedge][0], EDGES[nextedge][1]);
which gives:
0: nextedge=0 EDGES[nextedge][0]=-557487152 EDGES[nextedge][1]=32767
So it looks like EDGES[0] is not being initialized, which points to the if(i<j) condition in the initialization loop above the bubblesort. OK, so lets trace what is happening in the initialization loop by adding the following inside the if loop:
printf("EDGES[%d]: 0=%d 1=%d\n", k, i, j);
Rerunning this, we see that there are no lines associated with this statement, so it is not getting executed.
Changing the if condition to:
if(i<=j)
causes the statement to be executed and the segment fault to go away.

Resources