i'm doing a school project implementing some sorting algorithms in C codes, and i'm working on a Binary Insertion Sort on some generic data type arrays (so i'm using void* items and void** arrays).
I have a binarySearch function that returns the index i would have to insert an item into the array to preserve the ordering of its elements (according to a given function that i pass to the sorting function), and this works correctly.
int binary_search(void **arr, void *item, long start, long end, int data_size, compFunc compare)
{
int s = start, e = end;
while (s <= e)
{
long middle = s + (e - s) / 2;
int comparison = compare(item, arr[middle]);
if (comparison == 0)
return middle;
else if (comparison > 0)
s = middle + 1;
else
e = middle - 1;
}
return s;
}
Then i have the binaryInsertSort function
void binary_insert_sort(void **arr, long arr_size, int data_size, compFunc compare)
{
long explored, j, pos;
void *current = malloc(sizeof(void*)), *holder;
if(!current){
perror("Error allocating memory\n");
exit(EXIT_FAILURE);
}
for (explored = 1; explored < arr_size; explored++)
{
memcpy(current, arr[explored], data_size);
j = explored - 1;
pos = binary_search(arr, current, 0, j, data_size, compare);
while (j >= pos)
{
holder = malloc(sizeof(void *));
memcpy(holder, (*arr) + (j + 1)*data_size, data_size);
//CRASHES HERE
memcpy((*arr) + (j + 1) * data_size, (*arr) + (j * data_size), data_size);
memcpy((*arr) + (j * data_size), holder, data_size);
j--;
free(holder);
}
}
free(current);
}
I call these functions like this
int main(int argc, char const *argv[])
{
char* arr[] = {"a", "b", "f", "d", "c", "g", "e", "1"};
int n = sizeof(arr)/sizeof(char*);
binary_insert_sort((void**)arr, n, sizeof(char*), string_compare);
print_string_array(arr, n);
return 0;
}
But it always crashes after the first memcpy , when i try to move arr[j+1] into arr[j] ; i tried doing some debugging and, after trying to print (int)(arr[1]-arr[0])(which should print the size of a cell if i understand correctly), i noticed cells have size = 2 rather than the expected sizeof(char*)=4 , so there are problem accessing them correctly when using *arr + j*data_size , since i'm moving j * 2 cells rather than j
why does this happen?
i apologize if i'm missing something basic, or if english or formatting arent right, 1st time asking
i'm doing a school project implementing some sorting algorithms in C codes, and i'm working on a Binary Insertion Sort on some generic data type arrays (so i'm using void* items and void** arrays).
You clarified in comments that what you mean is that you want a function that can sort arrays having any element type. This is exactly what the standard library's qsort() function does, so you should look to it for guidance on how such a function might look and work.
In particular, you need to understand that C has no generic data type. In particular, a void * can point to an object of any type, but void * itself is a specific, complete type, not generic in any way.* Thus, using void * items does not serve your purpose at all. Not even if you wanted to sort only arrays of pointers, because the C language does not guarantee that different pointer types have the same representation as each other, or even the same size, except only that char * and void * are required to have the same size and representation.
In other words, no, you don't have void * items, and you don't want to sort a "void ** array". And therefore no, your binarySearch() function for an array of void * does not serve your purposes -- however well it does its job, it's the wrong job.
Following qsort(), here's a signature that would serve your purpose:
typedef int (*compFunc)(const void *, const void *);
void binary_insert_sort(void *arr, size_t element_count, size_t element_size,
compFunc compare);
Note that the array to sort is conveyed via a pointer to its first element, received by the function as a pointer of type void * -- not void **. This does present an issue, however: you cannot perform pointer arithmetic or array indexing on a void *, because these operations are defined in terms of the size of the pointed-to type, which is unknown in this case because void is an incomplete type. But it should not be a particular surprise that such an issue arises, because the whole point of the exercise is to sort objects whose size is not known when the the function is compiled.
So what do you do? The traditional approach would be via converting to char *:
#define ELEMENT_POINTER(base, index, size) ((char *) (base) + (index) * (size))
void *element_3_for_example = ELEMENT_POINTER(arr, 3, element_size);
So a comparison would then look like this ...
int result = compare(ELEMENT_POINTER(arr, i, element_size),
ELEMENT_POINTER(arr, j, element_size));
... and a swap might look like this:
void *temp = malloc(element_size);
// ...
memcpy(temp, ELEMENT_POINTER(arr, i, element_size));
memcpy(ELEMENT_POINTER(arr, i, element_size), ELEMENT_POINTER(arr, j, element_size));
memcpy(ELEMENT_POINTER(arr, j, element_size), temp);
// ...
free(temp);
I'll leave it to you to work the actual exercise in terms of those or similar constructs.
As for the actual question ...
why does this happen?
, it's because your function is confused about whether the elements of the array are themselves the items being sorted or whether they are pointers to the elements being sorted. Erroneously assuming the latter, it makes the further questionable choice of swapping the data instead of the pointers. In this particular case, the data happen to be pointers after all, though that would not always be the case. The data they point to are arrays containing string literals, and
These arrays are not the expected size, so you have bounds overruns on both reading and writing, and
They are not writable anyway, which is probably the specific source of the error.
* One could consider void to be a generic data type, as indeed this answer could be taken to demonstrate. But you cannot declare an object to have type void, nor access an object via an lvalue of type void, so this is largely moot.
Your approach is initially wrong due the function declarations and their calls like for example
binary_insert_sort((void**)arr, n, sizeof(char*), string_compare);
^^^^^^^^^^^
That is if within the function you will dereference the pointer like for example
arr[explored]
then the expression will have the type void *. So if the original array has for example the type
char arr[] = "hello";
then in the expression above there will be used incorrect pointer arithmetic. That is instead of evaluation the value of the pointer expression like value of arr + explored * sizeof( char ) the value of the pointer expression will be evaluated like value of arr + explored * sizeof( void * ).
Also the function is inefficient. There are too many memory allocations in the while loop
while (j >= pos)
{
holder = malloc(sizeof(void *));
//...
The function should be declared at least like
void binary_insert_sort( void *arr, size_t arr_size, size_t data_size, compFunc compare);
The function binary_search has a redundant parameter start. You are calling the function always passing 0 as its argument for the parameter start
pos = binary_search(arr, current, 0, j, data_size, compare);
Instead of the parameters start and end it is enough to pass the number of elements in the sub-array. So the function could be declared like
int binary_search( const void *arr, const void *item, size_t arr_size, size_t data_size, compFunc compare);
Or similarly to the standard C function bsearch like
int binary_search( const void *item, const void *arr, size_t arr_size, size_t data_size, compFunc compare);
If the sub-array already has the element that is equal to the searched element then the function should return the position after the existent element in the sub-array instead of returning the position of the existent element.
Ok so thanks to the help of the previous comments i managed to get it somewhat going, here is the updated code
#define GET_PTR(base, offset, size) ((char *)(base) + (offset) * (size))
long binary_search(void *arr, void *item, size_t end, int data_size, compFunc compare)
{
long s = 0, e = end, middle;
int comparison;
while (s <= e)
{
middle = s + (e - s) / 2;
comparison = compare(item, GET_PTR(arr, middle, data_size));
if (comparison == 0)
return middle;
else if (comparison > 0)
s = middle + 1;
else
e = middle - 1;
}
return s;
}
void swap(void *base, long ind1, long ind2, size_t data_size)
{
void *temp = malloc(data_size);
memcpy(temp, GET_PTR(base, ind2, data_size), data_size);
memcpy(GET_PTR(base, ind2, data_size), GET_PTR(base, ind1, data_size), data_size);
memcpy(GET_PTR(base, ind1, data_size), temp, data_size);
free(temp);
}
void binary_insert_sort(void *arr, size_t arr_size, size_t data_size, compFunc cmp)
{
void *current = malloc(data_size), *holder;
long explored, shifting, bs_pos;
for (explored = 1; explored < arr_size; explored++)
{
current = GET_PTR(arr, explored, data_size);
shifting = explored - 1;
bs_pos = binary_search(arr, current, shifting, data_size, cmp);
while (shifting >= bs_pos)
{
swap(arr, shifting, shifting + 1, data_size);
shifting--;
}
}
free(current);
}
and this seems to work, it manages to correctly sort an array of int and of a 'Person' struct when called like this
int main(int argc, char const *argv[])
{
// char *string_arr[] = {"a", "b", "f", "d", "c", "g", "e", "1"};
// int char_size = sizeof(arr) / sizeof(arr[0]);
// print_string_array((char**)arr, char_size);
// binary_insert_sort((void*)arr, char_size, sizeof(char*), string_compare);
// print_string_array((char**)arr, char_size);
int int_arr[] = {5, 3, 2, 4, 0, 6, 14, -4, 0, 32, 2, -1};
int int_size = sizeof(int_arr) / sizeof(int_arr[0]);
print_int_array((int*)int_arr, int_size);
binary_insert_sort((void*)int_arr, int_size, sizeof(int), int_compare);
print_int_array((int*)int_arr, int_size);
Person p_arr[] = {{3, "fabio"}, {5, "marco"}, {0, "giulio"}, {2, "alberto"}, {1, "gabri"}};
int p_size = sizeof(p_arr)/sizeof(p_arr[0]);
for(int i =0;i<p_size;i++) print_person(&(p_arr[i]));
// binary_insert_sort(p_arr, p_size, sizeof(Person), person_int_cmp);
binary_insert_sort(p_arr, p_size, sizeof(Person), person_string_cmp);
for(int i =0;i<p_size;i++) print_person(&(p_arr[i]));
return 0;
}
But for some reason, the string array just wont sort: i put some debug prints and to me it seems like the problem is that when i compare them, what's really being compared is their address, as the value of comparison in binary_search is always 1 when it's sorting char_arr
Here are the functions being used:
typedef struct {
int id;
char* name;
} Person;
int person_int_cmp(void* p1, void* p2)
{
Person *a = (Person*)p1;
Person *b = (Person*)p2;
return a->id - b->id;
}
int person_string_cmp(void* p1, void* p2)
{
Person *a = (Person*)p1;
Person *b = (Person*)p2;
return strcmp(a->name, b->name);
}
void print_person(void* p){
Person* a = (Person*)p;
printf("%d: %s\n", a->id, a->name);
}
int string_compare(void *a, void *b)
{
char * str1 = (char *)a;
char * str2 = (char *)b;
return strcmp(str1, str2);
}
int int_compare(void *n1, void *n2)
{
int n01 = *(int *)n1, n02 = *(int *)n2;
return n01 - n02;
}
void print_int_array(int *arr, long size)
{
printf("\t[");
for (int i = 0; i < size; i++) printf("%d -> ", arr[i]);
printf("]\n");
}
void print_string_array(char** arr, long size)
{
printf("\t[");
for (int i = 0; i < size; i++) printf("%s -> ", arr[i]);
printf("]");
}
Am i treating the string array in some wrong way i am not aware of? What's even weirder to me is that it's able to correctly sort the Person array by the names (which are strings) but not an array of just strings
If anyone can point me what i'm doing wrong thanks so much
I was reading about function pointer. That it contains address of instructions. And there I encountered one question to find an element in array using function pointer. Here is the code.
#include <stdio.h>
#include <stdbool.h>
bool compare(const void* a, const void* b)
{
return (*(int*)a == *(int*)b);
}
int search(void* arr, int arr_size, int ele_size, void* x, bool compare(const void*, const void*))
{
char* ptr = (char*)arr; // Here why not int *ptr = (int*)arr;
int i;
for (i = 0; i < arr_size; i++)
{
if (compare(ptr + i * ele_size, x))
{
return i;
}
}
return -1;
}
int main()
{
int arr[] = { 2, 5, 7, 90, 70 };
int n = sizeof(arr) / sizeof(arr[0]);
int x = 7;
printf("Returned index is %d ", search(arr, n, sizeof(int), &x, compare));
return 0;
}
In the search function char *ptr = (char*)arr; is used which is giving perfect answer = 2.
But when I have used int *ptr = (int*)arr; it gives -1 as answer.
Why is this? Can anyone explain this?
A char is the smallest addressable unit in any C program, and on most system it corresponds to a single byte. That treats the array as a generic sequence of bytes, and uses the ele_size to calculate the byte-position of each element with ptr + i*ele_size.
If you use int *ptr then the byte-position calculation will be wrong by a factor of sizeof(int) (typically 4), since the pointer arithmetic will be done in units of the base type (int instead of char).
The function search knows nothing about what is the type of elements of the array pointed to by the pointer arr of the type void *.
So casting the pointer to the type int * does not make a sense. If to do so then the expression ptr + i*ele_size where the pointer arithmetic is used will produce an incorrect result.
That it contains address of instructions
There is a subtle difference between normal (object) pointers and function pointers. It is not possible to access the single instructions of a function - they do not have the same length.
With other pointers the increment (arithmetic) is adapted to the type, whether as p[i] or p + i or *(p+i).
Side note: there still is int at the bottom of the call chain:
return (*(int*)a == *(int*)b);
code:
int arr[5] = {1,2,3,4,5};
int (*p)[5] = &arr;
printf("p:%p\n",p);
printf("*p:%p\n",*p);
result: p = *p = arr = 0x7ffee517c830 they are all the address of the array
The right way to use p to visit arr[i] is *(*p+i)
The type of pointer p is int(*)[5], so p point to an array which type is int [5]. But we can't say that p point to an invisible shell of arr, p is a variable after all. It stores the address of arr, which is also the address of arr[0], the first element of arr.
I thought *p will get me 1, which is the first element of arr.
The dereference operation means take the value in p as address and get the value from this address. Right?
So p stores the address of arr,which is 0x7ffee517c830 here, and 1 is stored in this address. Isn't **p illegal? The first dereference give us 1, and second dereference will use 1 as address which is illegal.
What I am missing?
The result of *p is an lvalue expression of array type. Using (*p) is exactly the same as using arr in any expression you could now think of.
For example:
&*p means &arr
**p means *arr (which is legal).
(*p)[i] means arr[i].
sizeof *p means sizeof arr.
Arrays are not special in this regard. You can see the same phenomenon with int x; int *q = &x;. Now *q and x have exactly the same meaning.
Regarding your last paragraph, I think you are confusing yourself by imagining pointers as glorified integers. Some people teach pointers this way but IMO it is not a good teaching technique because it causes the exact confusing you are now having.
If you dereference an int(*)[5] you get an int[5] and that's all there is to it. The data type matters in dereferencing. It does not make sense to talk about "dereferencing 0x7ffee517c830". Again this is not peculiar to arrays; if you dereference a char ***, you get a char ** etc.
The only way in which arrays are "different" in this discussion is what happens if you try to do arithmetic on them, or output them, etc. If you supply an int[5] as a printf argument for example, there is implicit conversion to int * pointing at the first of those 5 ints. This conversion also happens when applying the * operator to an int[5], which is why you get an int out of that.
p is declared as a 'pointer to int[5]'.
arr is declared as an 'int[5]`
so the initializer p = &arr; is not really that strange. If you substituted any primitive type for int[5] you wouldn't bat an eye.
*p is another handle on arr. so (*p)[0] = 1.
This really only comes up in wierd cases. It's most natural where you dereference the pointer-to-array using the subscript operator. Here's a contrived example where I want to pass a table as argument.
#include <stdio.h>
int print_row_range(int (*tab) [2], int first, int last)
{
int i;
for(i=first; i<= last; i++)
{
printf("{%d, %d}\n", tab[i][0], tab[i][1]);
}
}
int main(int argc, char *argv[])
{
int arr[3][2] = {{1,2},{3,4},{5,6}};
print_row_range(arr,1,2);
}
This example treats the table as an array of rows.
Dereferencing doesn't give you a value. It gives you an object, which can be used as a value of its type if it can be converted to.
*p, being identical to arr, is an object of an array of 5 ints, so if you want to get an integer from the array, you must dereference it again like (*p)[3].
Consider a bigger example:
int arr[5][5];
int (*p)[5] = arr;
Now you get arr[0] with *p, which itself is an array of 5. Here comes the difference:
*( p+1) == arr[1];
*(*p+1) == arr[0][1];
^ ^^^
Got the point?
One use case is to be able to allocate with malloc an 2D (or more) pointer of arrays with only one malloc:
#include <stdio.h>
#include <stdlib.h>
static int (*foo(size_t n))[42] {
return malloc(sizeof *foo(0) * n);
// return malloc(sizeof(int [n][42]); works too
}
int main(void) {
size_t n = 42;
int (*p)[42] = foo(n);
if (!p) {
return 1;
}
printf("p:");
int accu = 0;
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < sizeof *p / sizeof **p; j++) {
p[i][j] = accu++;
printf(" %d", p[i][j]);
}
}
printf("\n");
free(p);
}
I think this very funny.
One more with VLA:
#include <stdio.h>
#include <stdlib.h>
static void *foo(size_t elem, size_t n, size_t m) {
return malloc(elem * n * m);
}
int main(void) {
size_t n = 42;
int (*p)[n] = foo(sizeof **p, n, n);
if (!p) {
return 1;
}
printf("p:");
int accu = 0;
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < sizeof *p / sizeof **p; j++) {
p[i][j] = accu++;
printf(" %d", p[i][j]);
}
}
printf("\n");
free(p);
}
I'm a java student who's currently learning about pointers and C.
I tried to make a simple palindrome tester in C using a single array and pointer arithmetic.
I got it to work without a loop (example for an array of size 10 :*(test) == *(test+9) was true.
Having trouble with my loop. School me!
#include<stdio.h>
//function declaration
//int palindrome(int *test);
int main()
{
int output;
int numArray[10] = {0,2,3,4,1,1,4,3,2,0};
int *ptr;
ptr = &numArray[0];
output = palindrome(ptr);
printf("%d", output);
}
//function determine if string is a palindrome
int palindrome(int *test) {
int i;
for (i = 0; i <= (sizeof(test) / 2); i++) {
if (*(test + i) == *(test + (sizeof(test) - i)))
return 1;
else
return 0;
}
}
The Name of the array will itself acts as a pointer to an first element of the array, if you loose the pointer then there is no means for you to access the element of the array and hence you can send just the name of the array as a parameter to the function.
In the palindrome function:
you have used sizeof(test)/2. what happens is the address gets divided which is meaningless and hence you should not use that to calculate the mid element.
sizeof the pointer will be the same irrespective of the type of address that gets stored.
Why do you copy your pointer in another variable?
int *ptr;
ptr = &numArray[0];
Just send it to you function:
palindrome(numArray);
And sizeof(test) give you the memory size of a pointer, it's not what you want. You have to give the size in parameter of your function.
int palindrome(int *test, int size){
...
}
Finally your code must look like this:
#include<stdio.h>
int palindrome(int *test, int size);
int main()
{
int output;
int numArray[10] = {0,2,3,4,1,1,4,3,2,0};
output = palindrome(numArray, 10);
printf("%d", output);
}
//function determine if string is a palindrome
int palindrome(int *test, int size) {
int i;
for (i = 0; i < size / 2; i++) {
if (*(test + i) != *(test + (size - 1) - i))
return 0;
}
return 1;
}
Okay so I need to several quite long strings in C. So I say to myself "why, you'd better use that handy dandy qsort function! Better write yourself a string_comparator for it!"
So of course I do and here she is:
int string_comparator(const void* el1, const void* el2) {
char* x = (char*) el1;
char* y = (char*) el2;
int str_len = strlen(x);
int i = 0;
for (; i < str_len; i++) {
//when there are non-equal chars
if (x[i] != y[i]) {
break;
}
}
return x[i] - y[i];
}
So of course I pass my handy dandy string_comparator function to the C qsort function as such:
qsort(list.words, list.num_words, sizeof(char*), string_comparator);
list is a struct that holds a char** (words) and ints which refer to the number of words held by it (such as num_words)
Now I have the problem where my list is not getting sorted alphabetically like I had hoped! I put a bunch of printf statements in my comparator and it printed out garbage values for the strings every time so I'm fairly sure that is the problem. But why is that the problem?? I've used qsort before (never to sort words..just sorting characters) and from what I understand this should work...What's going wrong here?
I appreciate any suggestions!
This is a common mistake when using qsort(). Here are the corrections:
char *x = *(char **) el1;
char *y = *(char **) el2;
Because list.words has type char **, not type char *, right?
Another example of qsort()
Here's how you sort an array of int with qsort():
int int_comparator(const void *el1, const void *el2)
{
int x = *(int *) el1;
int y = *(int *) el2;
return x - y;
}
void sort_ints(int *a, size_t n)
{
// these two lines are both "correct"
// the second line is more "obviously correct"
// qsort(a, n, sizeof(int), int_comparator);
qsort(a, n, sizeof(*a), int_comparator);
}
Now, if you go through and replace int with char *, you have to replace int * with char **.