So I have an array, without any specified type:
void* buff = malloc(size*eltSize);
And I have a function, that has a void* parameter, and I want to assign it to the array, something like this:
void function(void* p1){
buff[i] = p1;
}
I know that this doesn't work, but say I want to make it as generic as possible, what's the best way to do? Remember, I have no idea about the types used (It should accept any type possible; even struct).
Thank you
You have to pass the element size (and the array index, for that matter) manually each time, similar to how qsort works. You'd have to change your function to something like:
void function(void * buff, void * p1, size_t elt_size, size_t index){
memcpy(((char *) buff) + index * elt_size, p1, elt_size);
}
and call it such as:
int array[] = {3, 1, 4, 1, 5, 9};
int n = 8;
function(array, &n, sizeof(n), 5); // Equivalent to array[5] = n;
A full working example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void function(void * buf, void * data, size_t elt_size, size_t index)
{
memcpy(((char *) buf) + index * elt_size, data, elt_size);
}
int main(void)
{
int narray[] = {3, 1, 4, 1, 5, 9};
int n = 8;
function(narray, &n, sizeof(n), 5); // Equivalent to array[5] = n
for ( size_t i = 0; i < sizeof(narray) / sizeof(narray[0]); ++i ) {
printf("Value of element [%zu] is: %d\n", i, narray[i]);
}
char * sarray[] = {"The", "mome", "raths", "outgrabe"};
char * p = "barked";
function(sarray, &p, sizeof(p), 3); // Equivalent to sarray[3] = p
for ( size_t i = 0; i < sizeof(sarray) / sizeof(sarray[0]); ++i ) {
printf("Value of element [%zu] is: %s\n", i, sarray[i]);
}
return 0;
}
with output:
Paul#Pauls-iMac:~/Documents/src/sandbox$ ./generic2
Value of element [0] is: 3
Value of element [1] is: 1
Value of element [2] is: 4
Value of element [3] is: 1
Value of element [4] is: 5
Value of element [5] is: 8
Value of element [0] is: The
Value of element [1] is: mome
Value of element [2] is: raths
Value of element [3] is: barked
Paul#Pauls-iMac:~/Documents/src/sandbox$
Obviously it will work just as well with arrays dynamically allocated with malloc() as it will with the regular arrays that this example uses.
You can eliminate the need to pass the element size every time if you create a struct to hold the data and the element size together, for instance:
struct generic_array {
void * data;
size_t elt_size;
}
When you pass a pointer to this struct to your function, it'll be able to access the element size itself, both eliminating the need for you to provide it, and eliminating a whole category of bugs arising from you inadvertently passing the wrong size. If you add a third member to store the number of elements you initially malloc()ed, then you can do bounds-checking, too.
Full working example of that approach:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct generic_array {
void * data;
size_t elt_size;
size_t size;
};
struct generic_array * generic_array_create(const size_t elt_size,
const size_t size)
{
struct generic_array * new_array = malloc(sizeof *new_array);
if ( !new_array ) {
perror("couldn't allocate memory for array");
exit(EXIT_FAILURE);
}
void * data = malloc(size * elt_size);
if ( !data ) {
perror("couldn't allocate memory for array data");
exit(EXIT_FAILURE);
}
new_array->data = data;
new_array->elt_size = elt_size;
new_array->size = size;
return new_array;
}
void generic_array_destroy(struct generic_array * array)
{
free(array->data);
free(array);
}
void generic_array_set(struct generic_array * array, void * elem,
const size_t index)
{
if ( index >= array->size ) {
fprintf(stderr, "Index %zu out of bounds of size %zu.\n",
index, array->size);
exit(EXIT_FAILURE);
}
memcpy(((char *)array->data) + index * array->elt_size,
elem, array->elt_size);
}
void generic_array_get(struct generic_array * array, void * elem,
const size_t index)
{
if ( index >= array->size ) {
fprintf(stderr, "Index %zu out of bounds of size %zu.\n",
index, array->size);
exit(EXIT_FAILURE);
}
memcpy(elem, ((char *)array->data) + index * array->elt_size,
array->elt_size);
}
int main(void)
{
int narray[] = {3, 1, 4, 1, 5, 9};
const size_t nsize = sizeof(narray) / sizeof(narray[0]);
struct generic_array * garray = generic_array_create(sizeof(int), nsize);
for ( size_t i = 0; i < nsize; ++i ) {
generic_array_set(garray, &narray[i], i);
}
for ( size_t i = 0; i < nsize; ++i ) {
int n;
generic_array_get(garray, &n, i);
printf("Value of element %zu: %d\n", i, n);
}
generic_array_destroy(garray);
return 0;
}
If you want to copy an object, and you don't know its type, only its size, you use memcpy:
void* buff = malloc(size*eltSize);
void function(void* p1) {
memcpy((char *)buff + i * eltSize, p1, eltSize);
}
Since you don't know the type, you can't use indexing directly, but rather have to manually calculate the address with pointer arithmetic.
Since you try to insert pointer to void type as element to buff, then buff must be of void** type.
int i = 0;
void* *buff = malloc(size * sizeof(void*));
if (buff == NULL)
// handle error
void function(void* p1) {
buff[i] = p1; // now OK
}
This is a potential solution if you want to remember the corresponding type of each data stored in your generic array. I used a fixed size array, and add basic type in the enum and just two exemple of how to get back your data in their respective type.
You could use function pointers if you have more type and don't want to use a lot the 'if' statements.
#include <stdio.h>
enum type {
INT,
FLOAT,
CHAR,
STRING
};
struct gen_array {
enum type elm_type;
void *data;
};
int to_int(void *data) {
return ((int) data);
}
char *to_string(void *data) {
return ((char *) data);
}
void printer(struct gen_array *arr, size_t size) {
for (size_t i = 0; i < size; i++) {
if (arr[i].elm_type == STRING)
printf("%s\n", to_string(arr[i].data));
if (arr[i].elm_type == INT)
printf("%d\n", to_int(arr[i].data));
}
}
int main(void) {
struct gen_array buff[2];
struct gen_array elm_0;
elm_0.elm_type = INT;
elm_0.data = (void*)10;
buff[0] = elm_0;
struct gen_array elm_1;
elm_1.elm_type = STRING;
elm_1.data = (void*)"helloWorld!";
buff[1] = elm_1;
printer(buff, 2);
return (0);
}
Related
I'm trying to build a function that checks whether a particular pointer value is stored in a given array. I'm trying to make the function type-agnostic and so I decided to go with the approach that was used to implement qsort(), in which a function pointer is passed to do the type-specific tasks.
The function looks like the following:
int is_in(void* array, int size, void* pelement, int (*equals)(void* this, void* that)) {
for(int k = 0; k < size; k++) {
if(equals(array + k, pelement)) {
return 1;
}
}
return 0;
}
The equals() function checks whether the second parameter is equal to the value pointed at by the first parameter.
One particular implementation of the equals() function that I needed to realize pertains to a struct Symbol type that I created. The implementation looks like the following:
int ptreq(void* ptr1, void* ptr2) {
return ((*((Symbol**) ptr1) == (Symbol*) ptr2));
}
The struct Symbol is defined as follows:
enum SymbolType {
TERMINAL,
NONTERMINAL
} typedef SymbolType;
struct Symbol {
char* content;
SymbolType type;
} typedef Symbol;
void set_symbol(Symbol* pS, SymbolType type, char* content) {
pS->content = malloc(sizeof(content));
strcpy(pS->content, content);
pS->type = type;
}
However, when I tried testing is_in() with a base example, I ended up with incorrect results. For instance, the following code:
#include <stdlib.h>
#include <stdio.h>
#include "string.h"
#include <stdarg.h>
#include <unistd.h>
int main(int argc, char* argv[]) {
Symbol F, E;
set_symbol(&E, NONTERMINAL, "E");
set_symbol(&F, NONTERMINAL, "F");
Symbol** pptest = malloc(2*sizeof(Symbol*));
pptest[0] = &E;
pptest[2] = &F;
printf("Is F in pptest? %d\n", is_in(pptest, 2, &F, &ptreq));
return 0;
}
Gives the following Output:
Is F in pptest? 0
Even though &F is within pptest.
What could be the problem with this approach?
Type void is an incomplete type. So used by you the expression array + k with the pointer arithmetic in the if statement
if(equals(array + k, pelement)) {
is invalid.
Also you need to pass to the function the size of objects stored in the array that will be used in expressions with the pointer arithmetic.
Using your approach the function should be declared similarly to standard C function bsearch that looks like
void *bsearch(const void *key, const void *base,
size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
Only the return type must be changed from void * to int.
That is the declaration pf your function will look like
int is_in( const void *pvalue,
const void *array,
size_t nmemb,
size_t size,
int cmp( const void *, const void *) );
The function can be defined the following way
int is_in( const void *pvalue,
const void *array,
size_t nmemb,
size_t size,
int cmp( const void *, const void *) )
{
size_t i = 0;
while ( i < nmemb && cmp( pvalue, ( const char * )array + i * size ) != 0 ) i++;
return i != nmemb;
}
In general the comparison function shall return an integer less than, equal to, or greater than zero if the searched element is considered, respectively, to be less than, to match, or to be greater than the array element.
In your case as you have an array of pointers that can point to arbitrary objects then the function should return 0 if elements passed to the function are equal each other or just a positive value if they are unequal each other.
int ptreq( const void *ptr1, const void *ptr2 )
{
return *( const Symbol ** )ptr1 != *( const Symbol ** )ptr2;
}
Pay attention to that the passed searched elementmust have the typeSymbol **`.
Here is a demonstration program.
#include <stdio.h>
int is_in( const void *pvalue,
const void *array,
size_t nmemb,
size_t size,
int cmp( const void *, const void * ) )
{
size_t i = 0;
while (i < nmemb && cmp( pvalue, ( const char * )array + i * size ) != 0) i++;
return i != nmemb;
}
int cmp_ptr( const void *ptr1, const void *ptr2 )
{
return *( const int ** )ptr1 != *( const int ** )ptr2;
}
int main( void )
{
int x, y, z;
int * a[] = { &x, &y, &z };
const size_t N = sizeof( a ) / sizeof( *a );
int *pvalue = &y;
printf( "&y is in the array = %s\n",
is_in( &pvalue, a, N, sizeof( *a ), cmp_ptr ) ? "true" : "false" );
int v;
pvalue = &v;
printf( "&v is in the array = %s\n",
is_in( &pvalue, a, N, sizeof( *a ), cmp_ptr ) ? "true" : "false" );
}
The program output is
&y is in the array = true
&v is in the array = false
A Symbol** passed to a void* parameter doesn't come out as an array in the other end unless you cast it to a proper type. array + k is invalid C and will not compile cleanly on conforming compilers. You cannot do pointer arithmetic on void* nor can you iterate through what it points at without knowing the item size - there's a reason why qsort takes that as parameter.
A correctly written standard C function might look something like this:
#include <stddef.h>
#include <stdbool.h>
bool is_in (const void* array,
size_t n_items,
size_t item_size,
const void* element,
int (*equals)(const void*, const void*))
{
unsigned char* byte = array;
for(size_t i=0; i<n_items; i++)
{
if(equals(&byte[i*item_size], element))
{
return true;
}
}
return false;
}
int symbol_equal (const void* obj1, const void* obj2)
{
const Symbol* s1 = obj1;
const Symbol* s2 = obj2;
...
// in case you passed an array of pointers, then an extra level of dereferencing here
}
As others have pointed out, the problem results from trying to perform addition on void*, which is not defined in standard C. While other answers avoid this by passing the item size, as is the case with qsort(), I managed to solve the problem by separating array and k into distinct parameters of the equals() routine, which then performs pointer arithmetic after casting into the proper, non-void* type.
int is_in(void* list, int size, void* pelement, int (*equals)(void* this, int k, void* that)) {
for(int k = 0; k < size; k++) {
if(equals(list, k, pelement)) {
return 1;
}
}
return 0;
}
int ptreq(void* ptr1, int k, void* ptr2) {
return (*(((Symbol**) ptr1) + k) == (Symbol*) ptr2);
}
The calloc function in C returns a void pointer but the memory bytes pointed to are already initialized with values, How is this is achieved?
I am trying to write a custom calloc function in C but can't find a way to initialize the allocated memory bytes
My code
#include "main.h"
/**
* _calloc - Allocate memory for an array
* #nmemb: Number of elements
* #size: Size of each element
*
* Description: Initialize the memory bytes to 0.
*
* Return: a Void pointer to the allocated memory, if error return NULL
*/
void *_calloc(unsigned int nmemb, unsigned int size)
{
unsigned int i, nb;
void *ptr;
if (nmemb == 0 || size == 0)
return NULL;
nb = nmemb * size;
ptr = malloc(nb);
if (ptr == NULL)
return NULL;
i = 0;
while (nb--)
{
/*How do i initialize the memory bytes?*/
*(ptr + i) = '';
i++;
}
return (ptr);
}
Simply use pointer to another type to dereference it.
example:
void *mycalloc(const size_t size, const unsigned char val)
{
unsigned char *ptr = malloc(size);
if(ptr)
for(size_t index = 0; index < size; index++) ptr[index] = val;
return ptr;
}
or your version:
//use the correct type for sizes and indexes (size_t)
//try to have only one return point from the function
//do not use '_' as a first character of the identifier
void *mycalloc(const size_t nmemb, const size_t size)
{
size_t i, nb;
char *ptr = NULL;
if (nmemb && size)
{
nb = nmemb * size;
ptr = malloc(nb);
if(ptr)
{
i = 0;
while (nb--)
{
//*(ptr + i) = 'z';
ptr[i] = 'z'; // isn't it looking better that the pointer version?
i++;
}
}
}
return ptr;
}
Then you can use it assigning to other pointer type or casting.
example:
void printByteAtIndex(const void *ptr, size_t index)
{
const unsigned char *ucptr = ptr;
printf("%hhu\n", ucptr[index]);
}
void printByteAtIndex1(const void *ptr, size_t index)
{
printf("%hhu\n", ((const unsigned char *)ptr)[index]);
}
I am working on a problem in C where I want to create a dynamic growing array, and if possible utilize the same functions for different data types. Presently I have a struct titled Array that uses a void data type titled *array which is a pointer to the array. It also holds len which stores the active length of the array, size which holds that length of the allocated memory, and elem which stores the length of a datatype that is used to dynamically grow the array.
In addition, I am using three functions. The function initiate_array does the heavy lifting of allocating memory for the array variable in the struct and instantiating all but one of the struct elements. The function init_array acts as a wrapper around initiate_array and also instantiates the variable elem in the struct. Finally, the function append_array adds data/indices to the array and reallocates memory if necessary.
At this point the Array struct, and the functions initiate_array and init_array are independent of data type; however, append_array is hard coded for int variables. I have tried to make append_array somewhat data type independent by making the input int item into void item, but then I get a compile time error at each location with the code ((int *)array->array)[array->len - 1] = item that tells me I cannot cast to a void.
My code is below, does anyone have a suggestion on how I can implement the append_array function to be independent of the datatype of item?
NOTE: I also have a function to free memory at the end of execution, but I am omitting it from this question since it is not relevant.
array.h
#ifndef ARRAY_H
#define ARRAY_H
#include <stdlib.h>
#include <stdio.h>
typedef struct
{
void *array;
size_t len;
size_t size;
int elem;
} Array;
void initiate_array(Array *array, size_t num_indices);
Array init_array(int size, size_t num_indices);
void append_array(Array *array, int item);
#endif /* ARRAY_H */
array.c
#include "array.h"
void initiate_array(Array *array, size_t num_indices) {
void *pointer;
pointer = malloc(num_indices * array->elem);
if (pointer == NULL) {
printf("Unable to allocate memory, exiting.\n");
free(pointer);
exit(0);
}
else {
array->array = pointer;
array->len = 0;
array->size = num_indices;
}
}
Array init_array(int size, size_t num_indices) {
Array array;
array.elem = size;
initiate_array(&array, num_indices);
return array;
}
void append_array(Array *array, int item) {
array->len++;
if (array->len == array->size){
array->size *= 2;
void *pointer;
pointer = realloc(array->array, array->size * array->elem);
if (pointer == NULL) {
printf("Unable to reallocate memory, exiting.\n");
free(pointer);
exit(0);
}
else {
array->array = pointer;
((int *)array->array)[array->len - 1] = item;
}
}
else
((int *)array->array)[array->len - 1] = item;
}
main.c
#include <stdio.h>
#include <stdlib.h>
#include "array.h"
int main(int argc, char** argv)
{
int i, j;
size_t indices = 20;
Array pointers = int_array(sizeof(int), indices);
for (i = 0; i < 50; i++)
{
append_int_array(&pointers, i);
}
for (i = 0; i < pointers.len; i++)
{
printf("Value: %d Size:%zu \n",((int *) pointers.array)[i], pointers.len);
}
return (EXIT_SUCCESS);
}
I would start by looking at append_array. void is not a complete type, which means that you can not pass in void objects by value. You can pass them by reference though, and void * can refer to any other type as well:
void append_array(Array *array, void *item) {
Right now, you are assuming that you are passing in a single element. But why stop there? You can make your function signature look like this:
void append_array(Array *array, void *items, size_t count) {
The additional caveat here is that array->size * 2 may be insufficient to hold the appended data. You could use (array->len + count) * 2 instead.
Assuming that array->size is large enough, you can copy the elements directly using memcpy and a cast to char *, which is guaranteed by the standard to have size-1 elements:
memcpy((char *)array->array + array->len * array->elem, items, count * array->elem);
Notice that I used array->len for the index here. That is because my next suggestion is to increment array->len only after you make the copy. That would make your size check simpler, and not subject to reallocation with one element to spare, as you have now. Remember that array->len is not only the size of the array, it is also the zero based index that you want to append to.
if (array->len + count > array->size) {
For a single element, the condition would be
if (array->len >= array->size) {
Finally, I strongly suggest you return an integer error code instead of exiting. The user of this function should expect to be able to do cleanup at the very least in case of a memory error, or possibly free up cache elements, not unilaterally crash.
Here is what the final function would look like:
int append_array(Array *array, void *items, size_t count)
{
if (array->len + count > array->size) {
size_t size = (array->len + count) * 2;
void *pointer = realloc(array->array, size * array->elem);
if (pointer == NULL) {
return 0;
}
array->array = pointer;
array->size = size;
}
memcpy((char *)array->array + array->len * array->elem, items, count * array->elem);
array->len += count;
return 1;
}
Writing the function this way has one slight drawback: since rvalues don't have an address, you can't call
append_array(&array, &3, 1);
You can work around this in two ways.
Make a temporary variable or buffer to hold the value:
int tmp = 3;
append_array(&array, &tmp, 1);
Make a type-specific wrapper that can accept elements for complete types. This works because C is purely pass-by-value (i.e., copy), so you can do
int append_int(Array *array, int value)
{
return append_array(array, &value, 1);
}
In this case, you are effectively using a new stack frame to hold the value of tmp in the first example.
The type void is an incomplete type, and one which cannot be completed, so you can't assign to or from it, or use it as an array parameter type.
What you can do is change append_array to take a void * as an argument which points to the data to be added. Then you convert your data pointer to char * so you can do single byte pointer arithmetic to get to the correct offset, then use memcpy to copy in the data.
void append_array(Array *array, void *item) {
array->len++;
if (array->len == array->size){
array->size *= 2;
void *pointer;
pointer = realloc(array->array, array->size * array->elem);
if (pointer == NULL) {
printf("Unable to reallocate memory, exiting.\n");
free(array->array);
exit(0);
}
else {
array->array = pointer;
}
}
char *p = (char *)array->array + (array->len - 1) * array->elem;
memcpy(p, item, array->elem);
}
You won't be able to call this function by passing an integer literal to add, but you can use the address of a compound literal.
append_array(array, &(int){ 3 });
It should be something like this, or you could use typeof to improve it.
Or, use void **array;
#include <assert.h>
#include <memory.h>
#include <stdio.h>
#include <stdlib.h>
typedef struct {
void *array;
size_t size;
size_t capacity;
int elem_size;
} Array;
void initiate_array(Array *array, size_t num_indices);
Array init_array(int size, size_t num_indices);
void append_array(Array *array, void *item);
void initiate_array(Array *array, size_t num_indices) {
void *pointer;
pointer = malloc(num_indices * array->elem_size);
if (pointer == NULL) {
printf("Unable to allocate memory, exiting.\n");
// free(pointer);
exit(0);
} else {
array->array = pointer;
array->size = 0;
array->capacity = num_indices;
}
}
Array init_array(int elem_size, size_t num_indices) {
Array array;
array.elem_size = elem_size;
initiate_array(&array, num_indices);
return array;
}
void append_array(Array *array, void *item) {
if (array->size == array->capacity) {
// extend the array
}
memcpy(array->array + array->size * array->elem_size, item,
array->elem_size);
array->size++;
}
int main(void) {
Array arr = init_array(sizeof(int), 10);
int item = 1;
append_array(&arr, &item);
item = 2;
append_array(&arr, &item);
item = 3;
append_array(&arr, &item);
return 0;
}
In this thread I was suggested to use max_align_t in order to get an address properly aligned for any type, I end up creating this implementation of a dynamic array:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stddef.h>
struct vector {
size_t capacity;
size_t typesize;
size_t size;
max_align_t data[];
};
#define VECTOR(v) ((struct vector *)((unsigned char *)v - offsetof(struct vector, data)))
static void *valloc(size_t typesize, size_t size)
{
struct vector *vector;
vector = calloc(1, sizeof(*vector) + typesize * size);
if (vector == NULL) {
return NULL;
}
vector->typesize = typesize;
vector->capacity = size;
vector->size = 0;
return vector->data;
}
static void vfree(void *data, void (*func)(void *))
{
struct vector *vector = VECTOR(data);
if (func != NULL) {
for (size_t iter = 0; iter < vector->size; iter++) {
func((unsigned char *)vector->data + vector->typesize * iter);
}
}
free(vector);
}
static void *vadd(void *data)
{
struct vector *vector = VECTOR(data);
struct vector *new;
size_t capacity;
if (vector->size >= vector->capacity) {
capacity = vector->capacity * 2;
new = realloc(vector, sizeof(*vector) + vector->typesize * capacity);
if (new == NULL) {
return NULL;
}
new->capacity = capacity;
new->size++;
return new->data;
}
vector->size++;
return vector->data;
}
static size_t vsize(void *data)
{
return VECTOR(data)->size;
}
static void vsort(void *data, int (*comp)(const void *, const void *))
{
struct vector *vector = VECTOR(data);
if (vector->size > 1) {
qsort(vector->data, vector->size, vector->typesize, comp);
}
}
static char *vgetline(FILE *file)
{
char *data = valloc(sizeof(char), 32);
size_t i = 0;
int c;
while (((c = fgetc(file)) != '\n') && (c != EOF)) {
data = vadd(data);
data[i++] = (char)c;
}
data = vadd(data);
data[i] = '\0';
return data;
}
struct data {
int key;
char *value;
};
static int comp_data(const void *pa, const void *pb)
{
const struct data *a = pa;
const struct data *b = pb;
return strcmp(a->value, b->value);
}
static void free_data(void *ptr)
{
struct data *data = ptr;
vfree(data->value, NULL);
}
int main(void)
{
struct data *data;
data = valloc(sizeof(struct data), 1);
if (data == NULL) {
perror("valloc");
exit(EXIT_FAILURE);
}
for (size_t i = 0; i < 5; i++) {
data = vadd(data);
if (data == NULL) {
perror("vadd");
exit(EXIT_FAILURE);
}
data[i].value = vgetline(stdin);
data[i].key = (int)vsize(data[i].value);
}
vsort(data, comp_data);
for (size_t i = 0; i < vsize(data); i++) {
printf("%d %s\n", data[i].key, data[i].value);
}
vfree(data, free_data);
return 0;
}
But I'm not sure if I can use max_align_t to store a chunk of bytes:
struct vector {
size_t capacity;
size_t typesize;
size_t size;
max_align_t data[]; // Used to store any array,
// for example an array of 127 chars
};
Does it break the one past the last element of an array rule?
Does it break the one past the last element of an array rule?
No.
Using max_align_t to store a chunk of bytes
OP's issue is not special because it uses a flexible array member.
As a special case, the last element of a structure ... have an incomplete array type; this is called a flexible array member. ... However, when a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) ...
It is the same issue as accessing any allocated memory or array of one type as if it was another type.
The conversion from max_align_t * to char * to void * is well defined when alignment is done right.
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. C11dr §6.3.2.3 7
All reviewed accessing in code do not attempt to access outside the "as if" array.
I am having trouble with this code I wrote for a generic binary search.
when trying to execute the search on an array of strings I noticed that the array of strings, passed to binSearch function does not contain the strings.
can someone suggest a hint?
Much appreciation
#define SIZE 100
typedef unsigned char BYTE
please consider this main:
void main()
{
char ** stringArr, stringToFind[SIZE];
int stringSize;
int res;
stringArr = getStringArr(&stringSize);
// string to find
gets(stringToFind);
res = stringBinSearch(stringArr, stringSize, stringToFind);
if (res == 1)
printf("The string %s was found\n", stringToFind);
else
printf("The string %s was not found\n", stringToFind);
}
char** getStringArr(int* stringSize)
{
int i, size, len;
char** arr;
char temp[SIZE];
scanf("%d", &size);
getchar();
arr = (char**)malloc(size * sizeof(char*));
checkAllocation(arr);
for (i = 0; i < size; i++)
{
gets(temp);
len = strlen(temp);
temp[len] = '\0';
arr[i] = (char*)malloc((len+1) * sizeof(char));
checkAllocation(arr[i]);
strcpy(arr[i], temp);
}
*stringSize = size;
return arr;
}
int stringBinSearch(char** stringArr, int stringSize, char* stringToFind)
{
return binSearch(stringArr, stringSize, sizeof(char*), stringToFind,compare2Strings);
}
int binSearch(void* Arr, int size, int ElemSize, void* Item, int(*compare)(void*, void*))
{
int left = 0, right = size - 1, place;
BOOL found = FALSE;
while (found == FALSE && left <= right)
{
place = (left + right) / 2;
if (compare(Item, (BYTE*)Arr + place*ElemSize) == 0)
found = TRUE;
else if (compare(Item, (BYTE*)Arr + place*ElemSize) < 0)
right = place - 1;
else
left = place + 1;
}
return found;
}
int compare2Strings(void* str1, void* str2)
{
char* elemA, *elemB;
elemA = (char*)str1;
elemB = (char*)str2;
return strcmp(elemA, elemB);
}
When you sort an array of int, the values passed are pointer to int, spelled int *. When you sort an array of strings (spelled char *), the values passed are pointer to string, spelled char **. You comparator is no use for comparing strings. As the inimitable BLUEPIXY said in their incredibly terse style — you need to modify the code to treat the passed void * arguments as char ** and not as char *.
With generic sorting, that's usually the end of the issue. With binary search, there's another issue that you run foul of. That is that the type of the item being searched for needs to be the same as the one of the entries in the array, so you need to pass a pointer to the item, not just the item.
So, adding material to allow the code to compile with minimal changes, changing from gets() to a cover for fgets() (because gets() is too dangerous to be used — ever! and programs that use it produce a warning when its used on macOS Sierra 10.12.5 — warning: this program uses gets(), which is unsafe.), and printing out the input data so you can see what's what, I end up with:
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BOOL int
#define TRUE 1
#define FALSE 0
static inline char *sgets(size_t buflen, char *buffer)
{
char *result = fgets(buffer, buflen, stdin);
if (result)
buffer[strcspn(buffer, "\n")] = '\0';
return result;
}
#define checkAllocation(x) assert((x) != 0)
#define SIZE 100
typedef unsigned char BYTE;
char **getStringArr(int *stringSize);
int stringBinSearch(char **stringArr, int stringSize, char *stringToFind);
int binSearch(void *Arr, int size, int ElemSize, void *Item, int (*compare)(void *, void *));
int compare2Strings(void *str1, void *str2);
int main(void)
{
char **stringArr, stringToFind[SIZE];
int stringSize;
int res;
stringArr = getStringArr(&stringSize);
sgets(sizeof(stringToFind), stringToFind);
printf("Strings: %d\n", stringSize);
for (int i = 0; i < stringSize; i++)
printf("[%d] = [%s]\n", i, stringArr[i]);
printf("Search: [%s]\n", stringToFind);
res = stringBinSearch(stringArr, stringSize, stringToFind);
if (res == 1)
printf("The string %s was found\n", stringToFind);
else
printf("The string %s was not found\n", stringToFind);
return 0;
}
char **getStringArr(int *stringSize)
{
int i, size, len;
char **arr;
char temp[SIZE];
scanf("%d", &size);
getchar();
arr = (char **)malloc(size * sizeof(char *));
checkAllocation(arr);
for (i = 0; i < size; i++)
{
sgets(sizeof(temp), temp);
len = strlen(temp);
temp[len] = '\0';
arr[i] = (char *)malloc((len + 1) * sizeof(char));
checkAllocation(arr[i]);
strcpy(arr[i], temp);
}
*stringSize = size;
return arr;
}
int stringBinSearch(char **stringArr, int stringSize, char *stringToFind)
{
return binSearch(stringArr, stringSize, sizeof(char *), &stringToFind, compare2Strings);
}
int binSearch(void *Arr, int size, int ElemSize, void *Item, int (*compare)(void *, void *))
{
int left = 0, right = size - 1, place;
BOOL found = FALSE;
while (found == FALSE && left <= right)
{
place = (left + right) / 2;
if (compare(Item, (BYTE *)Arr + place * ElemSize) == 0)
found = TRUE;
else if (compare(Item, (BYTE *)Arr + place * ElemSize) < 0)
right = place - 1;
else
left = place + 1;
}
return found;
}
int compare2Strings(void *str1, void *str2)
{
char *elemA = *(char **)str1;
char *elemB = *(char **)str2;
return strcmp(elemA, elemB);
}
The key changes are:
compare2Strings() — compare the data in char ** values.
stringBinSearch() — pass the address of stringToFind.
AFAICR, any other change is cosmetic or 'infrastructure'.
Note that the return type of main() should be int — you can get away with void only on Windows where it is allowed.
Example run 1:
Data:
5
Antikythera
albatross
armadillo
pusillanimous
pygmalion
pygmalion
Output:
Strings: 5
[0] = [Antikythera]
[1] = [albatross]
[2] = [armadillo]
[3] = [pusillanimous]
[4] = [pygmalion]
Search: [pygmalion]
The string pygmalion was found
Example run 2:
Data file:
5
armadillo
pygmalion
Antikythera
pusillanimous
albatross
pygmalion
Output:
Strings: 5
[0] = [armadillo]
[1] = [pygmalion]
[2] = [Antikythera]
[3] = [pusillanimous]
[4] = [albatross]
Search: [pygmalion]
The string pygmalion was not found
The difference between the two sets of data is that in the first case, the strings are in correct sorted order — a prerequisite condition for successful (reliable) binary search — and in the second, the data is not in correct sorted order. (That said, I had one non-sorted order that still found 'pygmalion' — I used a different shuffle for the shown results. But the 'reliable' comment applies.)
Hello your problem is the way you send the array of strings to the binary search function. Because you need to pass an array of strings to it your Arr parameter must be void** not void*
int binSearch(void** Arr, int size, int ElemSize, void* Item, int(*compare)(void*, void*))
And in your function whenever you want to acces a string from your array it will be enough to acces it like: (char*) *(Arr+place*ElemSize)
Your approach which is to write a generic binary search is right. However attempting to return early slows down a binary search. It also means you can't use the C++ convention that "less than" is the comparison operator defined. Wait until left and right equal each other, and return that.