C - Populate a generic struct inside a function without malloc - c

I'm trying to build a generic function that can populate a struct without any dynamic memory allocation.
The following code is a naive example of what I'm trying to do.
This code will not compile as incomplete type 'void' is not assignable.
Please note that this is a toy example to highlight my problems. I don't really want to convert colours; I just want to highlight that the structures will be different in data types and size.
#include <stdio.h>
typedef struct {
int r;
int g;
int b;
} rgb_t;
typedef struct {
float c;
float m;
float y;
float k;
} cmyk_t;
typedef enum { RGB, CMYK } color_t;
void convert_hex_to_color(long hex, color_t colorType, void* const out) {
if (colorType == RGB) {
rgb_t temp = { 0 };
// Insert some conversion math here....
temp.r = 1;
temp.g = 2;
temp.b = 3;
*out = temp; //< [!]
} else
if (colorType == CMYK) {
cmyk_t temp = { 0 };
// Insert some conversion math here....
temp.c = 1.0;
temp.m = 2.0;
temp.y = 3.0;
temp.k = 4.0;
*out = temp; //< [!]
}
}
int main(void) {
// Given
long hex = 348576;
rgb_t mydata = { 0 };
convert_hex_to_color(hex, RGB, (void*)(&mydata));
// Then
printf("RGB = %i,%i,%i\r\n", mydata.r, mydata.g, mydata.b);
return 0;
}
For some additional context, I'm using C11 on an embedded system target.
What is the best[1] way to do this? Macro? Union?
Regards,
Gabriel
[1] I would define "best" as a good compromise between readability and safety.

The reason for the error is it is invalid to store via a void pointer: the compiler does not know what to store. You could cast the pointer as *(rgb_t *)out = temp; or *(cmyk_t *)out = temp;
Alternately, you could define temp as a pointer to the appropriate structure type and initialize it directly from out, without the cast that is not needed in C:
void convert_hex_to_color(long hex, color_t colorType, void *out) {
if (colorType == RGB) {
rgb_t *temp = out;
// Insert some conversion math here....
temp->r = 1;
temp->g = 2;
temp->b = 3;
} else
if (colorType == CMYK) {
cmyk_t *temp = out;
// Insert some conversion math here....
temp->c = 1.0;
temp->m = 2.0;
temp->y = 3.0;
temp->k = 4.0;
}
}
Note that the cast is not needed in C:
int main(void) {
// Given
long hex = 348576;
rgb_t mydata = { 0 };
convert_hex_to_color(hex, RGB, &mydata);
// Then
printf("RGB = %i,%i,%i\r\n", mydata.r, mydata.g, mydata.b);
return 0;
}

rgb_t temp = {0};
So that declares a variable on the stack of type rgb_t. So far so good, though you don't need that 0.
*out = temp;
Here is your problem: in C you can only copy memory of the same type. Ever. This has nothing to do with malloc, as your title suggests, this is just the basic language specification. Sure, some types provide implicit casts, but void* is not one of them.
So if you're copying a structure (rgb_t on the right side), the destination has to be of the same type. So change the line to this:
*(rgb_t *)out = temp;

The "best" way is not to mix unrelated structures in the same function, or in the same memory area for that matter. That's just messy design.
If you need to keep a unison API for two different forms of data, then a typesafe function-like macro might be one idea. You can fake such a macro to have a syntax similar to passing the data by pointer
void convert_hex_to_color(long hex, type* data)
But then use C11 _Generic to actually determine the correct type to use, rather than using dangerous void pointers. Since you can't pass parameters "by reference" to macros, you'd have to sneak in a variable assignment in there. Example:
#include <stdio.h>
typedef struct {
int r;
int g;
int b;
} rgb_t;
typedef struct {
float c;
float m;
float y;
float k;
} cmyk_t;
void convert_hex_to_color(long hex, void* data);
/*
Pretty prototype just for code documentation purposes.
Never actually defined or called - the actual macro will "mock" this function.
*/
#define convert_hex_to_color(hex, output) ( *(output) = _Generic(*(output), \
rgb_t: (rgb_t){ .r=1, .g=2, .b=3 }, \
cmyk_t: (cmyk_t){ .c=1.0, .m=2.0, .y=3.0, .k=4.0 } ) )
int main(void) {
// Given
long hex = 348576;
rgb_t myrgb = { 0 };
cmyk_t mycmyk = { 0 };
convert_hex_to_color(hex, &myrgb);
convert_hex_to_color(hex, &mycmyk);
printf("RGB = %i,%i,%i\r\n", myrgb.r, myrgb.g, myrgb.b);
printf("CMYK = %f,%f,%f,%f\r\n", mycmyk.c, mycmyk.m, mycmyk.y, mycmyk.k);
return 0;
}
Output:
RGB = 1,2,3
CMYK = 1.000000,2.000000,3.000000,4.000000
Just be aware that the _Generic support for type qualifiers (const etc) was shaky in C11 - some C11 compilers treated const rgb_t different from rgb_t, others treated them the same. This was one of the "bug fixes" in C17, so use C17 if available.

Frame challenge: It seems like you want to perform a different operation depending on the type that you pass into this function. Instead of using an enum to tell it what type you're passing in, and branching based on that enum, make use of C11's _Generic to handle that, and you don't even need to explicitly tell it what the type is on each call:
#include <stdio.h>
typedef struct {
int r;
int g;
int b;
} rgb_t;
typedef struct {
float c;
float m;
float y;
float k;
} cmyk_t;
inline void convert_hex_to_color_rgb(long hex, rgb_t *const out) {
(void) hex; // or whatever you're planning to do with 'hex'
out->r = 1;
out->g = 2;
out->b = 3;
}
inline void convert_hex_to_color_cmyk(long hex, cmyk_t *const out) {
(void) hex; // or whatever you're planning to do with 'hex'
out->c = 1.0;
out->m = 2.0;
out->y = 3.0;
out->k = 4.0;
}
#define convert_hex_to_color(hex, out) _Generic((out), \
rgb_t *: convert_hex_to_color_rgb((hex), (rgb_t *)(out)), \
cmyk_t *: convert_hex_to_color_cmyk((hex), (cmyk_t *)(out)) \
)
int main(void) {
// Given
long hex = 348576;
rgb_t mydata = { 0 };
cmyk_t mydatac = { 0 };
convert_hex_to_color(hex, &mydata);
convert_hex_to_color(hex, &mydatac);
// Then
printf("RGB = %i,%i,%i\r\n", mydata.r, mydata.g, mydata.b);
printf("CMYK = %f,%f,%f,%f\r\n", mydatac.c, mydatac.m, mydatac.y, mydatac.k);
return 0;
}

Related

What's the name of this technique and does it violate strict-aliasing rules or invoke UB?

I have come up with some code which makes use of a self-referential struct (the 1st element of the struct is a pointer to a function that takes an instance of the struct as its one and only argument).
It has been useful for passing disparate routines to another to invoke because the invoking routine doesn't need to know the exact argument makeup of the passed routines (see the process_string call sites in the code below). The passed/invoked routines themselves are responsible for unpacking (casting) the args in a way meaningful to them.
At the bottom of this post is some sample code making use of this technique. It produces the following output when compiled with gcc -std=c99 -Wpedantic -Wall -Wextra -Wconversion:
nread: 5
vals[0]: 0.000000
vals[1]: 0.000000
vals[2]: 0.000000
vals[3]: 78.900000
vals[4]: 32.100000
vals[5]: 65.400000
vals[6]: 87.400000
vals[7]: 65.000000
12.3 12.3
34.5 34.5
56.7 56.7
78.9 78.9
32.1 32.1
65.4 65.4
87.4 87.4
65.0 65.0
My questions are:
What is the name of this technique? As you can see from the code, I've been using the name functor but I'm not sure that is correct. It looks a little like a closure but I don't think it is since it just points to it's arguments rather than carrying along copies of them.
Does the code violate the strict-aliasing rule?
Does the code invoke Undefined Behavior?
And now for the code:
#include <stdio.h>
typedef struct functor_s functor_t;
typedef int (func_t)(functor_t);
struct functor_s { func_t * _0; void * _1; void * _2; void * _3; void * _4; };
void process_string(char * buf, int skip, functor_t ftor) {
for (int i = skip; i < 8; ++i) {
ftor._4 = buf + i*5;
ftor._3 = &i;
(void)ftor._0(ftor);
}
}
int scan_in_double(functor_t in) {
// unpack the args
const char * p = in._4;
int offset = *(int*)in._3;
int * count = in._1;
double * dest = in._2;
// do the work
return *count += sscanf(p, "%lg", dest + offset);
}
int print_repeated(functor_t in) {
// unpack the args
const char * p = in._4;
// do the work
char tmp[10] = {0};
sscanf(p, "%s", tmp);
printf("%s %s\n", tmp, tmp);
return 0;
}
int main()
{
char line[50] = "12.3 34.5 56.7 78.9 32.1 65.4 87.4 65.0";
int nread = 0;
double vals[8] = {0};
functor_t ftor1 = { scan_in_double, &nread, vals };
process_string(line, 3, ftor1);
// check that it worked properly
printf("nread: %d\n", nread);
for (int i = 0; i < 8; ++i) {
printf("vals[%d]: %f\n", i, vals[i]);
}
functor_t ftor2 = { print_repeated };
process_string(line, 0, ftor2);
return 0;
}
EDIT: In response to #supercat's suggestion (https://stackoverflow.com/a/63332205/1206102), I reworked my example to pass a double-indirect function pointer (which incidentally made the self-referentiality unecessary) and added an extra case: scanning in ints. The ability to scan in different types better illustrates the need for a void* arg in both the functor struct & the function pointer sig. Here's the new code:
#include <stdio.h>
typedef int (func_t)(int offset, const char * src, void * extra);
typedef struct { func_t * func; void * data; } ftor_t;
typedef struct { int * count; double * dest; } extra_dbl_t;
typedef struct { int * count; int * dest; } extra_int_t;
void process_string(char * buf, int skip, func_t ** func) {
ftor_t * ftor = (ftor_t*)func; // <---- strict-alias violation? or UB?
for (int i = skip; i < 8; ++i) {
(void)ftor->func(i, buf+i*5, ftor->data);
}
}
int scan_in_double(int offset, const char * src, void * extra) {
extra_dbl_t * in = extra;
return *in->count += sscanf(src, "%lg", in->dest + offset);
}
int scan_in_int(int offset, const char * src, void * extra) {
extra_int_t * in = extra;
return *in->count += sscanf(src, "%d", in->dest + offset);
}
int print_repeated(int offset, const char * src, void * extra) {
// extra not used
char tmp[10] = {0};
sscanf(src, "%s", tmp);
printf("%s %s\n", tmp, tmp);
return 0;
}
int main()
{
// contrived strings to make the simplistic +5 in process_string work
// (the real process_string would use whitespace to non-whitespace
// transition)
char dbl_line[50] = "12.3 34.5 56.7 78.9 32.1 65.4 87.4 65.0";
char int_line[50] = "1234 3456 5678 7890 3210 6543 8743 6501";
int n_ints_read = 0;
int int_vals[8] = {0};
extra_int_t int_data = { .count=&n_ints_read, .dest=int_vals };
ftor_t ftor0 = { scan_in_int, &int_data };
process_string(int_line, 0, &ftor0.func);
// check that it worked properly
printf("n_ints_read: %d\n", n_ints_read);
for (int i = 0; i < 8; ++i) {
printf("int_vals[%d]: %d\n", i, int_vals[i]);
}
int n_dbls_read = 0;
double dbl_vals[8] = {0};
extra_dbl_t dbl_data = { .count=&n_dbls_read, .dest=dbl_vals };
ftor_t ftor1 = { scan_in_double, &dbl_data };
process_string(dbl_line, 3, &ftor1.func);
// check that it worked properly
printf("n_dbls_read: %d\n", n_dbls_read);
for (int i = 0; i < 8; ++i) {
printf("dbl_vals[%d]: %f\n", i, dbl_vals[i]);
}
ftor_t ftor2 = { print_repeated }; // no extra data req'd
process_string(dbl_line, 0, &ftor2.func);
return 0;
}
But if I accept a ptr to the struct/functor instead:
void process_string(char * buf, int skip, ftor_t * ftor) {
for (int i = skip; i < 8; ++i) {
(void)ftor->func(i, buf+i*5, ftor->data);
}
}
And change the call site to:
process_string(dbl_line, 0, &ftor2); // not &ftor2.func
Then there's no pointer casting in process_string(), and therefore no strict-alias violation. I think.
In both cases, the new output is:
n_ints_read: 8
int_vals[0]: 1234
int_vals[1]: 3456
int_vals[2]: 5678
int_vals[3]: 7890
int_vals[4]: 3210
int_vals[5]: 6543
int_vals[6]: 8743
int_vals[7]: 6501
n_dbls_read: 5
dbl_vals[0]: 0.000000
dbl_vals[1]: 0.000000
dbl_vals[2]: 0.000000
dbl_vals[3]: 78.900000
dbl_vals[4]: 32.100000
dbl_vals[5]: 65.400000
dbl_vals[6]: 87.400000
dbl_vals[7]: 65.000000
12.3 12.3
34.5 34.5
56.7 56.7
78.9 78.9
32.1 32.1
65.4 65.4
87.4 87.4
65.0 65.0
What is the name of this technique?
Obfuscation.
It has similarities with closures and with argument currying, but I wouldn't characterize it as either one.
It also has similarities with object-oriented program structure and practice, but the focus on intentionally hiding the argument types has no particular place in that regime.
And there is a hint of callback function, too.
Overall, though, it's just an over-abstracted mess.
It has been useful for passing disparate routines to another to invoke
because the invoking routine doesn't need to know the exact argument
makeup of the passed routines
I think you're fooling yourself.
Your functor_t indeed doesn't carry any information about the types that the parameters need to have, and it places only an upper bound on the number of them, but that's nothing to cheer about. The user of each instance still needs to know those things in order to use the object correctly, and the functor hides them not only from the user, but also from the compiler, such that neither one can easily check whether the user has set up the parameters correctly. The user furthermore does not benefit from any of the default argument conversions that happen in a direct function call, so they need to ensure exact type matching.
The only way I see something like this making sense is as more or less a pure callback interface, where the same user packages both the function to call and the arguments to pass to it -- or some specific ones of them, at least -- into an object, then stores or passes that off for some other function to call later. But such callback interfaces are usually structured differently, without including the function in the object alongside the arguments, and they do not go out of their way to hide data types.
Does the code violate the strict-aliasing rule?
Not inherently, but strict-aliasing violations will arise if pointers to the wrong types of objects are stored in a functor's parameter members, and the functor's function is then called.
Does the code invoke Undefined Bahavior?
Not inherently, but yes in the event of a strict-aliasing violation.
You should pass a pointer to the first member of the method structure (i.e. a double-indirect function pointer), rather than passing the structure by value. This will avoid the need for any of the code which needs to pass through or invoke that method pointer to care about anything other than the fact that the structure leads off with a function pointer. The actual function should receive as an argument (probably the first one) a copy of the pointer to the structure, which it can then use to retrieve any other parameters it needs.
If you want to pass around a function-pointer-plus-arguments structure rather than using a double-indirect pointer, I'd suggest having a structure contain a function pointer and a void* rather than trying to have the pass-through code care about anything beyond that.
Here's a demo of what I have in mind:
#include <stdint.h>
#include <string.h>
#include <stdio.h>
typedef void (*streamOutFunc)(void *, void const *dat, uint32_t len);
struct StringStream
{
streamOutFunc func;
char *dest;
uint32_t size,len,totlen;
};
void putStringStreamFunc(void *param, void const *dat, uint32_t len)
{
struct StringStream *it = param;
uint32_t maxLen = it->size - it->len;
uint32_t newTot = it->totlen + len;
if (newTot < len)
newTot = -1;
if (len > maxLen)
len = maxLen;
memcpy(it->dest+it->len, dat, len);
it->totlen = newTot;
it->len += len;
}
struct FileStream
{
streamOutFunc func;
FILE *f;
};
void putFileStreamFunc(void *param, void const *dat, uint32_t len)
{
struct FileStream *it = param;
fwrite(dat, len, 1, it->f);
}
void outputSomething(streamOutFunc *stream, void const *dat, uint32_t len)
{
(*stream)(stream, "Message: [", (sizeof "Message: [")-1);
(*stream)(stream, dat, len);
(*stream)(stream, "]\n", (sizeof "]\n")-1);
}
int main(void)
{
char msgBuff[20];
struct StringStream myStringStream =
{putStringStreamFunc, msgBuff, sizeof msgBuff, 0, 0};
outputSomething(&myStringStream.func, "TESTING 12345", (sizeof "TESTING 12345")-1);
struct FileStream myFileStream =
{putFileStreamFunc, stdout};
outputSomething(&myFileStream.func, msgBuff, myStringStream.len);
}
For a definition of functor see https://en.wikipedia.org/wiki/Functor. This does not seem fitting here.
Essentially this is how you can implement object oriented programming in C.
You see this technique in the Linux kernel to describe device drivers. The driver descriptor contains pointers to functions and some additional data, e.g:
static struct platform_driver meson_rng_driver = {
        .probe  = meson_rng_probe, // a function
        .driver = {
                .name = "meson-rng",
                .of_match_table = meson_rng_of_match,
        },
};
Linux collect these driver descriptors in linker generated lists.
In object oriented programming the structure definition (here struct platform_driver) represents an interface and the structure with the actual function pointers a class and the functions pointed to the methods of the class. The data fields contain the class level variables.
There is no undefined behavior involved. There is no violation of strict aliasing.

Static to Dynamic nature in C

I have implemented naive Bayes but I did it in static memory allocation.
I wanted to convert into dynamic but my small brain is not able to do that.
#define COLS 4 //including class label
#define BINS 100
#define CLASS_COL 0
#define CLASS 2
The idea is to fetch above value from a configuration file and then set it.
struct each_col //Probability for each feature based on classes
{
double col_PB[BINS][CLASS];
};
struct NB_Class_Map
{
char label[250];
unsigned int label_value;
double class_PB;
};
struct NB //Proabability for entire feature
{
struct NB_Class_Map classes[CLASS];
struct each_col cols[COLS];
};
NB nb = {0}; //gloabal value
The function to train NB:
long strhash(const char *str)
{
long hash = 5381;
int c;
printf("IN: %s ",str);
while (c = *str++)
hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
printf("OUT: %ld ||",hash);
return hash;
}
int setup_train_NB(vector<vector<string> > &data)
{
//Finding the feature count
static int class_label = -1;
for(unsigned int i=0;i<data.size();i++)
{
unsigned int Class;
printf("\n===========New ROW==============\n");
int k;
for(k=0;k<CLASS;k++)
{
if(strcmp(data[i][CLASS_COL].c_str(), nb.classes[k].label) == 0)
{
printf("MATCHED\n");
Class = nb.classes[k].label_value;
break;
}
}
if(k==CLASS)
{
printf("NOT MATCHED\n");
class_label++;
nb.classes[class_label].label_value = class_label;
strcpy( nb.classes[class_label].label, data[i][CLASS_COL].c_str());
Class = nb.classes[class_label].label_value;
}
printf("Class: %d ||\n", Class);
for(unsigned j=0;j<data[0].size();j++)
{
printf("\n===========New COLUMN==============\n");
if(j == CLASS_COL)
{
nb.classes[Class].class_PB++;
continue;
}
unsigned int bin = strhash((data[i][j].c_str()))%BINS;
printf("Bin: %d ||", bin);
printf("Class: %d ||\n", Class);
nb.cols[j].col_PB[bin][Class]++; //[feature][BINS][CLASS]
}
}
//Finding the feature PB
for(unsigned int i=0;i<COLS;i++)
{
if(i==CLASS_COL)
continue;
for(unsigned j=0;j<BINS;j++)
{
for(unsigned k=0;k<CLASS;k++)
{
// nb.cols[i].col_PB[j][k] /= nb.classes[k].class_PB; //without laplacian smoothing
nb.cols[i].col_PB[j][k] = (nb.cols[i].col_PB[j][k] + 1) / (nb.classes[k].class_PB + COLS - 1); //with laplace smoothing
}
}
}
int k = 0;
int sum = 0;
while(k<CLASS)
{
sum += nb.classes[k].class_PB;
k++;
}
//Finding the class PB
k = 0;
while(k<CLASS)
{
nb.classes[k].class_PB /= sum;
k++;
}
return 0;
}
The program is supposed to be written in C but for the moment, I use vector to fetched the data from a CSV file. Please ignore that for the moment. The actual question is how I can remove those hardcoded define value and still declare my structs.
Although it does not matter but the CSV file look like this and it may change in terms of no of cols and labels. The first line is ignored and not put into data.
Person,height,weight,foot
male,654,180,12
female,5,100,6
female,55,150,8
female,542,130,7
female,575,150,9
What actually I am doing is, for each value is put into a bin, then for each of those value, I am finding proabability for the CLASS/label i.e male = 0, female = 1
Basically:
Define variables instead of preprocessor macro constants: size_t cols; size_t bins; etc.
Replace your 1-dimensional fixed-size arrays with pointers (initialized to NULL!) and length variables. Alternatively, you could use a struct mytype_span { size_t length; mytype* data; }
Replace your 2-dimensional fixes-size arrays with "1-dimensional" pointers (also initialized to NULL of course) and pairs off dimension variables. Again, you could use a struct.
Replace your 2-d array accesses a[x][y] with a "linearized" access, i.e. a[x * row_length_of_a + y] (or you could do this in an inline function which takes the relevant arguments, or a struct mytype_span)
When you've read your configuration values from, um, wherever - set the relevant length variables (see above).
use the malloc() library function to allocate the correct amount of space; remember to check the malloc() return value to make sure it's not null, before using the pointer values!
Your use of struct, is probably wrong, except for struct NB_Class_Map. You shouldn't use struct in the goal of puting big arrays in the same variable. Instead of this, you should define of variable for each array, not putting it inside a struct, and instead of using array, replace it by a pointer. Then you can allocate memory to your pointer. e.g. :
struct mydata {
type1 field1;
type2 field2;
etc...
} *myarray;
myarray = calloc(number_of_record_you_need, sizeof(struct mydata));
// here, error checking code, etc.
Now, having done that, if you really want, you may put your different pointers into a global structure, but each of your table should be allocated separately.
Edit (about your variables) :
NB has no real interest as a structure. It's just 2 variables you glued together:
struct NB_Class_Map classes[CLASS];
struct each_col cols[COLS];
NB.Cols is not really a structure. It's just a three dimensional array
double cols[COLS][BINS][CLASS];
The only real structure is
struct NB_Class_Map
{
char label[250];
unsigned int label_value;
double class_PB;
};
So you just have to replace
struct NB_Class_Map classes[CLASS];
with
struct NB_Class_Map *classes;
and
double cols[COLS][BINS][CLASS];
with
double *cols[BINS][CLASS];
Or if you want a type name :
typedef double each_col[BINS][CLASS];
each_cols *cols;
and allocate memory space for classes and colls with calloc.
Now, if you really want this struct NB :
typedef double each_col[BINS][CLASS];
struct NB
{
struct NB_Class_Map *classes;
each_col *cols;
};

How could I know an Uncertain type parameters' size

const static int VECTOR_BASIC_LENGTH = 20;
struct m_vector
{
void* my_vector;
size_t my_capacity;
size_t my_head;
};
typedef struct m_vector Vector;
Vector creat_Vector(size_t size,void *judge)
{
Vector _vector;
size = size?size:VECTOR_BASIC_LENGTH;
_vector.my_capacity = size;
_vector.my_head = 0;
//How I write the following two lines
_vector.my_vector = malloc(sizeof(*judge) * size);
return _vector;
}
The type of judge is uncertain,so I pass a void pointer as a parameters.I need the size of *judge to allocate memory to _vector.my_vector,for example if I use:
int *a;
creat_Vector(5,a);
I want the following line:
_vector.my_vector = malloc(sizeof(*judge)*size);
is equal to:
_vector.my_vector = malloc(sizeof(*a)*5);
How could I achieve this function.Using pure C
There is a forbidden thing done in your code.
You statically (at compile time) allocate/declare a local _vector of type Vector in your function creat_Vector. Then you return this object to the outside world. However, when you are exiting your function, all local data is dead. So, you should absolutely rethink this.
One suggestion would be:
int init_Vector(Vector* _vect, size_t size, unsigned int ptr_size)
{
size = size?size:VECTOR_BASIC_LENGTH;
_vect->my_capacity = size;
_vect->my_head = 0;
_vect->my_vector = malloc(size*ptr_size);
if (_vect->my_vector) {
return 0;
}
return 1;
}
Then:
Vector _vector;
char *a;
if (init_Vector(&_vector, 5, sizeof(char)) == 0) {
printf("Success!\n");
}
else {
printf("Failure!\n");
/* treat appropriately (return error code/exit) */
}
/* do whatever with a (if needed) and _vector*/

"dynamic array of static arrays"

How do you specify a dynamic array of static arrays in C?
I want to make a struct holding two dynamic arrays of static arrays.
struct indexed_face_set {
double * [3] vertices;
int * [3] faces;
};
This should hold a dynamic list of vertices, which are each 3 doubles, and a dynamic list of faces, which are each 3 ints.
The syntax is, well, C's approach to declarations is not the cleanest and C++ inherited that...
double (*vertices)[3];
That declaration means that vertices is a pointer to double [3] objects. Note that the parenthesis are needed, otherwise (as in double *vertices[3]) it would mean an array of 3 double*.
After some time you end up getting use to the inverted way of parenthesis on expressions...
For the specific case of a structure containing two arrays each of dimension 3, it would be simpler to make the arrays a part of the structure, rather than dynamically allocating them separately:
struct indexed_face_set
{
double vertices[3];
int faces[3];
};
However, there certainly could be cases where it makes sense to handle dynamic array allocation. In that case, you need a pointer to an array in the structure (and not an array of pointers). So, you would need to write:
struct indexed_face_set
{
double (*vertices)[3];
int (*faces)[3];
};
To allocate a complete struct indexed_face_set, you need to use something like new_indexed_face_set() and to free one you need to use something like destroy_indexed_face_set():
struct indexed_face_set *new_indexed_face_set(void)
{
struct indexed_face_set *new_ifs = malloc(sizeof(*new_ifs));
if (new_ifs != 0)
{
double (*v)[3] = malloc(sizeof(*v));
int (*f)[3] = malloc(sizeof(*f));
if (v == 0 || f == 0)
{
free(v);
free(f);
free(new_ifs);
new_ifs = 0;
}
else
{
new_ifs->vertices = v;
new_ifs->faces = f;
}
}
return(new_ifs);
}
void destroy_indexed_face_set(struct indexed_face_set *ifs)
{
if (ifs != 0)
{
free(ifs->vertices);
free(ifs->faces);
free(ifs);
}
}
Then you can use it like this:
void play_with_ifs(void)
{
struct indexed_face_set *ifs = new_indexed_face_set();
if (ifs != 0)
{
(*ifs->vertices)[0] = 3.14159;
(*ifs->vertices)[1] = 2.71813;
(*ifs->vertices)[2] = 1.61803;
(*ifs->faces)[0] = 31;
(*ifs->faces)[1] = 30;
(*ifs->faces)[2] = 29;
do_something_fancy(ifs);
destroy_indexed_face_set(ifs);
}
}
Note that the notation using pointers to arrays is moderately messy; one reason why people do not often use them.
You could use this fragment as the body of a header:
#ifndef DASS_H_INCLUDED
#define DASS_H_INCLUDED
struct indexed_face_set;
extern void play_with_ifs(void);
extern void do_something_fancy(struct indexed_face_set *ifs);
extern void destroy_indexed_face_set(struct indexed_face_set *ifs);
extern struct indexed_face_set *new_indexed_face_set(void);
#endif /* DASS_H_INCLUDED */
It doesn't need any extra headers included; it does not need the details of the structure definition for these functions. You'd wrap it in suitable header guards.
Because the code above is a bit messy when it comes to using the arrays, most people would use a simpler notation. The header above can be left unchanged, but the code could be changed to:
struct indexed_face_set
{
double *vertices;
int *faces;
};
struct indexed_face_set *new_indexed_face_set(void)
{
struct indexed_face_set *new_ifs = malloc(sizeof(*new_ifs));
if (new_ifs != 0)
{
double *v = malloc(3 * sizeof(*v));
int *f = malloc(3 * sizeof(*f));
if (v == 0 || f == 0)
{
free(v);
free(f);
free(new_ifs);
new_ifs = 0;
}
else
{
new_ifs->vertices = v;
new_ifs->faces = f;
}
}
return(new_ifs);
}
void destroy_indexed_face_set(struct indexed_face_set *ifs)
{
if (ifs != 0)
{
free(ifs->vertices);
free(ifs->faces);
free(ifs);
}
}
void play_with_ifs(void)
{
struct indexed_face_set *ifs = new_indexed_face_set();
if (ifs != 0)
{
ifs->vertices[0] = 3.14159;
ifs->vertices[1] = 2.71813;
ifs->vertices[2] = 1.61803;
ifs->faces[0] = 31;
ifs->faces[1] = 30;
ifs->faces[2] = 29;
do_something_fancy(ifs);
destroy_indexed_face_set(ifs);
}
}
This is much simpler to understand and use and would generally be regarded as more idiomatic C.
Since the size of each array is fixed, there's no particular need to record the size in the structure. If the sizes varied at runtime, and especially if some indexed face sets had, say, 8 vertices and 6 faces (cuboid?), then you might well want to record the sizes of the arrays in the structure. You'd also specify the number of vertices and number of faces in the call to new_indexed_face_set().

qsort for structure array

I try to sort a struct below, given an intention to sort their error rate, while retaining the information of sid and did. While there is no compilation error, I get a seg fault in runtime. I wonder what has gone wrong....
#include <stdio.h>
#include <stdlib.h>
struct linkdata {
int sid;
int did;
double err;
};
typedef struct linkdata LD;
typedef int (*qsort_func_t)(const void *, const void *);
static int compareByErr (const void * a, const void * b)
{
fprintf(stderr, "aerr=%.3f, berr=%.3f\n", (*(LD**)a)->err, (*(LD**)b)->err);
int aerr = (*(LD**)a)->err;
int berr = (*(LD**)b)->err;
return aerr - berr;
}
int main() {
int idx;
int numnode;
struct linkdata* perr;
qsort_func_t qsort_func = compareByErr;
numnode = 3;
perr = (LD*) malloc (numnode*numnode*sizeof(LD));
perr[0].sid = 0; perr[0].did = 1; perr[0].err = 0.642;
perr[1].sid = 0; perr[1].did = 2; perr[1].err = 0.236;
perr[2].sid = 0; perr[2].did = 3; perr[2].err = 0.946;
idx = 3;
qsort(perr, idx, sizeof(perr), compareByErr);
int i;
for (i=0; i<idx; i++){
fprintf(stderr,"err[%d][%d] = %.3f\n", perr[i].sid, perr[i].did, perr[i].err);
}
free(perr);
}
There are many errors in the code.
1. compareByErr
The a and b parameters of the compareByErr function are objects of LD*, not LD**. You did an unnecessary dereferencing. Try to change that function to:
static int compareByErr (const void * a, const void * b)
{
fprintf(stderr, "aerr=%.3f, berr=%.3f\n", ((LD*)a)->err, ((LD*)b)->err);
int aerr = ((LD*)a)->err;
int berr = ((LD*)b)->err;
return aerr - berr;
}
2. compareByErr
There is another problem, that you implicitly convert the double into int. Since all those "errors" are 0.???, they will all be truncated to 0. Making the whole array unsorted. Change it to:
double aerr = ((LD*)a)->err;
double berr = ((LD*)b)->err;
return aerr < berr ? -1 : aerr > berr ? 1 : 0;
3. malloc
You are allocating for 32 nodes, but only 3 are needed. Change that to
perr = (LD*) malloc (numnode * sizeof(LD));
4. qsort
The 3rd argument is the size of each element of the array, not sizeof(perr) which is just the size of a pointer (4 bytes). Change that line to:
qsort(perr, idx, sizeof(*perr), compareByErr);
// ^
to actually get the element size.
The idx seems unnecessary. You could just use numnode here.
Your comparison function expects to be sorting an array of pointers to structs, but you're not doing that. This problem is covered by the other answers.
What they didn't mention is that you're also using the wrong sizeof for the sort. Since the array is an array of structs, you must tell qsort that the size of a member is the size of a struct. Change sizeof perr to sizeof *perr
Also, converting the floats to ints before comparing them results in them all being equal because they're all zero...
You're mis-treating the arguments to your comparator callback.
This:
fprintf(stderr, "aerr=%.3f, berr=%.3f\n", (*(LD**)a)->err, (*(LD**)b)->err);
should be:
{
const LD *lda = a, *ldb = b;
fprintf(stderr, "aerr=%.3f, berr=%.3f\n", lda->err, ldb->err);
/* ... */
}
Of course you don't have to introduce new variables of the proper type, but it makes the subsequent code that much easier. I always do this.
Further, this:
int aerr = (*(LD**)a)->err;
int berr = (*(LD**)b)->err;
return aerr - berr;
is adoringly terse, but it can hide integer overflow issues that are bit scary. I would recommend:
return (a->err < b->err) ? -1 : a->err > b->err;
This uses an explicit literal to generate the -1 value, while relying on comparisons generating 0 or 1 for the two other cases.

Resources