Is copying partially initialized structures well defined in C? - c

I recently learned that copying partially initialized structures through trivial construction or assignment is undefined in C++. Does the same hold true in C or does the standard guarantee that initialization and assignment behave like memcpy?
typedef struct { int i; int j; } A;
void foo() {
A x;
x.i = 0;
// Leave x.j indeterminate. Is the following well defined?
A y = x;
y.j = y.i + 1;
}

x is not "partially initialized" this is not initialized at all.
Reading x in your initializer for y propagates the "indetermined" valueness (if one can say so) to y. If int could have trap representations on your platform, this already would be an error.
But then you don't read that indeterminate field y.j, so there no problem at that particular assignment.

Related

Default values in struct c [duplicates]

What does the code below print when running the function print_test()?
struct test {
int a, b, c;
};
void print_test(void) {
struct test t = {1};
printf("%d/%d/%d", t.a, t.b, t.c);
}
Solution 1\0\0
Why are b and c initialized to zero even though I did not do it? Are there any default values for the struct?
However, the struct is not global and the member is not static. Why are they automatically zero-initialized? And if the data type were not int another data type to which value will be initialized?
If you don't specify enough initializers for all members of a struct, you'll face Zero initialization, which will initialize remaining members to 0. I think by today's standards this seems a bit odd, especially because C++'s initialization syntax has evolved and matured a lot over the years. But this behavior remains for backwards-compatibility.
I think we need to know two points at this stage:
There is nothing different between regular variables and structs, if they are at local scope i.e automatic storage duration. They will contain garbage values. Using those values could invoke undefined behaviour.
The only thing that makes structs different is that if you initialise at least one of the members, the rest of the members will get set to zero i.e. initialised as if they had static storage duration. But that's not the case when none of the members are initialised.
It depends on your declaration.
If your declaration is outside any function or with the static keyword (more precisely, has static storage duration), the initial value of x is a null pointer (which may be written either as 0 or as NULL).
If it's inside a function i.e. it has automatic storage duration, its initial value is random (garbage).
consider the following code:
#include<stdio.h>
#include<unistd.h>
struct point {
int x, y;
char a;
double d;
};
typedef struct point Point;
void main(){
Point p1;
printf("\nP1.x: %d\n", p1.x);
printf("\nP1.y: %d\n", p1.y);
printf("\nP1.a: %d\n", p1.a);
printf("\nP1.d: %lf\n", p1.d);
Point p2 = {1};
printf("\nP2.x: %d\n", p2.x);
printf("\nP2.y: %d\n", p2.y);
printf("\nP2.a: %d\n", p2.a);
printf("\nP2.d: %lf\n", p2.d);
}
The output is :
P1.x: 0
P1.y: 66900
P1.a: 140
P1.d: 0.000000
P2.x: 1
P2.y: 0
P2.a: 0
P2.d: 0.000000
A Good read: C and C++ : Partial initialization of automatic structure

std::array equivalent in C

I'm new to C and C++, and I've read that at least in C++ it's preferable to use std::array or std::vector when using vectors and arrays, specially when passing these into a function.
In my research I found the following, which makes sense. I suppose using std::vector would fix the problem of indexing outside of the variable's scope.
void foo(int arr[10]) { arr[9] = 0; }
void bar() {
int data[] = {1, 2};
foo(data);
}
The above code is wrong but the compiler thinks everything is fine and
issues no warning about the buffer overrun.
Instead use std::array or std::vector, which have consistent value
semantics and lack any 'special' behavior that produces errors like
the above.
(answer from bames53, thanks btw!)
What I want to code is
float foo(int X, int Y, int l){
// X and Y are arrays of length l
float z[l];
for (int i = 0; i < l; i ++){
z[i] = X[i]+Y[i];
}
return z;
}
int bar(){
int l = 100;
int X[l];
int Y[l];
float z[l];
z = foo(X,Y,l);
return 0;
}
I want this to be coded in C, so my question is is there a std::vector construct for C? I couldn't find anything on that.
Thanks in advance, also please excuse my coding (I'm green as grass in C and C++)
Standard C has nothing like std::vector or other container structures. All you get is built-in arrays and malloc.
I suppose using std::vector would fix the problem of indexing outside of the variable's scope.
You might think so, but you'd be wrong: Indexing outside of the bounds of a std::vector is just as bad as with a built-in array. The operator[] of std::vector doesn't do any bounds checking either (or at least it is not guaranteed to). If you want your index operations checked, you need to use arr.at(i) instead of arr[i].
Also note that code like
float z[l];
...
return z;
is wrong because there are no array values in C (or C++, for that matter). When you try to get the value of an array, you actually get a pointer to its first element. But that first element (and all other elements, and the whole array) is destroyed when the function returns, so this is a classic use-after-free bug: The caller gets a dangling pointer to an object that doesn't exist anymore.
The customary C solution is to have the caller deal with memory allocation and pass an output parameter that the function just writes to:
void foo(float *z, const int *X, const int *Y, int l){
// X and Y are arrays of length l
for (int i = 0; i < l; i ++){
z[i] = X[i]+Y[i];
}
}
That said, there are some libraries that provide dynamic data structures for C, but they necessarily look and feel very different from C++ and std::vector (e.g. I know about GLib).
Your question might be sensitive for some programmers of the language.
Using constructs of one language into another can be considered cursing as different languages have different design decisions.
C++ and C share a huge part, in a way that C code can (without a lot of modifications) be compiled as C++. However, if you learn to master C++, you will realize that a lot of strange things happen because how C works.
Back to the point: C++ contains a standard library with containers as std::vector. These containers make use of several C++ constructions that ain't available in C:
RAII (the fact that a Destructor gets executed when the instance goes out-of-scope) will prevent a memory leak of the allocated memory
Templates will allow type safety to not mix doubles, floats, classes ...
Operator overloading will allow different signatures for the same function (like erase)
Member functions
None of these exist in C, so in order to have a similar structure, several adaptions are required for getting a data structure that behaves almost the same.
In my experience, most C projects have their own generic version of data structures, often based on void*. Often this will look similar like:
struct Vector
{
void *data;
long size;
long capacity;
};
Vector *CreateVector()
{
Vector *v = (Vector *)(malloc(sizeof(Vector)));
memset(v, 0, sizeof(Vector));
return v;
}
void DestroyVector(Vector *v)
{
if (v->data)
{
for (long i = 0; i < v->size; ++i)
free(data[i]);
free(v->data);
}
free(v);
}
// ...
Alternatively, you could mix C and C++.
struct Vector
{
void *cppVector;
};
#ifdef __cplusplus
extern "C" {
#endif
Vector CreateVector()
void DestroyVector(Vector v)
#ifdef __cplusplus
}
#endif
vectorimplementation.cpp
#include "vector.h"
struct CDataFree
{
void operator(void *ptr) { if (ptr) free(ptr); }
};
using CData = std::unique_ptr<void*, CDataFree>;
Vector CreateVector()
{
Vector v;
v.cppVector = static_cast<void*>(std::make_unique<std::vector<CData>>().release());
return v;
}
void DestroyVector(Vector v)
{
auto cppV = static_cast<std::vector<CData>>(v.cppVector);
auto freeAsUniquePtr = std::unique_ptr<std::vector<CData>>(cppV);
}
// ...
The closest equivalent of std::array in c is probably a preprocessor macro defintion like
#define ARRAY(type,name,length) \
type name[(length)]

My C programming is rusty and i'm having some issues

I am trying to declare a data structure in c and set some variables but I'm having a bit of trouble.
struct point {
float *x;
float *y;
float *z;
};
this struct is 24 bytes long so that's fine by me.
const unsigned int sz = 1<<24;
struct point _points[sz];
for(int i = 0; i < sz; ++i)
{
_points[i].x = get_rand_float();
_points[i].y = get_rand_float();
_points[i].z = get_rand_float();
}
// get_rand_float() returns a pointer to float;
The problem that I am having is that the application will crash.
I playing with the code a bit it seems that maybe 1<<24 is too large? Bringing it down to 1<<14 the program runs just fine.
That brings me to another question, why would 1<<24 or about 16 million ints cause my program to crash? It's a fairly trivial program just int main boilerplate and this struct?
You don't want a structure of pointers to floats:
struct point {
float *x;
float *y;
float *z;
};
You want a structure of floats:
struct point {
float x;
float y;
float z;
};
In your code, sz is int variable, not an array. So, techinicaly you cannot use the array subscript operator on sz. That code should not compile.
Maybe, you wanted to write something like
_points[i].x = get_rand_float();
But then again, it depends on get_rand_float() return type. It has to return a float * (which is not very likely seeing the function name).
In case, if get_rand_float() returns afloat value, and you want to store the returned value, then you don't need to use pointers as your structure member variable. You can simply use float x; and so on.
One possible problem is that your array of points is too large for the stack. See here. You could fix that by a dynamic memory allocation, something like:
struct point *_points = malloc(sz*sizeof(struct point));
And of course, don't forget to free the memory when finished.
EDIT: Based on your edit, your crash occurs when you use a massive struct size (1<<24), or 16,777,216 items in the struct. Borrowing an answer from here:
Size limitation of C structure
It would appear that you may be violating C standard by having over 65535 bytes in an object. Since 1<<14 works, which is only 16384, that might be why. To verify, try using anything above 1<<16 - those should all crash because it would be over 65535.
On a side note, it would be helpful if you post the actual error message you get so we have a better idea of what is going on. :)
---Pre-Author Edit Answer---
Assuming get_rand_float() returns what it's supposed to, the problem is that sz is an int, not a struct. It should look like:
int sz = 24;
struct point _points[sz];
for(int i = 0; i < sz; ++i)
{
_points[i].x = get_rand_float();
_points[i].y = get_rand_float();
_points[i].z = get_rand_float();
}
Others have pointed out your two main problems (size too large, and you should be using float not float* in your struct). But there is another potential problem, too: you should not get into the habit of beginning an identifier name with an underscore, because, from Section 7.1.3 of the 1999 C standard:
All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.
All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces.
Each macro name in any of the following subclauses (including the future library directions) is reserved for use as specified if any of its associated headers is included; unless explicitly stated otherwise (see 7.1.4).
All identifiers with external linkage in any of the following subclauses (including the future library directions) are always reserved for use as identifiers with external linkage.154
Each identifier with file scope listed in any of the following subclauses (including the future library directions) is reserved for use as a macro name and as an identifier with file scope in the same name space if any of its associated headers is included.
Keeping apart the big size of the array,your program is mainly crashing because there no memory allocated to the pointer variables x,y and z.
You have to allocate memory to the variables before assigning any values.
for(int i = 0; i < sz; ++i)
{
sz[i].x = get_rand_float(); <--- getting crash here!
sz[i].y = get_rand_float();
sz[i].z = get_rand_float();
}
for(i=0;i<sz;i++)
{
_points[i].x =(float *) malloc(sizeof(float));
_points[i].y = (float *) malloc(sizeof(float));
_points[i].z = (float *) malloc(sizeof(float));
*( _points[i].x) = get_rand_float();
*( _points[i].y) = get_rand_float();
*( _points[i].z) = get_rand_float();
}
for(i=0;i<sz;i++)
{
printf("%f %f %f ",*( _points[i].x), *(_points[i].y), *(_points[i].z));
printf("\n");
}
You can make your program simple by taking float as members of the structure instead of float pointers.
struct point {
float x;
float y;
float z;
};
int main()
{
int i;
for(i=0;i<sz;i++)
{
_points[i].x = get_rand_float();
_points[i].y = get_rand_float();
_points[i].z = get_rand_float();
}
for(i=0;i<sz;i++)
{
printf("%f %f %f ", _points[i].x, _points[i].y, _points[i].z);
printf("\n");
}

What does it mean that automatic structures and arrays may now also be initialized?

In the beginning of chapter 6: Structures of the book by Brian W. Kernighan and Dennis M. Ritchie, there is a paragraph that I can't understand.
The main change made by the ANSI standard is to define structure assignment - structures may be copied and assigned to, passed to functions, and returned by functions. This has been supported by most compilers for many years, but the properties are now precisely defined. Automatic structures and arrays may now also be initialized.
What does it mean that automatic structures and arrays may now also be initialized? I'm pretty sure that automatic, namely local variables should be initialized manually. Would you please help me understand what it means?
In pre-standard C (meaning 'before the C89 standard', or a long time ago), you could not write:
int function(int i)
{
int array[4] = { 1, 2, 3, 4 };
struct { int x; int y; } x = { 1, 2 };
struct { int x; int y; } a[] = { { 2, 3 }, { 3, 4 } };
...code using array, x, a...
return x.y + a[!!i].x + array[3];
}
Now you are allowed to do all these.
Also, in K&R 1st Edition C (circa 1978), you were not allowed to write:
int another()
{
struct { int x; int y; } a, b;
a.x = 1;
a.y = 0;
b = a; /* Not allowed in K&R 1 */
some_other_func(a, b); /* Not allowed in K&R 1 */
some_other_func(&a, &b); /* Necessary in K&R 1 */
...
}
You could also only return pointers to structures (as well as pass pointers to structures). IIRC, some C compilers actually allowed the non-pointer notation, but converted the code behind the scenes to use pointers. However, the structure assignment and structure passing and structure returning limitations were removed from C well before the standard (shortly after K&R 1 was published). But not all compilers supported these features because there wasn't a standard for them to conform to.
It means you can do stuff like this:
typedef struct { int a; float b; } foo;
int main(void) {
foo f = { 42, 3.14 }; // Initialization
...
}

Why is sizeof(type) the size of a pointer, not the size of the type itself?

In this code, why is sizeof(x) the size of a pointer, not the size of the type x?
typedef struct {
...
} x;
void foo() {
x *x = malloc(sizeof(x));
}
Because C says:
(C99, 6.2.1p7) "Any other identifier has scope that begins just after the completion of its declarator."
So in your example, the scope of the object x start right after the x *x:
x *x = /* scope of object x starts here */
malloc(sizeof(x));
To convince yourself, put another object declaration of type x right after the declaration of the object x: you will get a compilation error:
void foo(void)
{
x *x = malloc(sizeof(x)); // OK
x *a; // Error, x is now the name of an object
}
Otherwise, as Shahbaz notee in the comments of another answer, this is still not a correct use of malloc. You should call malloc like this:
T *a = malloc(sizeof *a);
and not
T *a = malloc(sizeof a);
This is because sizeof(x) uses the innermost definition of x, which is the pointer. To avoid this problem, don't use the same name for a type and a variable.
It is a bad idea to not give different things different names (not only in programming):
The academic reason for the behavior observer had already been mentioned by my dear fellow annotators.
To give clear advises name diffenet things differnet (here: variable types and variable instances):
typedef struct {
...
} X;
void foo() {
X *x = malloc(sizeof(X));
}
An even more flexible way to code this example would be (as also already mentioned by Shahbaz's comment):
typedef struct {
...
} X;
void foo() {
X *x = malloc(sizeof(*x));
}
The latter example allows you to change the type of x without changing the code doing the allocation.
The drawback of this approach is that you could switch from using references to arrays and verse vica (as type for x) without being notified by the compiler, and break your code doing so.

Resources