I've asked a similar question on structs here but I'm trying to figure out how C handles things like assigning variables and why it isn't allowed to assign them to eachother if they are functionally the same.
Lets say I have two arrays:
int x[10];
int y[10];
Why won't x = y compile? If they are both the same "signature" like that, then shouldn't you be able to assign them back and forth?
Can I declare these in a way that would allow me to do that in C? It makes sense to me that you would be able to, but maybe there is a way that this can be done? Typedefs for structs seemed to be the solution, would it be the same for array declaration and assignment?
I appreciate your guys help, I'm new to Stackoverflow but it has been a really good resource for me so far!
Simply put, arrays are not assignable. They are a "non-modifiable lvalue". This of course begs the question: why? Please refer to this question for more information:
Why does C++ support memberwise assignment of arrays within structs, but not generally?
Arrays are not pointers. x here does refer to an array, though in many circumstances this "decays" (is implicitly converted) to a pointer to its first element. Likewise, y too is the name of an array, not a pointer.
You can do array assignment within structs:
struct data {
int arr[10];
};
struct data x = {/* blah */};
struct data y;
y = x;
But you can't do it directly with arrays. Use memcpy.
int x [sz];
int *y = x;
This compiles and y will be the same as x.
Some messages here say that the name of an array yields the address of its first element. It's not always true:
#include <stdio.h>
int
main(void)
{
int array[10];
/*
* Print the size of the whole array then the size of a pointer to the
* first element.
*/
printf("%u %u\n", (unsigned int)sizeof array, (unsigned int)sizeof &array[0]);
/*
* You can take the address of array, which gives you a pointer to the whole
* array. The difference between ``pointer to array'' and ``pointer to the
* first element of the array'' matters when you're doing pointer arithmetic.
*/
printf("%p %p\n", (void*)(&array + 1), (void*)(array + 1));
return 0;
}
Output:
40 4
0xbfbf2ca4 0xbfbf2c80
In order to assign arrays you will have to assign the values inside the array.
ie. x=y is equivalent to
for(int i = 0; i < 10 < ++i)
{
x[i] = y[i];
}
In an attempt to complement Blank's answer, I devised the following program:
localhost:~ david$ cat test.c
#include <stdlib.h>
#include <stdio.h>
int main (int argc, char * argv [])
{
struct data {
int c [2];
} x, y;
x.c[0] = x.c[1] = 0;
y.c[0] = y.c[1] = 1;
printf("x.c %p %i %i\n", x.c, x.c[0], x.c[1]);
printf("y.c %p %i %i\n", y.c, y.c[0], y.c[1]);
x = y;
printf("x.c %p %i %i\n", x.c, x.c[0], x.c[1]);
printf("y.c %p %i %i\n", y.c, y.c[0], y.c[1]);
return 0;
}
When executed, the following is output:
x.c 0x7fff5fbff870 0 0
y.c 0x7fff5fbff860 1 1
x.c 0x7fff5fbff870 1 1
y.c 0x7fff5fbff860 1 1
The point is to illustrate how the copy of structures' values occurs.
When saying "int x[10]" is saying, "reserve some room for 10 integers and pass me a pointer to the location". So for the copy to make sense you'd need to operate on the memory pointed by, rather than 'the name of the memory location'.
So for copying here you'd use a for loop or memcpy().
I've used C compilers where that would compile just fine...and when run the code would make x point to y's array.
You see, in C the name of an array is a pointer that points to the start of the array. In fact, arrays and pointers are essentially interchangable. You can take any pointer and index it like an array.
Back when C was being developed in the early 70's, it was meant for relatively small programs that were barely above assembly language in abstraction. In that environment, it was damn handy to be able to easily go back and forth between array indexing and pointer math. Copying whole arrays of data, on the other hand, was a very expensive thing do do, and hardly something to be encouraged or abstracted away from the user.
Yes, in these modern times it would make way more sense to have the name of the array be shorthand for "the whole array", rather than for "a ponter to the front of the array". However, C wasn't designed in these modern times. If you want a language that was, try Ada. x := y there does exactly what you would expect; it copies one array's contents to the other.
Related
I wish to have a type which can be used as two different array structures - depending on context. They are not to be used interchangeably whilst the program is executing, rather when the program is executed with a particular start-up flag the type will be addressed as one of the array types
(for example):
array1[2][100]
or
array2[200];
I am not interested in how the data is organised (well I am but it is not relevant to what I wish to achieve)
union m_arrays
{
uint16_t array1[2][100];
uint16_t array2[200];
};
or do I have to use a pointer and alloc it at runtime?
uint16_t * array;
array = malloc(200 * sizeof(uint16_t));
uint16_t m_value =100;
*(array + 199) = m_value;
//equivalent uint16_t array1[1][99] == *(array + 199);
//equivalent uint16_t array2[199] == *(array + 199);
I haven't tried anything as yet
A union as itself contains either of its members. That is, only one member can be "bound" at a time (this is just an abstraction, since C has no notion about which member is "active").
In general, the effective size of that union will be the higher size on bytes of its members.
Let me give an example:
#include <stdio.h>
typedef union m_arrays
{
int array1[2][100];
int array2[400];
} a;
int main()
{
printf("%zu", sizeof(a));
return 0;
}
In this example, this would print 1600 (assuming int is 4 bytes long, but at the end it will depend on the architecture) and is the highest size in bytes. So, YES, you can have a union of arrays in C
Yes, this does work, and it's actually precisely because of how arrays are different from pointers. I'm sure you've heard that arrays in C are really just pointers, but the truth is that there are some important differences.
First, an array always points to somewhere on the stack. You can't use malloc to make an array because malloc returns a heap address. A pointer can point anywhere, you can even set it to an arbitrary integer if you want (though there's no guaruntee you can access that memory that it points to).
Second, because arrays are fixed length, the compiler can and does allocate them for you when you declare them. Importantly, this comes with the guaruntee that the whole array is in one continuous memory block. So if you declare int arr[2][100], you'll have 200 int slots allocated in a row on the stack. That means you can treat any multimensional array as a single-dimensional array if you want to, e.g. instead of arr[y][x] you could do arr[0][y*100+x]. You could also do something like int* arr2 = arr and then treat arr2 as a regular array even though arr is technically an int** (you'll get a warning for doing either of these things, my point is that you can do them because of how arrays are made).
The third, and probably most important difference, is a consequence of the second. When you have an array in a struct or union, the struct/union isn't just holding a pointer to the first element. It holds the entire array. This is often used for copying arrays or returning them from functions. What this means for you is that what you want to do works despite what someone who's heard that arrays are pointers might initially think. If arrays were just an address and they were initialized by allocating at that address, there would be two different arrays initialized at two different places, and having the pointers to them in a union would mean one gets overwritten and now you have an array somewhere that you can't access.
So when this all comes together, your union of arrays basically has one array with two different ways of accessing the data (which is what you want if I'm not mistaken). A little example:
#include <stdio.h>
int main(void) {
union {
int arr1[4];
int arr2[2][2];
} u;
u.arr1[0] = 1;
u.arr1[1] = 2;
u.arr1[2] = 3;
u.arr1[3] = 4;
printf("%d %d\n%d %d\n", u.arr2[0][0], u.arr2[0][1], u.arr2[1][0], u.arr2[1][1]);
return 0;
}
Output:
1 2
3 4
We can also quickly walk through why this wouldn't work with pure pointers. Let's say we instead had a union like this:
union {
int* arr1;
int** arr2;
} u;
Then we might initialize with u.arr1 = (int*) malloc(4 * sizeof (int));. Then we could use arr1 like a normal array. But what happens when we try to use arr2? Well, arr2[y][x] is of course syntactic sugar for *(*(arr2+y)+x)). Once it's dereferenced that first time, we now have an int, since the address points to an int. So when we add x to that int and try to dereference again, we're trying to dereference an int. C will try to do it, and if you're very unlucky it will succeed; I say unlucky because then you'll be messing with arbitrary memory. What's more likely is a segfault because whatever int is there is most likely not an address your program has access to.
So this is one of the first times I've worked with multi-dimensional arrays in C. I have a 3D array of ints that is being filled and I want to check to make sure the values are turning out alright, but I'm having some issues after the array is filled.
Some similar code of what's going on to explain the situation:
int a = 10;
int b = 20;
int c = 30;
int* valuesPtr;
valuesPtr = (int*) malloc(a * b * c * sizeof(int));
functionThatFillsValues(valuesPtr); // this function expects a *int, and works.
Everything here is working properly. Then, I'm printing out the value like so:
printf("0,0,0: %d\n", *valuesPtr);
This successfully prints the value at [0][0][0]. Doing the following works:
printf("0,0,1?: %d\n", *(valuesPtr+1));
Now, I'm to understand from https://www.geeksforgeeks.org/pointer-array-array-pointer/ there's a few ways to get values from where this pointer is pointing to.
The following works: printf("0,0,0: %d\n", valuesPtr[0]); but as soon as I try to go any further than that, it breaks on make. None of the following work:
printf("0,0,0: %d\n", valuesPtr[0][0];
printf("0,0,0: %d\n", valuesPtr[0][0][0];
printf("0,1,0: %d\n", valuesPtr[0][1];
printf("0,0,1: %d\n", valuesPtr[0][0][1];
printf("%d\n", *(*(valuesPtr+1)+1));
printf("%d\n", *(*(*(valuesPtr+1)+1)+1));
With the [] ones getting the error "subscripted value is neither array nor pointer", and the *() attempts getting the error "invalid type argument of ‘unary *’ (have ‘int’)" I'm guessing because in both cases, it's resolving both valuesPtr[0] and *(valuesPtr) as an int, and then not going any further. I'm guessing the issue is that valuesPtr is declared as a *int, so it believes it's just a pointer to an int. How can I get this to understand it's a pointer to a 3D array, either by getting the pointers to the 2D or 1D arrays inside, or just getting it to understand I can use [][][] notation?
Would something like this be looking at the right spot?
printf("i,j,k: %d", *(valuesPtr+(i*(c*b))+(j*(c))+k)
There are two different things you can do.
Option #1 is to simply use linear offsets to access your elements. For that, you'd have to compute a single offset from your triplet. Let me show you an example:
printf("2, 3, 4: %d\n", valuesPtr[(2 * b + 3) * c + 4]);
This is a bit cumbersome, but it's the traditional way of doing things, and it works.
An alternative, if you're using C99 or later, is to use what's called a variably-modified type. Note that this requires support for VLAs, which is optional since C11, and thus some implementations (notably, MSVC) might not support them.
If you declared the array manually, it'd be int arr[a][b][c];. Because of the array-to-pointer decay rules, this would become an int (*) [b][c] when used in an expression — that is, a pointer to an array of b arrays of c elements. So you can just declare your pointer type to be of that type:
int (* valuesPtr) [b][c] = malloc(sizeof *valuesPtr * a);
// note that, since each element is already an array of [b][c], you only
// need to multiply by a to get the final size
printf("0, 0, 0: %d\n", valuesPtr[0][0][0]);
printf("1, 0, 0: %d\n", valuesPtr[1][0][0]);
printf("0, 1, 0: %d\n", valuesPtr[0][1][0]);
printf("0, 0, 1: %d\n", valuesPtr[0][0][1]);
If I understand your problem correctly, you are working with a 3d data set. However, you are forced to utilize a 1D array that maps to a 3D array. Perhaps the following will help:
You might think about building a (free) union of a 1D array and a 3D array so that you may refer to the data in whichever manner is more convenient. That is, you can fill the 3D array using triple indices, but for 1D purposes you can locate the first integer in the linear array.
Since the size of the arrays is identical, the union will contain nothing extra, and it shouldn't cost much to reference your data in either way. A simple example:
// FreeUnion.c
#include <stdio.h>
#define a 10 // defines for the sizing below
#define b 20
#define c 30
union blob {
int a3d[a][b][c]; // a 3d array
int a1d[a*b*c]; // a 1d array
} arrun;
void main(void) {
arrun.a1d[0] = 0;
arrun.a1d[1] = 1;
arrun.a1d[2] = 2;
printf("one: %i\n", arrun.a1d[1]);
printf("triplet 0: %i, %i, %i \n", arrun.a3d[0][0][0],
arrun.a3d[0][0][1], arrun.a3d[0][0][2]);
}
I am new to structures and I'm trying to do some tutorials to see if i understood well what i've been learning. Here's the code I wrote:
#include <stdio.h>
#include <stdlib.h>
typedef struct variables{
float Vx;
float Vy;
float Vz;
}velocity;
int main(){
velocity *pv;
pv = (velocity*)malloc(sizeof(velocity));
pv[0].Vx = 1;
pv[0].Vy = 2;
pv[0].Vz = 3;
free(pv);
return 0;
}
So my questions are 2:
Did I allocate the three variables in correct way?
Since I'm using the array notation why should I ever write [0]
instead of [1] or [2] or so on?
To answer the first question: yes, your code is completely correct. (You even free'd it properly, I'm a bit proud!)
As for the second question, I'm a bit unsure what you mean, but when you call malloc(N * sizeof(type)) where N is some integer (in your case, it would just be 1), you are in essence just creating an array of N elements of type. So pv[0] is the first and only element in this array when N=1, and pv[1], pv[2] etc don't exist.
You should, however, use the syntax pv->Vx instead of pv[0].Vx.
Your code is correct, but the syntax you are using is a bit odd. The operator [n] means: Take the pointer's address (in your case the value in pv), increment it by n, and dereference it. Since you are not incrementing the address (n = 0), you can just dereference it. You do this with *pv, or simply with pv->. You only need the [] operator when you have allocated more than one struct, and want to set the address to one of these structs. pv[3] would then be the same as *(pv+3). But you first have to allocate more space if you want to use a pointer as an array:
malloc(sizeof(velocity) * 4)
Yes, you used your variables in proper way.
in your code, you've only allocated memory for one instance of the variable. so, it's same if you write pv[0].Vx or pv->Vx. If you want to allocate memory for n number of instances, you can use pv[k].Vx, where 0<=k<=n-1.
SideNote: Please do not cast the return value of malloc().
I know there is several questions about that which gives good (and working) solutions, but none IMHO which says clearly what is the best way to achieve this.
So, suppose we have some 2D array :
int tab1[100][280];
We want to make a pointer that points to this 2D array.
To achieve this, we can do :
int (*pointer)[280]; // pointer creation
pointer = tab1; //assignation
pointer[5][12] = 517; // use
int myint = pointer[5][12]; // use
or, alternatively :
int (*pointer)[100][280]; // pointer creation
pointer = &tab1; //assignation
(*pointer)[5][12] = 517; // use
int myint = (*pointer)[5][12]; // use
OK, both seems to work well. Now I would like to know :
what is the best way, the 1st or the 2nd ?
are both equals for the compiler ? (speed, perf...)
is one of these solutions eating more memory than the other ?
what is the more frequently used by developers ?
//defines an array of 280 pointers (1120 or 2240 bytes)
int *pointer1 [280];
//defines a pointer (4 or 8 bytes depending on 32/64 bits platform)
int (*pointer2)[280]; //pointer to an array of 280 integers
int (*pointer3)[100][280]; //pointer to an 2D array of 100*280 integers
Using pointer2 or pointer3 produce the same binary except manipulations as ++pointer2 as pointed out by WhozCraig.
I recommend using typedef (producing same binary code as above pointer3)
typedef int myType[100][280];
myType *pointer3;
Note: Since C++11, you can also use keyword using instead of typedef
using myType = int[100][280];
myType *pointer3;
in your example:
myType *pointer; // pointer creation
pointer = &tab1; // assignation
(*pointer)[5][12] = 517; // set (write)
int myint = (*pointer)[5][12]; // get (read)
Note: If the array tab1 is used within a function body => this array will be placed within the call stack memory. But the stack size is limited. Using arrays bigger than the free memory stack produces a stack overflow crash.
The full snippet is online-compilable at gcc.godbolt.org
int main()
{
//defines an array of 280 pointers (1120 or 2240 bytes)
int *pointer1 [280];
static_assert( sizeof(pointer1) == 2240, "" );
//defines a pointer (4 or 8 bytes depending on 32/64 bits platform)
int (*pointer2)[280]; //pointer to an array of 280 integers
int (*pointer3)[100][280]; //pointer to an 2D array of 100*280 integers
static_assert( sizeof(pointer2) == 8, "" );
static_assert( sizeof(pointer3) == 8, "" );
// Use 'typedef' (or 'using' if you use a modern C++ compiler)
typedef int myType[100][280];
//using myType = int[100][280];
int tab1[100][280];
myType *pointer; // pointer creation
pointer = &tab1; // assignation
(*pointer)[5][12] = 517; // set (write)
int myint = (*pointer)[5][12]; // get (read)
return myint;
}
Both your examples are equivalent. However, the first one is less obvious and more "hacky", while the second one clearly states your intention.
int (*pointer)[280];
pointer = tab1;
pointer points to an 1D array of 280 integers. In your assignment, you actually assign the first row of tab1. This works since you can implicitly cast arrays to pointers (to the first element).
When you are using pointer[5][12], C treats pointer as an array of arrays (pointer[5] is of type int[280]), so there is another implicit cast here (at least semantically).
In your second example, you explicitly create a pointer to a 2D array:
int (*pointer)[100][280];
pointer = &tab1;
The semantics are clearer here: *pointer is a 2D array, so you need to access it using (*pointer)[i][j].
Both solutions use the same amount of memory (1 pointer) and will most likely run equally fast. Under the hood, both pointers will even point to the same memory location (the first element of the tab1 array), and it is possible that your compiler will even generate the same code.
The first solution is "more advanced" since one needs quite a deep understanding on how arrays and pointers work in C to understand what is going on. The second one is more explicit.
int *pointer[280]; //Creates 280 pointers of type int.
In 32 bit os, 4 bytes for each pointer. so 4 * 280 = 1120 bytes.
int (*pointer)[100][280]; // Creates only one pointer which is used to point an array of [100][280] ints.
Here only 4 bytes.
Coming to your question, int (*pointer)[280]; and int (*pointer)[100][280]; are different though it points to same 2D array of [100][280].
Because if int (*pointer)[280]; is incremented, then it will points to next 1D array, but where as int (*pointer)[100][280]; crosses the whole 2D array and points to next byte. Accessing that byte may cause problem if that memory doen't belongs to your process.
Ok, this is actually four different question. I'll address them one by one:
are both equals for the compiler? (speed, perf...)
Yes. The pointer dereferenciation and decay from type int (*)[100][280] to int (*)[280] is always a noop to your CPU. I wouldn't put it past a bad compiler to generate bogus code anyways, but a good optimizing compiler should compile both examples to the exact same code.
is one of these solutions eating more memory than the other?
As a corollary to my first answer, no.
what is the more frequently used by developers?
Definitely the variant without the extra (*pointer) dereferenciation. For C programmers it is second nature to assume that any pointer may actually be a pointer to the first element of an array.
what is the best way, the 1st or the 2nd?
That depends on what you optimize for:
Idiomatic code uses variant 1. The declaration is missing the outer dimension, but all uses are exactly as a C programmer expects them to be.
If you want to make it explicit that you are pointing to an array, you can use variant 2. However, many seasoned C programmers will think that there's a third dimension hidden behind the innermost *. Having no array dimension there will feel weird to most programmers.
This program is supposed to take in a three digit number and change it into its palindrome. 123 would become 321.
The logic is correct, and the program compiles correctly. :) However, the logic of these does not come easily.
My prof explains things with "stack diagrams" and I find them to be helpful. I created this program based off another program because I noticed the similarities between this and a different program I made, but how does the pointing work?
#include <stdio.h>
void reverse_number(int in_val, int *out_val) {
int ones, tens, hundreds;
ones = in_val % 10;
tens = (in_val % 100 - ones) / 10;
hundreds = (in_val - (ones + tens)) / 100;
*out_val = (ones * 100) + (tens * 10) + hundreds;
}
int main() {
int in_val;
int out_val;
printf("Give a three digit num to reverse: \n");
scanf("%d", &in_val);
reverse_number(in_val, &out_val);
printf("New number is: %d \n", out_val);
return 0;
}
Also, I am now beginning to understand how to write programs based on a kind of template with these pointers, and I understand very basically what the star inside a parameter means (declared as a pointer variable).
For example, I know that m = &q; gives variable m the address of another variable q and I know that m = *g; would mean that the value at the address g would go into m but I am really unfamiliar with how these work in the context of a function and a main file.
If someone could lay out the fundamental logic of how it would work (in this program) that would be awesome. As a math major, I can understand the operations of the math and stuff but the pointers have me not confused but it just seems to me that there are ways to do it without needing to deal with the address of a variable, etc.
When I run it, it compiles, and even works. See: http://ideone.com/RHWwI
So it must be how you're compiling it. What is your compiler error?
Well, since you understood & and * operators perfectly, the rest is very very simple.
Let's say you have:
int q;
int *m;
m = &q;
Then if you say:
int *m2;
m2 = m;
m2 will contain the same value as m, that is, it will have the address of q. Therefore, *m and *m2 will give you the same value (which is the value of q) (you do understand that * is the inverse operator of & right? So *(&q) = q and &(*m) = m (in the later case, m needs to be a pointer for * to be applicable.))
So, how does this work with functions? Simple! When you pass arguments to functions, you pass them by value. When you pass by pointer, you are actually passing by value, the pointer of the variable.
So let's examine your function call in detail:
reverse_number(in_orig, &out_orig);
I renamed your in_val and out_val in the main to in_orig and out_orig so it won't get mixed with those of reverse_number.
Now, &out_orig is the address of out_orig. When passed as arguments, this gets copied into out_val argument of reverse_number. This is exactly like writing:
int *out_val = &out_orig;
Now, if you had the above line in your main, you could just write *out_val = something; and it would change out_orig, right? Well, since you have the address of out_orig in out_val, then who cares if *out_val is being set in main or reverse_number?
So you see? When you have a pointer, you can just copy it around, whether by copying it to another variable or passing it as argument of a function (which is basically the same thing), you can still access the same variable it is pointing to. After all, all the copies have the same value: address of out_orig. Now if you want to access it in a function or in main, it doesn't really matter.
Edit: * in pointer definition
* can be used to define a pointer too and this has nothing to do with the previous usage of * as an operator that gets the value of an address.
This is just definition, so you have to learn it:
If you have a value of type type (for example int), then the address of that variable (using operator &) has type type * (in this example int *). Since a pointer takes that address, the type of the pointer is type *.
On the contrary, if a pointer has type type * (for example int *), then getting the value where the pointer points to (using operator *) has type type (in this example int).
In summary, you can say something like this:
operator &, adds one * to the type of the variable
operator *, removes one * from the type of the expression
So let's see some examples:
int x;
x has type int
&x has type int *
float *y;
y has type float *
&y has type float **
*y has type float
struct Data ***d;
d has type struct Data ***
&d has type struct Data ****
*d has type struct Data **
*(*d) has type struct Data *
*(*(*d)) has type struct Data
If you noticed, I said that & adds one * to the type of variable, but * removes one * from the type of expression. Why is that? Because & gives the address of a variable. Of course, because nothing else has an address. For example a+b (possibly) doesn't have any address in the memory, and if it does, it's just temporary and useless.
operator * however, works on addresses. No matter how you compute the address, operator * works on it. Examples:
*(0x12345678) -> Note that even if the compiler let's you do this,
your program will most likely crash
*d -> like we saw before
*(d+4) -> This is the same as writing d[4]
And now you know why arrays and pointers are treated as one
In case of a dynamic 2d array:
*(*(d+4)+6) -> This is the same as writing d[4][6]