Memcpy into an array inside a malloced structure - c

This is my scenario.
struct X {
char A[10];
int foo;
};
struct X *x;
char B[10]; // some content in it.
x = malloc(sizeof(struct X));
To copy contents from B to A, why is the following syntax correct:
memcpy(x->A, B, sizeof(x->A));
Here x->A is treated as a pointer, so we don't need &. But then we should need sizeof(*x->A) right? But with * it is not working, while without * it's working fine.
Is it like sizeof operator does not treat A like a pointer?

A is NOT a pointer, it's an array. So sizeof(x->A) is the correct syntax, it's the size of the whole array, i.e, 10.
It's true that in many situations, an array name is converted to a pointer to the first element. But sizeof(arrayname) is NOT one of them.

sizeof(*x->A) gives you the size of a char(1 byte), while size0f(x->A) gives you the size of the entire array(10bytes).

sizeof(*x->A) is equivalent to sizeof(x->A[0]).
sizeof(*x->A) is 1 bye here. So memcpy will happen for only one byte.
This is sizeof(x->A) is the correct procedure.

Though in many cases array name decay to a pointer (like the first argument to memcpy() in your example), there are a few that don't and sizeof operator argument is one of them. Other examples are unary & operator argument, etc. C++ has more scenarios (e.g. initializer for array reference).

Just to add on to previous comments
sizeof(x->A) is correct sizeof(*x->A) is not correct because -> has higher precedence than * so first the address of A is obtained(X->A) then * again deference's it to first byte (one char byte).
Not to forget sizeof operator doesn't consider '\0' character. if the the string "Hello" is pointed by A then it returns 5 ( array size is 6 including '\0'),
so while copying to B you have to add '\0' explicitly or you can increase the number bytes to be copied by one as shown below.
memcpy(x->A, B, sizeof(x->A) + 1);

Related

if array type does not have = operator then I understand that but why my casting of pointer/array to pointer to array is working not as expected

why this code does not seem to work the way I expect
char *c="hello";
char *x=malloc(sizeof(char)*5+1);
memcpy(x,(char(*)[2])c,sizeof("hello"));
printf("%s\n",x);
On this question I got comment you cannot cast a pointer to an array. But you can cast it to a pointer to array. Try (char*[2])c so I am just casting to pointer to array of two char so it will get first two characters from c becuase this is what (char(*)[2])c suppose to do. If not then am I missing anything? and I thought since Iam copying it the at index after 1 and 2 I get junk because i did not call memset. why I am getting full hello write with memcpy even though I just casted it t0 (char(*)[2])
how to extract specific range of characters from string with casting to array type-- What it can't be done?
Converting a pointer does not change the memory the pointer points to. Converting the c to char [2] or char (*)[2] will not separate two characters from c.
c is char * that points to the first character of "hello".
(char (*)[2]) c says to take that address and convert it to the type “pointer to an array of 2 char”. The result points to the same address as before; it just has a different type. (There are some technical C semantic issues involved in type conversions and aliasing, but I will not discuss those in this answer.)
memcpy(x,(char(*)[2])c,sizeof("hello")); passes that address to memcpy. Due to the declaration of memcpy, that address is automatically converted to const void *. So the type is irrelevant (barring the technical issues mentioned above); whether you pass the original c or the converted (char (*)[2]) c, the result is a const void * to the same address.
sizeof "hello" is 6, because "hello" creates an array that contains six characters, including the terminating null character. So memcpy copies six bytes from "hello" into x.
Then x[5]='\0'; is redundant because the null character is already there.
To copy n characters from position p in a string, use memcpy(x, c + p, n);. In this case, you will need to manually append a null character if it is not included in the n characters. You may also need to guard against going beyond the end of the string pointed to by c.

memcpy start index really needed?

The question is when we are copying any Byte array using memcpy(), shall we explicitly declare the starting (0 th) index for the destination buffer or simple mentioning it would suffice. Let me show the examples what I'm talking about. Provided that we are trying to copy source buffer to starting of the destination buffer.
BYTE *pucInputData; // we have some data here
BYTE ucOutputData[20] = {0};
Code 1
memcpy((void*)&ucOutputData, (void*)pucInputData, 20);
Code 2
memcpy((void*)&ucOutputData[0], (void*)pucInputData, 20);
In your case, considering this a C code snippet, and ucOutputData is an array
memcpy(ucOutputData, pucInputData, 20);
memcpy(&ucOutputData[0], pucInputData, 20);
both are same and can be used Interchangeably. The name of the array essentially gives you the address of the first element in the array.
Now, as per the very useful discussion in below comments, it is worthy to mention, that
memcpy(&ucOutputData, pucInputData, 20);
will also do the job here, but there is a fundamental difference between the usage of array name and address of array name. Considering the example in the question, for a definition like BYTE ucOutputData[20],
ucOutputData points to the address of the first element of an array of 20 BYTEs.
&ucOutputData is a pointer to an array of 20 BYTEs.
So, they are of different type and C respects the type of the variable. Hence, to avoid any possible misuse and misconception, the recommended and safe way to use this is either of the the first two expressions.
FWIW, the cast(s) here is(are) really not needed. Any pointer type can be implicitly ansd safely be converted to void * in C.
No, both of your examples are sub-optimal.
Remember that all data pointers in C convert to/from void * (which is the type of the first argument to memcpy()) without loss of information and that no cast is necessary to do so.
Also remember that the name of an array evaluates to the address of the first element in many contexts, such as here.
Also remember to use sizeof when you can, never introduce a literal constant when you don't have to.
So, the copy should just be:
memcpy(ucOutputData, pucInputData, sizeof ucOutputData);
Note that we use sizeof without parentheses, it's not a function. Also we use it on the destination buffer, which seems the safer choice.
Since an expression &array[0] is the same as array, and because any pointer can be implicitly converted to void*, you should do this instead:
memcpy(ucOutputData, pucInputData, 20);
Moreover, since you are writing over the entire ucOutputData, you do not need to zero out its content, so it's OK to drop the initializer:
BYTE ucOutputData[20]; // no "= {0}" part
A native array can decay to a pointer without conversion, so in the snippet below, the three assignments to p all result in the same value; p will point to the beginning of the array. No explicit cast is needed because casting to void* is implicit.
typedef char BYTE;
BYTE ucOutputData[20] = {0};
void *p = &ucOutputData;
p = ucOutputData;
p = &ucOutputData[0];

Pointers to struct and array in C

Questions are based on the following code :
struct t
{
int * arr;
};
int main()
{
struct t *a = malloc(5*sizeof(struct t));
a[2].arr = malloc(sizeof(int));//line 1
a[2].arr[1] = 3; //line 2
}
In line 2 I'm accessing the array arr using the . (dot) operator and not the -> operator. Why does this work?
When i rewrite line 2 as (a+2)->arr[1] = 3 this works. But if I write it as (a+2)->(*(arr+1)) = 3 I get a message as expected identifier before '(' token. Why is this happening?
For line 1, the dot operator works in this case, because the array access dereferences the pointer for you. *(a+2) == a[2]. These two are equivalent in both value and type.
The "->" operator, expects an identifier after it, specifically the right argument must be a property of the type of the left argument. Read the messages carefully, it really is just complaining about your use of parentheses. (Example using the . operator instead: a[2].(arr) is invalid, a[2].arr is just dandy.)
Also, if we can extrapolate meaning from your code, despite its compilation errors, there is the potential for memory related run time issues as well.
-> dereferences a pointer and accesses its pointee. As you seem to know a[1] is equivalent to *(a + 1), where the dereference already takes place.
The expression (a+2)->arr[1] is equivalent to *((a+2)->arr + 1).
You allocated one single struct t for a[2].arr, then wrote in the second one. Oops.
a[2] is not a pointer. The indexing operator ([]) dereferences the pointer (a[2] is equivalent to *(a+2)).
(*(arr+1)) is an expression. If you want to do it that way, you want to get the pointer (a+2)->(arr+1), then derefrence it: *((a+2)->arr+1). Of course, since you've only malloced enough memory for one int, this will attempt to access unallocated memory. If you malloc(sizeof(int)*2), it should work.

Accessing arrays in an array of pointers

Let's say I have an array of pointers in C. For instance:
char** strings
Each pointers in the array points to a string of a different length.
If I will do, for example: strings + 2, will I get to the third string, although the lengths may differ?
Yes, you will (assuming that the array has been filled correctly). Imagine the double pointer situation as a table. You then have the following, where each string is at a completely different memory address. Please note that all addresses have been made up, and probably won't be real in any system.
strings[0] = 0x1000000
strings[1] = 0xF0;
...
strings[n] = 0x5607;
0x1000000 -> "Hello"
0xF0 -> "World"
Note here that none of the actual text is stored in the strings. The storage at those addresses will contain the actual text though.
For this reason, strings + 2 will add two to the strings pointer, which will yield strings[2], which will yield a memory address, which can then be used to access the string.
strings + 2 is the address of the 3rd element of the buffer pointed to by string.
*(strings + 2) or strings[2] is the 3rd element which is again a pointer to a buffer of characters.
i think you are looking to access third element through expression
strings[2];
but this will not be the case because look at the type of expression string[2]
Type is char *
As according to the standards
A 'n' element array of type 't' will be decayed into pointer of type __t__.With the exception when expression is an operand to '&' operator and 'sizeof' operator.
so strings[2] is equivalent to *(strings + 2) so it will print the contents of the pointer to pointer at third location,which is the contents of a pointer i.e an address.
But
strings+2;
whose type is char ** will print the 3 rd location's address,i.e,address of the 3rd element of array of pointer, whose base address is stored in **string.
But in your question you have not shown any assignment to the char ** strings and i am answering by assuming it to be initialised with particular array of pointers.
According to your question it is silly to do
*(strings + 2)
As it is not initialised.

array of pointers

Consider the following code:
#include <stdio.h>
int main()
{
static int a[]={0,1,2,3,4};
int *p[]={a,a+1,a+2,a+3}; /* clear up to this extent */
printf(("%u\n%u\n%u",p,*p,*(*p))); /* how does this statement work? */
return 0;
}
Also is it necessary to get the value of addresses through %u,or we can use %d also?
Okay, you've created an array of integers and populated it with the integers from 0 to 4. Then you created a 4 element array of pointers to integers, and initialized it so its four elements point to the first four elements of a. So far, so good.
Then the printf is very strange. printf is passed a single argument, namely ("%u\n%u\n%u",p,p,(*p)). This is a comma-expression which means that the comma-separated expressions will be calculated in turn, and only the last one returned. Since the very first thing is a literal, and not an expression, I'd expect it to generate an error. However, without the extraneous parentheses, you have:
printf("%u\n%u\n%u\n",p, *p, *(*p));
This is legal. Three values are passed to printf, interpreted as unsigned integers (which actually only works on some systems, since what you are actually passing in are pointers in the first two cases, and they aren't guarateed to be the same size as unsigned ints) and printed.
Those values are p, *p and **p. p is an array, and so the value of p is the address of the array. *p is what p points to, which are the values of the array. *p is the first value, *(p+1) is the second value, etc. Now *p is the value stored in p[0] which is the address of a[0], so another address is printed. The third argument is **p which is the value stored at (*p), or a[0], which is 0
Do you have an extra pair of parens in your printf statement?
Anyway, you can think of this statement:
printf("%u\n%u\n%u",p,*p,*(*p));
like following a trail of pointers.
p is the pointer itself, printing it should print out the pointer's value which is the address of what it points to. In your case its an array of (int *)'s.
*p is a dereferencing operation. It allows access to the object that p points to. In the other answers you see notes made about *p being equivalent to p[0]. That's because p is pointing to the beginning of your structure, which is the start of the array.
**p is a dereferencing operation on the pointer object that p points to. Extending the example in the previous point, you can say that **p is equivalent to *(p[0]) which is equivalent to *(a) which is equivalent to a[0].
One tip that might help you when trying to decipher these sorts of statements is that keep in mind the precedence rules of C and insert parens between expressions in the statement to break up the statement. For the **p, inserting parens would do this: *(*p) which makes it clear that what you're doing is to follow two pointers to the final destination.
With those extra parentheses, the commas become comma operators so only the final **p is passed to printf. Since printf expects its first argument to be a pointer to a character string, and on most systems pointers
and integers have the same size, so the integer 0 is interpreted as a NULL pointer, and printf prints nothing at all. Or it crashes. That's the trouble with undefined behavior.
Your printf() arguments work like so:
p is an address (it's an array of pointers)
*p is also an address (it's equivalent to p[0], which is just a)
*(*p) is an integer (it's a[0])
My memory on C pointers is a tiny bit rusty, but let me see if I can recall.
p should be a memory location, it points to nothing else, other than p.
*p dereferences (goes to the memory location and returns the value there) p. since p itself is a pointer to pointers (*p[] can be also written as **p) when we dereference p we get the first value in the array definition, or the address of a.
**p dereferences *p. *p is the address of a. If we dereference that, we'll get the value we put in the first location of a, which is 0

Resources