memcpy start index really needed? - c

The question is when we are copying any Byte array using memcpy(), shall we explicitly declare the starting (0 th) index for the destination buffer or simple mentioning it would suffice. Let me show the examples what I'm talking about. Provided that we are trying to copy source buffer to starting of the destination buffer.
BYTE *pucInputData; // we have some data here
BYTE ucOutputData[20] = {0};
Code 1
memcpy((void*)&ucOutputData, (void*)pucInputData, 20);
Code 2
memcpy((void*)&ucOutputData[0], (void*)pucInputData, 20);

In your case, considering this a C code snippet, and ucOutputData is an array
memcpy(ucOutputData, pucInputData, 20);
memcpy(&ucOutputData[0], pucInputData, 20);
both are same and can be used Interchangeably. The name of the array essentially gives you the address of the first element in the array.
Now, as per the very useful discussion in below comments, it is worthy to mention, that
memcpy(&ucOutputData, pucInputData, 20);
will also do the job here, but there is a fundamental difference between the usage of array name and address of array name. Considering the example in the question, for a definition like BYTE ucOutputData[20],
ucOutputData points to the address of the first element of an array of 20 BYTEs.
&ucOutputData is a pointer to an array of 20 BYTEs.
So, they are of different type and C respects the type of the variable. Hence, to avoid any possible misuse and misconception, the recommended and safe way to use this is either of the the first two expressions.
FWIW, the cast(s) here is(are) really not needed. Any pointer type can be implicitly ansd safely be converted to void * in C.

No, both of your examples are sub-optimal.
Remember that all data pointers in C convert to/from void * (which is the type of the first argument to memcpy()) without loss of information and that no cast is necessary to do so.
Also remember that the name of an array evaluates to the address of the first element in many contexts, such as here.
Also remember to use sizeof when you can, never introduce a literal constant when you don't have to.
So, the copy should just be:
memcpy(ucOutputData, pucInputData, sizeof ucOutputData);
Note that we use sizeof without parentheses, it's not a function. Also we use it on the destination buffer, which seems the safer choice.

Since an expression &array[0] is the same as array, and because any pointer can be implicitly converted to void*, you should do this instead:
memcpy(ucOutputData, pucInputData, 20);
Moreover, since you are writing over the entire ucOutputData, you do not need to zero out its content, so it's OK to drop the initializer:
BYTE ucOutputData[20]; // no "= {0}" part

A native array can decay to a pointer without conversion, so in the snippet below, the three assignments to p all result in the same value; p will point to the beginning of the array. No explicit cast is needed because casting to void* is implicit.
typedef char BYTE;
BYTE ucOutputData[20] = {0};
void *p = &ucOutputData;
p = ucOutputData;
p = &ucOutputData[0];

Related

if array type does not have = operator then I understand that but why my casting of pointer/array to pointer to array is working not as expected

why this code does not seem to work the way I expect
char *c="hello";
char *x=malloc(sizeof(char)*5+1);
memcpy(x,(char(*)[2])c,sizeof("hello"));
printf("%s\n",x);
On this question I got comment you cannot cast a pointer to an array. But you can cast it to a pointer to array. Try (char*[2])c so I am just casting to pointer to array of two char so it will get first two characters from c becuase this is what (char(*)[2])c suppose to do. If not then am I missing anything? and I thought since Iam copying it the at index after 1 and 2 I get junk because i did not call memset. why I am getting full hello write with memcpy even though I just casted it t0 (char(*)[2])
how to extract specific range of characters from string with casting to array type-- What it can't be done?
Converting a pointer does not change the memory the pointer points to. Converting the c to char [2] or char (*)[2] will not separate two characters from c.
c is char * that points to the first character of "hello".
(char (*)[2]) c says to take that address and convert it to the type “pointer to an array of 2 char”. The result points to the same address as before; it just has a different type. (There are some technical C semantic issues involved in type conversions and aliasing, but I will not discuss those in this answer.)
memcpy(x,(char(*)[2])c,sizeof("hello")); passes that address to memcpy. Due to the declaration of memcpy, that address is automatically converted to const void *. So the type is irrelevant (barring the technical issues mentioned above); whether you pass the original c or the converted (char (*)[2]) c, the result is a const void * to the same address.
sizeof "hello" is 6, because "hello" creates an array that contains six characters, including the terminating null character. So memcpy copies six bytes from "hello" into x.
Then x[5]='\0'; is redundant because the null character is already there.
To copy n characters from position p in a string, use memcpy(x, c + p, n);. In this case, you will need to manually append a null character if it is not included in the n characters. You may also need to guard against going beyond the end of the string pointed to by c.

C conventions - how to use memset on array field of a struct

I wold like to settle an argument about proper usage of memset when zeroing an array field in a struct (language is C).
Let say that we have the following struct:
struct my_struct {
int a[10]
}
Which of the following implementations are more correct ?
Option 1:
void f (struct my_struct * ptr) {
memset(&ptr->a, 0, sizeof(p->a));
}
Option 2:
void f (struct my_struct * ptr) {
memset(ptr->a, 0, sizeof(p->a));
}
Notes:
If the field was of a primitive type or another struct (such as 'int') option 2 would not work, and if it was a pointer (int *) option 1 would not work.
Please advise,
For a non-compound type, you would not use memset at all, because direct assignment would be easier and potentially faster. It would also allow for compiler optimizations a function call does not.
For arrays, variant 2 works, because an array is implictily converted to a pointer for most operations.
For pointers, note that in variant 2 the value of the pointer is used, not the pointer itself, while for an array, a pointer to the array is used.
Variant 1 yields the address of the object itself. For a pointer, that is that of the pointer (if this "works" depends on your intention), for an array, it is that of the array - which happens to always be the address of its first element - but the type differs here (irrelevant, as memset takes void * and internally converts to char *).
So: it depends; for an array, I do not see much difference actually, except the address-operator might confuse reads not so familar with operator preceedence (and it is more to type). As a personal opinion: I prefer the simpler syntax, but would not complain about the other.
Note that memset with any other value than 0 does not make much sense actually; it might not even guarantee an array of pointers to be interpreted as null pointer.
IMO, option 1 is preferable because the same pattern works for any object, not just arrays:
memset(&obj, 0, sizeof obj);
You can tell just from this statement that it does not cause a buffer overflow -- i.e. does not access out of bounds. It's still possible that this doesn't do what was intended (e.g. if obj is a pointer and it was intended to set what the pointer was pointing to), but at least the damage is contained.
However if you accidentally use memset(p, 0, sizeof p) on a pointer then you may write past the end of the object being pointed to; or if the object is bigger than sizeof p, you leave the object in a weird state.

Getting length of an array

I've been wondering how to get the number of elements of an array. Somewhere in this website I found an answer which told me to declare the following macro:
#define NELEMS(x) (sizeof(x) / sizeof(x[0]))
It works well for arrays defined as:
type arr[];
but not for the following:
type *arr = (type) malloc(32*sizeof(type));
it returns 1 in that case (it's supposed to return 32).
I would appreciate some hint on that
Pointers do not keep information about whether they point to a single element or the first element of an array
So if you have a statement like this
type *arr = (type) malloc(32*sizeof(type));
then here is arr is not an array. It is a pointer to the beginning of the dynamically allocated memory extent.
Or even if you have the following declarations
type arr[10];
type *p = arr;
then again the pointer knows nothing about whether it points to a single object or the first element of an array. You can in any time write for example
type obj;
p = &obj;
So when you deal with pointers that point to first elements of arrays you have to keep somewhere (in some other variable) the actual size of the referenced array.
As for arrays themselves then indeed you may use expression
sizeof( arr ) / sizeof( *arr )
or
sizeof( arr ) / sizeof( arr[0] )
But arrays are not pointers though very often they are converted to pojnters to their first elements with rare exceptions. And the sizeof operator is one such exception. Arrays used in sizeof operator are not converted to pointers to their first elements.
sizeof operator produces the size of a type of the variable. It does not count the amount of memory allocated to a pointer (representing the array).
To elaborate,
in case of type arr[32];, sizeof (arr) is essentially sizeof(type[32]).
in case of type *arr;, sizeof(arr) is essentially sizeof(type*)
To get the length of a string, you need to use strlen().
Remember, the definition of string is a null-terminated character array.
That said, in your code,
type *arr = (type) malloc(32*sizeof(type));
is very wrong. To avoid this kind of error, we suggest do not cast malloc().
And remove the cast. You should not cast the result of malloc and
family.
These are the main reasons for not casting the returned value from malloc (and family of functions).
in C, the return type of those functions is 'void*'. A void * can be assigned to any pointer type.
During debugging and during maintenance the receiving pointer type is often changed. The origin of that change is often not where the malloc function is called. If the returned value is cast, then a bug is introduced to the code. This kind of bug can be very difficult to find.
There is no safe and sound way of finding the length of an array in C since no bookkeeping is done for them.
You will need to use some other data structures which does the book keeping for you in order to ensure the correct result every time.

Clarification about copying an array by referencing a pointer

So I have this array in a header file like this:
// header.h
static const unsigned int array1[]={0x00,0x01,0x02,0x03};
And:
// file.c
main()
{
unsigned int *ptrToArray;
ptrArray = &array1[0];
}
Correct me if I am wrong. I assume: to find the number of bytes of array elements, instead of sizeof(array1) the equivalent will be sizeof(*ptrArray), right?
And to access the elements of the array, instead of array[i], it will now be:
*(ptrArray) for the first element,
*(ptrArray+1) for the 2nd element so on right?
The type of *ptrToArray is int, therefore sizeof(*ptrToArray) is the same as sizeof(int). So it won't tell you anything about the number of elements in array1.
Whilst you can write *(ptrArray+1), etc., you should just write ptrToArray[1]!
A pointer is not an array, and an array is not a pointer. An array can decay into a pointer when convenient, but it is still a complete type.
So, the type of *someIntPointer is int, not an array, even if that pointer happens to point to the first element in an array. sizeof(someArray) works as you would expect because it knows that the type is actually an array.
sizeof won't behave in the same way for your pointer: your example will give you the size of the datatype: unsigned int.
And while you can use pointer arithmetic to reference elements through ptrArray, you can just as well use standard array dereferencing: ptrArray[0], ptrArrray[1], ... and in most cases you're better off doing so.
Sizeof will return the size of the pointer for regular pointer types. If you sizeof a dereferenced pointer type, you will get the size of the element (i.e. sizeof(unsigned int)). You will need to either keep track of the number of elements in the array yourself, or use sizeof on the array declaration.
As for accessing, you could do it that way, but you can just use the bracket notation as you would with a normal array.
Arrays are a special class of pointer. The compiler knows when to treat an array as an array and when to treat it as a pointer: that's how it knows how big an array is, but you can still pass it to functions that expect an array (when you do this, you get a pointer to the first element). The same does not work in reverse however: The compiler will never treat a pointer declared as a pointer as an array.
By the way, [] just simplifies to pointer arithmetic. You can add a pointer to an int, but you can also add an int to a pointer. You can thus (but probably shouldn't) do weird things like 1[ptrArray]

memcpy fails but assignment doesn't on character pointers

Actually, memcpy works just fine when I use pointers to characters, but stops working when I use pointers to pointers to characters.
Can somebody please help me understand why memcpy fails here, or better yet, how I could have figured it out myself. I am finding it very difficult to understand the problems arising in my c/c++ code.
char *pc = "abcd";
char **ppc = &pc;
char **ppc2 = &pc;
setStaticAndDynamicPointers(ppc, ppc2);
char c;
c = (*ppc)[1];
assert(c == 'b'); // assertion doesn't fail.
memcpy(&c,&(*ppc[1]),1);
if(c!='b')
puts("memcpy didn't work."); // this gets printed out.
c = (*ppc2)[3];
assert(c=='d'); // assertion doesn't fail.
memcpy(&c, &(*ppc2[3]), 1);
if(c != 'd')
puts("memcpy didn't work again.");
memcpy(&c, pc, 1);
assert(c == 'a'); // assertion doesn't fail, even though used memcpy
void setStaticAndDynamicPointers(char **charIn, char **charIn2)
{
// sets the first arg to a pointer to static memory.
// sets the second arg to a pointer to dynamic memory.
char stat[5];
memcpy(stat, "abcd", 5);
*charIn = stat;
char *dyn = new char[5];
memcpy(dyn, "abcd", 5);
*charIn2 = dyn;
}
your comment implies that char stat[5] should be static, but it isn't. As a result charIn points to a block that is allocated on the stack, and when you return from the function, it is out of scope. Did you mean static char stat[5]?
char stat[5];
is a stack variable which goes out of scope, it's not // sets the first arg to a pointer to static memory.. You need to malloc/new some memory that gets the abcd put into it. Like you do for charIn2
Just like what Preet said, I don't think the problem is with memcpy. In your function "setStaticAndDynamicPointers", you are setting a pointer to an automatic variable created on the stack of that function call. By the time the function exits, the memory pointed to by "stat" variable will no longer exist. As a result, the first argument **charIn will point to something that's non-existent. Perhaps you can read in greater detail about stack frame (or activation record) here: link text
You have effectively created a dangling pointer to a stack variable in that code. If you want to test copying values into a stack var, make sure it's created in the caller function, not within the called function.
In addition to the definition of 'stat', the main problem in my eyes is that *ppc[3] is not the same as (*ppc)[3]. What you want is the latter (the fourth character from the string pointed to by ppc), but in your memcpy()s you use the former, the first character of the fourth string in the "string array" ppc (obviously ppc is not an array of char*, but you force the compiler to treat it as such).
When debugging such problems, I usually find it helpful to print the memory addresses and contents involved.
Note that the parenthesis in the expressions in your assignment statements are in different locations from the parenthesis in the memcpy expressions. So its not too suprising that they do different things.
When dealing with pointers, you have to keep the following two points firmly in the front of your mind:
#1 The pointer itself is separate from the data it points to. The pointer is just a number. The number tells us where, in memory, we can find the beginning of some other chunk of data. A pointer can be used to access the data it points to, but we can also manipulate the value of the pointer itself. When we increase (or decrease) the value of the pointer itself, we are moving the "destination" of the pointer forward (or backward) from the spot it originally pointed to. This brings us to the second point...
#2 Every pointer variable has a type that indicates what kind of data is being pointed to. A char * points to a char; a int * points to an int; and so on. A pointer can even point to another pointer (char **). The type is important, because when the compiler applies arithmetic operations to a pointer value, it automatically accounts for the size of the data type being pointed to. This allows us to deal with arrays using simple pointer arithmetic:
int *ip = {1,2,3,4};
assert( *ip == 1 ); // true
ip += 2; // adds 4-bytes to the original value of ip
// (2*sizeof(int)) => (2*2) => (4 bytes)
assert(*ip == 3); // true
This works because the array is just a list of identical elements (in this case ints), laid out sequentially in a single contiguous block of memory. The pointer starts out pointing to the first element in the array. Pointer arithmetic then allows us to advance the pointer through the array, element-by-element. This works for pointers of any type (except arithmetic is not allowed on void *).
In fact, this is exactly how the compiler translates the use of the array indexer operator []. It is literally shorthand for a pointer addition with a dereference operator.
assert( ip[2] == *(ip+2) ); // true
So, How is all this related to your question?
Here's your setup...
char *pc = "abcd";
char **ppc = &pc;
char **ppc2 = &pc;
for now, I've simplified by removing the call to setStaticAndDynamicPointers. (There's a problem in that function too—so please see #Nim's answer, and my comment there, for additional details about the function).
char c;
c = (*ppc)[1];
assert(c == 'b'); // assertion doesn't fail.
This works, because (*ppc) says "give me whatever ppc points to". That's the equivalent of, ppc[0]. It's all perfectly valid.
memcpy(&c,&(*ppc[1]),1);
if(c!='b')
puts("memcpy didn't work."); // this gets printed out.
The problematic part —as others have pointed out— is &(*ppc[1]), which taken literally means "give me a pointer to whatever ppc[1] points to."
First of all, let's simplify... operator precedence says that: &(*ppc[1]) is the same as &*ppc[1]. Then & and * are inverses and cancel each other out. So &(*ppc[1]) simplifies to ppc[1].
Now, given the above discussion, we're now equipped to understand why this doesn't work: In short, we're treating ppc as though it points to an array of pointers, when in fact it only points to a single pointer.
When the compiler encounters ppc[1], it applies the pointer arithmetic described above, and comes up with a pointer to the memory that immediately follows the variable pc -- whatever that memory may contain. (The behavior here is always undefined).
So the problem isn't with memcopy() at all. Your call to memcpy(&c,&(*ppc[1]),1) is dutifully copying 1-byte (as requested) from the memory that's pointed to by the bogus pointer ppc[1], and writing it into the character variable c.
As others have pointed out, you can fix this by moving your parenthesis around:
memcpy(&c,&((*ppc)[1]),1)
I hope the explanation was helpful. Good luck!
Although the previous answers raise valid points, I think the other thing you need to look at is your operator precedence rules when you memcpy:
memcpy(&c, &(*ppc2[3]), 1);
What happens here? It might not be what you're intending. The array notation takes higher precedence than the dereference operator, so you first attempt perform pointer arithmetic equivalent to ppc2++. You then dereference that value and pass the address into memcpy. This is not the same as (*ppc2)[1]. The result on my machine is an access violation error (XP/VS2005), but in general this is undefined behaviour. However, if you dereference the same way you did previously:
memcpy(&c, &((*ppc2)[3]), 1);
Then that access violation goes away and I get proper results.

Resources