Getting length of an array - c

I've been wondering how to get the number of elements of an array. Somewhere in this website I found an answer which told me to declare the following macro:
#define NELEMS(x) (sizeof(x) / sizeof(x[0]))
It works well for arrays defined as:
type arr[];
but not for the following:
type *arr = (type) malloc(32*sizeof(type));
it returns 1 in that case (it's supposed to return 32).
I would appreciate some hint on that

Pointers do not keep information about whether they point to a single element or the first element of an array
So if you have a statement like this
type *arr = (type) malloc(32*sizeof(type));
then here is arr is not an array. It is a pointer to the beginning of the dynamically allocated memory extent.
Or even if you have the following declarations
type arr[10];
type *p = arr;
then again the pointer knows nothing about whether it points to a single object or the first element of an array. You can in any time write for example
type obj;
p = &obj;
So when you deal with pointers that point to first elements of arrays you have to keep somewhere (in some other variable) the actual size of the referenced array.
As for arrays themselves then indeed you may use expression
sizeof( arr ) / sizeof( *arr )
or
sizeof( arr ) / sizeof( arr[0] )
But arrays are not pointers though very often they are converted to pojnters to their first elements with rare exceptions. And the sizeof operator is one such exception. Arrays used in sizeof operator are not converted to pointers to their first elements.

sizeof operator produces the size of a type of the variable. It does not count the amount of memory allocated to a pointer (representing the array).
To elaborate,
in case of type arr[32];, sizeof (arr) is essentially sizeof(type[32]).
in case of type *arr;, sizeof(arr) is essentially sizeof(type*)
To get the length of a string, you need to use strlen().
Remember, the definition of string is a null-terminated character array.
That said, in your code,
type *arr = (type) malloc(32*sizeof(type));
is very wrong. To avoid this kind of error, we suggest do not cast malloc().

And remove the cast. You should not cast the result of malloc and
family.
These are the main reasons for not casting the returned value from malloc (and family of functions).
in C, the return type of those functions is 'void*'. A void * can be assigned to any pointer type.
During debugging and during maintenance the receiving pointer type is often changed. The origin of that change is often not where the malloc function is called. If the returned value is cast, then a bug is introduced to the code. This kind of bug can be very difficult to find.

There is no safe and sound way of finding the length of an array in C since no bookkeeping is done for them.
You will need to use some other data structures which does the book keeping for you in order to ensure the correct result every time.

Related

Trouble incrementing and decrementing a malloced array of multiple data types in C

In my Computer Science course, we have been taught a method of storing a value in the 0th element of a malloced array, then incrementing the array so that things such as the size of the array can be stored in that element and retrieved later. I have tried using a modified version of this method to store various datatypes in these incremented elements.
Here is an example of how such an array is created:
int *array;
array = malloc(sizeof(int) + sizeof(double) + (n * sizeof(int)))
*(array) = n;
array++;
(double*)array++;
return array;
In this example, the sizeof(int) and sizeof(double) in the malloc statement are the elements that will store things, such as the size of the array in the int element, and in the double element we can store something like the average of all the numbers in the array (excluding these two elements of course)
(n * sizeof(int)) is for creating the rest of the elements in the array, where n is the number of elements, and sizeof(int) is the desired data type for these elements, and in theory this should work for an array of any data type.
Now, here is the trouble I am having:
I have created another function to retrieve the size of the array, but I am having trouble decrementing and incrementing the array. Here is my code:
getArraySize(void* array){
(double*)array--;//Decrement past the double element
(int*)array--;//Decrement past the int element
int size = *((int*)array);//Acquire size of the array
(int*)array++;//Increment past int element
(double*)array++;//Increment past the double element
return size;}
This function fails to get the size of the array, and I have realized it is because the compiler first increments the array then type casts it. However, when i try to fix such increment/decrement statements as follows:
((int*)array)++;
I get an error that says lvalue required as increment operand. I do not know how to fix this notation in such a way that it will increment and decrement correctly. Any suggestions would be much appreciated.
In my Computer Science course, we have been taught a method of storing a value in the 0th element of a malloced array, then incrementing the array so that things such as the size of the array can be stored in that element and retrieved later.
Sorry to hear that, since this is utter nonsense. Use struct instead.
What's worse than the task being nonsense however, is that it also invokes undefined behavior (see the C standard 6.5.6). You cannot do pointer arithmetic with that are not pointing to an array with the same type as the pointer itself.
In addition, this may lead to misaligned access. Depending on CPU, misalignment could cause needlessly slow code or instruction traps leading to a program crash. Misaligned access is also undefined behavior.
Also, storing the result of various operations on a data type, such as average, inside the data type itself doesn't make any sense at all. They would have to be updated as soon as a value changes, which causes needless bloat and ineffective code.
Forget about all this nonsense immediately. Your program cannot get fixed or repaired, since the very idea behind it is fundamentally wrong. Do like this instead:
typedef struct
{
int i;
double d;
int array[];
} something;
something* s = malloc(sizeof(something) + sizeof(int[n]));
s->i = ...;
s->d = ...;
for(int i=0; i<n; i++)
s->array[i] = ...;
...
free(s);
Specifically, your code invokes undefined behavior per C17 6.5.6 §7 and §8:
For the purposes of these operators, a pointer to an object that is not an element of an
array behaves the same as a pointer to the first element of an array of length one with the
type of the object as its element type.
When an expression that has integer type is added to or subtracted from a pointer, the
result has the type of the pointer operand. /--/
If both the pointer operand and the result point to elements of the same array object... /--/ ...otherwise, the
behavior is undefined.
There is also the issue of pointer aliasing, but (by luck?) it doesn't apply in this case, since data allocated on the heap doesn't have an "effective type" until written to. Long as you write to a specific address with the same pointer type, it is not undefined behavior.
Relevant parts regarding misalignment is C17 6.3.2.3/7:
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined.
What you can do to reach your goal and is (to my opinion) more readable anyway:
array -= sizeof(double); // get to position where double starts
array -= sizeof(int); // get to position where int starts
NOTE
This only works on some compilers and within getArraySize since you casted the array pointer to void*. So this is also not advisable at all
But I really think that this is NOT the way to go and I also recommend to use a struct instead as #Lundin points out.
If you call your getArraySize function with any other pointer or with a pointer of the expected array but not at the right position, it will most likely end up in segmentation faults

Can malloc() be used to define the size of an array?

Here consider the following sample of code:
int *a = malloc(sizeof(int) * n);
Can this code be used to define an array a containing n integers?
int *a = malloc(sizeof(int) * n);
Can this code be used to define an array a containing n integers?
That depends on what you mean by "define an array".
A declaration like:
int arr[10];
defines a named array object. Your pointer declaration and initialization does not.
However, the malloc call (if it succeeds and returns a non-NULL result, and if n > 0) will create an anonymous array object at run time.
But it does not "define an array a". a is the name of a pointer object. Given that the malloc call succeeds, a will point to the initial element of an array object, but it is not itself an array.
Note that, since the array object is anonymous, there's nothing to which you can apply sizeof, and no way to retrieve the size of the array object from the pointer. If you need to know how big the array is, you'll need to keep track of it yourself.
(Some of the comments suggest that the malloc call allocates memory that can hold n integer objects, but not an array. If that were the case, then you wouldn't be able to access the elements of the created array object. See N1570 6.5.6p8 for the definition of pointer addition, and 7.22.3p1 for the description of how a malloc call can create an accessible array.)
int *a = malloc(sizeof(int) * n);
Assuming malloc() call succeeds, you can use the pointer a like an array using the array notation (e.g. a[0] = 5;). But a is not an array itself; it's just a pointer to an int (and it may be a block of memory which can store multiple ints).
Your comment
But I can use an array a in my program with no declaration otherwise
suggests this is what you are mainly asking about.
In C language,
p[i] == *(p + i) == *(i + p) == i[p]
as long as one of i or p is of pointer type (p can an array as well -- as it'd be converted into a pointer in any expression). Hence, you'd able to index a like you'd access an array. But a is actually a pointer.
Yes. That is exactly what malloc() does.
The important distinction is that
int array[10];
declares array as an array object with enough room for 10 integers. In contrast, the following:
int *pointer;
declares pointer as a single pointer object.
It is important to distiguinsh that one of them is a pointer and that the other as an actual array, and that arrays and pointers are closely related but are different things. However, saying that there is no array in the following is also incorrect:
pointer = malloc(sizeof (int) * 10);
Because what this piece of code does is precisely to allocate an array object with room for 10 integers. The pointer pointer contains the address of the first element of that array.(C99 draft, section 7.20.3 "Memory management functions")
Interpreting your question very literally, the answer is No: To "define an array" means something quite specific; an array definition looks something like:
int a[10];
Whereas what you have posted is a memory allocation. It allocates a space suitable for holding an array of 10 int values, and stores a pointer to the first element within this space - but it doesn't define an array; it allocates one.
With that said, you can use the array element access operator, [], in either case. For instance the following code snippets are legal:
int a[10];
for (int i = 0; i < 10; i++) a[i] = 0;
and
int *a = malloc(sizeof(int) * n);
for (int i = 0; i < n; i++) a[i] = 0;
There is a subtle difference between what they do however. The first defines an array, and sets all its elements to 0. The second allocates storage which can hold an equivalently-typed array value, and uses it for this purpose by initialising each element to 0.
It is worth pointing out that the second example does not check for an allocation error, which is generally considered bad practice. Also, it constitutes a potential memory leak if the allocated storage is not later freed.
In the language the Standard was written to describe (as distinct from the language that would be described by a pedantic literal reading of it), the intention was that malloc(n) would return a pointer that would, if cast to a T*, could be treated as a pointer to the first element of a T[n/sizeof T*]. Per N1570 7.22.3:
The
pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to
a pointer to any type of object with a fundamental alignment requirement and then used
to access such an object or an array of such objects in the space allocated (until the space
is explicitly deallocated).
The definition of pointer addition and subtraction, however, do not speak of acting upon pointers that are "suitably aligned" to allow access to arrays of objects, but rather speak of pointers to elements of actual array objects. If a program accesses space for 20 int objects, I don't think the Standard does actually says that the resulting pointer would behave in all respects as though it were a pointer to element [0] of an int[20], as distinct from e.g. a pointer to element [0][0] of an int[4][5]. An implementation would have to be really obtuse not to allow it to be used as either, of course, but I don't think the Standard actually requires such treatment.

memcpy start index really needed?

The question is when we are copying any Byte array using memcpy(), shall we explicitly declare the starting (0 th) index for the destination buffer or simple mentioning it would suffice. Let me show the examples what I'm talking about. Provided that we are trying to copy source buffer to starting of the destination buffer.
BYTE *pucInputData; // we have some data here
BYTE ucOutputData[20] = {0};
Code 1
memcpy((void*)&ucOutputData, (void*)pucInputData, 20);
Code 2
memcpy((void*)&ucOutputData[0], (void*)pucInputData, 20);
In your case, considering this a C code snippet, and ucOutputData is an array
memcpy(ucOutputData, pucInputData, 20);
memcpy(&ucOutputData[0], pucInputData, 20);
both are same and can be used Interchangeably. The name of the array essentially gives you the address of the first element in the array.
Now, as per the very useful discussion in below comments, it is worthy to mention, that
memcpy(&ucOutputData, pucInputData, 20);
will also do the job here, but there is a fundamental difference between the usage of array name and address of array name. Considering the example in the question, for a definition like BYTE ucOutputData[20],
ucOutputData points to the address of the first element of an array of 20 BYTEs.
&ucOutputData is a pointer to an array of 20 BYTEs.
So, they are of different type and C respects the type of the variable. Hence, to avoid any possible misuse and misconception, the recommended and safe way to use this is either of the the first two expressions.
FWIW, the cast(s) here is(are) really not needed. Any pointer type can be implicitly ansd safely be converted to void * in C.
No, both of your examples are sub-optimal.
Remember that all data pointers in C convert to/from void * (which is the type of the first argument to memcpy()) without loss of information and that no cast is necessary to do so.
Also remember that the name of an array evaluates to the address of the first element in many contexts, such as here.
Also remember to use sizeof when you can, never introduce a literal constant when you don't have to.
So, the copy should just be:
memcpy(ucOutputData, pucInputData, sizeof ucOutputData);
Note that we use sizeof without parentheses, it's not a function. Also we use it on the destination buffer, which seems the safer choice.
Since an expression &array[0] is the same as array, and because any pointer can be implicitly converted to void*, you should do this instead:
memcpy(ucOutputData, pucInputData, 20);
Moreover, since you are writing over the entire ucOutputData, you do not need to zero out its content, so it's OK to drop the initializer:
BYTE ucOutputData[20]; // no "= {0}" part
A native array can decay to a pointer without conversion, so in the snippet below, the three assignments to p all result in the same value; p will point to the beginning of the array. No explicit cast is needed because casting to void* is implicit.
typedef char BYTE;
BYTE ucOutputData[20] = {0};
void *p = &ucOutputData;
p = ucOutputData;
p = &ucOutputData[0];

Clarification about copying an array by referencing a pointer

So I have this array in a header file like this:
// header.h
static const unsigned int array1[]={0x00,0x01,0x02,0x03};
And:
// file.c
main()
{
unsigned int *ptrToArray;
ptrArray = &array1[0];
}
Correct me if I am wrong. I assume: to find the number of bytes of array elements, instead of sizeof(array1) the equivalent will be sizeof(*ptrArray), right?
And to access the elements of the array, instead of array[i], it will now be:
*(ptrArray) for the first element,
*(ptrArray+1) for the 2nd element so on right?
The type of *ptrToArray is int, therefore sizeof(*ptrToArray) is the same as sizeof(int). So it won't tell you anything about the number of elements in array1.
Whilst you can write *(ptrArray+1), etc., you should just write ptrToArray[1]!
A pointer is not an array, and an array is not a pointer. An array can decay into a pointer when convenient, but it is still a complete type.
So, the type of *someIntPointer is int, not an array, even if that pointer happens to point to the first element in an array. sizeof(someArray) works as you would expect because it knows that the type is actually an array.
sizeof won't behave in the same way for your pointer: your example will give you the size of the datatype: unsigned int.
And while you can use pointer arithmetic to reference elements through ptrArray, you can just as well use standard array dereferencing: ptrArray[0], ptrArrray[1], ... and in most cases you're better off doing so.
Sizeof will return the size of the pointer for regular pointer types. If you sizeof a dereferenced pointer type, you will get the size of the element (i.e. sizeof(unsigned int)). You will need to either keep track of the number of elements in the array yourself, or use sizeof on the array declaration.
As for accessing, you could do it that way, but you can just use the bracket notation as you would with a normal array.
Arrays are a special class of pointer. The compiler knows when to treat an array as an array and when to treat it as a pointer: that's how it knows how big an array is, but you can still pass it to functions that expect an array (when you do this, you get a pointer to the first element). The same does not work in reverse however: The compiler will never treat a pointer declared as a pointer as an array.
By the way, [] just simplifies to pointer arithmetic. You can add a pointer to an int, but you can also add an int to a pointer. You can thus (but probably shouldn't) do weird things like 1[ptrArray]

Pointer arithmetic and arrays: what's really legal?

Consider the following statements:
int *pFarr, *pVarr;
int farr[3] = {11,22,33};
int varr[3] = {7,8,9};
pFarr = &(farr[0]);
pVarr = varr;
At this stage, both pointers are pointing at the start of each respective array address. For *pFarr, we are presently looking at 11 and for *pVarr, 7.
Equally, if I request the contents of each array through *farr and *varr, i also get 11 and 7.
So far so good.
Now, let's try pFarr++ and pVarr++. Great. We're now looking at 22 and 8, as expected.
But now...
Trying to move up farr++ and varr++ ... and we get "wrong type of argument to increment".
Now, I recognize the difference between an array pointer and a regular pointer, but since their behaviour is similar, why this limitation?
This is further confusing to me when I also consider that in the same program I can call the following function in an ostensibly correct way and in another incorrect way, and I get the same behaviour, though in contrast to what happened in the code posted above!?
working_on_pointers ( pFarr, farr ); // calling with expected parameters
working_on_pointers ( farr, pFarr ); // calling with inverted parameters
.
void working_on_pointers ( int *pExpect, int aExpect[] ) {
printf("%i", *pExpect); // displays the contents of pExpect ok
printf("%i", *aExpect); // displays the contents of aExpect ok
pExpect++; // no warnings or errors
aExpect++; // no warnings or errors
printf("%i", *pExpect); // displays the next element or an overflow element (with no errors)
printf("%i", *aExpect); // displays the next element or an overflow element (with no errors)
}
Could someone help me to understand why array pointers and pointers behave in similar ways in some contexts, but different in others?
So many thanks.
EDIT: Noobs like myself could further benefit from this resource: http://www.panix.com/~elflord/cpp/gotchas/index.shtml
The difference is because for farr++ to have any effect, somewhere the compiler would need to store that farr will evaluate to the address of the second element of the array. But there is no place for that information. The compiler only allocates place for 3 integers.
Now when you declare that a function parameter is an array, the function parameter won't be an array. The function parameter will be a pointer. There are no array parameters in C. So the following two declarations are equivalent
void f(int *a);
void f(int a[]);
It doesn't even matter what number you put between the brackets - since the parameter really will be a pointer, the "size" is just ignored.
This is the same for functions - the following two are equivalent and have a function pointer as parameter:
void f(void (*p)());
void f(void p());
While you can call both a function pointer and a function (so they are used similar), you also won't be able to write to a function, because it's not a pointer - it merely converts to a pointer:
f = NULL; // error!
Much the same way you can't modify an array.
In C, you cannot assign to arrays. So, given:
T data[N];
where T is a type and N is a number, you cannot say:
data = ...;
Given the above, and that data++; is trying to assign to data, you get the error.
There is one simple rule in C about arrays and pointers. It is that, in value contexts, the name of an array is equivalent to a pointer to its first element, and in object contexts, the name of an array is equivalent to an array.
Object context is when you take the size of an array using sizeof, or when you take its address (&data), or at the time of initialization of an array. In all other contexts, you are in value context. This includes passing an array to a function.
So, your function:
void working_on_pointers ( int *pExpect, int aExpect[] ) {
is equivalent to
void working_on_pointers ( int *pExpect, int *aExpect ) {
The function can't tell if it was passed an array or a pointer, since all it sees is a pointer.
There are more details in the answers to the following questions:
type of an array,
sizeof behaving unexpectedly,
Also see this part of C for smarties website, which is very well-written.
Trying to increment farr or varr fails because neither one is a pointer. Each is an array. The name of an array, when evaluated by itself (except as the operand of the sizeof or address-of operator) evaluates to a value (an rvalue) that's of the correct type to be assigned to a pointer. Trying to increment it is a bit like trying to increment 17. You can increment an int that contains the value 17, but incrementing 17 itself won't work. The name of an array is pretty much like that.
As for your second part, it's pretty simple: if you attempt to declare a function parameter of array type, the compiler silently "adjusts" it to a pointer type. As such, in your working_on_pointers, aExpect and pExpect have exactly the same type. Despite the array-style notation, you've defined aExpect as having type 'pointer to int'. Since the two are the same, it's entirely expected that they'll act the same.
Have a look at this answer I posted in relation to differences between pointers and arrays here on SO.
Hope this helps.
okay, i may be wrong. but arrays and pointers can be used alternately.
int * ptr = (int *)malloc(2* sizeof(int));
ptr[0]=1;
ptr[1]=2;
printf ("%d\n", ptr[0]);
printf ("%d\n", ptr[1]);
here i declared a pointer and now i am treating it as array.
moreover:
As a consequence of this definition,
there is no apparent difference in the
behavior of the "array subscripting"
operator [] as it applies to arrays
and pointers. In an expression of the
form a[i], the array reference "a"
decays into a pointer, following the
rule above, and is then subscripted
just as would be a pointer variable in
the expression p[i] (although the
eventual memory accesses will be
different, as explained in question
2.2). In either case, the expression x[i] (where x is an array or a
pointer) is, by definition, identical
to *((x)+(i)).
reference: http://www.lysator.liu.se/c/c-faq/c-2.html
you need to understand the basic concept of arrays.
when you declare an array i.e
int farr[]
you are actually declaring a pointer with this declaration
const int * farr
i.e; a "constant" pointer to integer. so when you do farr++ you are actually trying to add up to a pointer which is constant, hence compilers gives you an error.
if you need to understand, try to declare a pointer with the above declaration and you would not be able to do the arithmetic which are legal on normal pointers.
P.S:
its been quiet a while i have coded in C so i am not sure about exact syntax. but bottom line is the difference between a pointer and a constant pointer.

Resources