Is it possible to put a C array inside itself? - c

In many programming languages (including JavaScript, Java, and Ruby), it's possible to put an array inside itself. Here, I'm trying to put a C integer array inside itself at its 3rd index, but I'm not sure if this is supported in the C programming language:
#include <stdio.h>
int main(void) {
int arr[] = {1, 1, 2};
arr[2] = arr; //now I'm trying to put arr into itself.
printf("%i", arr[2]); //this prints a negative number each time I run the program
printf("%i", arr[2][0]); //prog.c:7:24: error: subscripted value is neither array nor pointer nor vector
return 0;
}
Is it possible to put a C array inside itself, or is it not possible at all?

No, it's not possible for an array of int to contain itself.
There are some (likely non-portable) tricks you can play, like making one of the elements of the array be a converted pointer to the array:
int arr[10];
arr[5] = (int)arr;
but this doesn't make the array contain itself. The expression arr, since it's of array type, is implicitly converted ("decays") to a pointer to its first element in most contexts, including this one. So, assuming the conversion doesn't lose any information, you can retrieve a pointer to the first element of arr by converting arr[5] back to type int*. Note that this only gives you a pointer to arr's first element; it loses any information about the length of arr. And it's very common for an int* pointer value not to fit into an int without loss of information (on 64-bit systems, it's common for int* to be 64 bits and int to be 32 bits).
Integers, pointers, and arrays are three very different things. They are not simply interchangeable.
Recommend reading: section 6 of the comp.lang.c FAQ; it does a very good job of explaining the often confusing relationship between arrays and pointers in C.
Even in languages like Java and Ruby, an array can't actually contain itself. It can contain a reference to itself -- though the language might provide syntactic sugar that hides the fact that it's a reference. In C, such references are generally explicit.
What you can do is define a data structure that contains a pointer to an object of its own type. This is generally done with structures. For example:
struct tree_node {
int data;
struct tree_node *left;
struct tree_node *right;
};
This being C, you have to manage memory for your tree nodes explicitly, using malloc() to allocate and free() to deallocation -- or you can use an existing library that does that for you.

It actually is possible, if the length of an int on your system is equal to or greater than the length of a pointer, such that you can cast the pointer to an int to store it - and if you remember to cast it back before you try to dereference it as a pointer. Neglecting to do the latter was the cause of your error message.
Note that in such a simple scheme you will have to separately keep track of which elements are values and which are pointers. Only if you can guarantee that the length of a pointer is less than the length of an element (or that the valid range of a userspace pointer is further constrained) could you reserve some bits to indicate if an element is a literal value or a pointer.
Here's an example of how you might explicitly cast it back, based on prior knowledge that this is what would be appropriate for that element:
printf("%i", ((int *)arr[2])[0]);
To accomplish this more cleanly, you may want to make an array not of ints, but instead of unions, such that each element is a union of an int and a pointer - meaning there are two formally recognized possible views of the same memory. But you will still need a scheme to keep track of the applicable type of each element.

Related

mallocing array of structs creates too small of an array

I'm a little new to structs in C and I'm having a problem with creating an array to store them. As the title says when I try to malloc out an array of structs my array ends up being too small by quite a large margin.
Here is my struct:
struct Points
{
char file_letter;
char *operation;
int cycle_time;
};
And here is how I'm trying to create the array:
struct Points *meta_data;
meta_data = malloc(number_of_delims * sizeof(struct Points));
number_of_delims is an int representing the number of Points I'm trying to create and therefore the number of elements in my array.
With number_of_delims being 64 I get an array size of about 8.
Note: this is more or less a project for school and I can't use typedef when declaring my struct as the prof. wants each struct explicitly declared as one each time it is used. This may actually be the source of my problem but we'll see!
struct Points *meta_data;
At this point we have a declaration of an object, meta_data that has type struct Points *... and struct Points *, being a pointer type, typically requires 8 bytes on common implementations. This is observable through the following program:
#include <stdio.h>
struct Points;
int main(void) {
struct Points *meta_data;
printf("sizeof meta_data: %zu\n", sizeof meta_data);
}
Remember, the sizeof operator evaluates the size of the type of the expression, which in this case is a pointer. Pointers don't carry size information about the arrays they point into. You need to keep that (i.e. preferably by pairing number_of_delims with meta_data, if you require both values later on).
With number_of_delims being 64 I get an array size of about 8.
No. You get an array size of exactly 64, as you've expected. Your pointer doesn't automatically carry that size information around with it (because you're expected to), so there is no portable way to come to the conclusion that your allocation can store 64 elements. The only way you could come to this conclusion is erroneously (i.e. by attempting to use sizeof, which as I've explained doesn't work as you expect).
As an exercise, what happens if you declare a pointer to an array of 64 struct Points, like so?
struct Points (*foo)[64] = NULL;
For a start, how many elements can NULL contain? What is sizeof foo and sizeof *foo? Do you see what I mean when I say sizeof evaluates the size of the type of an expression?

Can malloc() be used to define the size of an array?

Here consider the following sample of code:
int *a = malloc(sizeof(int) * n);
Can this code be used to define an array a containing n integers?
int *a = malloc(sizeof(int) * n);
Can this code be used to define an array a containing n integers?
That depends on what you mean by "define an array".
A declaration like:
int arr[10];
defines a named array object. Your pointer declaration and initialization does not.
However, the malloc call (if it succeeds and returns a non-NULL result, and if n > 0) will create an anonymous array object at run time.
But it does not "define an array a". a is the name of a pointer object. Given that the malloc call succeeds, a will point to the initial element of an array object, but it is not itself an array.
Note that, since the array object is anonymous, there's nothing to which you can apply sizeof, and no way to retrieve the size of the array object from the pointer. If you need to know how big the array is, you'll need to keep track of it yourself.
(Some of the comments suggest that the malloc call allocates memory that can hold n integer objects, but not an array. If that were the case, then you wouldn't be able to access the elements of the created array object. See N1570 6.5.6p8 for the definition of pointer addition, and 7.22.3p1 for the description of how a malloc call can create an accessible array.)
int *a = malloc(sizeof(int) * n);
Assuming malloc() call succeeds, you can use the pointer a like an array using the array notation (e.g. a[0] = 5;). But a is not an array itself; it's just a pointer to an int (and it may be a block of memory which can store multiple ints).
Your comment
But I can use an array a in my program with no declaration otherwise
suggests this is what you are mainly asking about.
In C language,
p[i] == *(p + i) == *(i + p) == i[p]
as long as one of i or p is of pointer type (p can an array as well -- as it'd be converted into a pointer in any expression). Hence, you'd able to index a like you'd access an array. But a is actually a pointer.
Yes. That is exactly what malloc() does.
The important distinction is that
int array[10];
declares array as an array object with enough room for 10 integers. In contrast, the following:
int *pointer;
declares pointer as a single pointer object.
It is important to distiguinsh that one of them is a pointer and that the other as an actual array, and that arrays and pointers are closely related but are different things. However, saying that there is no array in the following is also incorrect:
pointer = malloc(sizeof (int) * 10);
Because what this piece of code does is precisely to allocate an array object with room for 10 integers. The pointer pointer contains the address of the first element of that array.(C99 draft, section 7.20.3 "Memory management functions")
Interpreting your question very literally, the answer is No: To "define an array" means something quite specific; an array definition looks something like:
int a[10];
Whereas what you have posted is a memory allocation. It allocates a space suitable for holding an array of 10 int values, and stores a pointer to the first element within this space - but it doesn't define an array; it allocates one.
With that said, you can use the array element access operator, [], in either case. For instance the following code snippets are legal:
int a[10];
for (int i = 0; i < 10; i++) a[i] = 0;
and
int *a = malloc(sizeof(int) * n);
for (int i = 0; i < n; i++) a[i] = 0;
There is a subtle difference between what they do however. The first defines an array, and sets all its elements to 0. The second allocates storage which can hold an equivalently-typed array value, and uses it for this purpose by initialising each element to 0.
It is worth pointing out that the second example does not check for an allocation error, which is generally considered bad practice. Also, it constitutes a potential memory leak if the allocated storage is not later freed.
In the language the Standard was written to describe (as distinct from the language that would be described by a pedantic literal reading of it), the intention was that malloc(n) would return a pointer that would, if cast to a T*, could be treated as a pointer to the first element of a T[n/sizeof T*]. Per N1570 7.22.3:
The
pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to
a pointer to any type of object with a fundamental alignment requirement and then used
to access such an object or an array of such objects in the space allocated (until the space
is explicitly deallocated).
The definition of pointer addition and subtraction, however, do not speak of acting upon pointers that are "suitably aligned" to allow access to arrays of objects, but rather speak of pointers to elements of actual array objects. If a program accesses space for 20 int objects, I don't think the Standard does actually says that the resulting pointer would behave in all respects as though it were a pointer to element [0] of an int[20], as distinct from e.g. a pointer to element [0][0] of an int[4][5]. An implementation would have to be really obtuse not to allow it to be used as either, of course, but I don't think the Standard actually requires such treatment.

multiArray and multiArray[0] and &multiArray[0] same?

On 6th line instead of multiArray[0], when I write multiArray, program still works. Don't understand why. I was thinking before that multiArray is a pointer to multiArray[0] which is a pointer to multiArray[0][0]. So multiArray alone is a pointer to a pointer. multiArray[0] is a pointer to a 4 element int array. So it seems that multiArray and multiArray[0] must be different. But in below code, both work. Print function I wrote expects a pointer to a 4 element int array. So only multiArray[0] must work and multiArray must not work. But both works. Didn't understand that.
#include <stdio.h>
void printArr(int(*ptr)[4]);
int i, k;
int main(void){
int multiArray[3][4] = { { 1, 5, 2, 4 }, { 0, 6, 3, 14 }, { 132, 4, 22, 5 } };
int(*point)[4] = multiArray[0];
for (k = 0; k < 3; k++)
{
printArr(point++);
}
getchar();
}
void printArr(int(*ptr)[4]){
int *temp = (int *)ptr;
for (i = 0; i < 4; i++)
{
printf("%d ", *temp);
temp++;
}
puts("\n");
}
Someone else wrote "Multi-dimensional arrays are syntactic sugar for 1-D arrays".
This is sort of like saying that int is just syntactic sugar for a unsigned char[4] . You could do away with expressions like 4 + 5 and get the same result by manipulating arrays of 4 bytes.
You could even say that C is just syntactic sugar for a Universal Turing Machine script, if you want to take this concept a bit further.
The reality is that multi-dimensional arrays are a part of the type system in C, and they have syntax associated with them. There's more than one way to skin a cat.
Moving on, the way C arranges what we are calling a multi-dimension array is to say: "Arrays can only have one dimension, but the element type may itself be another array". We say "multi-dimension array" as a matter of convenience, but the syntax and the type system actually reflect the one-dimensional nature of the array.
So, int multiArray[3][4] is an array of 3 elements. Each of those elements is an array of 4 ints.
In memory, an array's elements are stored contiguously -- regardless of what the element type is. So, the memory layout is an array of 4 int, immediately followed by another array of 4 int, and finally another array of 4 int.
There are 12 contiguous int in memory, and in the C type system they are grouped up into 3 groups of 4.
You will note that the first int of the 12 is also the first int of the first group of 4. This is why we find that if we ask "What is the memory location of the first int?", "What is the memory location of the first group of 4 ints?", and "What is the memory location of the entire bloc of 12 ints?", we get the same answer every time. (In C, the memory location of a multi-byte object is considered to start at the location of its first byte).
Now, to talk about the pointer syntax and representation. In C, a pointer tells you where in memory an object can be found. There are two aspects to this: the memory location of the object, and what type of object it is. (The size of the object is a corollary of the type).
Some presentations only focus on the first of those, they will say things like "A pointer is just a number". But that is forgetting about the type information, which is a crucial part of a pointer.
When you print the pointer with %p, you lose the type information. You're just putting out the location in memory of the first byte. So they all look the same, despite the fact that the three pointers are pointing at differently-sized objects (which overlap each other like matruskha dolls).
In most implementations of C, the type information is all computed at compile-time, so if you try to understand C by comparing source code with assembly code (some people do this), you only see the memory-location part of the pointer. This can lead to misunderstanding if you forget that the type information is also crucial.
Footnote: All of this is independent of a couple of syntax quirks that C has; which have caused a lot of confusion over the years (but are also useful sometimes). The expression x is a shortcut for &x[0] if x is an array, except when used as the operand of & or sizeof. (Otherwise this would be a recursive definition!). The second quirk is that if you write what looks like an array declarator in a function formal parameter list, it is actually as if you wrote a pointer declarator. I stress again that these are just syntax oddities, they are not saying anything fundamental about the nature of arrays and pointers, which is actually not that complicated. The language would work just as well without both of these quirks.
Multidiemensional arrays var_t arr[size_y][size_x] provide means of declaring and accessing array elements (memory) in a conveniant manner. But all multidiemensional arrays are internally continuous memory blocks.
You may say that arr[y][x] = arr[y*cols+x].
In terms of pointer-level, the pointers multiArray and multiArray[0] are the same, they're int* - though the formal type for arr will be int (*)[2]. Using that type will allow one to take advantage of all pointer mechanics (++ on such pointer will move the address by 8 bytes, not 4).
Try this:
void t1(int* param)
{
printf("t1: %d\n", *param);
}
void t2(int** param)
{
printf("t2: %d\n", **param);
}
int main(void) {
int arr[2][2] = { { 1, 2 } , { 3, 4 } };
t1(arr); // works ok
t1(arr[0]); // works ok
t2(arr); // seg fault
t2(arr[0]);
}
int(*point)[4] = multiArray[0];
This works because both multiArray[0] and multiArray point to same address, the address of first element of array: multiArray[0][0].
However in this case, you may get a warning from compiler because type of multiArray[0] is int* while of point is int [4]*(pointer to array of 4 integers).

Why is char*p[10] considered char** p by the compiler? [duplicate]

This question already has answers here:
Should I use char** argv or char* argv[]?
(10 answers)
Closed 8 years ago.
I've been fiddling around to see if there's any way to retain information about an array's length automatically when passed into a function (see my other question here: Why is this array size "workaround" giving me a warning?), but my question is more about a warning that gcc is giving that doesn't make sense to me.
According to this website (EDIT: I misread the website), char *p[10] declares a pointer to a 10-wide array of chars. But when I tried to pass in a pointer to an array into a function, I got this error message from the compiler:
Here is the rest of the program:
I know that when an array is passed into a function, it decays into a pointer (losing information about its length), but it seems that the declaration itself is decaying. What's going on here?
EDIT: When I replace the char *p[10] with char (*p)[10], it doesn't give the warning anymore, and more importantly, it displays the proper array length: 10. I guess my questions are 1) Why do the parentheses change things? and 2) Is this a well-known workaround or am I relying on some behavior of the compiler that isn't guaranteed? (i.e. that array length info can be passed by indirectly passing in a pointer to it?)
In fact char *p[10] is an array, of length 10, of pointers to char. You are looking for char (*p)[10]. That is a pointer to an array, of length 10, of char.
You might find http://cdecl.org/ a useful resource to help you test your understanding of declarations.
Regarding the discussion surrounding dynamic arrays, you are going to have to accept that once you allocate an array dynamically, the system provides no means for you to recover the length of the array. It is your responsibility to remember that information.
The subject of your question has been answered already but I wanted to address the heart of it, which is "can I encode the length of an array in its type?" Which is in fact what a pointer-to-array does. The real question is whether you can actually gain any brevity or safety from this. Consider that in each scope where you have a declaration of your type, the length still needs to be known a-priori. To show you what I mean let's generalize your example slightly by making 10 a compile-time constant N.
#define N 10
size_t arraylength(char (*arrayp)[N]) {
return sizeof(*arrayp);
}
int main(void) {
char array[N];
assert( arraylength(&array) == N ); //always true
}
So far so good. We didn't have to pass the length of array anywhere. But it's easy to see that anywhere the expression sizeof(*arrayp) is used, we also could have written N. And any place we declare a char(*)[ ], the bracketed length must come from somewhere.
So what if N isn't a compile time constant, and array is either a VLA or a pointer-to-array from malloc? We can still write and call arraysize, but it looks like this:
size_t arraylength(size_t N, char (*arrayp)[N]) {
return sizeof(*arrayp);
}
int main(void) {
size_t N = length_from_somewhere();
char array[N];
assert( arraylength(sizeof(array), &array) == N );
}
In defining arraysize N must still be visible before the declaration of arrayp. In either case, we can't avoid having N visible outside of the declaration of arrayp. In fact, we didn't gain anything over writing arraysize(size_t N, char* array) and passing array directly (which is a bit silly given the purpose of this function.) Both times arraylength could have equally been written return N;
Which isn't to say that array pointers are useless as parameters to functions -- in the opposite situation, when you want to enforce a length, they can provide type checking to make sure somefunc(char (*)[10]); receives a pointer to an array that is really (sans shady casting) 10 elements long, which is stronger than what a construct like [static 10] provides.
Also keep in mind that all of the length measurements above depend on the underlying type being char where length == size. For any larger type, taking the length requires the usual arithmetic e.g.
sizeof(*arrayp)/sizeof((*arrayp)[0])
In C, arrays decay to pointers to their first elements on most uses. In particular, what a function receives is always just a pointer to the first element, the size of the array is not passed with it.
Get a good text on C and read up on arrays.
I've been fiddling around to see if there's any way to retain information about an array's length automatically when passed into a function
The problem is so annoying that lots of programmers would love to have an answer. Unfortunately, this is not possible.
It seems that the declaration itself is decaying
Pointer to an array is not the same as a pointer to a pointer; that is why you are getting an error.
There is no decaying going on in your code, because you are not passing an array in your code sample: instead, you are trying to pass a pointer to an array &p. The pointer to an array of characters is not compatible to the expected type of the function, which is char**. Array size from the declaration is ignored.
You need to keep in mind two things:
1. Arrays are not pointers.
2. Array names decays to pointers (in most cases) when passed as arguments to functions.
So, when you declare
int a[10]; // a is an array of 10 ints
int *b; // b is a pointer to int
both of a and b are of different types. Former is of type int [10] while latter is of type int *.
In case of function parameter
void foo1 (int a[10]); // Actually you are not passing entire array
void foo2 (int a[]); // And that's why you can omit the first dimension.
void foo3 (int *a); // and the compiler interprets the above two third
ain all of the above function declarations is of same data type int *.
Now in your case
unsigned long arraySize(char *p[10]);
you can declare it as
unsigned long arraySize(char *p[]);
and hence
unsigned long arraySize(char **p);
All are equivalent.
char *p[10] char *p[] and char **p all are exactly equivalent but when they are declared as parameter of a function otherwise char *p[10] (an array of 10 pointers to char) and char **p (a pointer to pointer to char)are entirely of different type.
Suggested reading: C-FAQ: 6. Arrays and Pointers explains this in detailed.
Array name itself is a constant pointer. for example int arr[10]={0};
arr contains the address of arr[0]. hence arr equals&arr[0] .
when u pass the arraysize(&p) , you are actually passing a double pointer .
The correct format to pass a array pointer would be arraysize(&p[0]) or arraysizeof(p)
Note Array name is constant pointer , you cant change its value .
int arr[10];
arr++;
is invalid.
In your case you cant find a size of an array in function by passing the array name . it would return size of pointer(4 or 8 depends on your processor .
The method is to pass the size along with the array
func(array_name , array_size);

Questions about pointers and arrays

Sanity-check questions:
I did a bit of googling and discovered the correct way to return a one-dimensional integer array in C is
int * function(args);
If I did this, the function would return a pointer, right? And if the return value is r, I could find the nth element of the array by typing r[n]?
If I had the function return the number "3", would that be interpreted as a pointer to the address "3?"
Say my function was something like
int * function(int * a);
Would this be a legal function body?
int * b;
b = a;
return b;
Are we allowed to just assign arrays to other arrays like that?
If pointers and arrays are actually the same thing, can I just declare a pointer without specifying the size of the array? I feel like
int a[10];
conveys more information than
int * a;
but aren't they both ways of declaring an array? If I use the latter declaration, can I assign values to a[10000000]?
Main question:
How can I return a two-dimensional array in C? I don't think I could just return a pointer to the start of the array, because I don't know what dimensions the array has.
Thanks for all your help!
Yes
Yes but it would require a cast: return (int *)3;
Yes but you are not assigning an array to another array, you are assigning a pointer to a pointer.
Pointers and arrays are not the same thing. int a[10] reserves space for ten ints. int *a is an uninitialized variable pointing to who knows what. Accessing a[10000000] will most likely crash your program as you are trying to access memory you don't have access to or doesn't exist.
To return a 2d array return a pointer-to-pointer: int ** f() {}
Yes; array indexing is done in terms of pointer arithmetic: a[i] is defined as *(a + i); we find the address of the i'th element after a and dereference the result. So a could be declared as either a pointer or an array.
It would be interpreted as an address, yes (most likely an invalid address). You would need to cast the literal 3 as a pointer, because values of type int and int * are not compatible.
Yes, it would be legal. Pointless, but legal.
Pointers and arrays are not the same thing; in most circumstances, an expression of array type will be converted ("decay") to an expression of pointer type and its value will be the address of the first element of the array. Declaring a pointer by itself is not sufficient, because unless you initialize it to point to a block of memory (either the result of a malloc call or another array) its value will be indeterminate, and may not point to valid memory.
You really don't want to return arrays; remember that an array expression is converted to a pointer expression, so you're returning the address of the first element. However, when the function exits, that array no longer exists and the pointer value is no longer valid. It's better to pass the array you want to modify as an argument to the function, such as
void foo (int *a, size_t asize)
{
size_t i;
for (i = 0; i < asize; i++)
a[i] = some_value();
}
Pointers contain no metadata about the number of elements they point to, so you must pass that as a separate parameter.
For a 2D array, you'd do something like
void foo(size_t rows, size_t columns, int (*a)[columns])
{
size_t i, j;
for (i = 0; i < rows; i++)
for (j = 0; j < columns; j++)
a[i][j] = some_value;
}
This assumes you're using a C99 compiler or a C2011 compiler that supports variable length arrays; otherwise the number of columns must be a constant expression (i.e., known at compile time).
These answers certainly call for a bit more depth. The better you understand pointers, the less bad code you will write.
An array and a pointer are not the same, EXCEPT when they are. Off the top of my head:
int a[2][2] = { 1, 2, 3, 4 };
int (* p)[2] = a;
ASSERT (p[1][1] == a[1][1]);
Array "a" functions exactly the same way as pointer "p." And the compiler knows just as much from each, specifically an address, and how to calculate indexed addresses. But note that array a can't take on new values at run time, whereas p can. So the "pointer" aspect of a is gone by the time the program runs, and only the array is left. Conversely, p itself is only a pointer, it can point to anything or nothing at run time.
Note that the syntax for the pointer declaration is complicated. (That is why I came to stackoverflow in the first place today.) But the need is simple. You need to tell the compiler how to calculate addresses for elements past the first column. (I'm using "column" for the rightmost index.) In this case, we might assume it needs to increment the address ((2*1) + 1) to index [1][1].
However, there are a couple of more things the compiler knows (hopefully), that you might not.
The compiler knows two things: 1) whether the elements are stored sequentially in memory, and 2) whether there really are additional arrays of pointers, or just one pointer/address to the start of the array.
In general, a compile time array is stored sequentially, regardless of dimension(s), with no extra pointers. But to be sure, check the compiler documentation. Thus if the compiler allows you to index a[0][2] it is actually a[1][0], etc. A run time array is however you make it. You can make one dimensional arrays of whatever length you choose, and put their addresses into other arrays, also of whatever length you choose.
And, of course, one reason to muck with any of these is because you are choosing from using run time multiplies, or shifts, or pointer dereferences to index the array. If pointer dereferences are the cheapest, you might need to make arrays of pointers so there is no need to do arithmetic to calculate row addresses. One downside is it requires memory to store the addtional pointers. And note that if the column length is a power of two, the address can be calculated with a shift instead of a multiply. So this might be a good reason to pad the length up--and the compiler could, at least theoretically, do this without telling you! And it might depend on whether you select optimization for speed or space.
Any architecture that is described as "modern" and "powerful" probably does multiplies as fast as dereferences, and these issues go away completely--except for whether your code is correct.

Resources