I was reading a book when I found that array size must be given at time of declaration or allocated from heap using malloc at runtime.I wrote this program in C :
#include<stdio.h>
int main() {
int n, i;
scanf("%d", &n);
int a[n];
for (i=0; i<n; i++) {
scanf("%d", &a[i]);
}
for (i=0; i<n; i++) {
printf("%d ", a[i]);
}
return 0;
}
This code works fine.
My question is how this code can work correctly.Isn't it the violation of basic concept of C that array size must be declared before runtime or allocate it using malloc() at runtime.I'm not doing any of these two things,then why it it working properly ?
Solution to my question is variable length arrays which are supported in C99 but if I play aroundmy code and put the statement int a[n]; above scanf("%d,&n); then it's stops working Why is it so.if variable length arrays are supported in C ?
The C99 standard supports variable length arrays. The length of these arrays is determined at runtime.
Since C99 you can declare variable length arrays at block scope.
Example:
void foo(int n)
{
int array[n];
// Initialize the array
for (int i = 0; i < n; i++) {
array[i] = 42;
}
}
C will be happy as long as you've declared the array and allocated memory for it before you use it. One of the "features" of C is that it doesn't validate array indices, so it's the responsibility of the programmer to ensure that all memory accesses are valid.
Variable length arrays are a new feature added to C in C99.
"variable length" here means that the size of the array is decided at run-time, not compile time. It does not mean that the size of the array can change after it is created. The array is logically created where it is declared. So your code looks like.
int n, i;
Create two variables n and i. Initially these variables are uninitialised.
scanf("%d", &n);
Read a value into n.
int a[n];
Create an array "a" whose size is the current value of n.
If you swap the second and third steps you try to create an array whose size is determined by an uninitalised value. This is not likely to end well.
The C standard does not specify exactly how the array is stored but in practice most compilers (I belive there are some exceptions) will allocate it on the stack. The normal way to do this is to copy the stack pointer into a "frame pointer" as part of the function preamble. This then allows the function to dynamically modify the stack pointer while keeping track of it's own stack frame.
Variable length arrays are a feature that should be used with caution. Compilers typically do not insert any form of overflow checking on stack allocations. Operating systems typically insert a "gaurd page" after the stack to detect stack overflows and either raise an error or grow the stack, but a sufficiently large array can easilly skip over the guard page.
Related
I am trying to understand the difference between the stack and the heap whilst learning programming in C.
To do this, I have attempted to implement the binary search. I want to obtain the input data set of the binary search from the user. In order to do this, I want that the user be able to define the size of the data set (array in this case). Once I obtain the size of the array, I initialise the array and then ask the user to fill it up with values.
My (potentially wrong) understanding of the stack is, that the array can be initialised on the stack if its size is known at compile time. So, I tried the following to implement the user input on the stack (and it 'works'):
void getUserInput(int sizeArray, int inputArray[], int userIn[]){
/* get input data set and the element to be searched for from user */
printf("Enter data elements seperated by a comma, e.g. 1,2,3. Press Enter at the end. Do not enter more than %d numbers: \n", sizeArray);
for (int i = 0; i < sizeArray; i++){
scanf("%d, ", &inputArray[i]);
}
printf("\nEnter the number to be searched for: \n");
scanf("%d", &userIn[1]);
printf("\nFor iterative implementation, enter 0. For recursive implementation, enter 1 :\n");
scanf("%d", &userIn[0]);
}
int main(int arg, char **argv){
int sizeArray;
int userIn[2]; // userIn[1]: element to be searched for; userIn[0]: iterative or recursive implementations
printf("Enter size of input array: \n");
scanf("%d", &sizeArray);
int inputArray[sizeArray];
getUserInput(sizeArray, inputArray, userIn);
// more code ...
}
For an implementation on the heap, I attempted to use dynamic memory allocation (it also 'works'):
int main(int arg, char **argv) {
int sizeArray;
int userIn[2]; // userIn[1]: element to be searched for; userIn[0]: iterative or recursive implementations
printf("Enter size of input array: \n");
scanf("%d", &sizeArray);
int *inputArray;
inputArray = (int*) malloc(sizeArray * sizeof(int));
if(inputArray == NULL) {
printf("\n\nError! Memory not allocated, enter size of array again:\n");
scanf("%d", &sizeArray);
inputArray = (int*) malloc(sizeArray * sizeof(int));
}
getUserInput(sizeArray, inputArray, userIn);
// more code...
free(inputArray) // free memory allocated by malloc on the heap
}
Now, I wanted to combine both approaches into one file, so I created a switch to switch between the stack and heap implementations, as follows:
int main(int arg, char **argv) {
/* Obtain input */
int stackHeap; // 0 = implementation on stack; 1 = implementation on heap
printf("Implementation on stack or heap? Enter 0 for stack, 1 for heap: \n");
scanf("%d", &stackHeap);
int sizeArray;
int userIn[2]; // userIn[1]: element to be searched for; userIn[0]: iterative or recursive implementations
printf("Enter size of input array: \n");
scanf("%d", &sizeArray);
int *inputArray;
if (stackHeap == 0) {
inputArray = &inputArray[sizeArray];
printf("input array = %p\n", inputArray);
} else {
inputArray = (int*) malloc(sizeArray * sizeof(int));
printf("input array = %p\n", inputArray);
if(inputArray == NULL) {
printf("\n\nError! Memory not allocated, enter size of array again:\n");
scanf("%d", &sizeArray);
inputArray = (int*) malloc(sizeArray * sizeof(int));
}
}
getUserInput(sizeArray, inputArray, userIn);
// more code...
}
Currently the stack approach doesn't work. Instead of inputArray = &inputArray[sizeArray], I tried initialising the inputArray within the if statement. This is however not allowed, since it is then only valid within the scope of the if statement. I think I am getting confused as how to use the pointer *inputArray to initialise the array on the stack.
I have been reading about pointers and arrays in C, which is why I thought implementing this would be fun. I would very much appreciate any feedback you have (gladly also any fundamental errors I have made - I am quite new to this topic).
Thank you very much!
I am trying to understand the difference between the stack and the heap and whilst learning programming in C.
That was your first mistake. Stack and heap are not C-language concepts, and the C language does not depend on any such distinctions between areas of memory. With that said, many C implementations effectively do rely on these, so understanding something about them may prove worthwhile. As far as learning the C language goes, however, it would be better to focus on the relevant native C concepts.
In this particular area, those would be the connected concepts of objects' lifetime and storage duration. An object's lifetime is the portion of the program's execution during which that object exists and retains its last-stored value, which is a function of its storage duration and, for some objects, the timing of their creation. Storage duration is defined by the location of an object's definition, if any, and of the type qualifiers applying to it.
Objects declared without the static qualifier inside a code block have "automatic" storage duration and live from entry into the block to termination of the block's execution. On stack-based machines, storage for such objects is typically allocated on the stack. The other storage durations are "static", "allocated", and "thread", and on stack-based machines, storage for objects with these durations typically is implemented outside the stack, which some would describe as on the heap.
My (potentially wrong) understanding of the stack is, that the array can be initialised on the stack if its size is known at compile time.
That's somewhat imprecise.
On one hand, if the wanted size is known at compile time then you can certainly write a valid declaration for an automatic-duration array with that size. However, that does not guarantee that the program can actually obtain the wanted space at run time. This is an especial problem when objects with very large storage size are declared: when program execution reaches the point where such an object must be allocated, stack-oriented C implementations may not have enough stack space available. This is the "stack overflow" for which this site is named.
On the other hand, all conforming C99 implementations and most C11 and C17 implementations support variable-length arrays, which have size determined at run time and can be the types of automatic objects. This is, in fact, what you have employed for variable inputArray in your first example. That code is valid. Some implementations may provide other mechanisms for allocating variable-length data as automatic storage, too. See alloca(), for example.
For an implementation on the heap, I attempted to use dynamic memory allocation
This gets you an object with "allocated" storage duration. On implementations that you can characterize as maintaining a stack / heap distinction, storage for such objects is indeed ordinarily on the heap.
Now, I wanted to combine both approaches into one file, so I created a switch to switch between the stack and heap implementations
That's doable, but not the way you're trying to do it. Having declared ...
int *inputArray;
... you need to assign that pointer to point to something before you can access its elements. You do that in the one case by allocating memory with malloc, but in the other case, this ...
inputArray = &inputArray[sizeArray];
... does not do at all what you want. Appearing as it does in an expression rather than a declaration, that inputArray[sizeArray] does not cause any memory to be allocated. Instead, it attempts to access an int at index sizeArray in the object to which inputArray points. Since inputArray has not, at that point, been assigned a value, that produces undefined behavior. Thus, indeed,
Currently the stack approach doesn't work.
And you are also right that
initialising the inputArray within the
if statement [is] not allowed, since it is then only valid
within the scope of the if statement.
So when you say,
I think I am getting confused as
how to use the pointer *inputArray to initialise the array on the
stack.
, I am inclined to agree. In fact, I think you are missing a key point, which is that you can't in either case use the pointer to initialize an array, nor really anything like that. Rather, you must allocate the array and, more or less separately, assign the pointer to point to it.
You clearly know how to do that with malloc, though I suppose you weren't thinking about that in those terms. The way you are doing it works because the object allocated via malloc has "allocated" storage duration, meaning that its lifetime lasts until it is deallocated. To do similar with an automatic object, you need one that has sufficient lifetime for your purpose, which in this case means that it must be declared directly within the body of main(), not in any nested block. You can achieve that like this:
int sizeArray;
int sizeAutomatic = 1;
printf("Enter size of input array: \n");
scanf("%d", &sizeArray);
if (stackHeap == 0) {
sizeAutomatic = sizeArray;
}
int automaticArray[sizeAutomatic];
int *inputArray;
if (stackHeap == 0) {
inputArray = automaticArray;
// or, equivalently: inputArray = &automaticArray[0];
} else {
inputArray = malloc(sizeArray * sizeof(int));
if(inputArray == NULL) {
/// handle allocation error ...
}
}
printf("input array = %p\n", (void *)inputArray);
In the event that the user elects the automatic allocation option, inputArray will point to the first element of automaticArray, effectively making the two aliases for each other. In the other case, inputArray will point to an allocated object. The automaticArray object will still exist in the latter case, but it will have length 1 and be unused. This one-int overhead is part of the price you pay for allowing the user to decide which allocation approach to use.
Your programs using just stack allocation or just heap allocation are both fine. The only potential problem you have with the stack allocation is that if the array is too big (i.e. a few MB) it can overflow the stack, while the heap has a much higher limit.
The problem with the combined program is that this doesn't make sense:
inputArray = &inputArray[sizeArray];
Using the array index operator actually involves dereferencing a pointer, and at this point inputArray doesn't point anywhere.
You need to pick one or the other, either an array or a pointer that points to the start of a dynamically allocated array.
typedef enum
{
STACK,
HEAP,
}type;
size_t getSize(void) // implement user input
{
return 100;
}
type getType(void) // implement user input
{
return HEAP;
}
int main(void)
{
size_t stackSize = 0;
size_t size;
type tp;
int *inputArray;
size = getSize();
tp = getType();
if(tp == STACK) stackSize = size;
int array[stackSize];
if(tp == STACK)
inputArray = array;
else
inputArray = malloc(size * sizeof(*inputArray));
/* ... */
}
TL;DR: don't use VLAs unless you need their special semantics. They're tricky and somewhat error prone; using the heap is generally easier and safer.
The basic problem that you're running into is that defining a VLA is a declaration, so there's no way to conditionally create one that lives on after the conditional block. That's pretty much by design -- the block serves to limit the lifetime so it can be freed automatcially.
So your choices are to to create one unconditionally (and just not use it if stackHeap is true) or move all your ...more code... into the conditional block. The latter would be easiest to do by putting it all in a function (eg processArray(int sizeArray, int inputArray[],... much like your getUserInput function). Then you just need two calls to the function in the if and else cases.
So I am learning how to program in C, and am starting to learn about dynamic memory allocation. What I know is that not all the time will your program know how much memory it needs at run time.
I have this code:
#include <stdio.h>
int main() {
int r, c, i, j;
printf("Rows?\n");
scanf("%d", &r);
printf("Columns?\n");
scanf("%d", &c);
int array[r][c];
for (i = 0; i < r; i++)
for (j = 0; j < c; j++)
array[i][j] = rand() % 100 + 1;
return 0;
}
So if I wanted to create a 2D array, I can just declare one and put numbers in the brackets. But here in this code, I am asking the user how many rows and columns they would like, then declaring an array with those variables, I then filled up the rows and columns with random integers.
So my question is: Why don't I have to use something like malloc here? My code doesn't know how many rows and columns I am going to put in at run time, so why do I have access to that array with my current code?
So my question is: why don't I have to use something like malloc here?
My code doesn't know how many rows and columns I am going to put in at
run time, so why do I have access to that array with my current code?
You are using a C feature called "variable-length arrays". It was introduced in C99 as a mandatory feature, but support for it is optional in C11 and C18. This alternative to dynamic allocation carries several limitations with it, among them:
because the feature is optional, code that unconditionally relies on it is not portable to implementations that do not support the feature
implementations that support VLAs typically store local VLAs on the stack, which is prone to producing stack overflows if at runtime the array dimension is large. (Dynamically-allocated space is usually much less sensitive to such issues. Large, fixed-size automatic arrays can be an issue too, but the potential for trouble with these is obvious in the source code, and it is less likely to evade detection during testing.)
the program still needs to know the dimensions of your array before its declaration, and the dimensions at the point of the declaration are fixed for the lifetime of the array. Unlike dynamically-allocated space, VLAs cannot be resized.
there are contexts that accommodate ordinary, fixed length arrays, but not VLAs, such as file-scope variables.
Your array is allocated on the stack, so when the function (in your case, main()) exits the array vanishes into the air. Had you allocated it with malloc() the memory would be allocated on the heap, and would stay allocated forever (until you free() it). The size of the array IS known at run time (but not at compile time).
In your program, the array is allocated with automatic storage, aka on the stack, it will be released automatically when leaving the scope of definition, which is the body of the function main. This method, passing a variable expression as the size of an array in a definition, introduced in C99, is known as variable length array or VLA.
If the size is too large, or negative, the definition will have undefined behavior, for example causing a stack overflow.
To void such potential side effects, you could check the values of the dimensions and use malloc or calloc:
#include <stdio.h>
#include <stdlib.h>
int main() {
int r, c, i, j;
printf("Rows?\n");
if (scanf("%d", &r) != 1)
return 1;
printf("Columns?\n");
if (scanf("%d", &c) != 1)
return 1;
if (r <= 0 || c <= 0) {
printf("invalid matrix size: %dx%d\n", r, c);
return 1;
}
int (*array)[c] = calloc(r, sizeof(*array));
if (array == NULL) {
printf("cannot allocate memory for %dx%d matrix\n", r, c);
return 1;
}
for (i = 0; i < r; i++) {
for (j = 0; j < c; j++) {
array[i][j] = rand() % 100 + 1;
}
}
free(array);
return 0;
}
Note that int (*array)[c] = calloc(r, sizeof(*array)); is also a variable length array definition: array is a pointer to arrays of c ints. sizeof(*array) is sizeof(int[c]), which evaluates at run time to (sizeof(int) * c), so the space allocated for the matrix is sizeof(int) * c * r as expected.
The point of dynamic memory allocation (malloc()) is not that it allows for supplying the size at run time, even though that is also one of its important features. The point of dynamic memory allocation is, that it survives the function return.
In object oriented code, you might see functions like this:
Object* makeObject() {
Object* result = malloc(sizeof(*result));
result->someMember = ...;
return result;
}
This creator function allocates memory of a fixed size (sizeof is evaluated at compile time!), initializes it, and returns the allocation to its caller. The caller is free to store the returned pointer wherever it wants, and some time later, another function
void destroyObject(Object* object) {
... //some cleanup
free(object);
}
is called.
This is not possible with automatic allocations: If you did
Object* makeObject() {
Object result;
result->someMember = ...;
return &result; //Wrong! Don't do this!
}
the variable result ceases to exist when the function returns to its caller, and the returned pointer will be dangling. When the caller uses that pointer, your program exhibits undefined behavior, and pink elephants may appear.
Also note that space on the call stack is typically rather limited. You can ask malloc() for a gigabyte of memory, but if you try to allocate the same amount as an automatic array, your program will most likely segfault. That is the second reason d'etre for malloc(): To provide a means to allocate large memory objects.
The classic way of handling a 2D array in 'C' where the dimensions might change is to declare it as a sufficiently sized one dimensional array and then have a routine / macro / calculation that calculates the element number of that 1D array given the specified row, column, element size, and number of columns in that array.
So, let's say you want to calculate the address offset in a table for 'specifiedRow' and 'specifiedCol' and the array elements are of 'tableElemSize' size and the table has 'tableCols' columns. That offset could be calculated as such:
addrOffset = specifiedRow * tableCols * tableElemSize + (specifiedCol * tableElemSize);
You could then add this to the address of the start of the table to get a pointer to the element desired.
This is assuming that you have an array of bytes, not integers or some other structure. If something larger than a byte, then the 'tableElemSize' is not going to be needed. It depends upon how you want to lay it out in memory.
I do not think that the way that you are doing it is something that is going to be portable across a lot of compilers and would suggest against it. If you need a two dimensional array where the dimensions can be dynamically changed, you might want to consider something like the MATRIX 'object' that I posted in a previous thread.
How I can merge two 2D arrays according to row in c++
Another solution would be dynamically allocated array of dynamically allocated arrays. This takes up a bit more memory than a 2D array that is allocated at compile time and the elements in the array are not contiguous (which might matter for some endeavors), but it will still give you the 'x[i][j]' type of notation that you would normally get with a 2D array defined at compile time. For example, the following code creates a 2D array of integers (error checking left out to make it more readable):
int **x;
int i, j;
int count;
int rows, cols;
rows = /* read a value from user or file */
cols = /* read a value from user of file */
x = calloc(sizeof(int *), rows);
for (i = 0; i < rows; i++)
x[i] = calloc(sizeof(int), cols);
/* Initial the 2D array */
count = 0;
for (i = 0; i < rows; i++) {
for (j = 0; j < cols; j++) {
count++;
x[i][j] = count;
}
}
One thing that you need to remember here is that because we are using an array of arrays, we cannot always guarantee that each of the arrays is going to be in the next block of memory, especially if any garbage collection has been going on in the meantime (like might happen if your code was multithreaded). Even without that though, the memory is not going to be contiguous from one array to the next array (although the elements within each array will be). There is overhead associated with the memory allocation and that shows up if you look at the address of the 2D array and the 1D arrays that make up the rows. You can see this by printing out the address of the 2D array and each of the 1D arrays like this:
printf("Main Array: 0x%08X\n", x);
for (i = 0; i < rows; i++)
printf(" 0x08X [%04d], x[i], (int) x[i] - (int) x);
When I tested this with a 2D array with 4 columns, I found that each row took up 24 bytes even though it only needs 16 bytes for the 4 integers in the columns.
I want to create a variable length array for my code in the Visual Studio 2010 environment.
I had tried the code using the array of length x, as it is passing by the user. But I am facing the error as:
"error C2466:cannot allocate an array of constant size 0" ,"error C2133: 'v_X_array' : unknown size".
func1(int x)
{
int v_X_array[x];
int i;
for (i=0; i<x; i++)
{
v_X_array[i] = i;
}
}
I expect the answer as v_X_array[0] = 0, v_X_array[1] =1, v_X_array[2]=2 ... v_X_array[10]=10 ; for x = 10;
How can I do this?
Note: as calloc and malloc should not be used.
If you need your code to be portable, you cannot use that kind of array definition to handle memory areas.
Without going into specific implementations, you have two generic approaches that you can use:
Define an array big enough for the worst case. This is tightly dependent on the application, so you are on your own.
Define the "array" using dynamic allocation. With that, you can define memory areas of any arbitrary size.
If you choose option 2:
a. Do not forget to de-allocate the memory when you no longer need it.
b. To avoid frequent allocation and de-allocation, you may define the buffer once (perhaps bigger then necessary for the current call) and use it several times. You may and up with the same result as option 1 above - define a large array from the start.
Since you should not use dynamic allocation ("calloc and malloc should not be used"), then you are left with option 1.
I expect the ans as v_X_array[0] = 0, v_X_array[1] =1, v_X_array[2]=2 ... v_X_array[10]=10 ; for x = 10;
You expect to store 11 values in an array which can hold only 10?
You can't allocate an array of an unknown size.
So you need to allocate it dynamically "at run-time".
you can make this allocation using "new" in C++ or "malloc" in C.
For example:
In C++ if you want to allocate an array of an unknown size you should do the following:
int* v_X_array = new int[x];
int i;
for (i=0; i<x; i++)
{
v_X_array[i] = i;
}
The reason that we use integer pointer is that "new" returns the base address of the array "the address of the first element", so the only thing that can store addresses is pointers.
In C if you want to allocate an array of an unknown size you should do the following:
int* v_X_array = (int*) malloc(x*sizeof(int));
int i;
for(i=0; i<x; i++)
{
v_X_array[i] = i;
}
The malloc function takes a single argument which specifies the number of bytes to be allocated and returns a void pointer so the casting (int*) is required.
For more explanations, look at the next section:
If we need to allocate an array of 20 integers it could be as follow: "malloc(20*sizeof(int))" where 20 is the number of allocated elements and sizeof(int) is the size of the type you want to allocate. If successful it returns a pointer to memory allocated. If it fails, it returns a null pointer.
Enter image description here
I wrote the following c - codesnippet:
#include <stdio.h>
void function(int size){
int array[size];
for(int i = 0; i < size; i++){
printf("%d ", array[i]);
}
}
int main(){
int array_size;
scanf("%d",&array_size);
function(array_size);
return 0;
}
Why it is possible to generate an array of dynamic size this way. Normally I would use malloc, but this works as well. Why it is allowed to use the non constant variable size for the size of an array?
This is what is known as a "Variable-length array".
Variable-length automatic arrays have been supported since C99
They are declared like a normal array, but length is not constant and storage is allocated at the point of declaration.
More on this can be found # gcc.gnu.org
Why it is possible to generate an array of dynamic size this way ?
No, This
int array[size]; /* this doesn't get stored in heap section */
where size is a run time integer constant, is not a dynamic array, it's called Variable length array & it was introduced in C99. Dynamic array created only by calling either malloc() or calloc() which gets the address from heap section of primary memory.
Why it is allowed to use the non constant variable size for the size
of an array?
Yes, C99 onwards VLA can have size as non-constant variable. But you can't change(resized) the size of VLA once declared unlike dynamic array(can use realloc()).
I have a function that needs external parameters and afterwards creates variables that are heavily used inside that function. E.g. the code could look like this:
void abc(const int dim);
void abc(const int dim) {
double arr[dim] = { 0.0 };
for (int i = 0; i != dim; ++i)
arr[i] = i;
// heavy usage of the arr
}
int main() {
const int par = 5;
abc(par);
return 0;
}
But I am getting a compiler error, because the allocation on the stack needs compile-time constants. When I tried allocating manually on the stack with _malloca, the time performance of the code worsened (compared to the case when I declare the constant par inside the abc() function). And I don't want the array arr to be on the heap, because it is supposed to contain only small amount of values and it is going to get used quite often inside the function. Is there some way to combine the efficiency while keeping the possibility to pass the size parameter of an array to the function?
EDIT: I am using MSVC compiler and I received an error C2131: expression did not evaluate to a constant in VC 2017.
If you're using a modern C compiler, that implements the entire C99, or the C11 with variable-length array extension, this would work, with one little modification:
void abc(const int dim);
void abc(const int dim) {
double arr[dim];
for (int i = 0; i != dim; ++i)
arr[i] = i;
// heavy usage of the arr
}
int main(void) {
const int par = 5;
abc(par);
return 0;
}
I.e. double arr[dim] would work - it doesn't have a compile-time constant size, but it is enough to know its size at runtime. However, such a VLA cannot be initialized.
Unfortunately MSVC is not a modern C compiler / at MS they don't want to implement the VLA themselves - and I even suspect they're a big part of why the VLA's were made optional in C11, so you'd need to define the array in main then pass a pointer to it to the function abc; or if the size is globally constant, use an actual compile-time constant, i.e. a #define.
However, you're not showing the actual code that you're having performance problems with. It might very well be that the compiler can produce optimized output if it knows the number of iterations - if that is true, then the "globally defined size" might be the only way to get excellent performance.
Unfortunately the Microsoft Compiler does not support variable length arrays.
If the array is not too large you could allocate by the largest possible size needed and pass a pointer to that stack array and a dimension to the function. This approach could help limit the number of allocations.
Another option is to implement a simple heap allocated global pool for functions of this type to use. The pool would allocate a large continuous chunk on the heap and then you can get a pointer to your reservation in the pool. The benefit of this approach is you will not have to worry about over allocation on the stack causing a segmentation fault (which can happen with variable length arrays).