Having a little trouble understanding memory allocation in C - c

So I am learning how to program in C, and am starting to learn about dynamic memory allocation. What I know is that not all the time will your program know how much memory it needs at run time.
I have this code:
#include <stdio.h>
int main() {
int r, c, i, j;
printf("Rows?\n");
scanf("%d", &r);
printf("Columns?\n");
scanf("%d", &c);
int array[r][c];
for (i = 0; i < r; i++)
for (j = 0; j < c; j++)
array[i][j] = rand() % 100 + 1;
return 0;
}
So if I wanted to create a 2D array, I can just declare one and put numbers in the brackets. But here in this code, I am asking the user how many rows and columns they would like, then declaring an array with those variables, I then filled up the rows and columns with random integers.
So my question is: Why don't I have to use something like malloc here? My code doesn't know how many rows and columns I am going to put in at run time, so why do I have access to that array with my current code?

So my question is: why don't I have to use something like malloc here?
My code doesn't know how many rows and columns I am going to put in at
run time, so why do I have access to that array with my current code?
You are using a C feature called "variable-length arrays". It was introduced in C99 as a mandatory feature, but support for it is optional in C11 and C18. This alternative to dynamic allocation carries several limitations with it, among them:
because the feature is optional, code that unconditionally relies on it is not portable to implementations that do not support the feature
implementations that support VLAs typically store local VLAs on the stack, which is prone to producing stack overflows if at runtime the array dimension is large. (Dynamically-allocated space is usually much less sensitive to such issues. Large, fixed-size automatic arrays can be an issue too, but the potential for trouble with these is obvious in the source code, and it is less likely to evade detection during testing.)
the program still needs to know the dimensions of your array before its declaration, and the dimensions at the point of the declaration are fixed for the lifetime of the array. Unlike dynamically-allocated space, VLAs cannot be resized.
there are contexts that accommodate ordinary, fixed length arrays, but not VLAs, such as file-scope variables.

Your array is allocated on the stack, so when the function (in your case, main()) exits the array vanishes into the air. Had you allocated it with malloc() the memory would be allocated on the heap, and would stay allocated forever (until you free() it). The size of the array IS known at run time (but not at compile time).

In your program, the array is allocated with automatic storage, aka on the stack, it will be released automatically when leaving the scope of definition, which is the body of the function main. This method, passing a variable expression as the size of an array in a definition, introduced in C99, is known as variable length array or VLA.
If the size is too large, or negative, the definition will have undefined behavior, for example causing a stack overflow.
To void such potential side effects, you could check the values of the dimensions and use malloc or calloc:
#include <stdio.h>
#include <stdlib.h>
int main() {
int r, c, i, j;
printf("Rows?\n");
if (scanf("%d", &r) != 1)
return 1;
printf("Columns?\n");
if (scanf("%d", &c) != 1)
return 1;
if (r <= 0 || c <= 0) {
printf("invalid matrix size: %dx%d\n", r, c);
return 1;
}
int (*array)[c] = calloc(r, sizeof(*array));
if (array == NULL) {
printf("cannot allocate memory for %dx%d matrix\n", r, c);
return 1;
}
for (i = 0; i < r; i++) {
for (j = 0; j < c; j++) {
array[i][j] = rand() % 100 + 1;
}
}
free(array);
return 0;
}
Note that int (*array)[c] = calloc(r, sizeof(*array)); is also a variable length array definition: array is a pointer to arrays of c ints. sizeof(*array) is sizeof(int[c]), which evaluates at run time to (sizeof(int) * c), so the space allocated for the matrix is sizeof(int) * c * r as expected.

The point of dynamic memory allocation (malloc()) is not that it allows for supplying the size at run time, even though that is also one of its important features. The point of dynamic memory allocation is, that it survives the function return.
In object oriented code, you might see functions like this:
Object* makeObject() {
Object* result = malloc(sizeof(*result));
result->someMember = ...;
return result;
}
This creator function allocates memory of a fixed size (sizeof is evaluated at compile time!), initializes it, and returns the allocation to its caller. The caller is free to store the returned pointer wherever it wants, and some time later, another function
void destroyObject(Object* object) {
... //some cleanup
free(object);
}
is called.
This is not possible with automatic allocations: If you did
Object* makeObject() {
Object result;
result->someMember = ...;
return &result; //Wrong! Don't do this!
}
the variable result ceases to exist when the function returns to its caller, and the returned pointer will be dangling. When the caller uses that pointer, your program exhibits undefined behavior, and pink elephants may appear.
Also note that space on the call stack is typically rather limited. You can ask malloc() for a gigabyte of memory, but if you try to allocate the same amount as an automatic array, your program will most likely segfault. That is the second reason d'etre for malloc(): To provide a means to allocate large memory objects.

The classic way of handling a 2D array in 'C' where the dimensions might change is to declare it as a sufficiently sized one dimensional array and then have a routine / macro / calculation that calculates the element number of that 1D array given the specified row, column, element size, and number of columns in that array.
So, let's say you want to calculate the address offset in a table for 'specifiedRow' and 'specifiedCol' and the array elements are of 'tableElemSize' size and the table has 'tableCols' columns. That offset could be calculated as such:
addrOffset = specifiedRow * tableCols * tableElemSize + (specifiedCol * tableElemSize);
You could then add this to the address of the start of the table to get a pointer to the element desired.
This is assuming that you have an array of bytes, not integers or some other structure. If something larger than a byte, then the 'tableElemSize' is not going to be needed. It depends upon how you want to lay it out in memory.
I do not think that the way that you are doing it is something that is going to be portable across a lot of compilers and would suggest against it. If you need a two dimensional array where the dimensions can be dynamically changed, you might want to consider something like the MATRIX 'object' that I posted in a previous thread.
How I can merge two 2D arrays according to row in c++
Another solution would be dynamically allocated array of dynamically allocated arrays. This takes up a bit more memory than a 2D array that is allocated at compile time and the elements in the array are not contiguous (which might matter for some endeavors), but it will still give you the 'x[i][j]' type of notation that you would normally get with a 2D array defined at compile time. For example, the following code creates a 2D array of integers (error checking left out to make it more readable):
int **x;
int i, j;
int count;
int rows, cols;
rows = /* read a value from user or file */
cols = /* read a value from user of file */
x = calloc(sizeof(int *), rows);
for (i = 0; i < rows; i++)
x[i] = calloc(sizeof(int), cols);
/* Initial the 2D array */
count = 0;
for (i = 0; i < rows; i++) {
for (j = 0; j < cols; j++) {
count++;
x[i][j] = count;
}
}
One thing that you need to remember here is that because we are using an array of arrays, we cannot always guarantee that each of the arrays is going to be in the next block of memory, especially if any garbage collection has been going on in the meantime (like might happen if your code was multithreaded). Even without that though, the memory is not going to be contiguous from one array to the next array (although the elements within each array will be). There is overhead associated with the memory allocation and that shows up if you look at the address of the 2D array and the 1D arrays that make up the rows. You can see this by printing out the address of the 2D array and each of the 1D arrays like this:
printf("Main Array: 0x%08X\n", x);
for (i = 0; i < rows; i++)
printf(" 0x08X [%04d], x[i], (int) x[i] - (int) x);
When I tested this with a 2D array with 4 columns, I found that each row took up 24 bytes even though it only needs 16 bytes for the 4 integers in the columns.

Related

How do create a variable length array in C in Visual Studio 2010 enviroment?

I want to create a variable length array for my code in the Visual Studio 2010 environment.
I had tried the code using the array of length x, as it is passing by the user. But I am facing the error as:
"error C2466:cannot allocate an array of constant size 0" ,"error C2133: 'v_X_array' : unknown size".
func1(int x)
{
int v_X_array[x];
int i;
for (i=0; i<x; i++)
{
v_X_array[i] = i;
}
}
I expect the answer as v_X_array[0] = 0, v_X_array[1] =1, v_X_array[2]=2 ... v_X_array[10]=10 ; for x = 10;
How can I do this?
Note: as calloc and malloc should not be used.
If you need your code to be portable, you cannot use that kind of array definition to handle memory areas.
Without going into specific implementations, you have two generic approaches that you can use:
Define an array big enough for the worst case. This is tightly dependent on the application, so you are on your own.
Define the "array" using dynamic allocation. With that, you can define memory areas of any arbitrary size.
If you choose option 2:
a. Do not forget to de-allocate the memory when you no longer need it.
b. To avoid frequent allocation and de-allocation, you may define the buffer once (perhaps bigger then necessary for the current call) and use it several times. You may and up with the same result as option 1 above - define a large array from the start.
Since you should not use dynamic allocation ("calloc and malloc should not be used"), then you are left with option 1.
I expect the ans as v_X_array[0] = 0, v_X_array[1] =1, v_X_array[2]=2 ... v_X_array[10]=10 ; for x = 10;
You expect to store 11 values in an array which can hold only 10?
You can't allocate an array of an unknown size.
So you need to allocate it dynamically "at run-time".
you can make this allocation using "new" in C++ or "malloc" in C.
For example:
In C++ if you want to allocate an array of an unknown size you should do the following:
int* v_X_array = new int[x];
int i;
for (i=0; i<x; i++)
{
v_X_array[i] = i;
}
The reason that we use integer pointer is that "new" returns the base address of the array "the address of the first element", so the only thing that can store addresses is pointers.
In C if you want to allocate an array of an unknown size you should do the following:
int* v_X_array = (int*) malloc(x*sizeof(int));
int i;
for(i=0; i<x; i++)
{
v_X_array[i] = i;
}
The malloc function takes a single argument which specifies the number of bytes to be allocated and returns a void pointer so the casting (int*) is required.
For more explanations, look at the next section:
If we need to allocate an array of 20 integers it could be as follow: "malloc(20*sizeof(int))" where 20 is the number of allocated elements and sizeof(int) is the size of the type you want to allocate. If successful it returns a pointer to memory allocated. If it fails, it returns a null pointer.
Enter image description here

Allocating 2D array of dimensions read from file

I would like to read 2 numbers n,m from text file and then allocate a 2D array with n rows and m columns.
Also, I would like to initialise the array in my main function in order to use it later in other functions, and do the reading and allocating in a different function, which I will call from the main function.
I know how to handle the reading, but I'm struggling with the array allocation.
I've read quite a few answer to similar questions here, but they didn't help me.
I've wrote the following code, but not sure how to continue with it to get the desired result:
void func(int** array, int* rows, int* cols){
int n, m;
FILE *file;
fp = fopen("test.txt", "r");
if (file) {
/* reading numbers n and m */
*rows = n;
*cols = m;
**array = (int*)malloc(n * m * sizeof(int));
fclose(file);
}
}
int main() {
int rows, cols;
int** array;
func(&array, &rows, &cols);
return 0;
}
I thought perhaps I should first allocate a 2D array with calloc and then use realloc after reading n,m, but not sure if that's the best practise.
What is the best practise to allocate a 2D array based on dimensions I read from text file?
First the biggest goofs here:
Your function doesn't have any types in the function signature -- this should be rejected by the compiler
a 2D array is not the same as an array of pointers
what should && mean? & is the address of something, its result can't have an address because it isn't stored anywhere, so this doesn't make sense
If you want to dynamically allocate a real 2D array, you need to either have the second dimension fixed or use VLAs (which are optional in C11, but assuming support is quite safe) with a variable. Something like this:
// dimensions in `x` and `y`, should be of type `size_t`
int (*arr)[x] = malloc(y * sizeof *arr);
In any case, the second dimension is part of the type, so your structure won't work -- the calling code has to know this second dimension for passing a valid pointer.
Hint: This first part doesn't apply to the question any more, OP forgot to mention he's interested in C90 only. I added the appropriate tag, but leave the upper part of the answer for reference. The following applies to C90 as well:
You write int ** in your code, this would be a pointer to a pointer. You can create something that can be used like a 2D array by using a pointer to a pointer, but then, you can't allocate it as a single chunk.
The outer pointer will point to an array of pointers (say, the "row-pointers"), so for each of these pointers, you have to allocate an array of the actual values. This could look like the following:
// dimensions again `x` and `y`
int **arr = malloc(y * sizeof *arr);
for (size_t i = 0; i < y; ++i)
{
arr[i] = malloc(x * sizeof **arr);
}
Note on both snippets these are minimal examples. For real code, you have to check the return value of malloc() each time. It could return a null pointer on failure.
If you want to have a contiguous block of memory in the absence of VLAs, there's finally the option to just use a regular array and calculate indices yourself, something like:
int *arr = malloc(x * y * sizeof *arr);
// access arr[8][15] when x is the second dimension:
arr[x*8 + 15] = 24;
This will generate (roughly) the same executable code as a real 2D array, but of course doesn't look that nice in your source.
Note this is not much more than a direct answer to your immediate question. Your code contains more goofs. You should really enable a sensible set of compiler warnings (e.g. with gcc or clang, use -Wall -Wextra -pedantic -std=c11 flags) and then fix each and every warning you get when you move on with your project.

Passing parameters to a function to efficiently create array allocated on the stack

I have a function that needs external parameters and afterwards creates variables that are heavily used inside that function. E.g. the code could look like this:
void abc(const int dim);
void abc(const int dim) {
double arr[dim] = { 0.0 };
for (int i = 0; i != dim; ++i)
arr[i] = i;
// heavy usage of the arr
}
int main() {
const int par = 5;
abc(par);
return 0;
}
But I am getting a compiler error, because the allocation on the stack needs compile-time constants. When I tried allocating manually on the stack with _malloca, the time performance of the code worsened (compared to the case when I declare the constant par inside the abc() function). And I don't want the array arr to be on the heap, because it is supposed to contain only small amount of values and it is going to get used quite often inside the function. Is there some way to combine the efficiency while keeping the possibility to pass the size parameter of an array to the function?
EDIT: I am using MSVC compiler and I received an error C2131: expression did not evaluate to a constant in VC 2017.
If you're using a modern C compiler, that implements the entire C99, or the C11 with variable-length array extension, this would work, with one little modification:
void abc(const int dim);
void abc(const int dim) {
double arr[dim];
for (int i = 0; i != dim; ++i)
arr[i] = i;
// heavy usage of the arr
}
int main(void) {
const int par = 5;
abc(par);
return 0;
}
I.e. double arr[dim] would work - it doesn't have a compile-time constant size, but it is enough to know its size at runtime. However, such a VLA cannot be initialized.
Unfortunately MSVC is not a modern C compiler / at MS they don't want to implement the VLA themselves - and I even suspect they're a big part of why the VLA's were made optional in C11, so you'd need to define the array in main then pass a pointer to it to the function abc; or if the size is globally constant, use an actual compile-time constant, i.e. a #define.
However, you're not showing the actual code that you're having performance problems with. It might very well be that the compiler can produce optimized output if it knows the number of iterations - if that is true, then the "globally defined size" might be the only way to get excellent performance.
Unfortunately the Microsoft Compiler does not support variable length arrays.
If the array is not too large you could allocate by the largest possible size needed and pass a pointer to that stack array and a dimension to the function. This approach could help limit the number of allocations.
Another option is to implement a simple heap allocated global pool for functions of this type to use. The pool would allocate a large continuous chunk on the heap and then you can get a pointer to your reservation in the pool. The benefit of this approach is you will not have to worry about over allocation on the stack causing a segmentation fault (which can happen with variable length arrays).

Equivalence between Subscript Notation and Pointer Dereferencing

It is more than one questions. I need to deal with an NxN matrix A of integers in C. How can I allocate the memory in the heap? Is this correct?
int **A=malloc(N*sizeof(int*));
for(int i=0;i<N;i++) *(A+i)= malloc(N*sizeof(int));
I am not absolutely sure if the second line of the above code should be there to initiate the memory.
Next, suppose I want to access the element A[i, j] where i and j are the row and column indices starting from zero. It it possible to do it via dereferencing the pointer **A somehow? For example, something like (A+ni+j)? I know I have some conceptual gap here and some help will be appreciated.
not absolutely sure if the second line of the above code should be there to initiate the memory.
It needs to be there, as it actually allocates the space for the N rows carrying the N ints each you needs.
The 1st allocation only allocates the row-indexing pointers.
to access the element A[i, j] where i and j are the row and column indices starting from zero. It it possible to do it via dereferencing the pointer **
Sure, just do
A[1][1]
to access the element the 2nd element of the 2nd row.
This is identical to
*(*(A + 1) + 1)
Unrelated to you question:
Although the code you show is correct, a more robust way to code this would be:
int ** A = malloc(N * sizeof *A);
for (size_t i = 0; i < N; i++)
{
A[i] = malloc(N * sizeof *A[i]);
}
size_t is the type of choice for indexing, as it guaranteed to be large enough to hold any index value possible for the system the code is compiled for.
Also you want to add error checking to the two calls of malloc(), as it might return NULL in case of failure to allocate the amount of memory requested.
The declaration is correct, but the matrix won't occupy continuous memory space. It is array of pointers, where each pointer can point to whatever location, that was returned by malloc. For that reason addressing like (A+ni+j) does not make sense.
Assuming that compiler has support for VLA (which became optional in C11), the idiomatic way to define continuous matrix would be:
int (*matrixA)[N] = malloc(N * sizeof *matrixA);
In general, the syntax of matrix with N rows and M columns is as follows:
int (*matrix)[M] = malloc(N * sizeof *matrixA);
Notice that both M and N does not have to be given as constant expressions (thanks to VLA pointers). That is, they can be ordinary (e.g. automatic) variables.
Then, to access elements, you can use ordinary indice syntax like:
matrixA[0][0] = 100;
Finally, to relase memory for such matrices use single free, e.g.:
free(matrixA);
free(matrix);
You need to understand that 2D and higher arrays do not work well in C 89. Beginner books usually introduce 2D arrays in a very early chapter, just after 1D arrays, which leads people to assume that the natural way to represent 2-dimensional data is via a 2D array. In fact they have many tricky characteristics and should be considered an advanced feature.
If you don't know array dimensions at compile time, or if the array is large, it's almost always easier to allocate a 1D array and access via the logic
array[y*width+x];
so in your case, just call
int *A;
A = malloc(N * N * sizeof(int))
A[3*N+2] = 123; // set element A[3][2] to 123, but you can't use this syntax
It's important to note that the suggestion to use a flat array is just a suggestion, not everyone will agree with it, and 2D array handling is better in later versions of C. However I think you'll find that this method works best.

Difficulty in understanding variable-length arrays in C

I was reading a book when I found that array size must be given at time of declaration or allocated from heap using malloc at runtime.I wrote this program in C :
#include<stdio.h>
int main() {
int n, i;
scanf("%d", &n);
int a[n];
for (i=0; i<n; i++) {
scanf("%d", &a[i]);
}
for (i=0; i<n; i++) {
printf("%d ", a[i]);
}
return 0;
}
This code works fine.
My question is how this code can work correctly.Isn't it the violation of basic concept of C that array size must be declared before runtime or allocate it using malloc() at runtime.I'm not doing any of these two things,then why it it working properly ?
Solution to my question is variable length arrays which are supported in C99 but if I play aroundmy code and put the statement int a[n]; above scanf("%d,&n); then it's stops working Why is it so.if variable length arrays are supported in C ?
The C99 standard supports variable length arrays. The length of these arrays is determined at runtime.
Since C99 you can declare variable length arrays at block scope.
Example:
void foo(int n)
{
int array[n];
// Initialize the array
for (int i = 0; i < n; i++) {
array[i] = 42;
}
}
C will be happy as long as you've declared the array and allocated memory for it before you use it. One of the "features" of C is that it doesn't validate array indices, so it's the responsibility of the programmer to ensure that all memory accesses are valid.
Variable length arrays are a new feature added to C in C99.
"variable length" here means that the size of the array is decided at run-time, not compile time. It does not mean that the size of the array can change after it is created. The array is logically created where it is declared. So your code looks like.
int n, i;
Create two variables n and i. Initially these variables are uninitialised.
scanf("%d", &n);
Read a value into n.
int a[n];
Create an array "a" whose size is the current value of n.
If you swap the second and third steps you try to create an array whose size is determined by an uninitalised value. This is not likely to end well.
The C standard does not specify exactly how the array is stored but in practice most compilers (I belive there are some exceptions) will allocate it on the stack. The normal way to do this is to copy the stack pointer into a "frame pointer" as part of the function preamble. This then allows the function to dynamically modify the stack pointer while keeping track of it's own stack frame.
Variable length arrays are a feature that should be used with caution. Compilers typically do not insert any form of overflow checking on stack allocations. Operating systems typically insert a "gaurd page" after the stack to detect stack overflows and either raise an error or grow the stack, but a sufficiently large array can easilly skip over the guard page.

Resources