Maximum size array program in C? - c

with the following code, I am trying to make an array of numbers and then sorting them. But if I set a high arraysize (MAX), the program stops at the last 'randomly' generated number and does not continue to the sorting at all. Could anyone please give me a hand with this?
#include <stdio.h>
#define MAX 2000000
int a[MAX];
int rand_seed=10;
/* from K&R
- returns random number between 0 and 62000.*/
int rand();
int bubble_sort();
int main()
{
int i;
/* fill array */
for (i=0; i < MAX; i++)
{
a[i]=rand();
printf(">%d= %d\n", i, a[i]);
}
bubble_sort();
/* print sorted array */
printf("--------------------\n");
for (i=0; i < MAX; i++)
printf("%d\n",a[i]);
return 0;
}
int rand()
{
rand_seed = rand_seed * 1103515245 +12345;
return (unsigned int)(rand_seed / 65536) % 62000;
}
int bubble_sort(void)
{
int t, x, y;
/* bubble sort the array */
for (x=0; x < MAX-1; x++)
for (y=0; y < MAX-x-1; y++)
if (a[y] > a[y+1])
{
t=a[y];
a[y]=a[y+1];
a[y+1]=t;
}
return 0;
}

The problem is that you are storing the array in global section, C doesn't give any guarantee about the maximum size of global section it can support, this is a function of OS, arch compiler.
So instead of creating a global array, create a global C pointer, allocated a large chunk using malloc. Now memory is saved in the heap which is much bigger and can grow at runtime.

Your array will land in BSS section for static vars. It will not be part of an image but program loader will allocate required space and fill it with zeros before your program starts 'real' execution. You can even control this process if using embedded compiler and fill your static data with anything you like. This array may occupy 2GB or your RAM and yet your exe file may be few kilobytes. I've just managed to use over 2GB array this way and my exe was 34KB. I can believe a compiler may warn you when you approach maybe 231-1 elements (if your int is 32bit) but static arrays with 2m elements are not a problem nowadays (unless it is embedded system but I bet it is not).
The problem might be that your bubble sort has 2 nested loops (as all bubble sorts) so trying to sort this array - having 2m elements - causes the program to loop 2*1012 times (arithmetic sequence):
inner loop:
1: 1999999 times
2: 1999998 times
...
2000000: 1 time
So you must swap elements
2000000 * (1999999+1) / 2 = (4 / 2) * 10000002 = 2*1012 times
(correct me if I am wrong above)
Your program simply remains too long in sort routine and you are not even aware of that. What you see it just last rand number printed and program not responding. Even on my really fast PC with 200K array it took around 1minute to sort it this way.
It is not related to your os, compiler, heaps etc. Your program is just stuck as your loop executes 2*1012 times if you have 2m elements.
To verify my words print "sort started" before sorting and "sort finished" after that. I bet the last thing you'll see is "sort started". In addition you may print current x value before your inner loop in bubble_sort - you'll see that it is working.

Dynamic Array
int *Array;
Array= malloc (sizeof(int) * Size);

The original C standard (ANSI 1989/ISO 1990) required that a compiler successfully translate at least one program containing at least one example of a set of environmental limits. One of those limits was being able to create an object of at least 32,767 bytes.
This minimum limit was raised in the 1999 update to the C standard to be at least 65,535 bytes.
No C implementation is required to provide for objects greater than that size, which means that they don't need to allow for an array of ints greater than
(int)(65535 / sizeof(int)).
In very practical terms, on modern computers, it is not possible to say in advance how large an array can be created. It can depend on things like the amount of physical memory installed in the computer, the amount of virtual memory provided by the OS, the number of other tasks, drivers, and programs already running and how much memory that are using. So your program may be able to use more or less memory running today than it could use yesterday or it will be able to use tomorrow.
Many platforms place their strictest limits on automatic objects, that is those defined inside of a function without the use of the 'static' keyword. On some platforms you can create larger arrays if they are static or by dynamic allocation.

Related

Does recursive functions have some limitations? eg: how many layers does the function require?

Made a recursive function which gives how many terms are there in a collatz sequence given a starting number, this is the code n=13 for exemple :
int collatz(long n,long o)
{
if (n!=1) {
if(n%2==0)
return collatz(n/2,o+1);
else
return collatz((n*3)+1,o+1);
} else
printf("%ld\t",o);
}
void main()
{
collatz(13,0);
}
the function runs as expected; however with some integers such as "n=113383" something overflows (I guess) and returns :
Process returned -1073741571 (0xC00000FD) execution time : 4.631 s
Press any key to continue.
Excuse my non technical explanation, many thanks !
There is no limitation to recursion depth in the C standard itself. You might cause a stack overflow, but the stack size is different in different environments. I think Windows has 1MB and Linux 8MB. It also depends on the size of the stack frame for the function, which in turn depends on how many variables it has and which type.
In your case, you have two long variables which probably is 8 bytes each. You also have the string "%ld\t" which is 5 bytes, which might end up on the stack, but I'm not sure. On top of that you have the overhead of two pointers to the function return address and to the previous stack frame and they are 8 bytes each on a 64 bit system. So the stack frame for your function will roughly be 32 bytes or so. Maybe a little bit more. So on a Linux system I'd guess that your function would crash at a depth of around 200'000.
If this is a problem, consider rewriting the function to a non-recursive variant. Look at Blaze answer for how that can be done for your case. And as andreee commented below:
Additional note: You can increase the stack size under Linux with ulimit -s (also possible: ulimit -s unlimited) and in MSVC you can set the /F compilation flag to increase the stack size of your program. For MinGW, see this post.
What happens here is a stack overflow. This happens because every call of the function creates a new stack frame, and if there are too many, the stack's memory runs out. You can fix it by using iteration instead of recursion.
Also, long might not be able to hold the numbers that the collatz sequence produces for a starting value of 113383 (it didn't for me with MSVC). Use long long instead, that is at least 64 bits big. All in all, it could look like this:
void collatz(long long n)
{
long o;
for (o = 0; n > 1; o++) {
if (n % 2 == 0)
n /= 2;
else
n = n * 3 + 1;
}
printf("%ld\t", o);
return;
}
int main()
{
collatz(113383);
return 0;
}
Note how instead of recursion we now have a for loop.

Random numbers using malloc function

Why aren't the numbers after the first one of each line of the matrix random numbers? Why are they zeros? For example if I print a line of this matrix I get4 0 0 0 0 but I should get the numbers after 4 as random numbers instead.
void readfile(FILE *input,int **matrix){
int i=0, num;
while(fscanf(input, "%d ", &num) == 1){
matrix[i] = malloc((num+1)*sizeof(int));
matrix[i][0] = num;
i++;
}
}
Why aren't the numbers after the first one of each line of the matrix random numbers?
Why should they be?
Yes, malloc returns a newly allocated block of uninitialized memory, but nobody said that it had to be random.
Indeed, typically at process start you are going to get blank pages, just zeroed out by the operating system and provided to your process (the OS cannot recycle pages from other processes without blanking them out for security reasons), while later you are more likely to get back pages that your process has freed, so generally containing old data from your own program.
All this is strictly non-contractual, and is often violated - for example, so-called "debug heaps" generally fill in free pages with a known pattern (e.g. 0xCDCDCDCD on Visual C++) to spot usages of uninitialized memory.
Long story short: don't make any kind of assumption about the content of memory provided by malloc.

Standard C: Storing arrays in off-chip RAM

I would like to know if I can choose the storage location of arrays in c. There are a couple of questions already on here with some helpful info, but I'm looking for some extra info.
I have an embedded system with a soft-core ARM cortex implemented on an FPGA.
Upon start-up code is loaded from memory and executed by the processor. My code is in assembley and contains some c functions. One particular function is a uART interrupt which I have included below
void UART_ISR()
{
int count, n=1000, t1=0, t2=1, display=0, y, z;
int x[1000]; //storage array for first 1000 terms of Fibonacci series
x[1] = t1;
x[2] = t2;
printf("\n\nFibonacci Series: \n\n %d \n %d \n ", t1, t2);
count=2; /* count=2 because first two terms are already displayed. */
while (count<n)
{
display=t1+t2;
t1=t2;
t2=display;
x[count] = t2;
++count;
printf(" %d \n",display);
}
printf("\n\n Finished. Sequence written to memory. Reading sequence from memory.....:\n\n");
for (z=0; z<10000; z++){} // Delay
for (y=0; y<1000; y++) { //Read variables from memory
printf("%d \n",x[y]);
}
}
So basically the first 1000 values of the Fibonacci series are printed and stored in array X and then values from the array are printed to the screen again after a short delay.
Please correct me if I'm wrong but the values in the array X are stored on the stack as they are computed in the for loop and retrieved from the stack when the array is read from memory.
Here is he memory map of the system
0x0000_0000 to 0x0000_0be0 is the code
0x0000_0be0 to 0x0010_0be0 is 1MB heap
0x0010_0be0 to 0x0014_0be0 is 256KB stack
0x0014_0be0 to 0x03F_FFFF is of-chip RAM
Is there a function in c that allows me to store the array X in the off-chip ram for later retrieval?
Please let me know if you need any more info
Thanks very much for helping
--W
No, not "in C" as in "specified by the language".
The C language doesn't care about where things are stored, it specifies nothing about the existance of RAM at particular addresses.
But, actual implementations in the form of compilers, assemblers and linkers, often care a great deal about this.
With gcc for instance, you can use the section variable attribute to force a variable into a particular section.
You can then control the linker to map that section to a particular memory area.
UPDATE:
The other way to do this is manually, by not letting the compiler in on the secret and doing it yourself.
Something like:
int *external_array = (int *) 0x00140be0;
memcpy(external_array, x, sizeof x);
will copy the required number of bytes to the external memory. You can then read it back by swapping the two first arguments in the memcpy() call.
Note that this is way more manual, low-level and fragile, compared to letting the compiler/linker dynamic duo Just Make it Work for you.
Also, it seems very unlikely that you want to do all of that work from an ISR.

What's the best way to set all elements of array to a value?

I have an array of ints and I'd like to set all values in the array to 'x' everytime a function is called.
I've looked at memset but that would only work for an array of bytes I think.
I could do the obvious for loop, but I'm guessing there is a standard lib function out there that will accomplish this better. Anyone know?
Just loop it, pretty much. Or memset to 0, if you know the value is zero (similar for other values for which you have knowledge of the bit representation). There won't be a standard lib solution, since the standard lib can't know of particular user types.
If you're on an x86 system, you can use some assembly for this. For instance, in gcc:
__asm__(
"rep stosb"
: "=a"('x'), "=c"(count), "=D"(array)
);
Should do the trick.
rep stosb takes the value in AL and assigns it to consecutive memory locations pointed to by ES:EDI. The number of the locations is specified in ECX.
As an aside, in recent processors Intel has made many efforts to improve the performance of MOVSB and STOSB, so this is a good way to go about it.
In addition to memset and looping (which are both O(n) time), it can be actually done in O(1) - but at the cost of triple the amount of memory, and more expensive look ups later on.
This article describes how it can be done.
The idea is to maintain additional stack (logically, implemented as array+ pointer to top) and array, the additional array will indicate when it was first initialized (a number from 0 to n) and the stack will indicate which elements were already modified.
When you access array[i], if stack[additionalArray[i]] == i && i < top the value of the array is array[i]. Otherwise - it is the "initialized" value.
When doing array[i] = x, if it was not initialized yet (as seen before), you should set additionalArray[i] = stack[top] and increase top.
This results in O(1) initialization, but as said it requires additional memory and each access is more expansive.
Below logic will helps you.
...
int a[100] = {0};
int b = 5;
memset_ex(a, 100, &b, sizeof(int));
...
memset_ex(void *buf, int buf_size, void *value, int size_of_type)
{
int i = 0;
for(i = 0; i <= (buf_size - size_of_type); i +=size_of_type)
{
memcpy((buf + i), value, size_of_type);
}
}

Purposely waste all of main memory to learn fragmentation

In my class we have an assignment and one of the questions states:
Memory fragmentation in C: Design, implement, and run a C-program that does the following: it allocated memory for a sequence of of 3m arrays of size 500000 elements each; then it deallocates all even-numbered arrays and allocates a sequence of m arrays of size 700000 elements each. Measure the amounts of time your program requires for the allocations of the first sequence and for the second sequence. Choose m so that you exhaust all of the main memory available to your program. Explain your timings
My implementation of this is as follows:
#include <iostream>
#include <time.h>
#include <algorithm>
void main(){
clock_t begin1, stop1, begin2, stop2;
double tdif = 0, tdif2 = 0;
for(int k=0;k<1000;k++){
double dif, dif2;
const int m = 50000;
begin1 = clock();
printf("Step One\n");
int *container[3*m];
for(int i=0;i<(3*m);i++)
{
int *tmpAry = (int *)malloc(500000*sizeof(int));
container[i] = tmpAry;
}
stop1 = clock();
printf("Step Two\n");
for(int i=0;i<(3*m);i+=2)
{
free(container[i]);
}
begin2 = clock();
printf("Step Three\n");
int *container2[m];
for(int i=0;i<m;i++)
{
int *tmpAry = (int *)malloc(700000*sizeof(int));
container2[i] = tmpAry;
}
stop2 = clock();
dif = (stop1 - begin1)/1000.00;
dif2 = (stop2 - begin2)/1000.00;
tdif+=dif;
tdif/=2;
tdif2+=dif2;
tdif2/=2;
}
printf("To Allocate the first array it took: %.5f\n",tdif);
printf("To Allocate the second array it took: %.5f\n",tdif2);
system("pause");
};
I have changed this up a few different ways, but the consistencies I see are that when I initially allocate the memory for 3*m*500000 element arrays it uses up all of the available main memory. But then when I tell it to free them the memory is not released back to the OS so then when it goes to allocate the m*700000 element arrays it does it in the page file (swap memory) so it does not actually display memory fragmentation.
The above code runs this 1000 times and averages it, it takes quite some time. The first sequence average took 2.06913 seconds and the second sequence took 0.67594 seconds. To me the second sequence is supposed to take longer to show how fragmentation works, but because of the swap being used this does not occur. Is there a way around this or am I wrong in my assumption?
I will ask the professor about what I have on monday but until then any help would be appreciated.
Many libc implementations (I think glibc included) don't release memory back to the OS when you call free(), but keep it so you can use it on the next allocation without a syscall. Also, because of the complexity of modern paging and virtual memory stratagies, you can never be sure where anything is in physical memory, which makes it almost imposible to intentionally fragment it (even if it comes fragmented). You have to remember, all virtual memory, and all physical memory are different beasts.
(The following is written for Linux, but probably applicable to Windows and OSX)
When your program makes the first allocations, let's say there is enough physical memory for the OS to squeeze all of the pages in. They aren't all next to each-other in physical memory -- they are scattered wherever they can be. Then the OS modifies the page table to make a set of continuous virtual addresses, that refer to the scattered pages around in memory. But here's the thing -- because you don't really use the first memory you allocate, it becomes a really good candidate for swapping out. So, when you come along to do the next allocations, the OS, running out of memory, will probably swap out some of those pages to make room for the new ones. Because of this, you are actually measuring disk speeds, and the efficiency of the operations systems paging mechanism -- not fragmentation.
Remember, an set of continuous virtual addresses is almost never physically continuous in practice (or even in memory).

Resources