Is it possible to increase char array while using it, WITHOUT malloc? - c

I have a char array, we know that that a char size is 1 byte. Now I have to collect some char -> getchar() of course and simultaneously increase the array by 1 byte (without malloc, only library: stdio.h)
My suggestion would be, pointing to the array and somehow increase that array by 1 till there are no more chars to get OR you run out of Memory...

Is it possible to increase char array while using it, WITHOUT malloc?
No.
You cannot increase the size of a fixed size array.
For that you need realloc() from <stdlib.h>, which it seems you are not "allowed" to use.

Is it possible to increase char array while using it, WITHOUT malloc?
Quick answer: No it is not possible to increase the size of an array without reallocating it.
Fun answer: Don't use malloc(), use realloc().
Long answer:
If the char array has static or automatic storage class, it is most likely impossible to increase its size at runtime because keeping it at the same address that would require objects that are present at higher addresses to be moved or reallocated elsewhere.
If the array was obtained by malloc, it might be possible to extend its size if no other objects have been allocated after it in memory. Indeed realloc() to a larger size might return the same address. The problem is it is impossible to predict and if realloc returns a different address, the current space has been freed so pointers to it are now invalid.
The efficient way to proceed with this reallocation is to increase the size geometrically, by a factor at a time, 2x, 1.5x, 1.625x ... to minimize the number of reallocations and keep linear time as the size of the array grows linearly. You would a different variable for the allocated size of the array and the number of characters that you have stored into it.
Here is an example:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char *a = NULL;
size_t size = 0;
size_t count = 0;
int c;
while ((c = getchar()) != EOF && c != '\n') {
if (count >= size) {
/* reallocate the buffer to 1.5x size */
size_t newsize = size + size / 2 + 16;
char *new_a = realloc(a, new_size);
if (new_a == NULL) {
fprintf("out of memory for %zu bytes\n", new_size);
free(a);
return 1;
}
a = new_a;
size = new_size;
}
a[count++] = c;
}
for (i = 0; i < count; i++) {
putchar(a[i]);
}
free(a);
return 0;
}

There are two ways to create space for the string without using dynamic memory allocation(malloc...). You can use a static array or an array with automatic storage duration, you need to specify a maximum amount, you might never reach. But always check against it.
#define BUFFER_SIZE 0x10000
Static
static char buffer[BUFFER_SIZE];
Or automatic (You need to ensure BUFFER_SIZE is smaller than the stack size)
int main() {
char buffer[BUFFER_SIZE];
...
};
There are also optimizations done by the operating system. It might lazily allocate the whole (static/automatic) buffer, so that only the used part is in the physical memory. (This also applies to the dynamic memory allocation functions.) I found out that calloc (for big chunks) just allocates the virtual memory for the program; memory pages are cleared only, when they are accessed (probably through some interrupts raised by the cpu). I compared it to an allocation with malloc and memset. The memset does unnessecary work, if not all bytes/pages of the buffer are accessed by the program.
If you cannot allocate a buffer with malloc..., create a static/automatic array with enough size and let the operating system allocate it for you. It does not occupy the same space in the binary, because it is just stored as a size.

Related

How to use or free dynamically allocated memory when I run the program multiple times?

How do I free dynamically allocated memory?
Suppose input (assume it is given by user) is 1000 and now if I allocate memory of 1000 and after this(second time) if user gives input as 500 can I reuse already allocated memory ?
If user now inputs value as say 3000 , how do I go with it ? can I reuse already allocated 1000 blocks of memory and then create another 2000 blocks of memory ? or should I create all 3000 blocks of memory ?
which of these is advisable?
#include <stdio.h>
#include <stdlib.h>
typedef struct a
{
int a;
int b;
}aa;
aa* ptr=NULL;
int main() {
//code
int input=2;
ptr=malloc(sizeof(aa)*input);
for(int i=0;i<input;i++)
{
ptr[i].a=10;
ptr[i].b=20;
}
for(int i=0;i<input;i++)
{
printf("%d %d\n",ptr[i].a,ptr[i].b);
}
return 0;
}
I believe, you need to read about the "lifetime" of allocated memory.
For allocator functions, like malloc() and family, (quoting from C11, chapter ยง7.22.3, for "Memory management functions")
[...] The lifetime of an allocated object extends from the allocation
until the deallocation. [....]
So, once allocated, the returned pointer to the memory remains valid until it is deallocated. There are two ways it can be deallocated
Using a call to free() inside the program
Once the program terminates.
So, the allocated memory is available, from the point of allocation, to the termination of the program, or the free() call, whichever is earlier.
As it stands, there can be two aspects, let me clarify.
Scenario 1:
You allocate memory (size M)
You use the memory
You want the allocated memory to be re-sized (expanded/ shrinked)
You use some more
You're done using
is this is the flow you expect, you can use realloc() to resize the allocated memory size. Once you're done, use free().
Scenario 2:
You allocate memory (size M)
You use the memory
You're done using
If this is the case, once you're done, use free().
Note: In both the cases, if the program is run multiple times, there is no connection between or among the allocation happening in each individual invocation. They are independent.
When you use dynamically allocated memory, and adjust its size, it is important to keep track of exactly how many elements you have allocated memory for.
I personally like to keep the number of elements in use in variable named used, and the number of elements I have allocated memory for in size. For example, I might create a structure for describing one-dimensional arrays of doubles:
typedef struct {
size_t size; /* Number of doubles allocated for */
size_t used; /* Number of doubles in use */
double *data; /* Dynamically allocated array */
} double_array;
#define DOUBLE_ARRAY_INIT { 0, 0, NULL }
I like to explicitly initialize my dynamically allocated memory pointers to NULL, and their respective sizes to zero, so that I only need to use realloc(). This works, because realloc(NULL, size) is exactly equivalent to malloc(NULL). I also often utilize the fact that free(NULL) is safe, and does nothing.
I would probably write a couple of helper functions. Perhaps a function that ensures there is room for at_least entries in the array:
void double_array_resize(double_array *ref, size_t at_least)
{
if (ref->size < at_least) {
void *temp;
temp = realloc(ref->data, at_least * sizeof ref->data[0]);
if (!temp) {
fprintf(stderr, "double_array_resize(): Out of memory (%zu doubles).\n", at_least);
exit(EXIT_FAILURE);
}
ref->data = temp;
ref->size = at_least;
}
/* We could also shrink the array if
at_least < ref->size, but usually
this is not needed/useful/desirable. */
}
I would definitely write a helper function that not only frees the memory used, but also updates the fields to reflect that, so that it is completely safe to call double_array_resize() after freeing:
void double_array_free(double_array *ref)
{
if (ref) {
free(ref->data);
ref->size = 0;
ref->used = 0;
ref->data = NULL;
}
}
Here is how a program might use the above.
int main(void)
{
double_array stuff = DOUBLE_ARRAY_INIT;
/* ... Code and variables omitted ... */
if (some_condition) {
double_array_resize(&stuff, 321);
/* stuff.data[0] through stuff.data[320]
are now accessible (dynamically allocated) */
}
/* ... Code and variables omitted ... */
if (weird_condition) {
/* For some reason, we want to discard the
possibly dynamically allocated buffer */
double_array_free(&stuff);
}
/* ... Code and variables omitted ... */
if (other_condition) {
double_array_resize(&stuff, 48361242);
/* stuff.data[0] through stuff.data[48361241]
are now accessible. */
}
double_array_free(&stuff);
return EXIT_SUCCESS;
}
If I wanted to use the double_array as a stack, I might do
void double_array_clear(double_array *ref)
{
if (ref)
ref->used = 0;
}
void double_array_push(double_array *ref, const double val)
{
if (ref->used >= ref->size) {
/* Allocate, say, room for 100 more! */
double_array_resize(ref, ref->used + 100);
}
ref->data[ref->used++] = val;
}
double double_array_pop(double_array *ref, const double errorval)
{
if (ref->used > 0)
return ref->data[--ref->used];
else
return errorval; /* Stack was empty! */
}
The above double_array_push() reallocates for 100 more doubles, whenever the array runs out of room. However, if you pushed millions of doubles, this would mean tens of thousands of realloc() calls, which is usually considered wasteful. Instead, we usually apply a reallocation policy, that grows the size proportionally to the existing size.
My preferred policy is something like (pseudocode)
If (elements in use) < LIMIT_1 Then
Resize to LIMIT_1
Else If (elements in use) < LIMIT_2 Then
Resize to (elements in use) * FACTOR
Else
Resize to (elements in use) + LIMIT_2
End If
The LIMIT_1 is typically a small number, the minimum size ever allocated. LIMIT_2 is typically a large number, something like 220 (two million plus change), so that at most LIMIT_2 unused elements are ever allocated. FACTOR is between 1 and 2; many suggest 2, but I prefer 3/2.
The goal of the policy is to keep the number of realloc() calls at an acceptable (unnoticeable) level, while keeping the amount of allocated but unused memory low.
The final note is that you should only try to keep around a dynamically allocated buffer, if you reuse it for the same (or very similar) purpose. If you need an array of a different type, and don't need the earlier one, just free() the earlier one, and malloc() a new one (or let realloc() in the helpers do it). The C library will try to reuse the same memory anyway.
On current desktop machines, something like a hundred or a thousand malloc() or realloc() calls is probably unnoticeable compared to the start-up time of the program. So, it is not that important to minimize the number of those calls. What you want to do, is keep your code easily maintained and adapted, so logical reuse and variable and type names are important.
The most typical case where I reuse a buffer, is when I read text input line by line. I use the POSIX.1 getline() function to do so:
char *line = NULL;
size_t size = 0;
ssize_t len; /* Not 'used' in this particular case! :) */
while (1) {
len = getline(&line, &size, stdin);
if (len < 1)
break;
/* Have 'len' chars in 'line'; may contain '\0'! */
}
if (ferror(stdin)) {
fprintf(stderr, "Error reading standard input!\n");
exit(EXIT_FAILURE);
}
/* Since the line buffer is no longer needed, free it. */
free(line);
line = NULL;
size = 0;

Obtain size of array via write permission check

To obtain the length of a null terminated string,we simply write len = strlen(str) however,i often see here on SO posts saying that to get the size of an int array for example,you need to keep track of it on your own and that's what i do normally.But,i have a question,could we obtain the size by using some sort of write permission check,that checks if we have writing permissions to a block of memory? for example :
#include <stdio.h>
int getSize(int *arr);
bool permissionTo(int *ptr);
int main(void)
{
int arr[3] = {1,2,3};
int size = getSize(arr) * sizeof(int);
}
int getSize(int *arr)
{
int *ptr = arr;
int size = 0;
while( permissionTo(ptr) )
{
size++;
ptr++;
}
return size;
}
bool permissionTo(int *ptr)
{
/*............*/
}
No, you can't. Memory permissions don't have this granularity on most, if not all, architectures.
Almost all CPU architectures manage memory in pages. On most things you'll run into today one page is 4kB. There's no practical way to control permissions on anything smaller than that.
Most memory management is done by your libc allocating a large:ish chunk of memory from the kernel and then handing out smaller chunks of it to individual malloc calls. This is done for performance (among other things) because creating, removing or modifying a memory mapping is an expensive operation especially on multiprocessor systems.
For the stack (as in your example), allocations are even simpler. The kernel knows that "this large area of memory will be used by the stack" and memory accesses to it just simply allocates the necessary pages to back it. All tracking your program does of stack allocations is one register.
If you are trying to achive, that an allocation becomes comfortable to use by carrying its own size around then do this:
Wrap malloc and free by prefixing the memory with its size internally (written from memory, not tested yet):
void* myMalloc(long numBytes) {
char* mem = malloc(numBytes+sizeof(long));
((long*)mem)[0] = numBytes;
return mem+sizeof(long);
}
void myFree(void* memory) {
char* mem = (char*)memory-sizeof(long);
free(mem)
}
long memlen(void* memory) {
char* mem = (char*)memory-sizeof(long);
return ((long*)mem)[0];
}

Why is memory size doubled when allocating a large number of char*?

I allocate a 2D array of char * and every string length is 12.
50 rows and 2000000 columns.
Lets calculate it:
50*2000000 * (12(length)+8(for pointer)). I use 64 bit.
50*2000000 * 20 =2000000000 bits .. -> 2 GB.
When I check the memory monitor it shows that the process takes 4 GB.
(All that happened after allocation)
This is the code:
int col=2000000,row=50,i=0,j=0;
char *** arr;
arr=(char***)malloc(sizeof(char**)*row);
for(i=0;i<row;i++)
{
arr[i]=(char ** )malloc(sizeof(char*)*col);
for(j=0;j<col;j++)
{
arr[i][j]=(char*)malloc(12);
strcpy(arr[i][j],"12345678901");
arr[i][j][11]='\0';
}
}
May that be from the paging in Linux?
Each call of malloc is taking more memory than you ask. Malloc needs to store somewhere its internal info about allocated place, like size of allocated space, some info about neighbors chunks, etc. Also (very probably) each returned pointer is aligned to 16 bytes. In my estimation each allocation of 12 bytes takes 32 bytes of memory. If you want to save memory allocate all strings in one malloc and split them into sizes per 12 at your own.
Try the following:
int col=2000000,row=50,i=0,j=0;
char *** arr;
arr= malloc(sizeof(*arr)*row);
for(i=0;i<row;i++)
{
arr[i]= malloc(sizeof(*arr[i])*col);
char *colmem = malloc(12 * col);
for(j=0;j<col;j++)
{
arr[i][j] = colmem + j*12;
strcpy(arr[i][j],"12345678901");
}
}
I would re-write the code from scratch. For some reason, around 99% of all C programmers don't know how to correctly allocate true 2D arrays dynamically. I'm not even sure I'm one of the 1% who do, but lets give it a shot:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
int main()
{
const int COL_N = 2000000;
const int ROW_N = 50;
char (*arr)[ROW_N] = malloc( sizeof(char[COL_N][ROW_N]) );
if(arr == NULL)
{
printf("Out of memory");
return 0;
}
for(int row=0; row<ROW_N; row++)
{
strcpy(arr[row], "12345678901");
puts(arr[row]);
}
free(arr);
return 0;
}
The important parts here are:
You should always allocate multi-dimensional arrays in adjacent memory cells or they are not arrays, but rather pointer-based lookup tables. Thus you only need one single malloc call.
This should save a bit of memory since you only need one pointer and it is allocated on the stack. No pointers are allocated on the heap.
Casting the return value of malloc is pointless (but not dangerous on modern compilers).
Ensure that malloc actually worked, particularly when allocating ridiculous amounts of memory.
strcpy copies the null termination, you don't need to do it manually.
There is no need for nested loops. You want to allocate a 2D array, not a 3D one.
Always clean up your own mess with free(), even though the OS might do it for you.

How to create an array of char pointer in C?

I have to create an array of char pointers each of them of size 10000000 using the best and optimized way to do this in C.
I think this will do (haven't checked for nulls though):
int i;
int num_arrays;
char **huge_char_array;
num_arrays = 10; //number of arrays you want.
huge_char_array = (char **)malloc(sizeof(char *) * num_arrays);
for(i = 0; i < num_arrays; i++)
{
huge_char_array[i] = (char *)malloc(sizeof(char) * 10000000);
}
I believe this is the optimal way because there is only one dynamic allocation, reducing overhead from fragmentation of the heap and the time taken to allocate. You can use the STRING_INDEX utility function to access the nth string.
Also, using calloc() instead of malloc() zeroes out buffer to ensure that all strings are NUL terminated.
#define STRING_SIZE 10000000
#define NUM_STRINGS 10
#define STRING_INDEX(array, string_idx) ((array) + (string_idx) * STRING_SIZE)
int main(int argc, char **argv) {
char *array_of_strings = calloc(NUM_STRINGS, STRING_SIZE);
// Access 8th character of 7th string
char c = STRING_INDEX(array_of_strings, 7)[8];
// Use array
// Free array when done
free(array_of_strings);
return 0;
}
I have been allocating 50K with each request I process this resulted in memory fragmentation. So I switched to allocate 50k * 100 = 5MB and reuse the pool when ever I need to allocate one more. If the requests increases beyond my pool capacity I double the pool size to avoid memory fragmentations. Based on this I would recommend allocating a huge chunk of memory maybe in your case 10 * 10000000 to reuse it. I am sorry but I cannot share the code that handle the pool here. Also use malloc and free and do not use new and delete.
malloc won't through exception it will simply return null if there are no memory available.
if you insist on using malloc for each element individually you can create a heap with maximum size to avoid fragmentations and in that case you will have to define the heap size.
please referee to MSDN for more details about how to create a heap and use it or the other allocation methods available for windows users.
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366533(v=vs.85).aspx
char *arr[SOME_SIZE];
for(int i = 0; i < SOME_SIZE; ++i) {
arr[i] = malloc(10000000);
if(!arr[i])
// allocation failed, do something
}
Just realize that you're allocating ~9.5MB for every element in the array (i.e., SOME_SIZE * 10000000 total bytes)

How do I declare an array of undefined or no initial size?

I know it could be done using malloc, but I do not know how to use it yet.
For example, I wanted the user to input several numbers using an infinite loop with a sentinel to put a stop into it (i.e. -1), but since I do not know yet how many he/she will input, I have to declare an array with no initial size, but I'm also aware that it won't work like this int arr[]; at compile time since it has to have a definite number of elements.
Declaring it with an exaggerated size like int arr[1000]; would work but it feels dumb (and waste memory since it would allocate that 1000 integer bytes into the memory) and I would like to know a more elegant way to do this.
This can be done by using a pointer, and allocating memory on the heap using malloc.
Note that there is no way to later ask how big that memory block is. You have to keep track of the array size yourself.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv)
{
/* declare a pointer do an integer */
int *data;
/* we also have to keep track of how big our array is - I use 50 as an example*/
const int datacount = 50;
data = malloc(sizeof(int) * datacount); /* allocate memory for 50 int's */
if (!data) { /* If data == 0 after the call to malloc, allocation failed for some reason */
perror("Error allocating memory");
abort();
}
/* at this point, we know that data points to a valid block of memory.
Remember, however, that this memory is not initialized in any way -- it contains garbage.
Let's start by clearing it. */
memset(data, 0, sizeof(int)*datacount);
/* now our array contains all zeroes. */
data[0] = 1;
data[2] = 15;
data[49] = 66; /* the last element in our array, since we start counting from 0 */
/* Loop through the array, printing out the values (mostly zeroes, but even so) */
for(int i = 0; i < datacount; ++i) {
printf("Element %d: %d\n", i, data[i]);
}
}
That's it. What follows is a more involved explanation of why this works :)
I don't know how well you know C pointers, but array access in C (like array[2]) is actually a shorthand for accessing memory via a pointer. To access the memory pointed to by data, you write *data. This is known as dereferencing the pointer. Since data is of type int *, then *data is of type int. Now to an important piece of information: (data + 2) means "add the byte size of 2 ints to the adress pointed to by data".
An array in C is just a sequence of values in adjacent memory. array[1] is just next to array[0]. So when we allocate a big block of memory and want to use it as an array, we need an easy way of getting the direct adress to every element inside. Luckily, C lets us use the array notation on pointers as well. data[0] means the same thing as *(data+0), namely "access the memory pointed to by data". data[2] means *(data+2), and accesses the third int in the memory block.
The way it's often done is as follows:
allocate an array of some initial (fairly small) size;
read into this array, keeping track of how many elements you've read;
once the array is full, reallocate it, doubling the size and preserving (i.e. copying) the contents;
repeat until done.
I find that this pattern comes up pretty frequently.
What's interesting about this method is that it allows one to insert N elements into an empty array one-by-one in amortized O(N) time without knowing N in advance.
Modern C, aka C99, has variable length arrays, VLA. Unfortunately, not all compilers support this but if yours does this would be an alternative.
Try to implement dynamic data structure such as a linked list
Here's a sample program that reads stdin into a memory buffer that grows as needed. It's simple enough that it should give some insight in how you might handle this kind of thing. One thing that's would probably be done differently in a real program is how must the array grows in each allocation - I kept it small here to help keep things simpler if you wanted to step through in a debugger. A real program would probably use a much larger allocation increment (often, the allocation size is doubled, but if you're going to do that you should probably 'cap' the increment at some reasonable size - it might not make sense to double the allocation when you get into the hundreds of megabytes).
Also, I used indexed access to the buffer here as an example, but in a real program I probably wouldn't do that.
#include <stdlib.h>
#include <stdio.h>
void fatal_error(void);
int main( int argc, char** argv)
{
int buf_size = 0;
int buf_used = 0;
char* buf = NULL;
char* tmp = NULL;
char c;
int i = 0;
while ((c = getchar()) != EOF) {
if (buf_used == buf_size) {
//need more space in the array
buf_size += 20;
tmp = realloc(buf, buf_size); // get a new larger array
if (!tmp) fatal_error();
buf = tmp;
}
buf[buf_used] = c; // pointer can be indexed like an array
++buf_used;
}
puts("\n\n*** Dump of stdin ***\n");
for (i = 0; i < buf_used; ++i) {
putchar(buf[i]);
}
free(buf);
return 0;
}
void fatal_error(void)
{
fputs("fatal error - out of memory\n", stderr);
exit(1);
}
This example combined with examples in other answers should give you an idea of how this kind of thing is handled at a low level.
One way I can imagine is to use a linked list to implement such a scenario, if you need all the numbers entered before the user enters something which indicates the loop termination. (posting as the first option, because have never done this for user input, it just seemed to be interesting. Wasteful but artistic)
Another way is to do buffered input. Allocate a buffer, fill it, re-allocate, if the loop continues (not elegant, but the most rational for the given use-case).
I don't consider the described to be elegant though. Probably, I would change the use-case (the most rational).

Resources