Max Size for char buffer in C?

Max Size for char buffer in C? - c

Is there a max size for a char buffer? I have a program that is collecting strings for a char buffer and writing it to a proc file. After a certain point it appears to stop writing things - is there too much in there? What is that max size so I can work around this?
Here is code. This is an LKM - is limits.h available from kernel space?
Foremost:
const char* input = "hooloo\n";
Next:
int read_info( char *page, char **start, off_t off, int count, int *eof, void *data )
{
unsigned int mem;
char answer_buf[strlen(input) + 1 + 14];
name_added = vmalloc(strlen(input) + 1 + 14);
strcpy(name_added, input);
strcat(name_added, extension);
mem = sprintf(answer_buf, "%s\n", name_added);
memcpy(page, answer_buf, mem);
return strlen(answer_buf) + 1;
}
All in my code are things like this are things that remalloc the buffer and add to it. Also, that read_info is for the procfile. This issue is I keep adding to that buffer with the code above over and over and over - eventually I ca my procfile and the text cuts off - it doesn't go on forever like i want )-=.

There's no concrete maximum size "in C" specifically. Theoretical (or "potential") maximum size of any object on a C platform is determined by the implementation and is usually derived from the properties of the underlying machine platform and OS.
On platforms with flat memory model it will typically be limited by the size of the address space in theory, and by the size of the available free memory (or that specific kind) in practice.
On platforms with segmented memory model it might be limited by the segment size, which is smaller than the address space size. Although implementations are free to breach that limit by "emulating" flat memory model in the code. For that reason on such platforms the maximum object size can also depend on compilation settings.

The only maximum size for a dynamically allocated char buffer will be available system memory.
A buffer on the stack will have its size constrained by maximum stack size. This will vary greatly depending on host OS.
When writing data to file, are you checking the size returned by fwrite and calling it repeatedly to write the remainder of the buffer if necessary?

You have a memory leak in your code!
The following memory is never freed:
name_added = vmalloc(strlen(input) + 1 + 14);
I don't understand why you allocate memory for the output at all.
And you do it twice, both on the stack and on the heap.
The caller has provided a buffer for the output. Use it!
Don't create copies!

I'd say it's at least able to handle 1,000 unique characters

Related

Creating a memory buffer without malloc

I am working on an embedded system (ARM Cortex M3) where I do not have access to any sort of "standard library". In particular, I do not have access to malloc.
I have a function void doStuff(uint8_t *buffer) that accepts a pointer to a 512 bits buffer. I have tried doing the following:
uint8_t buffer[64] = {0};
doStuff((uint8_t *) &buffer));
but I'm not getting the expected results. Am I doing something wrong? Is there any alternative approach?

doStuff(buffer) shall be ok since buffer is already an uint8_t*.
Aside of this, you're closing one bracket too much after &buffer in your example.
If buffer is of variable size, you should pass the size into doStuff too, if it's of constant size, I'd also pass the size just in case that you change the size one day.
This being said, you should do it the following way:
uint8_t buffer[64] = {0};
int len = 64;
doStuff(buffer, len);

a simplemalloc(): have a char mem[MAXMEM]; and a struct freetable. Then write your own simplemalloc(), that finds in freetable a big enough junk of memory, and returns the offset into mem. simplefree() would then adjust the freetable.
EDITH:
if you need a lot of malloc()s, you may even devide your static mem into different chunks for different tasks (one chunk for exactly 100byte allocs, one fort the size of your favorite struct, and so on) this will speed up finding free mem.
if you are short of mem, you should implement a bestmatch() in simplemalloc(), which as a bad sideeffect slows the execution down.
if you have enough memory, you can implement a debugversion, which "allocates" a bit more memory and puts XXX before the start and after the end of the simplemalloc()ed mem. on free() you can check, if this XXX is broken, so you know you have some buffer-over or under-flow, which you can be aware of.

Character arrays in C

I'm new to c. Just have a question about the character arrays (or string) in c: When I want to create a character array in C, do I have to give the size at the same time?
Because we may not know the size that we actually need. For example of client-server program, if we want to declare a character array for the server program to receive a message from the client program, but we don't know the size of the message, we could do it like this:
char buffer[1000];
recv(fd,buffer, 1000, 0);
But what if the actual message is only of length 10. Will that cause a lot of wasted memory?

Yes, you have to decide the dimension in advance, even if you use malloc.
When you read from sockets, as in the example, you usually use a buffer with a reasonable size, and dispatch data in other structure as soon you consume it. In any case, 1000 bytes is not a so much memory waste and is for sure faster than asking a byte at a time from some memory manager :)

Yes, you have to give the size if you are not initializing the char array at the time of declaration. Better approach for your problem is to identify the optimum size of the buffer at run time and dynamically allocate the memory.

What you're asking about is how to dynamically size a buffer. This is done with a dynamic allocation such as using malloc() -- a memory allocator. Using it gives you an important responsibility though: when you're done using the buffer you must return it to the system yourself. If using malloc() [or calloc()], you return it with free().
For example:
char *buffer; // pointer to a buffer -- essentially an unsized array
buffer = (char *)malloc(size);
// use the buffer ...
free(buffer); // return the buffer -- do NOT use it any more!
The only problem left to solve is how to determine the size you'll need. If you're recv()'ing data that hints at the size, you'll need to break the communication into two recv() calls: first getting the minimum size all packets will have, then allocating the full buffer, then recv'ing the rest.

When you don't know the exact amount of input data, do as follows:
Create a small buffer
Allocate some memory for a "storage" (e.g. twice of buffer size)
Fill the buffer with the data from the input stream (e.g. socket, file etc.)
Copy the data from the buffer to the storage
4.1 If there is not enough place in storage, re-allocate the memory (e.g. with a size twice bigger than it is at this point)
Do steps 3 and 4 unless the "END OF STREAM"
Your storage contains the data now.

If you don't know the size a-priori, then you have no choice but to create it dynamically using malloc (or whatever equivalent mechanism in your language of choice.)
size_t buffer_size = ...; /* read from a DEFINE or from a config file */
char * buffer = malloc( sizeof( char ) * (buffer_size + 1) );
Creating a buffer of size m, but only receiving an input string of size n with n < m is not a waste of memory, but an engineering compromise.
If you create your buffer with a size close to the intended input, you risk having to refill the buffer many, many times for those cases where m >> n. Typically, iterations over the buffer are tied up with I/O operations, so now you might be saving some bytes (which is really nothing in today's hardware) at the expense of potentially increasing the problems in some other end. Specially for client-server apps. If we were talking about resource-constrained embedded systems, that'd be another thing.
You should be worrying about getting your algorithms right and solid. Then you worry, if you can, about shaving off a few bytes here and there.
For me, I'd rather create a buffer that is 2 to 10 times greater than the average input (not the smallest input as in your case, but the average), assuming my input tends to have a slow standard deviation in size. Otherwise, I'd go 20 times the size or more (specially if memory is cheap and doing this minimizes hitting the disk or the NIC card.)
At the most basic setup, one typically gets the size of the buffer as a configuration item read off a file (or passed as an argument), and defaulting to a default compile time value if none is provided. Then you can adjust the size of your buffers according to the observed input sizes.
More elaborate algorithms (say TCP) adjust the size of their buffers at run-time to better accommodate input whose size might/will change over time.

Even if you use malloc you also must define the size first! So instead you give a large number that is capable of accepting the message like:
int buffer[2000];
In case of small message or large you can reallocate it to release the unused locations or to occupy the unused locations
example:
int main()
{
char *str;
/* Initial memory allocation */
str = (char *) malloc(15);
strcpy(str, "tutorialspoint");
printf("String = %s, Address = %u\n", str, str);
/* Reallocating memory */
str = (char *) realloc(str, 25);
strcat(str, ".com");
printf("String = %s, Address = %u\n", str, str);
free(str);
return(0);
}
Note: make sure to include stdlib.h library

How to get memory block length after malloc?

I thought that I couldn't retrieve the length of an allocated memory block like the simple .length function in Java. However, I now know that when malloc() allocates the block, it allocates extra bytes to hold an integer containing the size of the block. This integer is located at the beginning of the block; the address actually returned to the caller points to the location just past this length value. The problem is, I can't access that address to retrieve the block length.
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
char *str;
str = (char*) malloc(sizeof(char)*1000);
int *length;
length = str-4; /*because on 32 bit system, an int is 4 bytes long*/
printf("Length of str:%d\n", *length);
free(str);
}
**Edit:
I finally did it. The problem is, it keeps giving 0 as the length instead of the size on my system is because my Ubuntu is 64 bit. I changed str-4 to str-8, and it works now.
If I change the size to 2000, it produces 2017 as the length. However, when I change to 3000, it gives 3009. I am using GCC.

You don't have to track it by your self!
size_t malloc_usable_size (void *ptr);
But it returns the real size of the allocated memory block!
Not the size you passed to malloc!

What you're doing is definitely wrong. While it's almost certain that the word just before the allocated block is related to the size, even so it probably contains some additional flags or information in the unused bits. Depending on the implementation, this data might even be in the high bits, which would cause you to read the entirely wrong length. Also it's possible that small allocations (e.g. 1 to 32 bytes) are packed into special small-block pages with no headers, in which case the word before the allocated block is just part of another block and has no meaning whatsoever in relation to the size of the block you're examining.
Just stop this misguided and dangerous pursuit. If you need to know the size of a block obtained by malloc, you're doing something wrong.

I would suggest you create your own malloc wrapper by compiling and linking a file which defines my_malloc() and then overwiting the default as follows:
// my_malloc.c
#define malloc(sz) my_malloc(sz)
typedef struct {
size_t size;
} Metadata;
void *my_malloc(size_t sz) {
size_t size_with_header = sz + sizeof(Metadata);
void* pointer = malloc(size_with_header);
// cast the header into a Metadata struct
Metadata* header = (Metadata*)pointer;
header->size = sz;
// return the address starting after the header
// since this is what the user needs
return pointer + sizeof(Metadata);
}
then you can always retrieve the size allocated by subtracting sizeof(Metadata), casting that pointer to Metadata and doing metadata->size:
Metadata* header = (Metadata*)(ptr - sizeof(Metadata));
printf("Size allocated is:%lu", header->size); // don't quote me on the %lu ;-)

You're not supposed to do that. If you want to know how much memory you've allocated, you need to keep track of it yourself.
Looking outside the block of memory returned to you (before the pointer returned by malloc, or after that pointer + the number of bytes you asked for) will result in undefined behavior. It might work in practice for a given malloc implementation, but it's not a good idea to depend on that.

This is not Standard C. However, it is supposed to work on Windows operatings systems and might to be available on other operating systems such as Linux (msize?) or Mac (alloc_size?), as well.
size_t _msize( void *memblock );
_msize() returns the size of a memory block allocated in the heap.
See this link:
http://msdn.microsoft.com/en-us/library/z2s077bc.aspx

This is implementation dependent

Every block you're allocating is precedeed by a block descriptor. Problem is, it dependends on system architecture.
Try to find the block descriptor size for you own system. Try take a look at you system malloc man page.

How many chars can be in a char array?

#define HUGE_NUMBER ???
char string[HUGE_NUMBER];
do_something_with_the_string(string);
I was wondering what would be the maximum number that I could add to a char array without risking any potential memory problems, buffer overflows or the like. I wanted to get user input into it, and possibly the maximum possible.

See this response by Jack Klein (see original post):
The original C standard (ANSI 1989/ISO
1990) required that a compiler
successfully translate at least one
program containing at least one
example of a set of environmental
limits. One of those limits was being
able to create an object of at least
32,767 bytes.
This minimum limit was raised in the
1999 update to the C standard to be at
least 65,535 bytes.
No C implementation is required to
provide for objects greater than that
size, which means that they don't need
to allow for an array of ints greater
than (int)(65535 / sizeof(int)).
In very practical terms, on modern
computers, it is not possible to say
in advance how large an array can be
created. It can depend on things like
the amount of physical memory
installed in the computer, the amount
of virtual memory provided by the OS,
the number of other tasks, drivers,
and programs already running and how
much memory that are using. So your
program may be able to use more or
less memory running today than it
could use yesterday or it will be able
to use tomorrow.
Many platforms place their strictest
limits on automatic objects, that is
those defined inside of a function
without the use of the 'static'
keyword. On some platforms you can
create larger arrays if they are
static or by dynamic allocation.
Now, to provide a slightly more tailored answer, DO NOT DECLARE HUGE ARRAYS TO AVOID BUFFER OVERFLOWS. That's close to the worst practice one can think of in C. Rather, spend some time writing good code, and carefully make sure that no buffer overflow will occur. Also, if you do not know the size of your array in advance, look at malloc, it might come in handy :P

It depends on where char string[HUGE_NUMBER]; is placed.
Is it inside a function? Then the array will be on the stack, and if and how fast your OS can grow stacks depends on the OS. So here is the general rule: dont place huge arrays on the stack.
Is it ouside a function then it is global (process-memory), if the OS cannot allocate that much memory when it tries to load your program, your program will crash and your program will have no chance to notice that (so the following is better:)
Large arrays should be malloc'ed. With malloc, the OS will return a null-pointer if the malloc failed, depending on the OS and its paging-scheme and memory-mapping-scheme this will either fail when 1) there is no continuous region of free memory large enough for the array or 2) the OS cannot map enough regions of free physical memory to memory that appears to your process as continous memory.
So, with large arrays do this:
char* largeArray = malloc(HUGE_NUMBER);
if(!largeArray) { do error recovery and display msg to user }

Declaring arbitrarily huge arrays to avoid buffer overflows is bad practice. If you really don't know in advance how large a buffer needs to be, use malloc or realloc to dynamically allocate and extend the buffer as necessary, possibly using a smaller, fixed-sized buffer as an intermediary.
Example:
#define PAGE_SIZE 1024 // 1K buffer; you can make this larger or smaller
/**
* Read up to the next newline character from the specified stream.
* Dynamically allocate and extend a buffer as necessary to hold
* the line contents.
*
* The final size of the generated buffer is written to bufferSize.
*
* Returns NULL if the buffer cannot be allocated or if extending it
* fails.
*/
char *getNextLine(FILE *stream, size_t *bufferSize)
{
char input[PAGE_SIZE]; // allocate
int done = 0;
char *targetBuffer = NULL;
*bufferSize = 0;
while (!done)
{
if(fgets(input, sizeof input, stream) != NULL)
{
char *tmp;
char *newline = strchr(input, '\n');
if (newline != NULL)
{
done = 1;
*newline = 0;
}
tmp = realloc(targetBuffer, sizeof *tmp * (*bufferSize + strlen(input)));
if (tmp)
{
targetBuffer = tmp;
*bufferSize += strlen(input);
strcat(targetBuffer, input);
}
else
{
free(targetBuffer);
targetBuffer = NULL;
*bufferSize = 0;
fprintf(stderr, "Unable to allocate or extend input buffer\n");
}
}
}

If the array is going to be allocated on the stack, then you are limited by the stack size (typically 1MB on Windows, some of it will be used so you have even less). Otherwise I imagine the limit would be quite large.
However, making the array really big is not a solution to buffer overflow issues. Don't do it. Use functions that have a mechanism for limiting the amount of buffer they use to make sure you don't overstep your buffer, and make the size something more reasonable (1K for example).

You can use malloc() to get larger portions of memory than normally an array could handle.

Well, a buffer overflow wouldn't be caused by too large a value for HUGE_NUMBER so much as too small compared to what was written to it (write to index HUGE_NUMBER or higher, and you've overflown the buffer).
Aside from that it will depend upon the machine. There are certainly systems that could handle several millions in the heap, and a million or so on the stack (depending on other pressures), but there are also certainly some that couldn't handle more than a few hundred (small embedded devices would be an obvious example). While 65,535 is a standard-specified minimum, a really small device could specify that the standard was deliberately departed from for this reason.
In real terms, on a large machine, long before you actually run out of memory, you are needlessly putting pressure on the memory in a way that would affect performance. You would be better off dynamically sizing an array to an appropriate size.

Segmentation fault due to lack of memory in C

This code gives me segmentation fault about 1/2 of the time:
int main(int argc, char **argv) {
float test[2619560];
int i;
for(i = 0; i < 2619560; i++)
test[i] = 1.0f;
}
I actually need to allocate a much larger array, is there some way of allowing the operating system to allow me get more memory?
I am using Linux Ubuntu 9.10

You are overflowing the default maximum stack size, which is 8 MB.
You can either increase the stack size - eg. for 32 MB:
ulimit -s 32767
... or you can switch to allocation with malloc:
float *test = malloc(2619560 * sizeof test[0]);

Right now you're allocating (or at least trying to) 2619560*sizeof(float) bytes on the stack. At least in most typical cases, the stack can use only a limited amount of memory. You might try defining it static instead:
static float test[2619560];
This gets it out of the stack, so it can typically use any available memory instead. In other functions, defining something as static changes the semantics, but in the case of main it makes little difference (other than the mostly theoretical possibility of a recursive main).

Don't put such a large object on the stack. Instead, consider storing it in the heap, by allocation with malloc() or its friends.
2.6M floats isn't that many, and even on a 32-bit system you should be ok for address space.
If you need to allocate a very large array, be sure to use a 64-bit system (assuming you have enough memory!). 32-bit systems can only address about 3G per process, and even then you can't allocate it all as a single contigous block.

It is the stack overflower.
You'd better to use malloc function to get memory larger than stack size which you can get it from "ulimit -s".

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight