c programming more efficient use of big arrays - c

How would I make these big arrays more efficient? I am getting a segmentation fault when I add them, but when I remove them the segmentation fault goes away. I have several big arrays like this that are not shown. I need the arrays to be this big to handle the files that I am reading from. In the code below I used stdin instead of the file pointer I would normally use. I also free each big array after use.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
int main(void) {
int players_column_counter = 0;
int players_column1[100000] = {0};
char *players_strings_line_column2[100000] = {0};
char *players_strings_line_column3[100000] = {0};
char *players_strings_line_column4[100000] = {0};
char *players_strings_line_column5[100000] = {0};
char *players_strings_line_column6[100000] = {0};
char line[80] = {0};
while(fgets(line, 80, stdin) != NULL)
{
players_strings_line_column2[players_column_counter] =
malloc(strlen("string")+1);
strcpy(players_strings_line_column2[players_column_counter],
"string");
players_column_counter++;
}
free(*players_strings_line_column2);
free(*players_strings_line_column3);
free(*players_strings_line_column4);
free(*players_strings_line_column5);
free(*players_strings_line_column6);
return 0;
}

Read much more about C dynamic memory allocation. Learn to use malloc and calloc and free. Notice that malloc and calloc (and realloc) can fail, and you need to handle that. See this and that.
The call stack is limited in size (to one or a few megabytes typically; the actual limit is operating system and computer specific). It is unreasonable to have a call frame of more than a few kilobytes. But calloc or malloc might permit allocation of a few gigabytes (the actual limit depends upon your system), or at least hundreds of megabytes on current laptops or desktops. A local -automatic variable- array of more than a few hundreds elements is almost always wrong (and is surely very bad smell).
BTW, if your system has getline(3), you probably should want to use it (like here). And likewise for strdup(3) and asprintf(3).
If your system don't have getline, or strdup, or asprintf, you should consider implementing them, or borrow some free software implementation of them.
Compile with all warnings and debug info (e.g. gcc -Wall -Wextra -g with GCC). Improve your code to get no warnings. Use the debugger gdb (and valgrind). Beware of undefined behavior (such as buffer overflows) and of memory leaks.
Study the source code of existing free software (e.g. on github and/or some Linux distribution) for inspiration.

Related

Finding Dynamically Allocated Size in C [duplicate]

Is there a way in C to find out the size of dynamically allocated memory?
For example, after
char* p = malloc (100);
Is there a way to find out the size of memory associated with p?
There is no standard way to find this information. However, some implementations provide functions like msize to do this. For example:
_msize on Windows
malloc_size on MacOS
malloc_usable_size on systems with glibc
Keep in mind though, that malloc will allocate a minimum of the size requested, so you should check if msize variant for your implementation actually returns the size of the object or the memory actually allocated on the heap.
comp.lang.c FAQ list · Question 7.27 -
Q. So can I query the malloc package to find out how big an
allocated block is?
A. Unfortunately, there is no standard or portable way. (Some
compilers provide nonstandard extensions.) If you need to know, you'll
have to keep track of it yourself. (See also question 7.28.)
The C mentality is to provide the programmer with tools to help him with his job, not to provide abstractions which change the nature of his job. C also tries to avoid making things easier/safer if this happens at the expense of the performance limit.
Certain things you might like to do with a region of memory only require the location of the start of the region. Such things include working with null-terminated strings, manipulating the first n bytes of the region (if the region is known to be at least this large), and so forth.
Basically, keeping track of the length of a region is extra work, and if C did it automatically, it would sometimes be doing it unnecessarily.
Many library functions (for instance fread()) require a pointer to the start of a region, and also the size of this region. If you need the size of a region, you must keep track of it.
Yes, malloc() implementations usually keep track of a region's size, but they may do this indirectly, or round it up to some value, or not keep it at all. Even if they support it, finding the size this way might be slow compared with keeping track of it yourself.
If you need a data structure that knows how big each region is, C can do that for you. Just use a struct that keeps track of how large the region is as well as a pointer to the region.
Here's the best way I've seen to create a tagged pointer to store the size with the address. All pointer functions would still work as expected:
Stolen from: https://stackoverflow.com/a/35326444/638848
You could also implement a wrapper for malloc and free to add tags
(like allocated size and other meta information) before the pointer
returned by malloc. This is in fact the method that a c++ compiler
tags objects with references to virtual classes. Here is one working
example:
#include <stdlib.h>
#include <stdio.h>
void * my_malloc(size_t s)
{
size_t * ret = malloc(sizeof(size_t) + s);
*ret = s;
return &ret[1];
}
void my_free(void * ptr)
{
free( (size_t*)ptr - 1);
}
size_t allocated_size(void * ptr)
{
return ((size_t*)ptr)[-1];
}
int main(int argc, const char ** argv) {
int * array = my_malloc(sizeof(int) * 3);
printf("%u\n", allocated_size(array));
my_free(array);
return 0;
}
The advantage of this method over a structure with size and pointer
struct pointer
{
size_t size;
void *p;
};
is that you only need to replace the malloc and free calls. All
other pointer operations require no refactoring.
No, the C runtime library does not provide such a function.
Some libraries may provide platform- or compiler-specific functions that can get this information, but generally the way to keep track of this information is in another integer variable.
Everyone telling you it's impossible is technically correct (the best kind of correct).
For engineering reasons, it is a bad idea to rely on the malloc subsystem to tell you the size of an allocated block accurately. To convince yourself of this, imagine that you were writing a large application, with several different memory allocators — maybe you use raw libc malloc in one part, but C++ operator new in another part, and then some specific Windows API in yet another part. So you've got all kinds of void* flying around. Writing a function that can work on any of these void*s impossible, unless you can somehow tell from the pointer's value which of your heaps it came from.
So you might want to wrap up each pointer in your program with some convention that indicates where the pointer came from (and where it needs to be returned to). For example, in C++ we call that std::unique_ptr<void> (for pointers that need to be operator delete'd) or std::unique_ptr<void, D> (for pointers that need to be returned via some other mechanism D). You could do the same kind of thing in C if you wanted to. And once you're wrapping up pointers in bigger safer objects anyway, it's just a small step to struct SizedPtr { void *ptr; size_t size; } and then you never need to worry about the size of an allocation again.
However.
There are also good reasons why you might legitimately want to know the actual underlying size of an allocation. For example, maybe you're writing a profiling tool for your app that will report the actual amount of memory used by each subsystem, not just the amount of memory that the programmer thought he was using. If each of your 10-byte allocations is secretly using 16 bytes under the hood, that's good to know! (Of course there will be other overhead as well, which you're not measuring this way. But there are yet other tools for that job.) Or maybe you're just investigating the behavior of realloc on your platform. Or maybe you'd like to "round up" the capacity of a growing allocation to avoid premature reallocations in the future. Example:
SizedPtr round_up(void *p) {
size_t sz = portable_ish_malloced_size(p);
void *q = realloc(p, sz); // for sanitizer-cleanliness
assert(q != NULL && portable_ish_malloced_size(q) == sz);
return (SizedPtr){q, sz};
}
bool reserve(VectorOfChar *v, size_t newcap) {
if (v->sizedptr.size >= newcap) return true;
char *newdata = realloc(v->sizedptr.ptr, newcap);
if (newdata == NULL) return false;
v->sizedptr = round_up(newdata);
return true;
}
To get the size of the allocation behind a non-null pointer which has been returned directly from libc malloc — not from a custom heap, and not pointing into the middle of an object — you can use the following OS-specific APIs, which I have bundled up into a "portable-ish" wrapper function for convenience. If you find a common system where this code doesn't work, please leave a comment and I'll try to fix it!
#if defined(__linux__)
// https://linux.die.net/man/3/malloc_usable_size
#include <malloc.h>
size_t portable_ish_malloced_size(const void *p) {
return malloc_usable_size((void*)p);
}
#elif defined(__APPLE__)
// https://www.unix.com/man-page/osx/3/malloc_size/
#include <malloc/malloc.h>
size_t portable_ish_malloced_size(const void *p) {
return malloc_size(p);
}
#elif defined(_WIN32)
// https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/msize
#include <malloc.h>
size_t portable_ish_malloced_size(const void *p) {
return _msize((void *)p);
}
#else
#error "oops, I don't know this system"
#endif
#include <stdio.h>
#include <stdlib.h> // for malloc itself
int main() {
void *p = malloc(42);
size_t true_length = portable_ish_malloced_size(p);
printf("%zu\n", true_length);
}
Tested on:
Visual Studio, Win64 — _msize
GCC/Clang, glibc, Linux — malloc_usable_size
Clang, libc, Mac OS X — malloc_size
Clang, jemalloc, Mac OS X — works in practice but I wouldn't trust it (silently mixes jemalloc's malloc and the native libc's malloc_size)
Should work fine with jemalloc on Linux
Should work fine with dlmalloc on Linux if compiled without USE_DL_PREFIX
Should work fine with tcmalloc everywhere
Like everyone else already said: No there isn't.
Also, I would always avoid all the vendor-specific functions here, because when you find that you really need to use them, that's generally a sign that you're doing it wrong. You should either store the size separately, or not have to know it at all. Using vendor functions is the quickest way to lose one of the main benefits of writing in C, portability.
I would expect this to be implementation dependent.
If you got the header data structure, you could cast it back on the pointer and get the size.
If you use malloc then you can not get the size.
In the other hand, if you use OS API to dynamically allocate memory, like Windows heap functions, then it's possible to do that.
Well now I know this is not answering your specific question, however thinking outside of the box as it were... It occurs to me you probably do not need to know. Ok, ok, no I don't mean your have a bad or un-orthodox implementation... I mean is that you probably (without looking at your code I am only guessing) you prbably only want to know if your data can fit in the allocated memory, if that is the case then this solution might be better. It should not offer too much overhead and will solve your "fitting" problem if that is indeed what you are handling:
if ( p != (tmp = realloc(p, required_size)) ) p = tmp;
or if you need to maintain the old contents:
if ( p != (tmp = realloc(p, required_size)) ) memcpy(tmp, p = tmp, required_size);
of course you could just use:
p = realloc(p, required_size);
and be done with it.
Quuxplusone wrote: "Writing a function that can work on any of these void*s impossible, unless you can somehow tell from the pointer's value which of your heaps it came from."
Determine size of dynamically allocated memory in C"
Actually in Windows _msize gives you the allocated memory size from the value of the pointer. If there is no allocated memory at the address an error is thrown.
int main()
{
char* ptr1 = NULL, * ptr2 = NULL;
size_t bsz;
ptr1 = (char*)malloc(10);
ptr2 = ptr1;
bsz = _msize(ptr2);
ptr1++;
//bsz = _msize(ptr1); /* error */
free(ptr2);
return 0;
}
Thanks for the #define collection. Here is the macro version.
#define MALLOC(bsz) malloc(bsz)
#define FREE(ptr) do { free(ptr); ptr = NULL; } while(0)
#ifdef __linux__
#include <malloc.h>
#define MSIZE(ptr) malloc_usable_size((void*)ptr)
#elif defined __APPLE__
#include <malloc/malloc.h>
#define MSIZE(ptr) malloc_size(const void *ptr)
#elif defined _WIN32
#include <malloc.h>
#define MSIZE(ptr) _msize(ptr)
#else
#error "unknown system"
#endif
Note: using _msize only works for memory allocated with calloc, malloc, etc. As stated on the Microsoft Documentation
The _msize function returns the size, in bytes, of the memory block
allocated by a call to calloc, malloc, or realloc.
And will throw an exception otherwise.
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/msize?view=vs-2019
This code will probably work on most Windows installations:
template <class T>
int get_allocated_bytes(T* ptr)
{
return *((int*)ptr-4);
}
template <class T>
int get_allocated_elements(T* ptr)
{
return get_allocated_bytes(ptr)/sizeof(T);
}
I was struggling recently with visualizing the memory that was available to write to (i.e using strcat or strcpy type functions immediately after malloc).
This is not meant to be a very technical answer, but it could help you while debugging, as much as it helped me.
You can use the size you mallocd in a memset, set an arbitrary value for the second parameter (so you can recognize it) and use the pointer that you obtained from malloc.
Like so:
char* my_string = (char*) malloc(custom_size * sizeof(char));
if(my_string) { memset(my_string, 1, custom_size); }
You can then visualize in the debugger how your allocated memory looks like:
This may work, a small update in your code:
void* inc = (void*) (++p)
size=p-inc;
But this will result 1, that is, memory associated with p if it is char*. If it is int* then result will be 4.
There is no way to find out total allocation.

Should you free at the end of a C program [duplicate]

This question already has answers here:
What REALLY happens when you don't free after malloc before program termination?
(20 answers)
Is freeing allocated memory needed when exiting a program in C
(8 answers)
Should I free memory before exit?
(5 answers)
Closed 5 years ago.
Suppose I have a program like the following
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
if (argc < 2) return 1;
long buflen = atol(argv[1]);
char *buf = malloc(buflen);
fread(buf, 1, buflen, stdin);
// Do stuff with buf
free(buf);
return 0;
}
Programs like these typically have more complex cleanup code, often including several calls to free and sometimes labels or even cleanup functions for error handling.
My question is this: Is the free(buf) at the end actually necessary? My understanding is that the kernel will automatically clean up unfreed memory when the program exits, but if this is the case, why is putting free at the end of code such a common pattern?
BusyBox provides a compilation option to disable calling free at the end of execution. If this isn't an issue, then why would anyone disable that option? Is it purely because programs like Valgrind detect memory leaks when allocated memory isn't freed?
Actually, as in absolutely? On a modern operating system, no. In some environments, yes.
It's always a good plan to clean up everything you allocate as this makes it very easy to scan for memory leaks. If you have outstanding allocations just prior to your exit you have a leak. If you don't free things because the OS does it for you then you don't know if it's a mistake or intended behaviour.
You're also supposed to check for errors from any function that might return them, like fread, but you don't, so you're already firmly in the danger zone here. Is this mission critical code where if it crashes Bad Things happen? If so you'll want to do everything absolutely by the book.
As Jean-François pointed out the way this trivial code is composed is a bad example. Most programs will look more like this:
void do_stuff_with_buf(char* arg) {
long buflen = atol(arg);
char *buf = malloc(buflen);
fread(buf, 1, buflen, stdin);
// Do stuff with buf
free(buf);
}
int main(int argc, char *argv[]) {
if (argc < 2)
return 1;
do_stuff_with_buf(argv[1])
return 0;
}
Here it should be more obvious that the do_stuff_with_buf function should clean up for itself, it can't depend on the program exiting to release resources. If that function was called multiple times you shouldn't leak memory, that's just sloppy and can cause serious problems. A run-away allocation can cause things like the infamous Linux "OOM killer" to show up and go on a murder spree to free up some memory, something that usually leads to nothing but chaos and confusion.

Is malloc faster when I freed memory before

When I allocate and free memory and afterwards I allocate memory that is max the size as the previously freed part.
May the 2nd allocation be faster than the first?
Maybe because it already knows a memory region that is free?
Or because this part of the heap is still assigned to the process?
Are there other possible advantages?
Or does it generally make no difference?
Edit: As asked in the comments:
I am especially interested in gcc and MSVC.
My assumption was that the memory was not "redeemed" by the OS before.
As there is a lot going about specific details about implementation, I'd like to make it more clear, that this is a hypothetical question.
I don't intend to abuse this, but I just want to know IF this may occur and what the reasons for the hypothetical speedup might be.
On some common platforms like GCC x86_64, there are two kinds of malloc(): the traditional kind for small allocations, and the mmap kind for large ones. Large, mmap-based allocations will have less interdependence. But traditional small ones will indeed experience a big speedup in some cases when memory has previously been free()'d.
This is because as you suggest, free() does not instantly return memory to the OS. Indeed it cannot do so in general, because the memory might be in the middle of the heap which is contiguous. So on lots of systems (but not all), malloc() will only be slow when it needs to ask the OS for more heap space.
Memory allocation with malloc should be faster whenever you avoid making system calls like sbrk or mmap. You will at least save a context switch.
Make an experiment with the following program
#include <stdlib.h>
int main() {
void* x = malloc(1024*1024);
free(x);
x = malloc(1024*1024);
}
and run it with command strace ./a.out
When you remove call to free you will notice two additional system calls brk.
Here's simple banchmark I compiled at -O1:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char** argv){
for(int i=0;i<10000000;i++){
char volatile * p = malloc(100);
if(!p) { perror(0); exit(1); }
*p='x';
//free((char*)p);
}
return 0;
}
An iteration cost about 60ns with free and about 150ns without on my Linux.
Yes, mallocs after free can be significantly faster.
It depends on the allocated sizes. These small sizes will not be returned to the OS. For larger sizes that are powers of two, the glibc malloc starts mmaping and unmmapping and then I'd expect a slowdown in the freeing variant.

GCC how to detect stack buffer overflow

Since there is an option -fstack-protector-strong in gcc to detect stack smashing. However, it can not always detect stack buffer overflow. For the first function func, when I input a 10 char more string, the program does not always crash. My question is where there is a way to detect stack buffer overflow.
void func()
{
char array[10];
gets(array);
}
void func2()
{
char buffer[10];
int n = sprintf(buffer, "%s", "abcdefghpapeas");
printf("aaaa [%d], [%s]\n", n, buffer);
}
int main ()
{
func();
func2();
}
Overflows on the stack are either hard to detect or very expensive to detect - chose your poison.
In a nutshell, when you have this:
char a,b;
char *ptr=&a;
ptr[1] = 0;
then this is technically legal: There is space allocated on the stack which belongs to the function. It's just very dangerous.
So the solution might be to add a gap between a and b and fill that with a pattern. But, well, some people actually write code as above. So your compiler needs to detect that.
Alternatively, we could create a bit-map of all bytes that your code has really allocated and then instrument all the code to check against this map. Very safe, pretty slow, bloats your memory usage. On the positive side, there are tools to help with this (like Valgrind).
See where I'm going?
Conclusion: In C, there is no good way to automatically detect many memory problems because the language and the API is often too sloppy. The solution is to move code into helper functions that check their parameters rigorously, always to the right thing and have good unit test coverage.
Always use snprintf() versions of functions if you have a choice. If old code uses the unsafe versions, change it.
You can use a tool called Valgrind
http://valgrind.org/
My question is where there is a way to detect stack buffer overflow...
void func()
{
char array[10];
gets(array);
}
void func2()
{
char buffer[10];
int n = sprintf(buffer, "%s", "abcdefghpapeas");
printf("aaaa [%d], [%s]\n", n, buffer);
}
Because you are using GCC, you can use FORTIFY_SOURCES.
FORTIFY_SOURCE uses "safer" variants of high risk functions like memcpy, strcpy and gets. The compiler uses the safer variants when it can deduce the destination buffer size. If the copy would exceed the destination buffer size, then the program calls abort(). If the compiler cannot deduce the destination buffer size, then the "safer" variants are not used.
To disable FORTIFY_SOURCE for testing, you should compile the program with -U_FORTIFY_SOURCE or -D_FORTIFY_SOURCE=0.
The C Standard has "safer" functions via ISO/IEC TR 24731-1, Bounds Checking Interfaces. On conforming platforms, you can simply call gets_s and sprintf_s. They offer consistent behavior (like always ensuring a string is NULL terminated) and consistent return values (like 0 on success or an errno_t).
Unfortunately, gcc and glibc does not conform to the C Standard. Ulrich Drepper (one of the glibc maintainers) called bounds checking interfaces "horribly inefficient BSD crap", and they were never added. Hopefully it will change in the future.
First of all Do Not Use gets. By now almost everyone knows the all the security and reliability problems that can occur with gets. But it's included here for historical reasons as well because it's a very good example of bad programming.
Let's look at all the problems with the code:
// Really bad code
char line[100];
gets(line);
Because gets does not do bounds checking a string longer than 100 characters will overwrite memory. If you're lucky the program will just crash Or it might exhibit strange behavior.
The gets function is so bad that the GNU gcc linker issues a warning whenever it's used.
/tmp/ccI5WJ5m.o(.text+0x24): In function `main':
: warning: the `gets' function is dangerous and should not be used.
Protect array accesses with assert
C/C++ does not do bound checking.
for example:
int data[10]
i = 20
data[20] = 100 //Memory Corruption
Use the assert function for above code
#include<assert.h>
int data[10];
i=20
assert((i >= 0) && (i < sizeof(data) / sizeof(data[0]))); // throws
data[i] = 100
Array overflows are one of the most common programming errors and are extremely frustrating to try and locate. This code doesn't eliminate them, but it does cause buggy code to abort early in a way that makes the problem tremendously easier to find.
And use snprintf(buffer, sizeof(buffer), "%s", "abcdefghpapeas") and some tools like valgrind or GDB.
Hope this helps you..

Determine size of dynamically allocated memory in C

Is there a way in C to find out the size of dynamically allocated memory?
For example, after
char* p = malloc (100);
Is there a way to find out the size of memory associated with p?
There is no standard way to find this information. However, some implementations provide functions like msize to do this. For example:
_msize on Windows
malloc_size on MacOS
malloc_usable_size on systems with glibc
Keep in mind though, that malloc will allocate a minimum of the size requested, so you should check if msize variant for your implementation actually returns the size of the object or the memory actually allocated on the heap.
comp.lang.c FAQ list · Question 7.27 -
Q. So can I query the malloc package to find out how big an
allocated block is?
A. Unfortunately, there is no standard or portable way. (Some
compilers provide nonstandard extensions.) If you need to know, you'll
have to keep track of it yourself. (See also question 7.28.)
The C mentality is to provide the programmer with tools to help him with his job, not to provide abstractions which change the nature of his job. C also tries to avoid making things easier/safer if this happens at the expense of the performance limit.
Certain things you might like to do with a region of memory only require the location of the start of the region. Such things include working with null-terminated strings, manipulating the first n bytes of the region (if the region is known to be at least this large), and so forth.
Basically, keeping track of the length of a region is extra work, and if C did it automatically, it would sometimes be doing it unnecessarily.
Many library functions (for instance fread()) require a pointer to the start of a region, and also the size of this region. If you need the size of a region, you must keep track of it.
Yes, malloc() implementations usually keep track of a region's size, but they may do this indirectly, or round it up to some value, or not keep it at all. Even if they support it, finding the size this way might be slow compared with keeping track of it yourself.
If you need a data structure that knows how big each region is, C can do that for you. Just use a struct that keeps track of how large the region is as well as a pointer to the region.
Here's the best way I've seen to create a tagged pointer to store the size with the address. All pointer functions would still work as expected:
Stolen from: https://stackoverflow.com/a/35326444/638848
You could also implement a wrapper for malloc and free to add tags
(like allocated size and other meta information) before the pointer
returned by malloc. This is in fact the method that a c++ compiler
tags objects with references to virtual classes. Here is one working
example:
#include <stdlib.h>
#include <stdio.h>
void * my_malloc(size_t s)
{
size_t * ret = malloc(sizeof(size_t) + s);
*ret = s;
return &ret[1];
}
void my_free(void * ptr)
{
free( (size_t*)ptr - 1);
}
size_t allocated_size(void * ptr)
{
return ((size_t*)ptr)[-1];
}
int main(int argc, const char ** argv) {
int * array = my_malloc(sizeof(int) * 3);
printf("%u\n", allocated_size(array));
my_free(array);
return 0;
}
The advantage of this method over a structure with size and pointer
struct pointer
{
size_t size;
void *p;
};
is that you only need to replace the malloc and free calls. All
other pointer operations require no refactoring.
No, the C runtime library does not provide such a function.
Some libraries may provide platform- or compiler-specific functions that can get this information, but generally the way to keep track of this information is in another integer variable.
Everyone telling you it's impossible is technically correct (the best kind of correct).
For engineering reasons, it is a bad idea to rely on the malloc subsystem to tell you the size of an allocated block accurately. To convince yourself of this, imagine that you were writing a large application, with several different memory allocators — maybe you use raw libc malloc in one part, but C++ operator new in another part, and then some specific Windows API in yet another part. So you've got all kinds of void* flying around. Writing a function that can work on any of these void*s impossible, unless you can somehow tell from the pointer's value which of your heaps it came from.
So you might want to wrap up each pointer in your program with some convention that indicates where the pointer came from (and where it needs to be returned to). For example, in C++ we call that std::unique_ptr<void> (for pointers that need to be operator delete'd) or std::unique_ptr<void, D> (for pointers that need to be returned via some other mechanism D). You could do the same kind of thing in C if you wanted to. And once you're wrapping up pointers in bigger safer objects anyway, it's just a small step to struct SizedPtr { void *ptr; size_t size; } and then you never need to worry about the size of an allocation again.
However.
There are also good reasons why you might legitimately want to know the actual underlying size of an allocation. For example, maybe you're writing a profiling tool for your app that will report the actual amount of memory used by each subsystem, not just the amount of memory that the programmer thought he was using. If each of your 10-byte allocations is secretly using 16 bytes under the hood, that's good to know! (Of course there will be other overhead as well, which you're not measuring this way. But there are yet other tools for that job.) Or maybe you're just investigating the behavior of realloc on your platform. Or maybe you'd like to "round up" the capacity of a growing allocation to avoid premature reallocations in the future. Example:
SizedPtr round_up(void *p) {
size_t sz = portable_ish_malloced_size(p);
void *q = realloc(p, sz); // for sanitizer-cleanliness
assert(q != NULL && portable_ish_malloced_size(q) == sz);
return (SizedPtr){q, sz};
}
bool reserve(VectorOfChar *v, size_t newcap) {
if (v->sizedptr.size >= newcap) return true;
char *newdata = realloc(v->sizedptr.ptr, newcap);
if (newdata == NULL) return false;
v->sizedptr = round_up(newdata);
return true;
}
To get the size of the allocation behind a non-null pointer which has been returned directly from libc malloc — not from a custom heap, and not pointing into the middle of an object — you can use the following OS-specific APIs, which I have bundled up into a "portable-ish" wrapper function for convenience. If you find a common system where this code doesn't work, please leave a comment and I'll try to fix it!
#if defined(__linux__)
// https://linux.die.net/man/3/malloc_usable_size
#include <malloc.h>
size_t portable_ish_malloced_size(const void *p) {
return malloc_usable_size((void*)p);
}
#elif defined(__APPLE__)
// https://www.unix.com/man-page/osx/3/malloc_size/
#include <malloc/malloc.h>
size_t portable_ish_malloced_size(const void *p) {
return malloc_size(p);
}
#elif defined(_WIN32)
// https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/msize
#include <malloc.h>
size_t portable_ish_malloced_size(const void *p) {
return _msize((void *)p);
}
#else
#error "oops, I don't know this system"
#endif
#include <stdio.h>
#include <stdlib.h> // for malloc itself
int main() {
void *p = malloc(42);
size_t true_length = portable_ish_malloced_size(p);
printf("%zu\n", true_length);
}
Tested on:
Visual Studio, Win64 — _msize
GCC/Clang, glibc, Linux — malloc_usable_size
Clang, libc, Mac OS X — malloc_size
Clang, jemalloc, Mac OS X — works in practice but I wouldn't trust it (silently mixes jemalloc's malloc and the native libc's malloc_size)
Should work fine with jemalloc on Linux
Should work fine with dlmalloc on Linux if compiled without USE_DL_PREFIX
Should work fine with tcmalloc everywhere
Like everyone else already said: No there isn't.
Also, I would always avoid all the vendor-specific functions here, because when you find that you really need to use them, that's generally a sign that you're doing it wrong. You should either store the size separately, or not have to know it at all. Using vendor functions is the quickest way to lose one of the main benefits of writing in C, portability.
I would expect this to be implementation dependent.
If you got the header data structure, you could cast it back on the pointer and get the size.
If you use malloc then you can not get the size.
In the other hand, if you use OS API to dynamically allocate memory, like Windows heap functions, then it's possible to do that.
Well now I know this is not answering your specific question, however thinking outside of the box as it were... It occurs to me you probably do not need to know. Ok, ok, no I don't mean your have a bad or un-orthodox implementation... I mean is that you probably (without looking at your code I am only guessing) you prbably only want to know if your data can fit in the allocated memory, if that is the case then this solution might be better. It should not offer too much overhead and will solve your "fitting" problem if that is indeed what you are handling:
if ( p != (tmp = realloc(p, required_size)) ) p = tmp;
or if you need to maintain the old contents:
if ( p != (tmp = realloc(p, required_size)) ) memcpy(tmp, p = tmp, required_size);
of course you could just use:
p = realloc(p, required_size);
and be done with it.
Quuxplusone wrote: "Writing a function that can work on any of these void*s impossible, unless you can somehow tell from the pointer's value which of your heaps it came from."
Determine size of dynamically allocated memory in C"
Actually in Windows _msize gives you the allocated memory size from the value of the pointer. If there is no allocated memory at the address an error is thrown.
int main()
{
char* ptr1 = NULL, * ptr2 = NULL;
size_t bsz;
ptr1 = (char*)malloc(10);
ptr2 = ptr1;
bsz = _msize(ptr2);
ptr1++;
//bsz = _msize(ptr1); /* error */
free(ptr2);
return 0;
}
Thanks for the #define collection. Here is the macro version.
#define MALLOC(bsz) malloc(bsz)
#define FREE(ptr) do { free(ptr); ptr = NULL; } while(0)
#ifdef __linux__
#include <malloc.h>
#define MSIZE(ptr) malloc_usable_size((void*)ptr)
#elif defined __APPLE__
#include <malloc/malloc.h>
#define MSIZE(ptr) malloc_size(const void *ptr)
#elif defined _WIN32
#include <malloc.h>
#define MSIZE(ptr) _msize(ptr)
#else
#error "unknown system"
#endif
Note: using _msize only works for memory allocated with calloc, malloc, etc. As stated on the Microsoft Documentation
The _msize function returns the size, in bytes, of the memory block
allocated by a call to calloc, malloc, or realloc.
And will throw an exception otherwise.
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/msize?view=vs-2019
This code will probably work on most Windows installations:
template <class T>
int get_allocated_bytes(T* ptr)
{
return *((int*)ptr-4);
}
template <class T>
int get_allocated_elements(T* ptr)
{
return get_allocated_bytes(ptr)/sizeof(T);
}
I was struggling recently with visualizing the memory that was available to write to (i.e using strcat or strcpy type functions immediately after malloc).
This is not meant to be a very technical answer, but it could help you while debugging, as much as it helped me.
You can use the size you mallocd in a memset, set an arbitrary value for the second parameter (so you can recognize it) and use the pointer that you obtained from malloc.
Like so:
char* my_string = (char*) malloc(custom_size * sizeof(char));
if(my_string) { memset(my_string, 1, custom_size); }
You can then visualize in the debugger how your allocated memory looks like:
This may work, a small update in your code:
void* inc = (void*) (++p)
size=p-inc;
But this will result 1, that is, memory associated with p if it is char*. If it is int* then result will be 4.
There is no way to find out total allocation.

Resources