This question already has answers here:
How to determine CPU and memory consumption from inside a process
(10 answers)
Closed 9 years ago.
Is there any way to calculate memory consumption in C. I have checked other answers on Stackoverflow but they were not satisfactory.
Something similar to the one we have in Java:
// Get the Java runtime
Runtime runtime = Runtime.getRuntime();
// Run the garbage collector
runtime.gc();
// Calculate the used memory
long memory = runtime.totalMemory() - runtime.freeMemory();
System.out.println("Used memory is bytes: " + memory + "bytes");
System.out.println("Used memory is kilobytes: " + bytesTokilobytes(memory) +"kb");
C-language itself does not provide any means.
Although every specific platform gives you some support.
For example on Windows you can look at the TaskManager, Details tab. Right click on the listview column headers to add/remove columns. Some of them give insight on how much memory the process consumes. There are a lot of other tools including commercial ones (use google), that give more detailed picture.
On Windows there is also special API that allows writing your own tool. A while ago I wrote one. I do not want this answer to be an ad.
The real question seems to be, can you get the C heap to report how much space it's currently holding. I don't know of a portable way to do this.
Or you could plug in a "debugging heap" implementation which tracks this number and provides an API to retrieve it; debugging heaps are available as second-source libraries, and one MAY come with your compiler. (Many years ago I implemented a debugging heap as a set of macros which intercepted heap calls and redirected them through wrapper routines to perform several kinds of analysis; I wasn't maintaining a usage counter but I could have done so.) ((CAUTION: Anything allocated from a debugging heap MUST be returned to that heap, not the normal heap, and vice versa, or things get very ugly very quickly.))
Or your compiler may have some other nonstandard way to retrieve this information. Check its documentation.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Opinion-based Update the question so it can be answered with facts and citations by editing this post.
Improve this question
Context: while compiling some C code compilers may show high RAM consumption. Preliminary investigation shows that (at least some) C compilers do not immediately free "further unused" memories: despite that such (previously allocated) memories are not used anymore, they are still kept in the RAM. The C compiler continues processing the C code, allocating more memories in the RAM, until it reaches OOM (out of memory).
The core question: should C compilers immediately free "further unused" memories?
Rationale:
Efficient RAM utilization: no need of mem_X anymore => free mem_X to let other processes (itself including) to use mem_X.
Ability to compile the "RAM demanding" C code.
UPD20210825. I've memory-profiled some C compiler and have found that it keeps in RAM "C preprocessor data", in particular:
macro table (memory pool for macros);
scanner token objects (memory pool for tokens and for lists).
At certain point X in the middle-end (after the IR is built) these objects seem not needed anymore and, hence, can be freed. (However, now these objects are kept in RAM until a point X+1.) The benefit is seen on "preprocessor-heavy" C programs. Example: "preprocessor-heavy" C program using "ad hoc polymorphism" implemented via C preprocessor (by using a set of macros it progressively implements all the needed "machinery" to support a common interface for an arbitrary (and supported) set of individually specified types). The number of "polymorphic" entries is ~50k * 12 = ~600k (yes, it does not say anything). Results:
before fix: at point X C compiler keeps in RAM ~1.5GB of unused "C preprocessor data";
after fix: at point X C compiler frees from RAM ~1.5GB of unused "C preprocessor data", hence, letting OS processes (itself including) to use these ~1.5GB.
I don't know where you get your analysis from. Most parts like the abstract syntax tree is kept because it is used in all different passes.
It might be that some, especially simple compilers don't free stuff because it's not considered necessary for a C compiler. It's a one shot compilation unit operation and than the process ends.
Of course if you build a compiler library like tinycc did you need to free everything, but even this might happen within a final custom heap clearance at the end of the compilation run.
I have not seen this ever be a problem in real world. But i don't do embedded stuff where a lack of resources can be something to worry.
allocating more memories in the RAM, until it reaches OOM (out of
memory).
None of the compilers I use ever run out of memory. Please give an example of such behaviour.
If you are an Arduino user and think about the code which will not fit into the memory - it is not the problem of the compiler, only the programmer.
This question already has answers here:
When you exit a C application, is the malloc-ed memory automatically freed?
(9 answers)
Closed 8 years ago.
I am using Visual C++ 2010 for a C project. I have the following code in main():
ARRAY2D a;
arr2_init(&a, 5, 5); /* 5x5 dynamic, multi-demensional array (of type int) */
arr2_release(&a);
I'm not sure if I need the last line. Can I omit out arr2_release() at the end of the program in modern OS's? I'm using Windows 7.
Yes, you can avoid releasing any resource manually which the runtime or the OS will clean up after you.
Still, do not do so please.
It is a valid optimisation for faster shutdown (and sometimes even for faster execution in exchange for memory consumption), though you must be picky about which resources you leave around:
Memory and file descriptors are efficiently handled by the OS (ancient platforms not doing so have mostly succumbed to disuse. Still, there are a few tiny systems not able to free this).
FILE buffers are efficiently cleaned up by the runtime.
Windows GUI resources are not efficiently cleaned up this way, it needs longer.
Anyway, do the cleanup and develop the right mind-set, it makes searching for leaks much easier and is better transferable to bigger and longer-running tasks.
Premature optimisation is the root of all evil. (The expert only option to optimise after measurement and careful consideration does not apply yet)
Always free your memory. The operating system will release a process' resources when it terminates, which includes its memory. But that doesn't give you garbage collection (you have to use different languages for that). Also note that it only does so after your program has ended (also stated in the comments), so as long as your program is running, the memory will not be freed if you don't do it.
Your code might be used as part of a bigger program someday, even if it's now only a few lines. So always make sure to release all resources you acquire. Also, as a C programmer, thinking about resource management should be a habit anyway.
This question already has answers here:
disable the randomness in malloc
(6 answers)
Closed 9 years ago.
I'm experimenting with Pin, an instrumentation tool, which I use to compute some statistics based on memory address of my variables. I want to re-run my program with the information gathered by my instrumentation tool, but for that it's crucial that virtual memory addresses remain the same through different runs.
In general, I should let the OS handle memory allocation, but in this case I need some kind of way to force it to always allocate to the same virtual address. In particular, I'm interested in a very long array, which I'm currently allocating with numa_alloc_onnode(), though I could use something else.
What would be the correct way to proceed?
Thanks
You could try mmap(2).
The instrumented version of your program will use a different memory layout than the original program because pin needs memory for the dynamic translation etc. and will change the memory layout. (if I recall correctly)
With the exception of address space layout randomization, most memory allocators, loaders, and system routines for assigning virtual memory addresses will return the same results given the same calls and data (not by deliberate design for that but by natural consequence of how software works). So, you need to:
Disable address space layout randomization.
Ensure your program executes in the same way each time.
Address space layout randomization is deliberate changes to address space to foil attackers: If the addresses are changed in each program execution, it is more difficult for attacks to use various exploits to control the code that is executed. It should be disabled only temporarily and only for debugging purposes. This answer shows one method of doing that and links to more information, but the exact method may depend on the version of Linux you are using.
Your program may execute differently for a variety of reasons, such as using threads or using asynchronous signals or interprocess communication. It will be up to you to control that in your program.
Generally, memory allocation is not guaranteed to be reproducible. The results you get may be on an as-is basis.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
hi i tried to allocate memory continuously using calloc function . so obviously the system memory gets filled and crashed. But the worst part is even if i am a standard user and if i am able to run that program the system is crashing . how can we block this to happen for standard user.
code used is :
#include <stdio.h>
#include <stdlib.h>
int main()
{
while(1)
{
int *abc;
abc=(int*)calloc(1000,sizeof(int));
}
}
There might be some way to block this , or else user gets ssh access then easily he can crash the system easily .
You can set up various memory limits.
ulimit springs to mind. This can be set by the shell (see http://unixhelp.ed.ac.uk/CGI/man-cgi?ulimit). There is also a way of setting them system wide. There is a file that stores these but off the top of my head cannot remember where it is located. You need to google it.
You should then set up memory limit for your ssh shells.
There is some discussion here
First, you are allocating memory without freeing any. The system's memory is finite, which means that at some point, you'll run out of memory, if the OS doesn't crash your system first.
That's the first thing to fix.
As for preventing a standard user from allocating memory, a program cannot automatically identify if a user is standard or not.
So, you might need to define a variable/constant to which the system passes an argument to the program when it is ran. Check the constant/variable to determine if the user is standard. If the user isn't, that snippet is not run.
How to identify standard and privileged users differ from OS to OS; so, you might need to figure that out first.
Hope this gives you somewhere to start.
I'm not sure if you are looking at this from the right angle.
Given a user has shell access to your system, and that user has malicious intent, then a simple DoS (denial of service) Attack that crashes your system is the least of your worries.
That being said, unix environments usually avoid this problem by restricting shell access to trusted users, and only in very rare cases it is necessary to directly restrict memory consumption for processes.
in unix lands there is a concept of 'nice' which controls scheduling priority...
so as a process is becoming more of a jerk it will get a lower and lower nice number...
and the pre-emptive scheduler will give it less and less time... other VM mechanisms could kill it... but it is up to the kernel/scheduler to keep the process from becoming too resource intensive....
In practicality, every system that I have worked on could be affected with the type of code you have written.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I have an single threaded, embedded application that allocates and deallocates lots and lots of small blocks (32-64b). The perfect scenario for a cache based allocator. And although I could TRY to write one it'll likely be a waste of time, and not as well tested and tuned as some solution that's already been on the front lines.
So what would be the best allocator I could use for this scenario?
Note: I'm using a Lua Virtual Machine in the system (which is the culprit of 80+% of the allocations), so I can't trivially refactor my code to use stack allocations to increase allocation performance.
I'm a bit late to the party, but I just want to share very efficient memory allocator for embedded systems I've recently found and tested: https://github.com/dimonomid/umm_malloc
This is a memory management library specifically designed to work with the ARM7, personally I use it on PIC32 device, but it should work on any 16- and 8-bit device (I have plans to test in on 16-bit PIC24, but I haven't tested it yet)
I was seriously beaten by fragmentation with default allocator: my project often allocates blocks of various size, from several bytes to several hundreds of bytes, and sometimes I faced 'out of memory' error. My PIC32 device has total 32K of RAM, and 8192 bytes is used for heap. At the particular moment there is more than 5K of free memory, but default allocator has maximum non-fragmented memory block just of about 700 bytes, because of fragmentation. This is too bad, so I decided to look for more efficient solution.
I already was aware of some allocators, but all of them has some limitations (such as block size should be a power or 2, and starting not from 2 but from, say, 128 bytes), or was just buggy. Every time before, I had to switch back to the default allocator.
But this time, I'm lucky: I've found this one: http://hempeldesigngroup.com/embedded/stories/memorymanager/
When I tried this memory allocator, in exactly the same situation with 5K of free memory, it has more than 3800 bytes block! It was so unbelievable to me (comparing to 700 bytes), and I performed hard test: device worked heavily more than 30 hours. No memory leaks, everything works as it should work.
I also found this allocator in the FreeRTOS repository: http://svnmios.midibox.org/listing.php?repname=svn.mios32&path=%2Ftrunk%2FFreeRTOS%2FSource%2Fportable%2FMemMang%2F&rev=1041&peg=1041# , and this fact is an additional evidence of stability of umm_malloc.
So I completely switched to umm_malloc, and I'm quite happy with it.
I just had to change it a bit: configuration was a bit buggy when macro UMM_TEST_MAIN is not defined, so, I've created the github repository (the link is at the top of this post). Now, user dependent configuration is stored in separate file umm_malloc_cfg.h
I haven't got deeply yet in the algorithms applied in this allocator, but it has very detailed explanation of algorithms, so anyone who is interested can look at the top of the file umm_malloc.c . At least, "binning" approach should give huge benefit in less-fragmentation: http://g.oswego.edu/dl/html/malloc.html
I believe that anyone who needs for efficient memory allocator for microcontrollers, should at least try this one.
In a past project in C I worked on, we went down the road of implementing our own memory management routines for a library ran on a wide range of platforms including embedded systems. The library also allocated and freed a large number of small buffers. It ran relatively well and didn't take a large amount of code to implement. I can give you a bit of background on that implementation in case you want to develop something yourself.
The basic implementation included a set of routines that managed buffers of a set size. The routines were used as wrappers around malloc() and free(). We used these routines to manage allocation of structures that we frequently used and also to manage generic buffers of set sizes. A structure was used to describe each type of buffer being managed. When a buffer of a specific type was allocated, we'd malloc() the memory in blocks (if a list of free buffers was empty). IE, if we were managing 10 byte buffers, we might make a single malloc() that contained space for 100 of these buffers to reduce fragmentation and the number of underlying mallocs needed.
At the front of each buffer would be a pointer that would be used to chain the buffers in a free list. When the 100 buffers were allocated, each buffer would be chained together in the free list. When the buffer was in use, the pointer would be set to null. We also maintained a list of the "blocks" of buffers, so that we could do a simple cleanup by calling free() on each of the actual malloc'd buffers.
For management of dynamic buffer sizes, we also added a size_t variable at the beginning of each buffer telling the size of the buffer. This was then used to identify which buffer block to put the buffer back into when it was freed. We had replacement routines for malloc() and free() that did pointer arithmetic to get the buffer size and then to put the buffer into the free list. We also had a limit on how large of buffers we managed. Buffers larger than this limit were simply malloc'd and passed to the user. For structures that we managed, we created wrapper routines for allocation and freeing of the specific structures.
Eventually we also evolved the system to include garbage collection when requested by the user to clean up unused memory. Since we had control over the whole system, there were various optimizations we were able to make over time to increase performance of the system. As I mentioned, it did work quite well.
I did some research on this very topic recently, as we had an issue with memory fragmentation. In the end we decided to stay with GNU libc's implementation, and add some application-level memory pools where necessary. There were other allocators which had better fragmentation behavior, but we weren't comfortable enough with them replace malloc globally. GNU's has the benefit of a long history behind it.
In your case it seems justified; assuming you can't fix the VM, those tiny allocations are very wasteful. I don't know what your whole environment is, but you might consider wrapping the calls to malloc/realloc/free on just the VM so that you can pass it off to a handler designed for small pools.
Although its been some time since I asked this, my final solution was to use LoKi's SmallObjectAllocator it work great. Got rid off all the OS calls and improved the performance of my Lua engine for embedded devices. Very nice and simple, and just about 5 minutes worth of work!
Since version 5.1, Lua has allowed a custom allocator to be set when creating new states.
I'd just also like to add to this even though it's an old thread. In an embedded application if you can analyze your memory usage for your application and come up with a max number of memory allocation of the varying sizes usually the fastest type of allocator is one using memory pools. In our embedded apps we can determine all allocation sizes that will ever be needed during run time. If you can do this you can completely eliminate heap fragmentation and have very fast allocations. Most these implementations have an overflow pool which will do a regular malloc for the special cases which will hopefully be far and few between if you did your analysis right.
I have used the 'binary buddy' system to good effect under vxworks. Basically, you portion out your heap by cutting blocks in half to get the smallest power of two sized block to hold your request, and when blocks are freed, you can make a pass up the tree to merge blocks back together to mitigate fragmentation. A google search should turn up all the info you need.
I am writing a C memory allocator called tinymem that is intended to be able to defragment the heap, and re-use memory. Check it out:
https://github.com/vitiral/tinymem
Note: this project has been discontinued to work on the rust implementation:
https://github.com/vitiral/defrag-rs
Also, I had not heard of umm_malloc before. Unfortunately, it doesn't seem to be able to deal with fragmentation, but it definitely looks useful. I will have to check it out.