Generational GC source code - c

I am studying GC implementations, and I'm currently looking for references and good open-source GC examples to base in.
Is there any good and simple generational GC implementation ? The second best thing would be good resources and guidelines!
Thank you!

I've wrote the Qish garbage collector (not really maintained any more, but feel free to ask). It is a free copying generational GC for C (with some coding styles restrictions).
The GCC MELT [meta-]plugin (free, GPLv3 licensed), providing a high level language, MELT, to extend the GCC compiler, also has a copying generational GC above the existing Ggc garbage collector of GCC. Look into gcc/melt-runtime.c
With generational copying GC, generating the application's code in C is quite useful. See my DSL2011 paper on MELT
Feel free to ask me more, I love talking about my GC-s.
Of course, reading the Garbage Collection Handbook: The Art of Automatic Memory Management (Jones, Hosking, Moss) [ISBN-13: 978-1420082791] is a must
(added in 2017)
Look also into Ravenbrook's Memory Pool System which can be used for generational GC.
Look also into the runtime of Ocaml, which has a good (single-threaded) generational GC.
PS. Debugging a generational copying GC is painful.

Java's HotSpot GC
You can look at the various GC implementations provided by the JVM here.
The Memory Management white paper gives an overview of the different garbage collectors implemented in the JVM. Its from 2006 so its missing the new G1 collector details but its a good starting point.
Mon's SGen GC
Mono's new SGen is on github too. Check out the sgen files.

The Ovm framework is open source and offers a framework that allows to select several features regarding garbage collection for real-time systems.
According to the website
Includes Minuteman RTGC framework which allows to select from newly
supported RTGC features: time-based scheduling (periodic, slack, and
hybrid - a combination of both), incremental stack scanning,
replication or Brooks barrier, incremental object copy, arraylets,
memory usage, and GC pause profiling and tracing.
Although domain specific, it may be a good starting point for your study.
I hope this helps.

The V8 project (Javascript engine used in Chrome and Android) is open source and has a simple generational garbage collector.
You can browse the source code online. In particular, look at heap.cc (implementation of the heap and scavenge algorithm), spaces.cc (lower level heap stuff), and mark-compact.cc (full garbage collector).

The Parrot VM also uses a generational garbage collector.

Although it's not in written in C, the JikesRVM JVM contains several GC implementations, including a couple of generational ones, and I think it's rather simple to understand.

The Boehm Garbage Collector is commonly used for C and C++ projects.

Related

How Kotlin/Native garbage collector works in C?

I'm found some explanation of Kotlin/Native memory management model in JetBrains FAQ.
A: Kotlin/Native provides an automated memory management scheme,
similar to what Java or Swift provides. The current implementation
includes an automated reference counter with a cycle collector to
collect cyclical garbage.
I understand more or less how it works in Java or Kotlin (JVM). Can any describe detailed how memory is managed in Kotlin/Native in projects with C?
Also, if there is the garbage collector, why do we need a Kotlin/Native function memScoped { }?
Also, I found here :
Kotlin/Native is a technology for compiling Kotlin to native binaries that run without any VM.
Broadly speaking, Native code is any code whose memory is not managed by the underlying framework but has to be managed by the programmer themselves. i.e. there is no Garbage collection.
e.g. C++’ delete and C’s free
which in my opinion contradicts what is written in JetBrains FAQ
Memory Management in K/N is provided by the runtime. It consists of two main parts: automatic reference counting and cycle collector. This provides one with availability to write the code just as in the Kotlin/JVM. Some details on this topic can be found digging inside this file but all you need to know is that it is automatic by default.
About MemScoped etc. When you use interoperability with C, you have to deal with managing such a resource as native memory. Native memory is the memory provided to the application process by the operating system. As it has nothing to do with the Kotlin code, this resource can't be managed by K/N runtime. But all C struct s and variables you are going to use must be allocated there. You can do it straight by calling nativeHeap.alloc() function. When the need of this memory is gone, it can be freed by nativeHeap.free().
But to make your experience more comfortable, K/N also has Arena class, implementing region-based memory management. It simplifies memory management to just a series of alloc() wherever you need, and one deallocation by .clear() for all the region.
Also, there is a MemScoped {} blocks, that covers Arena from you, and lets to forget about even the freeing native memory. So in your code that includes some elements from C, you can just write MemScoped { ... }, and then put operations into it. You can see a lot of examples of this approach in the samples from the K/N repository

observing memory allocation in Xcode Ansi C

I want to observe the dynamically allocated memory in a C program while running, and to detect memory leaks. My program allocates memory according to user input. I'm looking for hours now for tutorials that might help, but the thing is that all of what i've found is not based on user input! i want to insert input and run the "instruments" in the same time..any suggestions?
I'd suggest you watch WWDC 2012 video iOS App Performance: Memory. It gives excellent primer on types of memory, issues that can arise, coding conventions to watch out for, how to use Instruments to identify issues, etc. It's a good place to start.
Lots of leaks cannot be identified by the "Leaks" tool in Instruments. Check out the "Allocations" tool and some of the great features hidden in there such as heapshots (discussed in that video) or option-dragging in the Allocations tool graph. Also, make sure you avail yourself of the static analyzer ("Analyze" on the "Product" menu in Xcode, or command+shift+B) which can identify a remarkable number of issues just by analyzing at your code.

Should a C library offer ability to use custom memory allocators?

I see that some C libraries have ability to specify custom memory allocators (malloc/free replacements).
In what systems/environments/conditions is that useful? Isn't this feature just a leftover from MSDOS era or similar no-longer-relevant problems?
Background story:
I'm planning to make pngquant a library that can be embedded in various software (from iOS apps to Apache modules). I'm using malloc()/free() and my own memory pools for small allocations. I use 2MB-50MB of memory in total. I use threads, but only need to alloc on the main thread.
In any application where control over memory allocation is critical (for example my field, game development, or other real or near real time systems) the inability to control memory allocations in a library immediately disqualifies it from use.
Many malloc/free algorithms exist. The system malloc is sometimes not optimized for the task that the library is handling, so the caller might want to try a few different ones to optimize performance.
A few that come to mind are:
dlmalloc
jemalloc
TCMalloc
There are also Garbage Collection libraries such as the Boehm Garbage Collector which are usable in C by calling the provided malloc/free replacements (even though free is then a dummy function call, kept for compatibility).
There are also many possible uses, for example one may write a debug malloc/free function that could trace memory allocations and liberations in the library, such as one that I wrote that uses SQLite to record statistics about how the memory is used (admittedly at the cost of performance, but it is a debugging situation).

Library for task distribution in MPI (or other)?

I'm looking to implement 'branch and bound' over a cluster (like Amazon's say), as I want it to be horizontally scalable, not limited to a single CPU. There's a paper "Task Pool Teams: A Hybrid Programming Environment for Irregular Algorithms on SMP Clusters" by Judith Hippold and Gudula Runger. It's basically a bottom-up, task-stealing framework like Intel's TBB, except for ad-hoc networks instead of shared memory. If this library was available I'd use it (replacing the local, threaded part with TBB). Unfortunately they don't seem to have made it available for download anywhere that I could find, so I wonder are there other implementations, or similar libraries out there?
It doesn't look like Microsoft's Task Parallel Library has the equivalent, either, to steal from.
(I tried to make a tag 'taskpool' after 'threadpool', the most-used variant (before 'thread-pool') but, didn't have enough points. Anyone heavy enough think it's worth adding?)
edit:
I haven't tried it yet, but it PEBBL (under here: software.sandia.gov/trac/acro/wiki/Packages) claims to scale really high. The paper that the answerer mentions from the Wiley book 'Parallel Branch-and-Bound Algorithms', Crainic, Le Cun and Roucairol, 2006, from "Parallel Combinatorial Optimization", 2006 edited by El-Ghazali Talbi was where I found it, and there are other libraries listed; some may be better, I reserve the right to update this :). Funny that Google didn't find these libs, either my Googling was weak or Google itself fails to be magic sometimes.
When you say "over a cluster" it sounds like you mean distributed memory, and parallelizing branch and bound is a notoriously difficult problem for distributed memory - at least in a way that guarantees scalability. The seminal paper on this topic is available here, and there's an excerpt from a Wiley book on the topic here.
Shared memory branch is bound is an easier problem because you can implement a global task queue. A good high level description of how to do both shared memory and message passing implementations is available here. If nothing else, the references section is worth purusing for ideas and existing implementations.
One thing you might consider is investigating shared message queues like RabbitMQ. It is a AMQP server (a messaging protocol developed so distributed applications can send messages to each other).
you basically need some kind of distributed synchronization/queue
I suggest looking into armci as a low-level distributed memory interface with synchronization and build on top of that.
Alternative is to allocate mpi process as Master to distribute work allocation.
http://www.cs.utk.edu/~dongarra/ccgsc2008/talks/Talk10-Lusk.pdf

Memory leak detectors for C?

What memory leak detectors have people had a good experience with?
Here is a summary of the answers so far:
Valgrind - Instrumentation framework for building dynamic analysis tools.
Electric Fence - A tool that works with GDB
Splint - Annotation-Assisted Lightweight Static Checking
Glow Code - This is a complete real-time performance and memory profiler for Windows and .NET programmers who develop applications with C++, C#, or any .NET Framework
Also see this stackoverflow post.
second the valgrind... and I'll add electric fence.
Valgrind under linux is fairly good; I have no experience under Windows with this.
If you have the money: IBM Rational Purify is an extremely powerful industry-strength memory leak and memory corruption detector for C/C++. Exists for Windows, Solaris and Linux. If you're linux-only and want a cheap solution, go for Valgrind.
Mudflap for gcc! It actually compiles the checks into the executable. Just add
-fmudflap -lmudflap
to your gcc flags.
I had quite some hits with cppcheck, which does static analysis only. It is open source and has a command line interface (I did not use it in any other way).
lint (very similar open-source tool called splint)
Also worth using if you're on Linux using glibc is the built-in debug heap code. To use it, link with -lmcheck or define (and export) the MALLOC_CHECK_ environment variable with the value 1, 2, or 3. The glibc manual provides more information.
This mode is most useful for detecting double-frees, and it often finds writes outside the allocated memory area when doing a free. I don't think it reports leaked memory.
Painful but if you had to use one..
I'd recommend the DevPartner BoundsChecker suite.. that's what people at my workplace use for this purpose. Paid n proprietary.. not freeware.
I've had minimal love for any memory leak detectors. Typically there are far too many false positives for them to be of any use. I would recommend these two as beiong the least intrusive:
GlowCode
Debug heap
For Win32 debugging of memory leaks I have had very good experiences with the plain old CRT Debug Heap, that comes as a lib with Visual C.
In a Debug build malloc (et al) get redefined as _malloc_dbg (et al) and there are other calls to retrieve results, which are all undefined if _DEBUG is not set. It sets up all sorts of boundary guards on the heap, and allows you to diplay the results at any time.
I had a few false positives when I was witting some time routines that messed with the library run time allocations until I discovered _CRT_BLOCK.
I had to produce first DOS, then Win32 console and services that would run for ever. As far as I know there are no memory leaks, and in at least one place the code run for two years unattended before the monitor on the PC failed (though the PC was fine!).
On Windows, I have used Visual Leak Detector. Integrates with VC++, easy to use (just include a header and set LIB to find the lib), open source, free to use FTW.
At university when I was doing most things under Unix Solaris I used gdb.
However I would go with valgrind under Linux.
The granddaddy of these tools is the commercial, closed-source Purify tool, which was sold to IBM and then to UNICOM
Parasoft's Insure++ (source code instrumentation) and valgrind (open source) are the two other real competitors.
Trivia: the original author of Purify, Reed Hastings, went on to found NetFlix.
No one mentioned clang's MSan, which is quite powerful. It is officially supported on Linux only, though.
This question maybe old, but I'll answer it anyway - maybe my answer will help someone to find their memory leaks.
This is my own project - I've put it as open source code:
https://sourceforge.net/projects/diagnostic/
Windows 32 & 64-bit platforms are supported, native and mixed mode callstacks are supported.
.NET garbage collection is not supported. (C++ cli's gcnew or C#'s new)
It high performance tool, and does not require any integration (unless you really want to integrate it).
Complete manual can be found here:
http://diagnostic.sourceforge.net/index.html
Don't be afraid of how much it actually detects leaks it your process. It catches memory leaks from whole process. Analyze only biggest leaks, not all.
I'll second the valgrind as an external tool for memory leaks.
But, for most of the problems I've had to solve I've always used internally built tools. Sometimes the external tools have too much overhead or are too complicated to set up.
Why use already written code when you can write your own :)
I joke, but sometimes you need something simple and it's faster to write it yourself.
Usually I just replace calls to malloc() and free() with functions that keep better
track of who allocates what. Most of my problems seem to be someone forgot to free and this helps to solve that problem.
It really depends on where the leak is, and if you knew that, then you would not need any tools. But if you have some insight into where you think it's leaking, then put in your own instrumentation and see if it helps you.
Our CheckPointer tool can do this for GNU C 3/4 and, MS dialects of C, and GreenHills C. It can find memory management problems that Valgrind cannot.
If your code simply leaks, on exit CheckPointer will tell you where all the unfreed memory was allocated.

Resources