Msxml3 - Memory leak - msxml

I plan to write an application (win32 platform) for parsing the xml documents. For parsing the xml's, i plan to use the msxml3.dll (microsoft latest service pack library) but many forum described that this has an huge memory leak issue.
Is this really true that msxml3.dll has huge memory leak?

MSXML3 has its own garbage collection mechanism. If you don't know about this mechanism, MSXML3 only "appears" to be leaking memory before garbage collector kicks in and recycles resources. Please check Understanding the MSXML garbage collection mechanism for more details.

Related

How Kotlin/Native garbage collector works in C?

I'm found some explanation of Kotlin/Native memory management model in JetBrains FAQ.
A: Kotlin/Native provides an automated memory management scheme,
similar to what Java or Swift provides. The current implementation
includes an automated reference counter with a cycle collector to
collect cyclical garbage.
I understand more or less how it works in Java or Kotlin (JVM). Can any describe detailed how memory is managed in Kotlin/Native in projects with C?
Also, if there is the garbage collector, why do we need a Kotlin/Native function memScoped { }?
Also, I found here :
Kotlin/Native is a technology for compiling Kotlin to native binaries that run without any VM.
Broadly speaking, Native code is any code whose memory is not managed by the underlying framework but has to be managed by the programmer themselves. i.e. there is no Garbage collection.
e.g. C++’ delete and C’s free
which in my opinion contradicts what is written in JetBrains FAQ
Memory Management in K/N is provided by the runtime. It consists of two main parts: automatic reference counting and cycle collector. This provides one with availability to write the code just as in the Kotlin/JVM. Some details on this topic can be found digging inside this file but all you need to know is that it is automatic by default.
About MemScoped etc. When you use interoperability with C, you have to deal with managing such a resource as native memory. Native memory is the memory provided to the application process by the operating system. As it has nothing to do with the Kotlin code, this resource can't be managed by K/N runtime. But all C struct s and variables you are going to use must be allocated there. You can do it straight by calling nativeHeap.alloc() function. When the need of this memory is gone, it can be freed by nativeHeap.free().
But to make your experience more comfortable, K/N also has Arena class, implementing region-based memory management. It simplifies memory management to just a series of alloc() wherever you need, and one deallocation by .clear() for all the region.
Also, there is a MemScoped {} blocks, that covers Arena from you, and lets to forget about even the freeing native memory. So in your code that includes some elements from C, you can just write MemScoped { ... }, and then put operations into it. You can see a lot of examples of this approach in the samples from the K/N repository

Hunting for memory leaks on embedded system without valgrind (or using minimal valgrind-like application)

I'm working with embedded linux development, and we're currently having some trouble with some memory page allocation faults, which led me to believe we have a leak somewhere.
Currently, I am trying to cross compile valgrind to use on our system, but I'm losing my faith on this solution because of the sheer amount of memory valgrind will use up (we have serious memory restrictions).
This has made me wonder: is there any way of hunting for a memory leak without valgrind or with a valgrind-like application with minimal memory usage? Creating wrappers for malloc() and free() is out of question.
Also, the test that caused the allocation failures was a simple stress test of copying a file n times and checking its md5sum, in case anyone is curious.
I'm using the Linaro toolchain for cross compiling, glibc 2.15, and the system is set up without a swap partition. The system has around 64MB of RAM, making valgrind, or any other memory intensive application a tad difficult to use.
Regards,
Guilherme
Since you are using glibc, you should have its built-in memory-tracing support available to you. Your program would enable this by calling mtrace(3) at or near startup. mtrace() installs hook functions into the memory allocator to log allocations and deallocations, under runtime control via environment variable MALLOC_TRACE.
You probably also want to be aware of mtrace(1), a Perl script for interpreting log files produced by the mtrace facility.
This facility traces only allocations and deallocations, which is much less than Valgrind does. Nevertheless, those are the main items of interest when you are looking for a memory leak.

Is memory leak unavoidable in C

There are many memory leak bug in our company's codes and normally our solution is "Reading the Codes" though we have tools to found memory leak's position. So I am wondering is memory leak unavoidable in C or it is not worth to do garbage collection to sacrifice system's performance.
It is always possible to avoid memory leaks, it's just that it can be difficult to do so when doing manual memory management. As programs grow complex it becomes harder to do memory management correctly. That is why you see many larger project implement some kind of automatic or semi automatic memory management. For instance GCC has a garbage collector, as has open source web browsers like Firefox and Chrome (I'm sure the closed source web browsers has it as well but it's not so easy to tell).
It is important to not that automatic memory management does not remove all memory leaks. Data can still be retained unnecessarily. But automatic memory management makes things easier and helps avoid errors like freeing memory twice or referencing already freed memory.

Generational GC source code

I am studying GC implementations, and I'm currently looking for references and good open-source GC examples to base in.
Is there any good and simple generational GC implementation ? The second best thing would be good resources and guidelines!
Thank you!
I've wrote the Qish garbage collector (not really maintained any more, but feel free to ask). It is a free copying generational GC for C (with some coding styles restrictions).
The GCC MELT [meta-]plugin (free, GPLv3 licensed), providing a high level language, MELT, to extend the GCC compiler, also has a copying generational GC above the existing Ggc garbage collector of GCC. Look into gcc/melt-runtime.c
With generational copying GC, generating the application's code in C is quite useful. See my DSL2011 paper on MELT
Feel free to ask me more, I love talking about my GC-s.
Of course, reading the Garbage Collection Handbook: The Art of Automatic Memory Management (Jones, Hosking, Moss) [ISBN-13: 978-1420082791] is a must
(added in 2017)
Look also into Ravenbrook's Memory Pool System which can be used for generational GC.
Look also into the runtime of Ocaml, which has a good (single-threaded) generational GC.
PS. Debugging a generational copying GC is painful.
Java's HotSpot GC
You can look at the various GC implementations provided by the JVM here.
The Memory Management white paper gives an overview of the different garbage collectors implemented in the JVM. Its from 2006 so its missing the new G1 collector details but its a good starting point.
Mon's SGen GC
Mono's new SGen is on github too. Check out the sgen files.
The Ovm framework is open source and offers a framework that allows to select several features regarding garbage collection for real-time systems.
According to the website
Includes Minuteman RTGC framework which allows to select from newly
supported RTGC features: time-based scheduling (periodic, slack, and
hybrid - a combination of both), incremental stack scanning,
replication or Brooks barrier, incremental object copy, arraylets,
memory usage, and GC pause profiling and tracing.
Although domain specific, it may be a good starting point for your study.
I hope this helps.
The V8 project (Javascript engine used in Chrome and Android) is open source and has a simple generational garbage collector.
You can browse the source code online. In particular, look at heap.cc (implementation of the heap and scavenge algorithm), spaces.cc (lower level heap stuff), and mark-compact.cc (full garbage collector).
The Parrot VM also uses a generational garbage collector.
Although it's not in written in C, the JikesRVM JVM contains several GC implementations, including a couple of generational ones, and I think it's rather simple to understand.
The Boehm Garbage Collector is commonly used for C and C++ projects.

memory leaks during development

So, I've recently noticed that our development server has a steady ~300MB out of 4GB ram left after the finished development of a certain project. Assuming this was due to memory leaks during the development phase, will that memory eventually free itself up or will it require a server restart. Are there any tools that can be used to prevent this in the future (aside from the obvious, 'don't write code that produces memory leaks')? Sometimes they go unseen for a little while and over time I guess they add up as you continue testing your app.
What operating system are you running? Most operating systems these days will clean up leaked memory for a process when the process exits. It is possible that the memory you are seeing in use is actually being used for the filesystem cache. This is nothing to worry about -- the OS will reclaim this memory if necessary.
From: http://learnlinux.tsf.org.za/courses/build/internals/ch05.html
The amount of free memory indicated by
the free command includes the current
size of the buffer cache in its
calculation. This is misleading, as
the amount of free memory indicated
will often be very low, as the buffer
cache soon fills most of user memory.
Don't' panic. Applications are
probably not crowding your RAM; it is
merely the buffer cache that is taking
up all available space. The buffer
cache counts as memory space available
for application use (remembering that
it will be shrunk as required), so
subtract the size of the buffer cache
to see the real amount of free memory
available for application use
It's best to fight them during development, because then it's easier to identify the revision that introduces the leak. As you probably see now, doing it after the fact is very, very hard. Expect a lot of reports when running the tools I recommend below:
http://valgrind.org/
http://www.ibm.com/software/awdtools/purify/
http://directory.fsf.org/project/ElectricFence/
I'd suggest you to run this tools, suppress most warnings about leaks, and then fix them one by one, removing the suppresions.
And then, make sure you regularly run these tools and quickly fix any regressions!
Of course the obvious answer is "Don't write code that produces memory leaks" and it's a valid one, because they can be extremely hard to fix if you have reference counting issues, or complex code in which it's hard to track the lifetime of memory.
To address your current situation you might consider using a tool such as DevPartner for Windows, or Valgrind for Linux/Unix, both of which I've found to be very effective for tracking down memory leaks (as well as other issues such as performance bottlenecks).
Another thing you may wish to consider is to look at your use of pointers and slowly replace them with smart pointers if you can, which should help manage your pointer lifetimes.
And no, I doubt that memory is going to be recovered without restarting the process in which your code is running.
Run the program using the exceptional valgrind on Linux x86 boxes.
A commerical equivilant, Purify, is available on Windows.
These runtime analysis of your program will report memory leaks and other errors such as buffer overflows and unitialised variables.
Static code analysis - Lint and Coverity for example - can also uncover memory leaks and more serious errors.
Lets be specific about what memory leaks cause and how they harm your program:
If you 'leak' memory during operation of your program there is a risk that your application will eventually exhaust RAM and swap, or the address space of available to your program (which can be less than physical RAM) and cause the next allocation to fail. The vast majority of programs will fail to catch this error, as error checking is harder than it seems. The majority of programs will either fail by dereferencing a null pointer or will exit.
If this is on Linux, check the output of 'free' and specifically check the amount of 'cached' ram. If your development work includes a lot of disk I/O, it'll use it for caching files, and you'll see very little 'available' but it's still there if it's needed. For all practical purposes, consider free+cached as available.
The 'free' output is distilled from /proc/meminfo, and you can get more detailed information on the running process in /proc/$pid/{maps,smaps}
In theory when your process exits, any memory it had is released. Is your process exiting?
Don't assume anything, run a memory profiler over it and see what it's doing.
When I was at college we used the Borland C++ Builder 6 IDE
It included CodeGuard, which checks for memory leaks and other memory related issues.
I am not sure if this option is still available on newer versions, but it would be weird for a new version to have less features.
On linux, as mentioned before, valgrind is a good memory leak debugger.

Resources