Why calling `free(malloc(8))`?

Why calling `free(malloc(8))`? - c

The Objective-C runtime's hashtable2.mm file contains the following code:
static void bootstrap (void) {
free(malloc(8));
prototypes = ALLOCTABLE (DEFAULT_ZONE);
prototypes->prototype = &protoPrototype;
prototypes->count = 1;
prototypes->nbBuckets = 1; /* has to be 1 so that the right bucket is 0 */
prototypes->buckets = ALLOCBUCKETS(DEFAULT_ZONE, 1);
prototypes->info = NULL;
((HashBucket *) prototypes->buckets)[0].count = 1;
((HashBucket *) prototypes->buckets)[0].elements.one = &protoPrototype;
};
Why does it allocate and release the 8-bytes space immediately?
Another source of confusion is this method from objc-os.h:
static __inline void *malloc_zone_malloc(malloc_zone_t z, size_t size) { return malloc(size); }
While it uses only one parameter, does the signature ask for two?

For the first question I can only assume. My bet it was done to avoid/reduce memory churn, or segment the memory for some other reason. You can briefly find where it's discussed in the Changelog of bmalloc (which is not quite relevant, but i could not find a better reference):
2017-06-02 Geoffrey Garen <ggaren#apple.com>
...
Updated for new APIs. Note that we cache one free chunk per page
class. This avoids churn in the large allocator when you
free(malloc(X))
It's unclear however, if the memory churn is caused by this technique or it was supposed to address it.
For the second question, Objective-C runtime used to work with "zones" in order to destroy all allocated variables by just destroying the said zone, but it proved being error prone and later it was agreed to not use it anymore. The API, however still uses it for historical reasons (backward compatibility, i assume), but says that zones are ignored:
Zones are ignored on iOS and 64-bit runtime on OS X. You should not use zones in current development.

Related

why handle to an object frequently appears as pointer-to-pointer

What is the intention to set handle to an object as pointer-to pointer but not pointer? Like following code:
FT_Library library;
FT_Error error = FT_Init_FreeType( &library );
where
typedef struct FT_LibraryRec_ *FT_Library
so &library is a FT_LIBraryRec_ handle of type FT_LIBraryRec_**

It's a way to emulate pass by reference in C, which otherwise only have pass by value.

The 'C' library function FT_Init_FreeType has two outputs, the error code and/or the library handle (which is a pointer).
In C++ we'd more naturally either:
return an object which encapsulated the success or failure of the call and the library handle, or
return one output - the library handle, and throw an exception on failure.
C APIs are generally not implemented this way.
It is not unusual for a C Library function to return a success code, and to be passed the addresses of in/out variables to be conditionally mutated, as per the case above.

The approach hides implementation. It speeds up compilation of your code. It allows to upgrade data structures used by the library without breaking existing code that uses them. Finally, it makes sure the address of that object never changes, and that you don’t copy these objects.
Here’s how the version with a single pointer might be implemented:
struct FT_Struct
{
// Some fields/properties go here, e.g.
int field1;
char* field2;
}
FT_Error Init( FT_Struct* p )
{
p->field1 = 11;
p->field2 = malloc( 100 );
if( nullptr == p->field2 )
return E_OUTOFMEMORY;
return S_OK;
}
Or C++ equivalent, without any pointers:
class FT_Struct
{
int field1;
std::vector<char> field2;
public:
FT_Struct() :
field1( 11 )
{
field2.resize( 100 );
}
};
As a user of the library, you have to include struct/class FT_Struct definition. Libraries can be very complex so this will slow down compilation of your code.
If the library is dynamic i.e. *.dll on windows, *.so on linux or *.dylib on osx, you upgrade the library and if the new version changes memory layout of the struct/class, old applications will crash.
Because of the way C++ works, objects are passed by value, i.e. you normally expect them to be movable and copiable, which is not necessarily what library author wants to support.
Now consider the following function instead:
FT_Error Init( FT_Struct** pp )
{
try
{
*pp = new FT_Struct();
return S_OK;
}
catch( std::exception& ex )
{
return E_FAIL;
}
}
As a user of the library, you no longer need to know what’s inside FT_Struct or even what size it is. You don’t need to #include the implementation details, i.e. compilation will be faster.
This plays nicely with dynamic libraries, library author can change memory layout however they please, as long as the C API is stable, old apps will continue to work.
The API guarantees you won’t copy or move the values, you can’t copy structures of unknown lengths.

Shared pointers and queues in FreeRTOS

A C++ wapper around a FreeRTOS queue can be simplified into something like this:
template<typename T>
class Queue<T>
{
public:
bool push(const T& item)
{
return xQueueSendToBack(handle, &item, 0) == pdTRUE;
}
bool pop(T& target)
{
return xQueueReceive(handle, &target, 0) == pdTRUE;
}
private:
QueueHandle_t handle;
}
The documentation of xQueueSendToBack states:
The item is queued by copy, not by reference.
Unfortunately, it is literally by copy, because it all ends in a memcpy, which makes sense since it is a C API. While this works well for plain old data, more complex items such as the following event message give serious problems.
class ConnectionStatusEvent
{
public:
ConnectionStatusEvent() = default;
ConnectionStatusEvent(std::shared_ptr<ISocket> sock)
: sock(sock)
{
}
const std::shared_ptr<ISocket>& get_socket() const
{
return sock;
}
private:
const std::shared_ptr<ISocket> sock;
bool connected;
};
The problem is obviously the std::shared_ptr which doesn't work at all with a memcpy since its copy constructor/assignment operator isn't called when copied onto the queue, resulting in premature deletion of the held object when the event message, and thus the shared_ptr, goes out of scope.
I could solve this by using dynamically allocated T-instances and change the queues to only contain pointers to the instance, but I'd rather not do that since this shall run on an embedded system and I very much want to keep the memory static at run-time.
My current plan is to change the queue to contain pointers to a locally held memory area in the wrapper class in which I can implement full C++ object-copy, but as I'd also need to protect that memory area against multiple thread access, it essentially defeats the already thread-safe implementation of the FreeRTOS queues (which surely are more efficient than any implementation I can write myself) I might as well skip them entirely.
Finally, the question:
Before I implement my own queue, are there any tricks I can use to make the FreeRTOS queues function with C++ object instances, in particular std::shared_ptr?

The issue is what happens to the original once you put the pointer into the queue.
Copying seems trivial but not optimal.
To get around this issue i use a mailbox instead of a queue:
T* data = (T*) osMailAlloc(m_mail, osWaitForever);
...
osMailPut (m_mail, data);
Where you allocate the pointer explicitly to begin with. And just add the pointer to the mailbox.
And to retrieve:
osEvent ev = osMailGet(m_mail, osWaitForever);
...
osStatus freeStatus = osMailFree(m_mail, p);
All can be neatly warpend into c++ template methods.

Finding Memory leaks in C with Visual Studio 2013

I am using Visual Studio 2013, and taking a course in C programming and we started talking about memory bugs detaction, in particular memory leaks, and how to detect them using Valgrind on Linux.
I want to know if there is a way to detect such memory leaks using VS 2013. I tried searching online but it just leads to lots of articles and blogs and msdn posts and blogs with lots of words like dump files, and analizing them, etc which weird because:
1) It sounds so complex, convoluted and unintuitive it is actually hard to comprehend it.
2) The main reason this is weird is due to the fact that VS is the most advanced IDE around and Microsoft spends so much money on in, yet from what I have read it seems that there is no simple way to use VS to detect memory leaks - certainly no way that's as simple as Valgrind where I only have to compile the program and run the command
valgrind -leaks-check=yes ProgramName
and it simply prints to me all location it thinks there is a memory leak and describes the error (like not freeing memory after allocating it with malloc hence having "dead" memory after the program finishes, or accessing memory that's out of the array bounds)
So my question is how to use VS 2013 in order to achieve the same results and to find out First in high level if there are memory leaks in the program, and second - to detect in a simple manner where those leaks are- preferably without involving dump files (not that I know how to use them anyway in VS).

The answer I offer cannot be perfect as exhaustively explaining ALL features VS2013 etc. offer to find C/C++ heap hickups could be a book well sold. And not a very thin one for that matter.
First, things I will NOT cover:
sal.h - Semantic annotations (which are also used by MS tools, if provided).
Not EVERY _Crt* function available. Only a few to give the idea.
Additional steps an application designer might to add to their design for a "production quality" code base.
Having said that, here a piece of annotated code containing various bugs. Each bug is annotated with how it was possible to detect it by means of what VS2013 has to offer. Either at run time or at compile time or with static code analysis.
The compile time error sections are placed in comments, btw.
#include "stdafx.h"
#include <crtdbg.h>
#include <malloc.h>
#include <cstdint>
#include <cstdlib>
// This small helper class shows the _Crt functions (some of them)
// and keeps the other code less cluttered.
// Note: This is NOT what you should do in production code,
// as all those _Crt functions disappear in a release build,
// but this class does not disappear...
class HeapNeutralSection
{
_CrtMemState m_start;
public:
HeapNeutralSection()
{
_CrtMemCheckpoint(&m_start);
}
~HeapNeutralSection()
{
_CrtMemState now;
_CrtMemState delta;
_CrtMemCheckpoint(&now);
if (_CrtMemDifference(&delta, &m_start, &now))
{
// This code section is not heap neutral!
_CrtMemDumpStatistics(&delta);
_CrtDumpMemoryLeaks();
_ASSERTE(false);
}
}
};
static void DoubleAllocationBug()
{
{
HeapNeutralSection hns;
uint32_t *rogue = (uint32_t*)malloc(sizeof(uint32_t));
if (NULL != rogue)
{
*rogue = 42; // Static Analysis without the NULL check: Dereferencing null pointer.
}
rogue = (uint32_t*)malloc(sizeof(uint32_t));
free(rogue);
// detected in destructor of HeapNeutralSection.
}
}
static void UninitializedPointerAccessBug()
{
// uint32_t *rogue;
// *rogue = 42; // C4700: uninitialized local variable 'rogue' used.
}
static void LeakBug()
{
{
HeapNeutralSection hns;
uint32_t *rogue = (uint32_t*)malloc(sizeof(uint32_t));
if (NULL != rogue)
{
*rogue = 42; // Static Analysis without the NULL check: Dereferencing null pointer.
}
}
}
static void InvalidPointerBug()
{
// Not sure if the C-compiler also finds this... one more reason to program C with C++ compiler ;)
// uint32_t *rogue = 234; // C2440: 'initalizing': cannot convert from 'int' to 'uint32_t *'
}
static void WriteOutOfRangeBug()
{
{
HeapNeutralSection hns;
uint32_t * rogue = (uint32_t*)malloc(sizeof(uint32_t));
if (NULL != rogue)
{
rogue[1] = 42; // Static analysis: warning C6200: Index '1' is out of valid index range '0' to '0' for non-stack buffer 'rogue'.
rogue[2] = 43; // Static analysis: warning C6200: Index '2' is out of valid index range '0' to '0' for non-stack buffer 'rogue'.
// warning C6386: Buffer overrun while writing to 'rogue': the writable size is '4' bytes, but '8' bytes might be written.
}
free(rogue); // We corrupted heap before. Next malloc/free should trigger a report.
/*
HEAP[ListAppend.exe]: Heap block at 0076CF20 modified at 0076CF50 past requested size of 28
ListAppend.exe has triggered a breakpoint.
*/
}
}
static void AccessAlreadyFreedMemoryBug()
{
{
HeapNeutralSection hns;
uint32_t * rogue = (uint32_t*)malloc(sizeof(uint32_t));
free(rogue);
*rogue = 42; // Static analysis: warning C6001: Using uninitialized memory 'rogue'.
}
}
int _tmain(int argc, _TCHAR* argv[])
{
// checks heap integrity each time a heap allocation is caused. Slow but useful.
_CrtSetDbgFlag(_CRTDBG_CHECK_ALWAYS_DF);
AccessAlreadyFreedMemoryBug();
WriteOutOfRangeBug();
UninitializedPointerAccessBug();
DoubleAllocationBug();
// _CrtDumpMemoryLeaks();
return 0;
}
Last not least, on Windows CE and in the DDKs (maybe also in the platform SDKs), there used to be the 2 static code checkers "prefast" and "prefix". But maybe they are now outdated.
Additionally, there are MS research projects, going a step beyond sal.h, such as:
VCC, which help finding concurrency bugs etc.

Using a custom memory allocation function in R

I would like to be able to use my own memory allocation function for certain data structures (real valued vectors and arrays) in R. The reason for this is that I need my data to be 64bit aligned and I would like to use the numa library for having control over which memory node is used (I'm working on compute nodes with four 12-core AMD Opteron 6174 CPUs).
Now I have two functions for allocating and freeing memory: numa_alloc_onnode and numa_free (courtesy of this thread). I'm using R version 3.1.1, so I have access to the function allocVector3 (src/main/memory.c), which seems to me as the intended way of adding a custom memory allocator. I also found the struct R_allocator in src/include/R_ext
However it is not clear to me how to put these pieces together. Let's say, in R, I want the result res of an evaluation such as
res <- Y - mean(Y)
to be saved in a memory area allocated with my own function, how would I do this? Can I integrate allocVector3 directly at the R level? I assume I have to go through the R-C interface. As far as I know, I cannot just return a pointer to the allocated area, but have to pass the result as an argument. So in R I call something like
n <- length(Y)
res <- numeric(length=1)
.Call("R_allocate_using_myalloc", n, res)
res <- Y - mean(Y)
and in C
#include <R.h>
#include <Rinternals.h>
#include <numa.h>
SEXP R_allocate_using_myalloc(SEXP R_n, SEXP R_res){
PROTECT(R_n = coerceVector(R_n, INTSXP));
PROTECT(R_res = coerceVector(R_res, REALSXP));
int *restrict n = INTEGER(R_n);
R_allocator_t myAllocator;
myAllocator.mem_alloc = numa_alloc_onnode;
myAllocator.mem_free = numa_free;
myAllocator.res = NULL;
myAllocator.data = ???;
R_res = allocVector3(REALSXP, n, myAllocator);
UNPROTECT(2);
}
Unfortunately I cannot get beyond a variable has incomplete type 'R_allocator_t' compilation error (I had to remove the .data line since I have no clue as to what I should put there). Does any of the above code make sense? Is there an easier way of achieving what I want to? It seems a bit odd to have to allocate a small vector in R and the change its location in C just to be able to both control the memory allocation and have the vector available in R...
I'm trying to avoid using Rcpp, as I'm modifying a fairly large package and do not want to convert all C calls and thought that mixing different C interfaces could perform sub-optimally.
Any help is greatly appreciated.

I made some progress in solving my problem and I would like to share in case anyone else encounters a similar situation. Thanks to Kevin for his comment. I was missing the include statement he mentions. Unfortunately this was only one among many problems.
dyn.load("myAlloc.so")
size <- 3e9
myBigmat <- .Call("myAllocC", size)
print(object.size(myBigmat), units = "auto")
rm(myBigmat)
#include <R.h>
#include <Rinternals.h>
#include <R_ext/Rallocators.h>
#include <numa.h>
typedef struct allocator_data {
size_t size;
} allocator_data;
void* my_alloc(R_allocator_t *allocator, size_t size) {
((allocator_data*)allocator->data)->size = size;
return (void*) numa_alloc_local(size);
}
void my_free(R_allocator_t *allocator, void * addr) {
size_t size = ((allocator_data*)allocator->data)->size;
numa_free(addr, size);
}
SEXP myAllocC(SEXP a) {
allocator_data* my_allocator_data = malloc(sizeof(allocator_data));
my_allocator_data->size = 0;
R_allocator_t* my_allocator = malloc(sizeof(R_allocator_t));
my_allocator->mem_alloc = &my_alloc;
my_allocator->mem_free = &my_free;
my_allocator->res = NULL;
my_allocator->data = my_allocator_data;
R_xlen_t n = asReal(a);
SEXP result = PROTECT(allocVector3(REALSXP, n, my_allocator));
UNPROTECT(1);
return result;
}
For compiling the c code, I use R CMD SHLIB -std=c99 -L/usr/lib64 -lnuma myAlloc.c. As far as I can tell, this works fine. If anyone has improvements/corrections to offer, I'd be happy to include them.
One requirement from the original question that remains unresolved is the alignment issue. The block of memory returned by numa_alloc_local is correctly aligned, but other fields of the new VECTOR_SEXPREC (eg. the sxpinfo_struct header) push back the start of the data array. Is it somehow possible to align this starting point (the address returned by REAL())?

R has, in memory.c:
main/memory.c
84:#include <R_ext/Rallocators.h> /* for R_allocator_t structure */
so I think you need to include that header as well to get the custom allocator (RInternals.h merely declares it, without defining the struct or including that header)

How to properly free an erlang term

In the example erlang port program
tuplep = erl_decode(buf);
fnp = erl_element(1, tuplep);
argp = erl_element(2, tuplep);
...
erl_free_compound(tuplep);
erl_free_term(fnp);
erl_free_term(argp);
Both erl_free_compound and erl_free_term are used for freeing term (and its sub-term) separately of the same ETERM*. From the documentation of erl_free_compund() it says
erl_free_compound() will recursively free all of the sub-terms associated with a given Erlang term
So, my question is, does erl_element() makes a copy of the element which, if not freed separately will leak memory or the above situation might lead to double free which is detected and handled by erl_free_term.

The erl_interface library indeed uses a sort of reference counting system to keep track
of allocated ETERM structs. So if you write:
ETERM *t_arr[2];
ETERM *t1;
t_arr[0] = erl_mk_atom("hello");
t_arr[1] = erl_mk_atom("world");
t1 = erl_mk_tuple(&t_arr[0],2);
you have created three (3) Erlang terms (ETERM). Now if you
call: erl_free_term(t1) you only free upp the tuple not the
other two ETERM's. To free all the allocated memory, you
would have to call:
erl_free_term(t_arr[0]);
erl_free_term(t_arr[1]);
erl_free_term(t1)
To avoid all these calls to erl_free_term() you can use:
erl_free_compund() instead. It does a "deep" free of
all ETERM's. So the above could be accomplished with:
erl_free_compund(t1)
Thus, this routine makes it possible for you to write
in a more compact way where you don't have to remember
the references to all sub-component ETERM's.
Example:
ETERM *list;
list = erl_cons(erl_mk_int(36),
erl_cons(erl_mk_atom("tobbe"),
erl_mk_empty_list()));
... /* do some work */
erl_free_compound(list);
Update: To check if you really free up all the create terms you can use this piece of code (original manual entry:
long allocated, freed;
erl_eterm_statistics(&allocated,&freed);
printf("currently allocated blocks: %ld\n",allocated);
printf("length of freelist: %ld\n",freed);
/* really free the freelist */
erl_eterm_release();
(answer adopted from here)