bin_at in dlmalloc

bin_at in dlmalloc - c

In glibc malloc.c or dlmalloc It said "repositioning tricks"As in blew, and use this trick in bin_at.
bins is a array,the space is allocated when av(struct malloc_state) is allocated.doesn't it? the sizeof(bin[i]) is less then sizeof(struct malloc_chunk*)?
When bin_at(M,1)(which is used as unsorted_chunks) is called,the result is: bin[0] - offsetof (struct malloc_chunk, fd) bin[0] - 8 is right?
Who can describe this trick for me? I can't understand the bin_at macro.why they get the bins address use this method?how it works?
Very thanks,and sorry for my poor English.
/*
To simplify use in double-linked lists, each bin header acts
as a malloc_chunk. This avoids special-casing for headers.
But to conserve space and improve locality, we allocate
only the fd/bk pointers of bins, and then use repositioning tricks
to treat these as the fields of a malloc_chunk*.
*/
typedef struct malloc_chunk* mbinptr;
/* addressing -- note that bin_at(0) does not exist */
#define bin_at(m, i) \
(mbinptr) (((char *) &((m)->bins[((i) - 1) * 2])) \
- offsetof (struct malloc_chunk, fd))
The malloc_chunk struct like this:
struct malloc_chunk {
INTERNAL_SIZE_T prev_size; /* Size of previous chunk (if free). */
INTERNAL_SIZE_T size; /* Size in bytes, including overhead. */
struct malloc_chunk* fd; /* double links -- used only if free. */
struct malloc_chunk* bk;
/* Only used for large blocks: pointer to next larger size. */
struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
struct malloc_chunk* bk_nextsize;
};
And the bin type like this:
typedef struct malloc_chunk* mbinptr;
struct malloc_state {
/* Serialize access. */
mutex_t mutex;
/* Flags (formerly in max_fast). */
int flags;
#if THREAD_STATS
/* Statistics for locking. Only used if THREAD_STATS is defined. */
long stat_lock_direct, stat_lock_loop, stat_lock_wait;
#endif
/* Fastbins */
mfastbinptr fastbinsY[NFASTBINS];
/* Base of the topmost chunk -- not otherwise kept in a bin */
mchunkptr top;
/* The remainder from the most recent split of a small request */
mchunkptr last_remainder;
/* Normal bins packed as described above */
mchunkptr bins[NBINS * 2 - 2];
/* Bitmap of bins */
unsigned int binmap[BINMAPSIZE];
/* Linked list */
struct malloc_state *next;
#ifdef PER_THREAD
/* Linked list for free arenas. */
struct malloc_state *next_free;
#endif
/* Memory allocated from the system in this arena. */
INTERNAL_SIZE_T system_mem;
INTERNAL_SIZE_T max_system_mem;
};

Presumably struct malloc_chunk looks something like:
struct malloc_chunk {
/* ... fields here ... */
struct malloc_chunk *fd;
struct malloc_chunk *bk;
/* ... more fields here ... */
};
...and the ->bins type looks like:
struct {
struct malloc_chunk *fd;
struct malloc_chunk *bk;
};
The bin_at macro makes a pointer to the latter structure into a fake pointer to the former structure, for the purpose of accessing the fd and bk members only (since they're the only ones that exist in the smaller one). ie bin_at(m, i)->fd and bin_at(m, i)->bk are the same as m->bins[(i - 1) * 2].fd and m->bins[(i - 1) * 2].bk, but bin_at can be used in places that expect a struct malloc_chunk * (as long as they only use the fd and bk members).
It's a bit of a hack. I wouldn't do this in your own code - remember Kernighan's advice about writing code as cleverly as possible:
"Debugging is twice as hard as writing
the code in the first place.
Therefore, if you write the code as
cleverly as possible, you are, by
definition, not smart enough to debug
it." – Brian W. Kernighan
OK, so ->bins isn't an array of structs at all - it's an array of struct malloc_chunk *.
Notice that ->bins[(i - 1) * 2] refers to the i-th pair of struct malloc_chunk * pointers in the ->bins array. This pair is equivalent to the fd and bk pair of pointers in a struct malloc_chunk, with the first (->bins[(i - 1) * 2]) being equivalent to fd (they could have instead made ->bins an array of the smaller struct I suggested above; it would be functionally equivalent and probably clearer).
The bin_at macro lets the code insert one of those pairs of pointers that are in the ->bins array into a linked list of struct malloc_chunk structs - without allocating an entire struct malloc_chunk. This is the space saving they are talking about.
The bin_at macro takes a index into the bins array, then does "if this pointer was actually the fd value in a struct malloc_chunk, then calculate a pointer to where that struct malloc_chunk would be". It does this by subtracting the offset of the fd member within a struct malloc_chunk from the address of the item in the bins array.
It doesn't really "locate the bins[i]" - that's straightforward (&bins[i]). It actually locates the imaginary struct malloc_chunk that bins[i] is the fd member of.
Sorry, it's complicated to explain because it's a complicated concept.

Related

Understand alignment sentence in inotify example

I'm reading the inotify man page here and i struggle to understand the following comment in the example
Some systems cannot read integer variables if they are not
properly aligned. On other systems, incorrect alignment may
decrease performance. Hence, the buffer used for reading from
the inotify file descriptor should have the same alignment as
struct inotify_event.
And this is the buffer declaration+definition
char buf[4096]
__attribute__ ((aligned(__alignof__(struct inotify_event))));
I read the following pdf document that gives a basic understanding of memory alignment access issues.
I would like to understand both the inotify comment and also having some hints and links for further understand the alignment problem. For example i always hear that there's no alignment problem for variables allocated on the stack. The problem arise only for buffers whose data gets reinterpreted.

The struct inotify_event is declared like this:
struct inotify_event {
int wd; /* Watch descriptor */
uint32_t mask; /* Mask describing event */
uint32_t cookie; /* Unique cookie associating related
events (for rename(2)) */
uint32_t len; /* Size of name field */
char name[]; /* Optional null-terminated name */
};
The problem is with the flexible array member name. Name has no number inside the braces, that means that &((struct inotify_event*)0)->name[0] == offsetof(struct inotify_event, name), ie. the memory for elements inside the name member starts RIGHT AFTER the structure.
Now if we were to story one inotify event, we need additional space for the name after the structure. Dynamic allocation may look like this:
char name[] = "this is the name of this event";
struct inotify_event *obj = malloc(sizeof(*obj) * (strlen(name) + 1));
obj->wd = smth;
obj->mask = smth2;
obj->len = strlen(name) + 1;
// fun fact: obj->name = &obj[1] . So if you ware to place array of inotify_events, you would overwrite the second member here.
memcpy(obj->name, name, obj->len);
The memory for the name structure member comes right after struct inotify_event and is allocated with the same malloc. So If we want to have an array of inotify_events and would copy them including the name, the next struct inotify_event may be unaligned.
Let's assume alignof(struct inotify_event) = 8 and sizeof(struct inotify_event) = 16 and char name[] = "A"; so strlen(name) = 1 (strlen excludes counting the terminating zero byte) and that we want to store an array of inotfiy_events inside a buffer. First we copy the struct - 16 bytes. Then we copy the name - 2 bytes (including zero byte). If we were to copy the next struct, it would be unaligned, cause we would copy it starting from the 18th byte (sizeof(struct inotfy_event) + strlen(name) + 1) in the buffer, which is not dividable by alignof(struct inotify_event). We need to insert 5 bytes of extra padding and we need to do that manually, after the first array member, so the next struct inotify_event will be copied into the 24th byte.
However, we need also to notify the user / application code of how much it needs to increment the pointer to get to the next struct array member. So we increment obj->len with the number of padding bytes. So obj->len is equal to strlen(name) + 1 + number of padding bytes inserted to make the next array member aligned or is equal to 0, in case of no name.
Inspect the example code in the manual page. In the loop where we loop through struct inotify_events there is the line:
ptr += sizeof(struct inotify_event) + event->len
The ptr is a char* pointer to the current / next array member. We need to increment the pointer not only by sizeof(struct inotify_event) but also by the number of strlen(name) + 1 bytes + inserted padding to the next array member. That way we can keep the array member aligned to their needed alignment. In the next position is the next struct inotify_event.
For more information browse about pointer arithmetics in C and flexible array struct member.

This is the old GCC non-standard way of specifying that the array buf needs to start at an address that would be a suitable starting address for struct inotify_event.
This can be in C11, C17 written as
char _Alignas(struct inotify_event) buf[4096];

cpumask array defined with size 0

I'm investigating the Linux kernel (specifically the load balance area).
in the kernel (sched.h) there is a declaration of a struct :
struct sched_group
which looks like this:
struct sched_group {
struct sched_group *next; /* Must be a circular list */
atomic_t ref;
unsigned int group_weight;
struct sched_group_power *sgp;
/*
* The CPUs this group covers.
*
* NOTE: this field is variable length. (Allocated dynamically
* by attaching extra space to the end of the structure,
* depending on how many CPUs the kernel has booted up with)
*/
unsigned long cpumask[0];
};
what I don't understand is the use of a cpumask array with the size 0.
any explanation would be much appreciated :)

The size of cpumask should be variant based on different platforms that has different number of CPUs, that's why it cannot define a fixed length array. GNU C supports a variable-length object, it should be the last element of a structure.

Allocating a struct dirent without malloc()

I need to use readdir_r() to read the contents of a directory in a multithreaded program. Since the size of struct dirent is filesystem dependent, man readdir_r recommends
name_max = pathconf(dirpath, _PC_NAME_MAX);
if (name_max == -1) /* Limit not defined, or error */
name_max = 255; /* Take a guess */
len = offsetof(struct dirent, d_name) + name_max + 1;
to find the size of the allocation needed. To allocate it
entryp = malloc(len);
is called, and finally readdir_r() uses it like this:
struct dirent *returned;
readdir_r(DIR*, entryp, &returned);
However, I'd like to avoid calling malloc() (or any other manual memory management function).
One way I've thought of is
_Alignas(struct dirent) char direntbuf[len];
struct dirent *entryp = (struct dirent*) direntbuf;
This should give a correctly aligned allocation, but it violates strict aliasing. However, the buffer is never accessed via a char* so the most likely problem, the compiler reordering accesses to the buffer via different types, cannot occur.
Another way could be by alloca(), which returns a void*, avoiding strict aliasing problems. However, alloca() does not seem to guarantee alignment the way malloc() and friends do. To always get an aligned buffer, something like
void *alloc = alloca(len + _Alignof(struct dirent));
struct dirent *direntbuf = (struct dirent*)((uintptr_t)&((char*)alloc)[_Alignof(struct dirent)]&-_Alignof(struct dirent));
would be needed. In particular, the cast to char * is needed to perform arithmetic on a pointer, and the cast to uintptr_t is needed to do the binary &. This doesn't look more well-defined than allocating a char[].
Is there a way to avoid manual memory management when allocating a struct dirent?

What about defining this:
#include <stddef.h> /* For offsetof */
#include <dirent.h>
union U
{
struct dirent de;
char c[offsetof(struct dirent, d_name) + NAME_MAX + 1]; /* NAME_MAX is POSIX. */
};

The readdir_r function signature is:
int readdir_r(DIR *dirp, struct dirent *entry, struct dirent **result);
And direct is a struct like this:
struct dirent {
ino_t d_ino; /* inode number */
off_t d_off; /* offset to the next dirent */
unsigned short d_reclen; /* length of this record */
unsigned char d_type; /* type of file; not supported
by all file system types */
char d_name[256]; /* filename */
};
You have to pass a pointer to readdir_r but how you allocate memory for the dirent structure is entirely up to you.
You could do it like this and use a stack variable.
struct dirent entry = {0};
...
readdir_r(DIR*, &entry, &returned);

Document C in Doxygen despite usage of macros?

I have macros in C source files that generate function declarations as well as for structures.
I made the decision to use doxygen in order to document them, but as long as my source file does not explicitly contain the declaration, doxygen is not generating the appropriate documentation.
Here is a little example; I have a macro that is a kind of idiom for a class declaration:
#define CLASS(x) \
typedef struct _##x x; \
typedef struct _##x *P##x; \
typedef struct _##x **PP##x; \
typedef struct _##x
So instead of writing:
/**
* \struct _Vector
* \brief the vector structure handles stuff to store data accessible by an index.
*/
typedef struct _Vector {
/*#{*/
Container container; /**< inherits from container */
size_t allocated_size; /**< the total allocated size of a vector (private usage) */
void* elements; /**< pointer to the allocated memory space (private usage) */
/*#}*/
} Vector, *PVector;
I may write instead:
/**
* \struct _Vector
* \brief the vector structure handles stuff to store data accessible by an index.
*/
CLASS(Vector) {
/*#{*/
Container container; /**< inherits from container */
size_t allocated_size; /**< the total allocated size of a vector (private usage) */
void* elements; /**< pointer to the allocated memory space (private usage) */
/*#}*/
};
But for the second case, doxygen does not generate the documentation regarding my struct members.
How could I find a compliant solution to such an issue?

What's the correct (modern) way to pad a struct?

edit: a better way of phrasing this: What's the correct [modern] way to ensure that a struct is a specific size in bytes?
just spending a relaxing saturday afternoon debugging a legacy codebase, and having a bit of trouble figuring this out. The compiler error I get is this:
INC/flx.h:33: error: dereferencing pointer to incomplete type
the code at line 33 looks like this
typedef struct flx_head {
FHEAD_COMMON;
LONG frames_in_table; /* size of index */
LONG index_oset; /* offset to index */
LONG path_oset; /* offset to flipath record chunk */
/* this will insure that a Flx_head is the same size as a fli_head but won't
* work if there is < 2 bytes left (value <= 0) */
PADTO(sizeof(Fli_head),flx_head,flxpad); /* line 33 is this one */
} Flx_head;
well okay so I can see that the struct is referring to itself to pad it out somehow. But I don't know an alternative way of doing what PADTO does without the self reference.
here's what PADTO is defined as
#define MEMBER(struc,field) \
((struc*)NULL)->field
/* returns offset of field within a given struct name,
* and field name ie: OFFSET(struct sname,fieldname) */
#define OFFSET(struc,field) \
(USHORT)((ULONG)((PTR)&MEMBER(struc,field)-(PTR)NULL))
/* offset to first byte after a field */
#define POSTOSET(struc,field) \
(OFFSET(struc,field)+sizeof(MEMBER(struc,field)))
/* macro for defining pad sizes in structures can not define a pad of
* less than two bytes one may use pname for the offset to it but
* sizeof(struc->pname) will not be valid
*
* struct sname {
* char fld1[64];
* PADTO(68,sname,pname);
* };
* will make:
*
* struct sname {
* char fld1[64];
* UBYTE pname[1];
* UBYTE __pname[3];
* };
*/
#define PADTO(sz,struc,padfld) \
UBYTE padfld[1];UBYTE __##padfld[(sz)-OFFSET(struct struc,padfld)-1]
here is FHEAD_COMMON
#define FHEAD_COMMON \
CHUNKID_FIELDS;\
USHORT frame_count;\
USHORT width;\
USHORT height;\
USHORT bits_a_pixel;\
SHORT flags;\
LONG speed;\
USHORT unused;\
Fli_id id;\
USHORT aspect_dx;\
USHORT aspect_dy;\
UBYTE commonpad[38] /* should be total of 80 bytes (48 for unique) */
and flihead
typedef struct fli_head {
FHEAD_COMMON;
LONG frame1_oset;
LONG frame2_oset;
UBYTE padfill[40];
} Fli_head;
this is Autodesk animator pro. what I am working on is the "reference" implementation for the FLI file format- which you can see a spec for here:
http://www.compuphase.com/flic.htm
Incidentally, I'm pretty sure that what the /source code/ there refers to as "flx" is actually what that webpage there calls "flc" , not what it calls "flx"
update:
better source for format info http://drdobbs.com/architecture-and-design/184408954

It isn't pretty, but one possibility is to define another identical structure and use its size to determine the padding for the one you actually want to use:
#define FLX_HEAD \
FHEAD_COMMON;\
LONG frames_in_table; /* size of index */ \
LONG index_oset; /* offset to index */ \
LONG path_oset /* offset to flipath record chunk */
struct flx_head_unpadded {
FLX_HEAD;
};
typedef struct flx_head {
FLX_HEAD;
char __flxpad[sizeof(Fli_head)-sizeof(struct flx_head_unpadded)];
} Flx_head;

I suppose the answer depends on what you're trying to achieve. In most cases, the correct, modern way to pad a struct is not to. The only situation I can think of where it's legitimate to pad a struct is when you have a library interface where the caller creates objects of a structure type and passes pointers to them to the library, and where you want to leave room to add additional fields to the structure without breaking the ABI. In this case, I would start out with something like char pad[256]; and change it to char pad[256-3*sizeof(long)]; or similar as you add fields (making sure to avoid internal padding when you add fields).

Define it in a union with a byte/char array of the desired size?
I can think quickly of some scenarios where this is needed:
1) Compatibility with old software that uses flat binary files to store data, (as in OP).
2) Interaction with drivers and/or hardware
3) Forcing structs to be an exact multiple of the cache line size to prevent false sharing in inter-thread comms.

If you only want to achieve specific size you can use (for sure working in GCC):
typedef union {
struct {
FHEAD_COMMON;
LONG frames_in_table; /* size of index */
LONG index_oset; /* offset to index */
LONG path_oset; /* offset to flipath record chunk */
};
uint8_t __padding[128];
} Flx_head;
void test() {
Flx_head boo;
boo.frames_in_table= 0;
}
I am not sure if this is modern enough. If youc compiler does not support anonymous struct (the one inside union) it will get "messy".
Also you must remember that struct is now padded, but not aligned to specific data size.

thanks everyone. not sure who to award the green checkmark to, since I found this solution as a result of everyone kind of hinting and pointing in the right direction. After looking at the problem, it struck me that the struct just needs to be exactly 128 bytes. I "hand parsed" the macro, cross referencing with the spec and ended up with this:
typedef struct flx_head {
FHEAD_COMMON;
LONG frames_in_table; /* size of index */
LONG index_oset; /* offset to index */
LONG path_oset; /* offset to flipath record chunk */
UBYTE flxpad[36];
} Flx_head;
which is 128-(80+4+4+4) = 36

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

bin_at in dlmalloc - c

Related

Understand alignment sentence in inotify example

cpumask array defined with size 0

Allocating a struct dirent without malloc()

Document C in Doxygen despite usage of macros?

What's the correct (modern) way to pad a struct?

Categories

Resources