Allocating a struct dirent without malloc() - c

I need to use readdir_r() to read the contents of a directory in a multithreaded program. Since the size of struct dirent is filesystem dependent, man readdir_r recommends
name_max = pathconf(dirpath, _PC_NAME_MAX);
if (name_max == -1) /* Limit not defined, or error */
name_max = 255; /* Take a guess */
len = offsetof(struct dirent, d_name) + name_max + 1;
to find the size of the allocation needed. To allocate it
entryp = malloc(len);
is called, and finally readdir_r() uses it like this:
struct dirent *returned;
readdir_r(DIR*, entryp, &returned);
However, I'd like to avoid calling malloc() (or any other manual memory management function).
One way I've thought of is
_Alignas(struct dirent) char direntbuf[len];
struct dirent *entryp = (struct dirent*) direntbuf;
This should give a correctly aligned allocation, but it violates strict aliasing. However, the buffer is never accessed via a char* so the most likely problem, the compiler reordering accesses to the buffer via different types, cannot occur.
Another way could be by alloca(), which returns a void*, avoiding strict aliasing problems. However, alloca() does not seem to guarantee alignment the way malloc() and friends do. To always get an aligned buffer, something like
void *alloc = alloca(len + _Alignof(struct dirent));
struct dirent *direntbuf = (struct dirent*)((uintptr_t)&((char*)alloc)[_Alignof(struct dirent)]&-_Alignof(struct dirent));
would be needed. In particular, the cast to char * is needed to perform arithmetic on a pointer, and the cast to uintptr_t is needed to do the binary &. This doesn't look more well-defined than allocating a char[].
Is there a way to avoid manual memory management when allocating a struct dirent?

What about defining this:
#include <stddef.h> /* For offsetof */
#include <dirent.h>
union U
{
struct dirent de;
char c[offsetof(struct dirent, d_name) + NAME_MAX + 1]; /* NAME_MAX is POSIX. */
};

The readdir_r function signature is:
int readdir_r(DIR *dirp, struct dirent *entry, struct dirent **result);
And direct is a struct like this:
struct dirent {
ino_t d_ino; /* inode number */
off_t d_off; /* offset to the next dirent */
unsigned short d_reclen; /* length of this record */
unsigned char d_type; /* type of file; not supported
by all file system types */
char d_name[256]; /* filename */
};
You have to pass a pointer to readdir_r but how you allocate memory for the dirent structure is entirely up to you.
You could do it like this and use a stack variable.
struct dirent entry = {0};
...
readdir_r(DIR*, &entry, &returned);

Related

what's the correct way to malloc struct pointer using sizeof?

Imagine I've the following struct
struct Memory {
int type;
int prot;
};
typedef struct Memory *Memory;
How would I initialise it using malloc()?
Memory mem = malloc(sizeof(Memory));
or
Memory mem = malloc(sizeof(struct Memory));
What is the correct way to allocate that?
Your struct declaration is a bit muddled up, and the typedef is wrong on many levels. Here's what I'd suggest:
//typedef + decl in one
typedef struct _memory {
int type;
int prot;
} Memory;
Then allocate like so:
Memory *mem = malloc(sizeof *mem);
Read the malloc call like so: "Allocate the amount of memory required to store whatever type mem is pointing to". If you change Memory *mem to Memory **mem, it'll allocate 4 or 8 bytes (depending on the platform), as it now stands it'll probably allocate 8 bytes, depending on the size of int and how the compiler pads the struct check wiki for more details and examples.
Using sizeof *<the-pointer> is generally considered to be the better way of allocating memory, but if you want, you can write:
Memory *mem = malloc(sizeof(Memory));
Memory *mem = malloc(sizeof(struct _memory));
They all do the same thing. Mind you, if you typedef a struct, that's probably because you want to abstract the inner workings of something, and want to write an API of sorts. In that case, you should discourage the use of struct _memory as much as possible, in favour of Memory or *<the-pointer> anyway
If you want to typedef a pointer, then you can write this:
typedef struct _memory {
int type;
int prot;
} *Memory_p;
In which case this:
Memory_p mem = malloc(sizeof *mem);
might seem counter intuitive, but is correct, as is:
Memory_p mem = malloc(sizeof(struct _memory));
But this:
Memory_p mem = malloc(sizeof(Memory_p));
is wrong (it won't allocate the memory required for the struct, but memory to store a pointer to it).
It's a matter of personal preference, perhaps, but I personally find typedefs obscure certain things. In many cases this is for the better (ie FILE*), but once an API starts hiding the fact you're working with pointers, I start to worry a bit. It tends to make code harder to read, debug and document...
Just think about it like this:
int *pointer, stack;
The * operator modifies a variable of a given type, a pointer typedef does both. That's just my opinion, I'm sure there are many programmers that are far more skilled than me who do use pointer typedefs.
Most of the time, though, a pointer typedef is accompanied by custom allocator functions or macro's, so you don't have to write odd-looking statements like Memory_p mem = malloc(sizeof *mem);, but instead you can write ALLOC_MEM_P(mem, 1); which could be defined as:
#define ALLOC_MEM_P(var_name, count) Memory_p var_name = malloc(count * sizeof *var_name)
or something
Both
typedef struct Memory * Memory;
and
Memory mem = malloc (sizeof (Memory));
are wrong. The correct way to do it is :
typedef struct memory
{
int type;
int prot;
} *MEMPTR;
or
struct memory
{
int type;
int prot;
};
typedef struct memory *MEMPTR;
The name of the structure should be different than the name of a pointer to it.
This construction
struct {
int type;
int prot;
} Memory;
defines an object with name Memory that has type of unnamed structure.
Thus the next construction
typedef struct Memory *Memory;
defined 1) a new type struct Memory that has nothing common with the definition above and the name Memory. and 2) another new type name Memory that is pointer to struct Memory.
If the both constructions are present in the same compilation unit then the compiler will issue an error because name Memory (the name of the pointer) in the typedef declaration tries to redeclare the object of the type of the unnamed structure with the same name Memory.
I think you mean the following
typedef struct Memory {
int type;
int prot;
} Memory;
In this case you may use the both records of using malloc like
Memory *mem = malloc( sizeof( Memory ) );
and
struct Memory *mem = malloc( sizeof( struct Memory ) );
or
Memory *mem = malloc( sizeof( struct Memory ) );
or
struct Memory *mem = malloc( sizeof( Memory ) );
because now the two identifiers Memory are in two different name spaces, The first one is used with tag struct and the second is used without tag struct.

Issue with devm_kzalloc

I am trying to understanding devm_kzalloc() function implementation. It is allocating more than the requested memory(sizeof(struct devres) + size) to manage resources.
struct devres is defined as follows, the second member is an incomplete array.
struct devres {
struct devres_node node;
/* -- 3 pointers */
unsigned long long data[]; /* guarantee ull alignment */
};
Following is the source to allocate memory.
size_t tot_size = sizeof(struct devres) + size;
struct devres *dr;
dr = kmalloc_track_caller(tot_size, gfp);
if (unlikely(!dr))
return NULL;
memset(dr, 0, tot_size);
INIT_LIST_HEAD(&dr->node.entry);
dr->node.release = release;
return dr;
I have following doubhts.
. It is caliculating the tot_size, but in struct devres the array is imcomplete.
. The devm_kzalloc() function(shown below) is returning dr->data as the starting of the requested memory. If we understand that array name contains the startig address of that array then we are allocating more than the requested memory. i.e size of unsigned long long + size.
void * devm_kzalloc(struct device *dev, size_t size, gfp_t gfp)
{
struct devres *dr;
/* use raw alloc_dr for kmalloc caller tracing */
dr = alloc_dr(devm_kzalloc_release, size, gfp);
if (unlikely(!dr))
return NULL;
set_node_dbginfo(&dr->node, "devm_kzalloc_release", size);
devres_add(dev, dr->data);
return dr->data;
}
Could you please help me to understand this.
Thanks
I have understood now.
Its flexible array member feature of C99 used in "struct devres" structure definition.

write your own malloc

I am writing my own malloc() and i have already figured the following
struct myblock
{
struct myblock *next;
struct myblock *prev;
int isFree;
unsigned availablesize;
char *buffer;
}
and space #define MEM_BUFFER (1024) which will be "my ram".
and if i am not wrong then i would have
char *array[MEM_BUFFER];
to have array of 1024 bytes (kindly correct me if i am wrong).
As we know that MEM_BUFFER will also contain the matadata of occupied space. I am bit confused that how should i start.
This is my main question.
should i assign the struct to the array on each allocation request (if yes then from struct char array ?) .
should i handle double linked list on heap and skip sizeof(myblock) bytes from the array.
I am thinking on this solution for last 2 days and I am still confused.
No,
char *array[MEM_BUFFER];
is not an array of 1024 bytes (unless MEM_BUFFER is set to 1024 / sizeof (char *)) typically. It's an array of MEM_BUFFER character pointers.
You need just:
char array[MEM_BUFFER];
although a better name might be heap_space.
To make it consist of blocks, you'd need an additional pointer that is the first block:
struct myblock *heap = (struct myblock *) heap_space;
Then you can initialize that:
heap->next = NULL;
heap->prev = NULL;
heap->isFree = 1;
heap->availablesize = sizeof heap_space - sizeof *heap;
Not sure what struct myblock.buffer should do, I put the blocks inside the heap so the user memory for a block is at (void *) (block + 1);

If array sizes can only be a constant value than what does char d_name[...] mean?

If array sizes can only be a constant value than what does
char d_name[...]
mean?
Actually, there is a struct dirent declared in dirent.h file. its declaration is as under:
struct dirent{
....
ino_t d_ino;
char d_name[...];
...
};
It is used to read directory contents one at a time i.e. inode numbers and filenames etc...
I mean what is the max size of such an array and how much space is statically allocated in the memory once such an array is defined? Is such a definition portable?
Assuming it's from struct linux_dirent, it's actually char d_name[] :
struct linux_dirent {
unsigned long d_ino; /* Inode number */
unsigned long d_off; /* Offset to next linux_dirent */
unsigned short d_reclen; /* Length of this linux_dirent */
char d_name[]; /* Filename (null-terminated) */
}
It's called a flexible array member, using malloc you can allocate more memory to the struct giving d_name a variable size.
EDIT
The text the OP is quoting:
Directory entries are represented by a struct dirent
struct dirent {
...
ino_t d_ino; /* XSI extension --- see text */
char d_name[...]; /* See text on the size of this array */
...
};
With the ... the authors signals the size isn't fixed per standard. Each implementation must choose a fixed size, for example Linux chooses 256. But it's not valid code.

bin_at in dlmalloc

In glibc malloc.c or dlmalloc It said "repositioning tricks"As in blew, and use this trick in bin_at.
bins is a array,the space is allocated when av(struct malloc_state) is allocated.doesn't it? the sizeof(bin[i]) is less then sizeof(struct malloc_chunk*)?
When bin_at(M,1)(which is used as unsorted_chunks) is called,the result is: bin[0] - offsetof (struct malloc_chunk, fd) bin[0] - 8 is right?
Who can describe this trick for me? I can't understand the bin_at macro.why they get the bins address use this method?how it works?
Very thanks,and sorry for my poor English.
/*
To simplify use in double-linked lists, each bin header acts
as a malloc_chunk. This avoids special-casing for headers.
But to conserve space and improve locality, we allocate
only the fd/bk pointers of bins, and then use repositioning tricks
to treat these as the fields of a malloc_chunk*.
*/
typedef struct malloc_chunk* mbinptr;
/* addressing -- note that bin_at(0) does not exist */
#define bin_at(m, i) \
(mbinptr) (((char *) &((m)->bins[((i) - 1) * 2])) \
- offsetof (struct malloc_chunk, fd))
The malloc_chunk struct like this:
struct malloc_chunk {
INTERNAL_SIZE_T prev_size; /* Size of previous chunk (if free). */
INTERNAL_SIZE_T size; /* Size in bytes, including overhead. */
struct malloc_chunk* fd; /* double links -- used only if free. */
struct malloc_chunk* bk;
/* Only used for large blocks: pointer to next larger size. */
struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
struct malloc_chunk* bk_nextsize;
};
And the bin type like this:
typedef struct malloc_chunk* mbinptr;
struct malloc_state {
/* Serialize access. */
mutex_t mutex;
/* Flags (formerly in max_fast). */
int flags;
#if THREAD_STATS
/* Statistics for locking. Only used if THREAD_STATS is defined. */
long stat_lock_direct, stat_lock_loop, stat_lock_wait;
#endif
/* Fastbins */
mfastbinptr fastbinsY[NFASTBINS];
/* Base of the topmost chunk -- not otherwise kept in a bin */
mchunkptr top;
/* The remainder from the most recent split of a small request */
mchunkptr last_remainder;
/* Normal bins packed as described above */
mchunkptr bins[NBINS * 2 - 2];
/* Bitmap of bins */
unsigned int binmap[BINMAPSIZE];
/* Linked list */
struct malloc_state *next;
#ifdef PER_THREAD
/* Linked list for free arenas. */
struct malloc_state *next_free;
#endif
/* Memory allocated from the system in this arena. */
INTERNAL_SIZE_T system_mem;
INTERNAL_SIZE_T max_system_mem;
};
Presumably struct malloc_chunk looks something like:
struct malloc_chunk {
/* ... fields here ... */
struct malloc_chunk *fd;
struct malloc_chunk *bk;
/* ... more fields here ... */
};
...and the ->bins type looks like:
struct {
struct malloc_chunk *fd;
struct malloc_chunk *bk;
};
The bin_at macro makes a pointer to the latter structure into a fake pointer to the former structure, for the purpose of accessing the fd and bk members only (since they're the only ones that exist in the smaller one). ie bin_at(m, i)->fd and bin_at(m, i)->bk are the same as m->bins[(i - 1) * 2].fd and m->bins[(i - 1) * 2].bk, but bin_at can be used in places that expect a struct malloc_chunk * (as long as they only use the fd and bk members).
It's a bit of a hack. I wouldn't do this in your own code - remember Kernighan's advice about writing code as cleverly as possible:
"Debugging is twice as hard as writing
the code in the first place.
Therefore, if you write the code as
cleverly as possible, you are, by
definition, not smart enough to debug
it." – Brian W. Kernighan
OK, so ->bins isn't an array of structs at all - it's an array of struct malloc_chunk *.
Notice that ->bins[(i - 1) * 2] refers to the i-th pair of struct malloc_chunk * pointers in the ->bins array. This pair is equivalent to the fd and bk pair of pointers in a struct malloc_chunk, with the first (->bins[(i - 1) * 2]) being equivalent to fd (they could have instead made ->bins an array of the smaller struct I suggested above; it would be functionally equivalent and probably clearer).
The bin_at macro lets the code insert one of those pairs of pointers that are in the ->bins array into a linked list of struct malloc_chunk structs - without allocating an entire struct malloc_chunk. This is the space saving they are talking about.
The bin_at macro takes a index into the bins array, then does "if this pointer was actually the fd value in a struct malloc_chunk, then calculate a pointer to where that struct malloc_chunk would be". It does this by subtracting the offset of the fd member within a struct malloc_chunk from the address of the item in the bins array.
It doesn't really "locate the bins[i]" - that's straightforward (&bins[i]). It actually locates the imaginary struct malloc_chunk that bins[i] is the fd member of.
Sorry, it's complicated to explain because it's a complicated concept.

Resources