I'm reading the inotify man page here and i struggle to understand the following comment in the example
Some systems cannot read integer variables if they are not
properly aligned. On other systems, incorrect alignment may
decrease performance. Hence, the buffer used for reading from
the inotify file descriptor should have the same alignment as
struct inotify_event.
And this is the buffer declaration+definition
char buf[4096]
__attribute__ ((aligned(__alignof__(struct inotify_event))));
I read the following pdf document that gives a basic understanding of memory alignment access issues.
I would like to understand both the inotify comment and also having some hints and links for further understand the alignment problem. For example i always hear that there's no alignment problem for variables allocated on the stack. The problem arise only for buffers whose data gets reinterpreted.
The struct inotify_event is declared like this:
struct inotify_event {
int wd; /* Watch descriptor */
uint32_t mask; /* Mask describing event */
uint32_t cookie; /* Unique cookie associating related
events (for rename(2)) */
uint32_t len; /* Size of name field */
char name[]; /* Optional null-terminated name */
};
The problem is with the flexible array member name. Name has no number inside the braces, that means that &((struct inotify_event*)0)->name[0] == offsetof(struct inotify_event, name), ie. the memory for elements inside the name member starts RIGHT AFTER the structure.
Now if we were to story one inotify event, we need additional space for the name after the structure. Dynamic allocation may look like this:
char name[] = "this is the name of this event";
struct inotify_event *obj = malloc(sizeof(*obj) * (strlen(name) + 1));
obj->wd = smth;
obj->mask = smth2;
obj->len = strlen(name) + 1;
// fun fact: obj->name = &obj[1] . So if you ware to place array of inotify_events, you would overwrite the second member here.
memcpy(obj->name, name, obj->len);
The memory for the name structure member comes right after struct inotify_event and is allocated with the same malloc. So If we want to have an array of inotify_events and would copy them including the name, the next struct inotify_event may be unaligned.
Let's assume alignof(struct inotify_event) = 8 and sizeof(struct inotify_event) = 16 and char name[] = "A"; so strlen(name) = 1 (strlen excludes counting the terminating zero byte) and that we want to store an array of inotfiy_events inside a buffer. First we copy the struct - 16 bytes. Then we copy the name - 2 bytes (including zero byte). If we were to copy the next struct, it would be unaligned, cause we would copy it starting from the 18th byte (sizeof(struct inotfy_event) + strlen(name) + 1) in the buffer, which is not dividable by alignof(struct inotify_event). We need to insert 5 bytes of extra padding and we need to do that manually, after the first array member, so the next struct inotify_event will be copied into the 24th byte.
However, we need also to notify the user / application code of how much it needs to increment the pointer to get to the next struct array member. So we increment obj->len with the number of padding bytes. So obj->len is equal to strlen(name) + 1 + number of padding bytes inserted to make the next array member aligned or is equal to 0, in case of no name.
Inspect the example code in the manual page. In the loop where we loop through struct inotify_events there is the line:
ptr += sizeof(struct inotify_event) + event->len
The ptr is a char* pointer to the current / next array member. We need to increment the pointer not only by sizeof(struct inotify_event) but also by the number of strlen(name) + 1 bytes + inserted padding to the next array member. That way we can keep the array member aligned to their needed alignment. In the next position is the next struct inotify_event.
For more information browse about pointer arithmetics in C and flexible array struct member.
This is the old GCC non-standard way of specifying that the array buf needs to start at an address that would be a suitable starting address for struct inotify_event.
This can be in C11, C17 written as
char _Alignas(struct inotify_event) buf[4096];
Related
Consider the following code fragment:
struct data_t {
int data1;
int data2;
struct data_t *next;
size_t size;
int data3;
int data4;
};
int *ptr;
struct data_t data;
...
ptr = &data.data4;
Now using pointer, which is set to point to the last element in the structure, how can one use that pointer to access the first element in the structure (data1)?
Normally, what I would do in this case is back up the pointer by so many words to point to that element, but there is a problem. The pointer variable next in the middle of the structure has a varying size depending on the platform. If this is running on a 32-bit platform, then the pointer is 4 bytes while on a 64-bit platform, the pointer takes up 8 bytes. A similar issue happens with the size_t datatype as well.
Although not clear in the example, the structure is the header to a block of memory that is variable in size and is part of a linked list. AKA a free list in a memory allocator. Other than using some kind of an initialization that calculates the size of the pointer itself, is there a portable way of getting the address of the first element of the structure?
You can use offsetof to know how far a member is from the start of the structure. In this case:
struct data_t *p = (struct data_t *)( (char *)ptr - offsetof(struct data_t, data4) );
Obviously this requires you to know that the pointer is pointing at a data4 already, there's no way to autodetect that or anything. And, of course, it would be preferable to use a code design where you pass around the struct data * in the first place.
Given the below simple code, where you have process_payload is given a pointer to the payload portion of the packet, how do you access the header portion? Ideally the caller should simply give a pointer to full packet from beginning, but there are cases where you don't have the beginning of the message and need to work backwards to get to the header info. I guess this question becomes a understanding of walking through the memory layout of a struct.
The header computes to 8 bytes with sizeof operation. I assume Visual C++ compiler added 3 bytes padding to header.
The difference between pptr and pptr->payload is decimal 80 (not sure why this value??) when doing ptr arith (pptr->payload - pptr). Setting ptr = (struct Packet*)(payload - 80) works but seems more a hack. I don't quite understand why subtracting sizeof(struct header) doesn't work.
Thanks for any help you can give.
struct Header
{
unsigned char id;
unsigned int size;
};
struct Packet
{
struct Header header;
unsigned char* payload;
};
void process_payload(unsigned char* payload);
int main()
{
struct Packet* pptr = (struct Packet*)malloc(sizeof(struct Packet));
pptr->payload = (unsigned char*)malloc(sizeof(unsigned char)*10);
process_payload(pptr->payload);
return 1;
}
// Function needs to work backwards to get to header info.
void process_payload(unsigned char* payload)
{
// If ptr is correctly setup, it will be able to access all the fields
// visible in struct Packet and not simply payload part.
struct Packet* ptr;
// This does not work when intuitively it should?
ptr = (struct Packet*)(payload - sizeof(struct Header));
}
It's because in main you allocate two pointers, and pass the second pointer to the process_payload function. The two pointers are not related.
There are two ways of solving this problem, where both include a single allocation.
The first solution is to used so called flexible arrays, where you have an array member last in the structure without any size:
struct Packet
{
struct Header header;
unsigned char payload[];
};
To use it you make one allocation, with the size of the structure plus the size of the payload:
struct Packet *pptr = malloc(sizeof(struct Packet) + 10);
Now pptr->payload is handled like a normal pointer pointing to 10 unsigned characters.
Another solution, which is a mix of your current solution and the solution with flexible arrays, is to make one allocation and make the payload pointer to point to the correct place in the single allocated memory block:
struct Packet
{
struct Header header;
unsigned char *payload;
};
// ...
struct Packet *pptr = malloc(sizeof(struct Packet) + 10);
pptr->payload = (unsigned char *) ((char *) pptr + sizeof(struct Packet);
Note that in this case, to get the Packet structure from the payload pointer, you have to use sizeof(Packet) instead of only sizeof(Header).
Two things to note about the code above:
I don't cast the result of malloc
sizeof(char) (and also the size of unsigned char) is specified to always be one, so no need for sizeof
We use Coverity to detect vulnerabilities in our code. Basically this is the code snippet:
static int vendor_request(
const struct OFPHDR *oh,
size_t length,
const struct OFPUTIL_MT **typep
)
{
const struct OFPSM *osr;
ovs_be32 vendor;
osr = (const struct OFPSM *) oh;
memcpy(&vendor, ((char*)osr + sizeof(struct OFPSM)), sizeof( vendor ));
if (vendor == htonl(VENDOR_A))
return (functionA(oh, typep));
if (vendor == htonl(VENDOR_B))
return (functionB(oh, length, typep));
else
return 0;
}
Here,
sizeof(struct OFPSM) = 12 bytes.
sizeof(struct OFPHDR) = 8 bytes.
Coverity says:
CID xxxxx (#1 of 2): Out-of-bounds access (OVERRUN)
1. overrun-buffer-val: Overrunning struct type OFPHDR of 8 bytes by passing it to a function which accesses it at byte offset 12. Pointer osr indexed by constant 12U through dereference in call to memcpy.
Basically struct OFPHDR is a PDU on top of TCP layer, it's size is 8 bytes but it can vary depending upon what type of OFP message it is. Coverity says that I'm dereferencing *oh at byte offset index 12 which is out-bound-access index.
But I don't understand the problem since I'm typecasting OFPHDR to proper structure which is of 12 bytes and then dereferencing it. So, how could this error be avoided?
This cast:
osr = (const struct OFPSM *) oh;
is breaking the strict aliasing rules since it is casting to an incompatible type.
It's clear they are incompatible since you say:
sizeof(struct OFPSM) = 12 bytes.
sizeof(struct OFPHDR) = 8 bytes.
But I don't understand the problem since I'm typecasting OFPHDR to proper structure which is of 12 bytes and then dereferencing it.
Coverity is trying to save you from a path where perhaps you only allocated/read in sizeof OFPHDR bytes and yet you attempt to access beyond that allocation. You can see two reasonable possibilities taking you there: your vendor == htonl(VENDOR_A) logic could be implemented incorrectly or the values that you read from the network were maliciously crafted/in error.
Your cast supposes information about the implementation of the caller that coverity thinks you can't be certain about in vendor_request.
So, how could this error be avoided?
You could avoid it by changing vendor_request like so:
typedef union {
struct OFPHDR oh;
struct OFPSM osm;
} son_of_OFPHDR;
static int vendor_request(
const son_of_OFPHDR *oh,
size_t length,
const struct OFPUTIL_MT **typep
)
This explicitly tells compilers, static checkers, and humans that the oh input may be an OFPHDR or could be an OFPSM.
Everyone who agrees to take a son_of_OFPHDR * has an implicit pledge from callers that memory for the entire structure has been allocated. And everywhere son_of_OFPHDRs show up with automatic storage duration, sufficient memory will be allocated there.
Everyone, thanks for the answers.
#PeterSW: The structs are incompatible yes, but as I mentioned OFPHDR is a PDU on top of TCP layer, it's size is variable. The information which we need to extract(vendor) from that pointer lies on its 12th byte offset.
This is solved by typecasting it to correct structure which has size enough to envelop more than 12 bytes and includes that element(vendor):
struct OFPVSM {
struct OFPSM osm;
ovs_be32 vendor; /* Vendor ID:
/* Followed by vendor-defined arbitrary additional data. */
};
Here,
sizeof(struct OFPVSM) = 16 bytes.
Solution in git diff format:
- const struct OFPSM *osr;
+ const struct OFPVSM *osr;
- osr = (const struct OFPSM *) oh;
+ osr = (const struct OFPVSM*) oh;
Sorry for not mentioning a vital info:
struct OFPSM actually comprises of struct OFPHDR
struct OFPSM{
struct OFPHDR header;
ovs_be16 type;
ovs_be16 flags;
};
"vendor" lies at the end of struct OFPSM.
See the struct I used bellow. I wish to solve this problem in a portable way.
The code I used for finding the absolute address of the struct was: (char*)data - sizeof(struct block); (where data is the address to the data in the struct block). It did not work on this struct.
I made a test program seen bellow where the last assert fails.
If I change unsigned int free:1; to unsigned int free; both prints will print 12 and thus sizeof has given me the expected result.
Thanks in advance.
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
struct block {
size_t size;
struct block* next;
unsigned int free:1;
char data[];
};
int main(void)
{
struct block* avail;
struct block* b;
avail = malloc(sizeof(struct block) + 10);
printf("%zu \n", sizeof(struct block)); // prints 12
printf("%zu\n", avail->data - (char*)&avail->size); //prints 9
b = (struct block*)((char*)avail->data - 9);
assert(b == avail);
b = (struct block*)((char*)avail->data - sizeof(struct block));
assert(b == avail);
return 0;
}
EDIT: seems like I found the answer here on stack overflow:
how to get struct's start address from its member's address
It gives me correct absolute address.
The only guarantees you have regarding the layout (and size) of
struct block {
size_t size;
struct block* next;
unsigned int free:1;
char data[];
};
are that the addresses of the members (resp. the unit containign the bit-field) are increasing in the order of their listing, the members are suitably aligned for their types, and there's no padding at the start of the struct, so a pointer to the struct, suitably converted yields a pointer to its first member. The compiler is free to insert more padding between the members than needed for alignment.
However, usually, the padding inserted is only what is needed for alignment. Also the size and alignment requirements of size_t and struct block* are in most implementations the same, both 4 bytes on a 32 bit system and 8 bytes on a 64 bit system. Then the size of struct block is a multiple of k = sizeof(size_t), and the first k bytes are occupied by the size member, the next k bytes by the next pointer.
After that comes an unsigned bit-field of width 1. Such a small bit-field fits into any unit of storage, thus the implementation is free to choose a unit of storage of any size for it. Natural choices would be
one byte, since it's the smallest possible unit,
sizeof(int) bytes, since " A ‘‘plain’’ int object has the natural size suggested by the architecture of the execution environment".
Now, if the unit to contain the bit-field is chosen to have the size of one byte, as was the case for your implementation (and mine), the data member is typically placed directly after that, at an offset of 2*k+1 bytes, since the alignment of char is 1. If the unit for the bit-field is chosen to be int-sized, the offset of data will most likely be 2*k + sizeof(int), which on 32-bit systems is probably equal to sizeof(struct block), but not on 64-bit systems.
You can with very high probability bring the implementation to make
offsetof(struct block, data) == sizeof(struct block)
by inserting an unnamed bit-field of appropriate width (CHAR_BIT * sizeof(size_t) - 1) between free and data, but the only way that is portable and guaranteed to work is
struct block *b_addr = (struct block*)((char*)(avail->data) - offsetof(struct block, data));
as stated in Greg Hewgill's answer to the linked question.
sizeof(struct block) - sizeof(char*) should give you the size of the struct block, not including the data field. So, if you have a pointer to data, you should reach the beginning of the structure.
b = (struct block*)((char*)avail->data - (sizeof(struct block) - sizeof(char*));
assert(b == avail);
I have not tested it, though.
Let's say you have-
struct Person {
char *name;
int age;
int height;
int weight;
};
If you do-
struct Person *who = malloc(sizeof(struct Person));
How would C know how much memory to allocate for name variable as this can hold a large number of data/string? I am new to C and getting confused with memory allocation.
It won't know, You will have to allocate memory for it separately.
struct Person *who = malloc(sizeof(struct Person));
Allocates enough memory to store an object of the type Person.
Inside an Person object the member name just occupies a space equivalent to size of an pointer to char.
The above malloc just allocates that much space, to be able to do anything meaningful with the member pointer you will have to allocate memory to it separately.
#define MAX_NAME 124
who->name = malloc(sizeof(char) * MAX_NAME);
Now the member name points to an dynamic memory of size 124 byte on the heap and it can be used further.
Also, after your usage is done you will need to remember to free it explicitly or you will end up with a memory leak.
free(who->name);
free(who);
Don't assume that the memory storing the pointer for name is the same as the memory storing the data for name. Assuming a 4 byte word size, you have the following:
char * (4 bytes)
int (4 bytes)
int (4 bytes)
int (4 bytes)
================
total: 16 bytes
which is: sizeof(char*) + sizeof(int) + sizeof(int) + sizeof(int). C knows the size because you've told it the size of the elements in the struct definition.
I think what you are confused about is the following:
The contents at the char * will be a memory location (e.g. 0x00ffbe532) which is where the actual string will be stored. Don't assume that the struct contents are contiguous (because of the pointer). In fact, you can be pretty sure that they won't be.
So, to reiterate, for an example struct Person (this is just an example, the locations won't be the same in a real program.)
location : [contents]
0x0000 : [0x00ffbe532]
0x0004 : [10]
0x0008 : [3]
0x000C : [25]
0x00ffbe532 : [I am a string\0]
The name member is just a pointer. The size of a pointer varies with the underlying architecture, but is usually 4 or 8 bytes nowadays.
The data that name can point to (if assigned later) should be laid out in an area that does not coincide with the struct at all.
At the language level, the struct doesn't know anything about the memory that the name member is pointing to; you have to manage that manually.
It allocates memory for the just the pointer to a char. You need to do a separate allocation for the contents.
There are other options although:
If you are OK with having a fixed sized maximum length, you can do:
struct Person {
char name[PERSON_NAME_MAX_LENGTH+1];
int age;
int height;
int weight;
};
And allocate it as in your example.
Or you can declare a variable sized struct, but I wouldn't recommend this as it is tricky and you cannot have more than one variable size array per struct:
struct Person {
int age;
int height;
int weight;
char name[]; /*this must go at the end*/
};
and then allocate it like:
struct Person *who = malloc(sizeof(struct Person) + sizeof(char)*(name_length+1));
It will allocate 4 bytes for the name pointer, but no space for the "real" string. If you try to write there you will seg fault; you need to malloc (and free) it separately.
Using string (in C++) can save you some headache
For pointers the struct is allocated enough memory just for the pointer. You have to create the memory for the char* and assign the value to the struct. Assuming you had char* name somewhere:
struct Person *who = malloc(sizeof(struct Person));
who->name = malloc((strlen(name)+1) * sizeof(char));
strcpy(who->name, name)
It doesn't. You have to do that too.
struct Person *who = malloc(sizeof(struct Person));
who->name = malloc(sizeof(char) * 16); /* for a 15+1 character array */
You have a pointer to a name character array in your struct, ie. it will consume that much bytes that are needed to represent a memory address.
If you want to actually use the field, you must allocate additional memory for it.
Pointer members occupy one word in my experience (the address on witch the data actually resides) so the size will probably be 16 bytes (assuming one word is 4 bytes)
As the man said above you need to separately allocate memory for *name witch will free memory someplace else for the size you desire
I think you should also need to allocate memory for the name attribute