Offsetof used in Linux

Offsetof used in Linux - c

I was going through how offset of a particular variable is found in a given structure.
I tried the following program .
struct info{
char a;
int b;
char c;
int d;
};
struct info myinfo;
int main(int argc, char **argv)
{
struct info *ptr = &myinfo;
unsigned int offset;
offset = (unsigned int) &((struct info *) 0)->d;
printf("Offset = %d\n",offset);
return 0;
}
I just wanted to know how the line offset = (unsigned int) &((struct info *) 0)->d works.
I am confused because of dereferencing of 0.

It does not really dereference 0, although it looks like it. It really takes the address of some member if it was dereferenced at address 0, hypothetically.
This is a kind of dirty hack (plus, some nasty macro stuff), but it gets you what you're interested in (the offset of the member in the struct).
A more "correct" way of doing the same thing would be to generate a valid object, take its address, and take the address of the member, then subtract these. Doing the same with a null pointer is not all pretty, but works without creating an object and subtracting anything.

You're not actually dereferencing 0. You're adding zero and the offset of the member, since you're taking the address of the expression. That is, if off is the offset of the member, you're doing
0 + off
not
*(0 + off)
so you never actually do a memory access.

Related

Accessing value from address in C

Here is my code
struct ukai { int val[1]; };
struct kai { struct ukai daddr; struct ukai saddr; };
struct kai *k, uk;
uk.saddr.val[0] = 5;
k = &uk;
k->saddr.val[0] = 6;
unsigned int *p = (unsigned int *)malloc(sizeof(unsigned int));
p[0] = k;
int *vp;
vp = ((uint8_t *)p[0] + 4);
printf("%d\n", *vp);
This produces a segmentation fault. However if we replace the last line with printf("%u\n", vp) it gives the address i.e. &(k->saddr.val[0]). However I am unable to print the value present at the address using p[0] but able to print it using k->saddr.val[0].
I have to use p pointer in some way to access value at val[0], I can't use pointer k. I need help here, whether it is even possible or not please let me know.

The code makes no sense:
p[0] = k; converts the value of a pointer k to an int as p is a pointer to int. This is implementation defined and loses information if pointers are larger than type int.
vp = ((uint8_t *)p[0] + 4); converts the int pointed to by p to a pointer to unsigned char and makes vp point to the location 4 bytes beyond this pointer. If pointers are larger than int, this has undefined behavior. Just printing the the value of this bogus pointer might be OK, but dereferencing it has undefined behavior.
printf("%u\n", vp) uses an incorrect format for pointer vp, again this is undefined behavior, although it is unlikely to crash.
The problem is most likely related to the size of pointers and integers: if you compile this code as 64 bits, pointers are larger than ints, so converting one to the other loses information.
Here is a corrected version:
struct ukai { int val[1]; };
struct kai { struct ukai daddr; struct ukai saddr; };
struct kai *k, uk;
uk.saddr.val[0] = 5;
k = &uk;
k->saddr.val[0] = 6;
int **p = malloc(sizeof *p);
p[0] = k;
int *vp = (int *)((uint8_t *)p[0] + sizeof(int));
printf("%d\n", *vp); // should print 6

There is a lot of "dirty" mess with the addresses done here.
Some of this stuff is not recommended or even forbidden from the standard C point of view.
However such pointer/addresses tweaks are commonly used in low level programming (embedded, firmware, etc.) when some compiler implementation details are known to the user. Of course such code is not portable.
Anyway the issue here (after getting more details in the comments section) is that the machine on which this code runs is 64 bits. Thus the pointers are 64 bits width while int or unsigned int is 32 bits width.
So when storing address of k in p[0]
p[0] = k;
while p[0] is of type unsigned int and k is of type pointer to struct kai, the upper 32 bits of the k value are cut off.
To resolve this issue, the best way is to use uintptr_t as this type will alway have the proper width to hold the full address value.
uintptr_t *p = malloc(sizeof(uintptr_t));
Note: uintptr_t is optional, yet common. It is sufficient for a void*, but maybe not a function pointer. For compatible code, proper usage of uintptr_t includes object pointer --> void * --> uintptr_t --> void * --> object pointer.

Understanding pointer casting on struct type in C

I'm trying to understanding the pointer casting in this case.
# https://github.com/udp/json-parser/blob/master/json.c#L408
#define json_char char
typedef struct _json_object_entry
{
json_char * name;
unsigned int name_length;
struct _json_value * value;
} json_object_entry;
typedef struct _json_value
{
struct
{
unsigned int length;
json_object_entry * values;
#if defined(__cplusplus) && __cplusplus >= 201103L
decltype(values) begin () const
{ return values;
}
decltype(values) end () const
{ return values + length;
}
#endif
} object;
}
(*(json_char **) &top->u.object.values) += string_length + 1;
Due to what I see top->u.object.values has the address of the first element of values ( type : json_object_entry ), and then we get the address of values, casting it to char, .. And from here I'm lost. I don't really understand the purpose of this.
// Notes : This is two pass parser for those who wonders what is this.
Thanks

_json_value::values is a pointer to the beginning of (or into) an array of json_object_entrys. The code adjusts its value by a few bytes, e.g in order to skip a header or such before the actual data. Because the pointer is typed one can without casting only change its value in quants of sizeof(_json_object_entry), but apparently the offset can have any value, depending on some string_length. So the address of the pointer is taken, cast to the address of a char pointer (a char pointer can be changed in 1 increments), dereferenced so the result is a pointer to char residing at the same place as the real u.object.values, and then assigned to.
One should add that such code may break at run time if the architecture demands a minimal alignment for structures (possibly depending on their first element, here a pointer) and the string length can have a value which is not a multiple of that alignment. That would make the code UB. I'm not exactly sure whether the code is nominally UB if the alignment is preserved.

Author here (guilty as charged...)
In the first pass, values hasn't yet been allocated, so the parser cheats by using the same field to store the amount of memory (length) that's going to be required when it's actually allocated in the second pass.
if (state.first_pass)
(*(json_char **) &top->u.object.values) += string_length + 1;
The cast to json_char is so that we add multiples of char to the length, rather than multiples of json_object_entry.
It is a bit (...OK, more than a bit...) of a dirty hack re-using the field like that, but it was to save adding another field to json_value or using a union (C89 unions can't be anonymous, so it would have made the structure of json_value a bit weird).
There's no UB here, because we're not actually using values as an array of structs at this point, just subverting the type system and using it as an integer.

json_object_entry * values;
...
}
(*(json_char **) &top->u.object.values) += string_length + 1;
forgetting type correctness, you can collapse the & and *:
((json_char **) top->u.object.values) += string_length + 1;
top->u.object.values is indeed the pointer to first element of values array. It is typecasted to a pointer to a pointer to json_char, and then advanced string_length + 1 characters. The net result is that top->u.object.values now points (string_length + 1) json_chars ahead of what it used to.

How do I correctly use a void pointer in C?

Can someone explain why I do not get the value of the variable, but its memory instead?
I need to use void* to point to "unsigned short" values.
As I understand void pointers, their size is unknown and their type is unknown.
Once initialize them however, they are known, right?
Why does my printf statement print the wrong value?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void func(int a, void *res){
res = &a;
printf("res = %d\n", *(int*)res);
int b;
b = * (int *) res;
printf("b =%d\n", b);
}
int main (int argc, char* argv[])
{
//trial 1
int a = 30;
void *res = (int *)a;
func(a, res);
printf("result = %d\n", (int)res);
//trial 2
unsigned short i = 90;
res = &i;
func(i, res);
printf("result = %d\n", (unsigned short)res);
return 0;
}
The output I get:
res = 30
b =30
result = 30
res = 90
b =90
result = 44974

One thing to keep in mind: C does not guarantee that int will be big enough to hold a pointer (including void*). That cast is not a portable thing/good idea. Use %p to printf a pointer.
Likewise, you're doing a "bad cast" here: void* res = (int*) a is telling the compiler: "I am sure that the value of a is a valid int*, so you should treat it as such." Unless you actually know for a fact that there is an int stored at memory address 30, this is wrong.
Fortunately, you immediately overwrite res with the address of the other a. (You have two vars named a and two named res, the ones in main and the ones in func. The ones in func are copies of the value of the one in main, when you call it there.) Generally speaking, overwriting the value of a parameter to a function is "bad form," but it is technically legal. Personally, I recommend declaring all of your functions' parameters as const 99% of the time (e.g. void func (const int a, const void* res))
Then, you cast res to an unsigned short. I don't think anybody's still running on a 16-bit address-space CPU (well, your Apple II, maybe), so that will definitely corrupt the value of res by truncating it.
In general, in C, typecasts are dangerous. You're overruling the compiler's type system, and saying: "look here, Mr Compiler, I'm the programmer, and I know better than you what I have here. So, you just be quiet and make this happen." Casting from a pointer to a non-pointer type is almost universally wrong. Casting between pointer types is more often wrong than not.
I'd suggest checking out some of the "Related" links down this page to find a good overview of how C types an pointers work, in general. Sometimes it takes reading over a few to really get a grasp on how this stuff goes together.

(unsigned short)res
is a cast on a pointer, res is a memory address, by casting it to an unsigned short, you get the address value as an unsigned short instead of hexadecimal value, to be sure that you are going to get a correct value you can print
*(unsigned short*)res
The first cast (unsigned short*)res makes a cast on void* pointer to a pointer on unsigned short. You can then extract the value inside the memory address res is pointing to by dereferencing it using the *

If you have a void pointer ptr that you know points to an int, in order to access to that int write:
int i = *(int*)ptr;
That is, first cast it to a pointer-to-int with cast operator (int*) and then dereference it to get the pointed-to value.
You are casting the pointer directly to a value type, and although the compiler will happily do it, that's not probably what you want.

A void pointer is used in C as a kind of generic pointer. A void pointer variable can be used to contain the address of any variable type. The problem with a void pointer is once you have assigned an address to the pointer, the information about the type of variable is no longer available for the compiler to check against.
In general, void pointers should be avoided since the type of the variable whose address is in the void pointer is no longer available to the compiler. On the other hand, there are cases where a void pointer is very handy. However it is up to the programmer to know the type of variable whose address is in the void pointer variable and to use it properly.
Much of older C source has C style casts between type pointers and void pointers. This is not necessary with modern compilers and should be avoided.
The size of a void pointer variable is known. What is not known is the size of the variable whose pointer is in the void pointer variable. For instance here are some source examples.
// create several different kinds of variables
int iValue;
char aszString[6];
float fValue;
int *pIvalue = &iValue;
void *pVoid = 0;
int iSize = sizeof(*pIvalue); // get size of what int pointer points to, an int
int vSize = sizeof(*pVoid); // compile error, size of what void pointer points to is unknown
int vSizeVar = sizeof(pVoid); // compiles fine size of void pointer is known
pVoid = &iValue; // put the address of iValue into the void pointer variable
pVoid = &aszString[0]; // put the address of char string into the void pointer variable
pVoid = &fValue; // put the address of float into the void pointer variable
pIvalue = &fValue; // compiler error, address of float into int pointer not allowed
One way that void pointers have been used is by having several different types of structs which are provided as an argument for a function, typically some kind of a dispatching function. Since the interface for the function allows for different pointer types, a void pointer must be used in the argument list. Then the type of variable pointed to is determined by either an additional argument or inspecting the variable pointed to. An example of that type of use of a function would be something like the following. In this case we include an indicator as to the type of the struct in the first member of the various permutations of the struct. As long as all structs that are used with this function have as their first member an int indicating the type of struct, this will work.
struct struct_1 {
int iClass; // struct type indicator. must always be first member of struct
int iValue;
};
struct struct_2 {
int iClass; // struct type indicator. must always be first member of struct
float fValue;
};
void func2 (void *pStruct)
{
struct struct_1 *pStruct_1 = pStruct;
struct struct_2 *pStruct_2 = pStruct;
switch (pStruct_1->iClass) // this works because a struct is a kind of template or pattern for a memory location
{
case 1:
// do things with pStruct_1
break;
case 2:
// do things with pStruct_2
break;
default:
break;
}
}
void xfunc (void)
{
struct struct_1 myStruct_1 = {1, 37};
struct struct_2 myStruct_2 = {2, 755.37f};
func2 (&myStruct_1);
func2 (&myStruct_2);
}
Something like the above has a number of software design problems with the coupling and cohesion so unless you have good reasons for using this approach, it is better to rethink your design. However the C programming language allows you to do this.
There are some cases where the void pointer is necessary. For instance the malloc() function which allocates memory returns a void pointer containing the address of the area that has been allocated (or NULL if the allocation failed). The void pointer in this case allows for a single malloc() function that can return the address of memory for any type of variable. The following shows use of malloc() with various variable types.
void yfunc (void)
{
int *pIvalue = malloc(sizeof(int));
char *paszStr = malloc(sizeof(char)*32);
struct struct_1 *pStruct_1 = malloc (sizeof(*pStruct_1));
struct struct_2 *pStruct_2Array = malloc (sizeof(*pStruct_2Array)*21);
pStruct_1->iClass = 1; pStruct_1->iValue = 23;
func2(pStruct_1); // pStruct_1 is already a pointer so address of is not used
{
int i;
for (i = 0; i < 21; i++) {
pStruct_2Array[i].iClass = 2;
pStruct_2Array[i].fValue = 123.33f;
func2 (&pStruct_2Array[i]); // address of particular array element. could also use func2 (pStruct_2Array + i)
}
}
free(pStruct_1);
free(pStruct_2Array); // free the entire array which was allocated with single malloc()
free(pIvalue);
free(paszStr);
}

If what you want to do is pass the variable a by name and use it, try something like:
void func(int* src)
{
printf( "%d\n", *src );
}
If you get a void* from a library function, and you know its actual type, you should immediately store it in a variable of the right type:
int *ap = calloc( 1, sizeof(int) );
There are a few situations in which you must receive a parameter by reference as a void* and then cast it. The one I’ve run into most often in the real world is a thread procedure. So, you might write something like:
#include <stddef.h>
#include <stdio.h>
#include <pthread.h>
void* thread_proc( void* arg )
{
const int a = *(int*)arg;
/** Alternatively, with no explicit casts:
* const int* const p = arg;
* const int a = *p;
*/
printf( "Daughter thread: %d\n", a );
fflush(stdout); /* If more than one thread outputs, should be atomic. */
return NULL;
}
int main(void)
{
int a = 1;
const pthread_t tid = pthread_create( thread_proc, &a );
pthread_join(tid, NULL);
return EXIT_SUCCESS;
}
If you want to live dangerously, you could pass a uintptr_t value cast to void* and cast it back, but beware of trap representations.

printf("result = %d\n", (int)res); is printing the value of res (a pointer) as a number.
Remember that a pointer is an address in memory, so this will print some random looking 32bit number.
If you wanted to print the value stored at that address then you need (int)*res - although the (int) is unnecessary.
edit: if you want to print the value (ie address) of a pointer then you should use %p it's essentially the same but formats it better and understands if the size of an int and a poitner are different on your platform

void *res = (int *)a;
a is a int but not a ptr, maybe it should be:
void *res = &a;

The size of a void pointer is known; it's the size of an address, so the same size as any other pointer. You are freely converting between an integer and a pointer, and that's dangerous. If you mean to take the address of the variable a, you need to convert its address to a void * with (void *)&a.

Understanding C: Pointers and Structs

I'm trying to better understand c, and I'm having a hard time understanding where I use the * and & characters. And just struct's in general. Here's a bit of code:
void word_not(lc3_word_t *R, lc3_word_t A) {
int *ptr;
*ptr = &R;
&ptr[0] = 1;
printf("this is R at spot 0: %d", ptr[0]);
}
lc3_word_t is a struct defined like this:
struct lc3_word_t__ {
BIT b15;
BIT b14;
BIT b13;
BIT b12;
BIT b11;
BIT b10;
BIT b9;
BIT b8;
BIT b7;
BIT b6;
BIT b5;
BIT b4;
BIT b3;
BIT b2;
BIT b1;
BIT b0;
};
This code doesn't do anything, it compiles but once I run it I get a "Segmentation fault" error. I'm just trying to understand how to read and write to a struct and using pointers. Thanks :)
New Code:
void word_not(lc3_word_t *R, lc3_word_t A) {
int* ptr;
ptr = &R;
ptr->b0 = 1;
printf("this is: %d", ptr->b0);
}

Here's a quick rundown of pointers (as I use them, at least):
int i;
int* p; //I declare pointers with the asterisk next to the type, not the name;
//it's not conventional, but int* seems like the full data type to me.
i = 17; //i now holds the value 17 (obviously)
p = &i; //p now holds the address of i (&x gives you the address of x)
*p = 3; //the thing pointed to by p (in our case, i) now holds the value 3
//the *x operator is sort of the reverse of the &x operator
printf("%i\n", i); //this will print 3, cause we changed the value of i (via *p)
And paired with structs:
typedef struct
{
unsigned char a;
unsigned char r;
unsigned char g;
unsigned char b;
} Color;
Color c;
Color* p;
p = &c; //just like the last code
p->g = 255; //set the 'g' member of the struct to 255
//this works because the compiler knows that Color* p points to a Color
//note that we don't use p[x] to get at the members - that's for arrays
And finally, with arrays:
int a[] = {1, 2, 7, 4};
int* p;
p = a; //note the lack of the & (address of) operator
//we don't need it, as arrays behave like pointers internally
//alternatively, "p = &a[0];" would have given the same result
p[2] = 3; //set that seven back to what it should be
//note the lack of the * (dereference) operator
//we don't need it, as the [] operator dereferences for us
//alternatively, we could have used "*(p+2) = 3;"
Hope this clears some things up - and don't hesitate to ask for more details if there's anything I've left out. Cheers!

I think you are looking for a general tutorial on C (of which there are many). Just check google. The following site has good info that will explain your questions better.
http://www.cplusplus.com/doc/tutorial/pointers/
http://www.cplusplus.com/doc/tutorial/structures/
They will help you with basic syntax and understanding what the operators are and how they work. Note that the site is C++ but the basics are the same in C.

First of all, your second line should be giving you some sort of warning about converting a pointer into an int. The third line I'm surprised compiles at all. Compile at your highest warning level, and heed the warnings.
The * does different things depending on whether it is in a declaration or an expression. In a declaration (like int *ptr or lc3_word_t *R) it just means "this is a pointer."
In an expression (like *ptr = &R) it means to dereference the pointer, which is basically to use the pointed-to value like a regular variable.
The & means "take the address of this." If something is not a pointer, you use it to turn it into a pointer. If something is already a pointer (like R or ptr in your function), you don't need to take the address of it again.

int *ptr;
*ptr = &R;
Here ptr is not initialized. It can point to whatever. Then you dereference it with * and assign it the address of R. That should not compile since &R is of type lc3_word_t** (pointer to pointer), while *ptr is of type int.
&ptr[0] = 1; is not legal either. Here you take the address of ptr[0] and try to assign it 1. This is also illegal since it is an rvalue, but you can think of it that you cannot change the location of the variable ptr[0] since what you're essentially trying to do is changing the address of ptr[0].

Let's step through the code.
First you declare a pointer to int: int *ptr. By the way I like to write it like this int* ptr (with * next to int instead of ptr) to remind myself that pointer is part of the type, i.e. the type of ptr is pointer to int.
Next you assign the value pointed to by ptr to the address of R. * dereferences the pointer (gets the value pointed to) and & gives the address. This is your problem. You've mixed up the types. Assigning the address of R (lc3_word_t**) to *ptr (int) won't work.
Next is &ptr[0] = 1;. This doesn't make a whole lot of sense either. &ptr[0] is the address of the first element of ptr (as an array). I'm guessing you want just the value at the first address, that is ptr[0] or *ptr.

How can I get/set a struct member by offset

Ignoring padding/alignment issues and given the following struct, what is best way to get and set the value of member_b without using the member name.
struct mystruct {
int member_a;
int member_b;
}
struct mystruct *s = malloc(sizeof(struct mystruct));
Put another way; How would you express the following in terms of pointers/offsets:
s->member_b = 3;
printf("%i",s->member_b);
My guess is to
calculate the offset by finding the sizeof the member_a (int)
cast the struct to a single word pointer type (char?)
create an int pointer and set the address (to *charpointer + offset?)
use my int pointer to set the memory contents
but I get a bit confused about casting to a char type or if something like memset is more apropriate or if generally i'm aproching this totally wrong.
Cheers for any help

The approach you've outlined is roughly correct, although you should use offsetof instead of attempting to figure out the offset on your own. I'm not sure why you mention memset -- it sets the contents of a block to a specified value, which seems quite unrelated to the question at hand.
Here's some code to demonstrate how it works:
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
typedef struct x {
int member_a;
int member_b;
} x;
int main() {
x *s = malloc(sizeof(x));
char *base;
size_t offset;
int *b;
// initialize both members to known values
s->member_a = 1;
s->member_b = 2;
// get base address
base = (char *)s;
// and the offset to member_b
offset = offsetof(x, member_b);
// Compute address of member_b
b = (int *)(base+offset);
// write to member_b via our pointer
*b = 10;
// print out via name, to show it was changed to new value.
printf("%d\n", s->member_b);
return 0;
}

The full technique:
Get the offset using offsetof:
b_offset = offsetof(struct mystruct, member_b);
Get the address of your structure as a char * pointer.
char *sc = (char *)s;
Add the add the offset to the structure address, cast the value to a pointer to the appropriate type and dereference:
*(int *)(sc + b_offset)

Ignoring padding and alignment, as you said...
If the elements you're pointing to are entirely of a single type, as in your example, you can just cast the structure to the desired type and treat it as an array:
printf("%i", ((int *)(&s))[1]);

It's possible calculate the offset based on the struct and NULL as reference pointer
e.g " &(((type *)0)->field)"
Example:
struct my_struct {
int x;
int y;
int *m;
int *h;
};
int main()
{
printf("offset %d\n", (int) &((((struct my_struct*)0)->h)));
return 0;
}

In this particular example, you can address it by *((int *) ((char *) s + sizeof(int))). I'm not sure why you want that, so I'm assuming didactic purposes, therefore the explanation follows.
The bit of code translates as: take the memory starting at address s and treat it as memory pointing to char. To that address, add sizeof(int) char-chunks - you will get a new address. Take the value that the address thus created and treat it as an int.
Note that writing *(s + sizeof(int)) would give the address at s plus sizeof(int) sizeof(mystruct) chunks
Edit: as per Andrey's comment, using offsetof:
*((int *) ((byte *) s + offsetof(struct mystruct, member_b)))
Edit 2: I replaced all bytes with chars as sizeof(char) is guaranteed to be 1.

It sounds from your comments that what you're really doing is packing and unpacking a bunch of disparate data types into a single block of memory. While you can get away with doing that with direct pointer casts, as most of the other answers have suggested:
void set_int(void *block, size_t offset, int val)
{
char *p = block;
*(int *)(p + offset) = val;
}
int get_int(void *block, size_t offset)
{
char *p = block;
return *(int *)(p + offset);
}
The problem is that this is non-portable. There's no general way to ensure that the types are stored within your block with the correct alignment, and some architectures simply cannot do loads or stores to unaligned addresses. In the special case where the layout of your block is defined by a declared structure, it will be OK, because the struct layout will include the necessary padding to ensure the right alignment. However since you can't access the members by name, it sounds like this isn't actually what you're doing.
To do this portably, you need to use memcpy:
void set_int(void *block, size_t offset, int val)
{
char *p = block;
memcpy(p + offset, &val, sizeof val);
}
int get_int(void *block, size_t offset)
{
char *p = block;
int val;
memcpy(&val, p + offset, sizeof val);
return val;
}
(similar for the other types).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Offsetof used in Linux - c

You're not actually dereferencing 0. You're adding zero and the offset of the member, since you're taking the address of the expression. That is, if off is the offset of the member, you're doing 0 + off not *(0 + off) so you never actually do a memory access.

Related

Accessing value from address in C

Understanding pointer casting on struct type in C

How do I correctly use a void pointer in C?

Understanding C: Pointers and Structs

How can I get/set a struct member by offset

Categories

Resources