Confused about definition for pte_t and __pte(x) - c

typedef struct { unsigned long pte; } pte_t;
#define __pte(x) ((pte_t) { (x) } )
Why not use 'typedef unsigned long pte_t' directly?
Why '{ }' is used here? It looks weird.
I know without those the gcc will report an error. However, how it works?

I don't know the library, but I would bet that the intention is to prevent the automatic conversion between pte_t and integral types.
I mean, a typedef is just an alias for a type, so:
typedef unsigned long pte_t;
pte_t x = 3; //ok
char y = x; //ok
But a struct is a new type, so:
typedef struct { unsigned long pte; } pte_t;
pte_t x = 3; //error!
char y = x; //error!
Then provide a few functions, macros or whatever to get/set the pte_t internal field, and done.
UPDATE: Ok, I've found it. The library is the Linux kernel, and just next to it there is:
#define pte_val(x) ((x).pte)
to access the value.
But looking a bit over it, under conditionally compiled code, there is this alternative definition:
typedef struct { unsigned long pte_low, pte_high; } pte_t;
#define pte_val(x) ((x).pte_low | ((unsigned long long)(x).pte_high << 32))
See? Depending on the configuration, there may be just one field or several of them. It would be crazy to change from a single type to a struct. So it is defined always as a struct.

Related

Casting pointer type based on integer size (C99)

How do you (if possible) define a type by an integer size? For example if I wanted to make a type which was 3 bytes long how could I accomplish doing something like this? (I am aware this is incorrect)
typedef int24_t 3;
I am trying to write a generalized function which takes a character string parameter containing multiple numbers, and stores the numbers in a pointer, passed as another parameter.
However I want to make it so you can pass a numerical parameter which determines how big the variable type storing the numbers will be: i.e. if it were 1 the type could be char, if it were 4 the type could be int etc.
I am aware that it is possible to just store the number in a temporary fixed size variable, and then only copy the relevant bytes to the pointer depending on the requested size, but I want the code to be portable and I don't want to be messing around with Endianness as I've had trouble with that in the past.
Thanks
You can use a struct, it's not elegant but sounds like what you're looking for.
Note that you must define the struct alignment to 1 byte. You're also limited to 64bit.
typedef struct Int24 {
int value : 24;
} Int;
typedef struct UInt24 {
unsigned value : 24;
} UInt24;
typedef struct Int48 {
long long value : 48;
} Int48;
With templates:
template<int bytes> struct Int {
long long value : bytes * 8;
};
typedef Int<1> Int8;
typedef Int<6> Int48;
With macro:
#define DECL_INT(n) \
typedef struct _Int##n { \
long long value : n; \
} Int##n
// declaration of type
DECL_INT(48); // produces Int48
// usage
Int48 i48;
struct smallerint
{
unsigned int integer:24; //(24=24bits=3bytes)
};
typedef struct smallerint int24_t;
If i understand well what you're trying to do, and if you want a nice generalized function, i would use linked list of bytes. Maybe you should have a look on a bigint implementation.

using memcpy for structs

I have a problem when using memcpy on a struct.
Consider the following struct
struct HEADER
{
unsigned int preamble;
unsigned char length;
unsigned char control;
unsigned int destination;
unsigned int source;
unsigned int crc;
}
If I use memcpy to copy data from a receive buffer to this struct the copy is OK, but if i redeclare the struct to the following :
struct HEADER
{
unsigned int preamble;
unsigned char length;
struct CONTROL control;
unsigned int destination;
unsigned int source;
unsigned int crc;
}
struct CONTROL
{
unsigned dir : 1;
unsigned prm : 1;
unsigned fcb : 1;
unsigned fcb : 1;
unsigned function_code : 4;
}
Now if I use the same memcpy code as before, the first two variables ( preamble and length ) are copied OK. The control is totally messed up, and last three variables are shifted one up, aka crc = 0, source = crc, destination = source...
ANyone got any good suggestions for me ?
Do you know that the format in the receive buffer is correct, when you add the control in the middle?
Anyway, your problem is that bitfields are the wrong tool here: you can't depend on the layout in memory being anything in particular, least of all the exact same one you've chosen for the serialized form.
It's almost never a good idea to try to directly copy structures to/from external storage; you need proper serialization. The compiler can add padding and alignment between the fields of a structure, and using bitfields makes it even worse. Don't do this.
Implement proper serialization/deserialization functions:
unsigned char * header_serialize(unsigned char *put, const struct HEADER *h);
unsigned char * header_deserialize(unsigned char *get, struct HEADER *h);
That go through the structure and read/write as many bytes as you feel are needed (possibly for each field):
static unsigned char * uint32_serialize(unsigned char *put, uint32_t x)
{
*put++ = (x >> 24) & 255;
*put++ = (x >> 16) & 255;
*put++ = (x >> 8) & 255;
*put++ = x & 255;
return put;
}
unsigned char * header_serialize(unsigned char *put, const struct HEADER *h)
{
const uint8_t ctrl_serialized = (h->control.dir << 7) |
(h->control.prm << 6) |
(h->control.fcb << 5) |
(h->control.function_code);
put = uint32_serialize(put, h->preamble);
*put++ = h->length;
*put++ = ctrl_serialized;
put = uint32_serialize(put, h->destination);
put = uint32_serialize(put, h->source);
put = uint32_serialize(put, h->crc);
return put;
}
Note how this needs to be explicit about the endianness of the serialized data, which is something you always should care about (I used big-endian). It also explicitly builds a single uint8_t version of the control fields, assuming the struct version was used.
Also note that there's a typo in your CONTROL declaration; fcb occurs twice.
Using struct CONTROL control; instead of unsigned char control; leads to a different alignment inside the struct and so filling it with memcpy() produces a different result.
Memcpy copies the values of bytes from the location pointed by source directly to the memory block pointed by destination.
The underlying type of the objects pointed by both the source and destination pointers are irrelevant for this function; The result is a binary copy of the data.
So if there is any structure padding then you will have messed up results.
Check sizeof(struct CONTROL) -- I think it would be 2 or 4 depending on the machine. Since you are using unsigned bitfields (and unsigned is shorthand of unsigned int), the whole structure (struct CONTROL) would take at least the size of unsigned int -- i.e. 2 or 4 bytes.
And, using unsigned char control takes 1 byte for this field. So, definitely there should be mismatch staring with the control variable.
Try rewriting the struct control as below:-
struct CONTROL
{
unsigned char dir : 1;
unsigned char prm : 1;
unsigned char fcb : 1;
unsigned char fcb : 1;
unsigned char function_code : 4;
}
The clean way would be to use a union, like in.:
struct HEADER
{
unsigned int preamble;
unsigned char length;
union {
unsigned char all;
struct CONTROL control;
} uni;
unsigned int destination;
unsigned int source;
unsigned int crc;
};
The user of the struct can then choose the way he wants to access the thing.
struct HEADER thing = {... };
if (thing.uni.control.dir) { ...}
or
#if ( !FULL_MOON ) /* Update: stacking of bits within a word appears to depend on the phase of the moon */
if (thing.uni.all & 1) { ... }
#else
if (thing.uni.all & 0x80) { ... }
#endif
Note: this construct does not solve endianness issues, that will need implicit conversions.
Note2: and you'll have to check the bit-endianness of your compiler, too.
Also note that bitfields are not very useful, especially if the data goes over the wire, and the code is expected to run on different platforms, with different alignment and / or endianness. Plain unsigned char or uint8_t plus some bitmasking yields much cleaner code. For example, check the IP stack in the BSD or linux kernels.

Is this code declaring a type?

#ifdef _CPU_8BIT_INT_
// unsigned 8 bit
typedef unsigned _CPU_8BIT_INT_ u8 ;
What is the code above doing? Is it trying to declare a type? (type as in integer, char etc.)
Yes, typedef is used to declare a type. From now on
u8 x;
/* Equivalent to. */
unsigned _CPU_8BIT_INT_ x;
Are you sure you're not better off using uint8_t from stdint.h ?

Union to unsigned long long int cast

I have a union as follows:
typedef unsigned long GT_U32;
typedef unsigned short GT_U16;
typedef unsigned char GT_U8;
typedef union
{
GT_U8 c[8];
GT_U16 s[4];
GT_U32 l[2];
} GT_U64;
I want to cast this union into the following:
typedef unsigned long long int UINT64;
The casting function I wrote is as follows:
UINT64 gtu64_to_uint64_cast(GT_U64 number_u)
{
UINT64 casted_number = 0;
casted_number = number_u.l[0];
casted_number = casted_number << 32;
casted_number = casted_number | number_u.l[1];
return casted_number;
}
This function is using the l member to perform the shifting and bitwise or. What will happen if the s or c members of the union are used to set its values?
I am not sure if this function will always cast the values correctly. I suspect it has something to do with the byte ordering of long and short. Can any body help?
Full example program is listed below.
#include <stdio.h>
typedef unsigned long GT_U32;
typedef unsigned short GT_U16;
typedef unsigned char GT_U8;
typedef union
{
GT_U8 c[8];
GT_U16 s[4];
GT_U32 l[2];
} GT_U64;
typedef unsigned long long int UINT64;
UINT64 gtu64_to_uint64_cast(GT_U64 number_u)
{
UINT64 casted_number = 0;
casted_number = number_u.l[0];
casted_number = casted_number << 32;
casted_number = casted_number | number_u.l[1];
return casted_number;
}
int main()
{
UINT64 left;
GT_U64 right;
right.s[0] = 0x00;
right.s[1] = 0x00;
right.s[2] = 0x00;
right.s[3] = 0x01;
left = gtu64_to_uint64_cast(right);
printf ("%llu\n", left);
return 0;
}
That's really ugly and implementation-dependent - just use memcpy, e.g.
UINT64 gtu64_to_uint64_cast(GT_U64 number_u)
{
UINT64 casted_number;
assert(sizeof(casted_number) == sizeof(number_u));
memcpy(&casted_number, &number_u, sizeof(number_u));
return casted_number;
}
First of all, please use the typedefs from "stdint.h" for such a purpose. You have plenty of assumptions of what the width of integer types would be, don't do that.
What will happen if the s or c members
of the union are used to set its
values?
Reading a member of a union that has been written to through another member may cause undefined behavior if there are padding bytes or padding bits. The only exception from that is unsigned char that may always be used to access the individual bytes. So access through c is fine. Access through s may (in very unlikely circumstances) cause undefined behavior.
And there is no such thing like a "correct" cast in your case. It simply depends on how you want to interpret an array of small numbers as one big number. One possible interpretation for that task is the one you gave.
This code should work independantly of padding, endianess, union accessing and implicit integer promotions.
uint64_t gtu64_to_uint64_cast (const GT_U64* number_u)
{
uint64_t casted_number = 0;
uint8_t i;
for(i=0; i<8; i++)
{
casted_number |= (uint64_t) number_u->c[i] << i*8U;
}
return casted_number;
}
If you can't change the declaration of the union to include an explicit 64-bit field, perhaps you can just wrap it? Like this:
UINT64 convert(const GT_U64 *value)
{
union {
GT_U64 in;
UINT64 out;
} tmp;
tmp.in = *value;
return tmp.out;
}
This does violate the rule that says you can only read from the union member last written to, so maybe it'll set your hair on fire. I think it will be quite safe though, don't see a case where a union like this would include padding but of course I could be wrong.
I mainly wanted to include this since just because you can't change the declaration of the "input" union doesn't mean you can't do almost the same thing by wrapping it.
Probably an easier way to cast is to use union with a long long member:
typedef unsigned long long int UINT64;
typedef unsigned long GT_U32;
typedef unsigned short GT_U16;
typedef unsigned char GT_U8;
typedef union
{
GT_U8 c[8];
GT_U16 s[4];
GT_U32 l[2];
UINT64 ll;
} GT_U64;
Then, simply accessing ll will get the 64-bit value without having to do an explicit cast. You will need to tell your compiler to use one-byte struct packing.
You don't specify what "cast the values correctly" means.
This code will cast in the simplest possible way, but it'll give different results depending on your systems endianness.
UINT64 gtu64_to_uint64_cast(GT_U64 number_u) {
assert(sizeof(UINT64) == sizeof(GT_U64));
return *(UINT64 *) &number_u;
}

Using unions to simplify casts

I realize that what I am trying to do isn't safe. But I am just doing some testing and image processing so my focus here is on speed.
Right now this code gives me the corresponding bytes for a 32-bit pixel value type.
struct Pixel {
unsigned char b,g,r,a;
};
I wanted to check if I have a pixel that is under a certain value (e.g. r, g, b <= 0x10). I figured I wanted to just conditional-test the bit-and of the bits of the pixel with 0x00E0E0E0 (I could have wrong endianness here) to get the dark pixels.
Rather than using this ugly mess (*((uint32_t*)&pixel)) to get the 32-bit unsigned int value, i figured there should be a way for me to set it up so I can just use pixel.i, while keeping the ability to reference the green byte using pixel.g.
Can I do this? This won't work:
struct Pixel {
unsigned char b,g,r,a;
};
union Pixel_u {
Pixel p;
uint32_t bits;
};
I would need to edit my existing code to say pixel.p.g to get the green color byte. Same happens if I do this:
union Pixel {
unsigned char c[4];
uint32_t bits;
};
This would work too but I still need to change everything to index into c, which is a bit ugly but I can make it work with a macro if i really needed to.
(Edited) Both gcc and MSVC allow 'anonymous' structs/unions, which might solve your problem. For example:
union Pixel {
struct {unsigned char b,g,r,a;};
uint32_t bits; // use 'unsigned' for MSVC
}
foo.b = 1;
foo.g = 2;
foo.r = 3;
foo.a = 4;
printf ("%08x\n", foo.bits);
gives (on Intel):
04030201
This requires changing all your declarations of struct Pixel to union Pixel in your original code. But this defect can be fixed via:
struct Pixel {
union {
struct {unsigned char b,g,r,a;};
uint32_t bits;
};
} foo;
foo.b = 1;
foo.g = 2;
foo.r = 3;
foo.a = 4;
printf ("%08x\n", foo.bits);
This also works with VC9, with 'warning C4201: nonstandard extension used : nameless struct/union'. Microsoft uses this trick, for example, in:
typedef union {
struct {
DWORD LowPart;
LONG HighPart;
}; // <-- nameless member!
struct {
DWORD LowPart;
LONG HighPart;
} u;
LONGLONG QuadPart;
} LARGE_INTEGER;
but they 'cheat' by suppressing the unwanted warning.
While the above examples are ok, if you use this technique too often, you'll quickly end up with unmaintainable code. Five suggestions to make things clearer:
(1) Change the name bits to something uglier like union_bits, to clearly indicate something out-of-the-ordinary.
(2) Go back to the ugly cast the OP rejected, but hide its ugliness in a macro or in an inline function, as in:
#define BITS(x) (*(uint32_t*)&(x))
But this would break the strict aliasing rules. (See, for example, AndreyT's answer: C99 strict aliasing rules in C++ (GCC).)
(3) Keep the original definiton of Pixel, but do a better cast:
struct Pixel {unsigned char b,g,r,a;} foo;
// ...
printf("%08x\n", ((union {struct Pixel dummy; uint32_t bits;})foo).bits);
(4) But that is even uglier. You can fix this by a typedef:
struct Pixel {unsigned char b,g,r,a;} foo;
typedef union {struct Pixel dummy; uint32_t bits;} CastPixelToBits;
// ...
printf("%08x\n", ((CastPixelToBits)foo).bits); // not VC9
With VC9, or with gcc using -pedantic, you'll need (don't use this with gcc--see note at end):
printf("%08x\n", ((CastPixelToBits*)&foo)->bits); // VC9 (not gcc)
(5) A macro may perhaps be preferred. In gcc, you can define a union cast to any given type very neatly:
#define CAST(type, x) (((union {typeof(x) src; type dst;})(x)).dst) // gcc
// ...
printf("%08x\n", CAST(uint32_t, foo));
With VC9 and other compilers, there is no typeof, and pointers may be needed (don't use this with gcc--see note at end):
#define CAST(typeof_x, type, x) (((union {typeof_x src; type dst;}*)&(x))->dst)
Self-documenting, and safer. And not too ugly. All these suggestions are likely to compile to identical code, so efficiency is not an issue. See also my related answer: How to format a function pointer?.
Warning about gcc: The GCC Manual version 4.3.4 (but not version 4.3.0) states that this last example, with &(x), is undefined behaviour. See http://davmac.wordpress.com/2010/01/08/gcc-strict-aliasing-c99/ and http://gcc.gnu.org/ml/gcc/2010-01/msg00013.html.
The problem with a structure inside a union, is that the compiler is allowed to add padding bytes between members of a structure (or class), except bit fields.
Given:
struct Pixel
{
unsigned char red;
unsigned char green;
unsigned char blue;
unsigned char alpha;
};
This could be laid out as:
Offset Field
------ -----
0x00 red
0x04 green
0x08 blue
0x0C alpha
So the size of the structure would be 16 bytes.
When put in a union, the compiler would take the larger capacity of the two to determine space. Also, as you can see, a 32 bit integer would not align correctly.
I suggest creating functions to combine and extract pixels from a 32-bit quantity. You can declare it inline too:
void Int_To_Pixel(const unsigned int word,
Pixel& p)
{
p.red = (word & 0xff000000) >> 24;
p.blue = (word & 0x00ff0000) >> 16;
p.green = (word & 0x0000ff00) >> 8;
p.alpha = (word & 0x000000ff);
return;
}
This is a lot more reliable than a struct inside a union, including one with bit fields:
struct Pixel_Bit_Fields
{
unsigned int red::8;
unsigned int green::8;
unsigned int blue::8;
unsigned int alpha::8;
};
There is still some mystery when reading this whether red is the MSB or alpha is the MSB. By using bit manipulation, there is no question when reading the code.
Just my suggestions, YMMV.
Why not make the ugly mess into an inline routine? Something like:
inline uint32_t pixel32(const Pixel& p)
{
return *reinterpret_cast<uint32_t*>(&p);
}
You could also provide this routine as a member function for Pixel, called i(), which would allow you to access the value via pixel.i() if you preferred to do it that way. (I'd lean on separating the functionality from the data structure when invariants need not be enforced.)

Resources