I need to cast a pointer to a long long, and would prefer to do it in a way that gcc doesn't complain on either 32 or 64-bit architectures about converting pointer to ints of different size. And before anyone asks, yes, I know what I'm doing, and I know what I'm casting to -- my specific use case is wanting to send a stack trace (the pointers themselves being the subject here) over the network when an application error occurs, so there is no guarantee the sender and receiver will have the same word size. I've therefore built a struct holding the message data with, among other entries, an array of "unsigned long long" values (guaranteed minimum 64-bits) to hold the pointers. And yes, I know "long long" is not guaranteed to be only 64-bits, but all compilers I'm using for both source and destination implement it as 64-bits. Because the header (and source) with the struct will be used on both architectures, "uintptr_t" doesn't seem like a workable solution (because, according to the definition in stdint.h, its size is architecture-dependent).
I thought about getting tricky with anonymous unions, but this feels a little too hackish to me...I'm hoping there's a way with some double-cast magic or something to do this in C99 (since anonymous unions weren't standard until C11).
EDIT:
typedef struct error_msg_t {
int msgid;
int len;
pid_t pid;
int si_code;
int signum;
int errno;
unsigned long long stack[20];
char err_msg[];
} error_msg_t;
...
void **stack;
...
msg.msgid = ERROR_MSG;
msg.len = sizeof(error_msg_t) + strlen(err_msg) + 1);
msg.pid = getpid();
...
for (i=0; i<stack_depth; i++)
msg.stack[i] = (unsigned long long)stack[i];
Warning (on a 32-bit compile) about casting to integer of different size occurs on the last line.
Probably your best bet is to double cast to spell it out to the compiler what you want to do (as suggested by Max).
I would recommend wrapping it up into a macro so that the code intention is clear from the macro name.
#define PTR_TO_UINT64(x) (uint64_t)(uintptr_t)(x)
Related
I've inherited some old code that assumes that an int can store values from -231 to 2^31-1, that overflow just wraps around, and that the sign bit is the high-order bit. In other words, that code should have used uint32_t, except that it wasn't. I would like to fix this code to use uint32_t.
The difficulty is that the code is distributed as source code and I'm not allowed to change the external interface. I have a function that works on an array of int. What it does internally is its own business, but int is exposed in the interface. In a nutshell, the interface is:
struct data {
int a[10];
};
void frobnicate(struct data *param);
I'd like to change int a[10] to uint32_t a[10], but I'm not allowed to modify the definition of struct data.
I can make the code work on uint32_t or unsigned internally:
struct internal_data {
unsigned a[10];
};
void frobnicate(struct data *param) {
struct internal_data *internal = (struct internal_data *)param;
// ... work with internal ...
}
However this is not actually correct C since it's casting between pointers to different types.
Is there a way I can add compile-time guards so that, for the rare people for whom int isn't “old-school” 32-bit, the code doesn't build? If int is less than 32 bits, the code has never worked anyway. For the vast majority of users, the code should build, and in a way that tells the compiler not to do “weird” things with overflowing int calculations.
I distribute the source code and people may use it with whatever compiler they choose, so compiler-specific tricks are not relevant.
I'm at least going to add
#if INT_MIN + 1 != -0x7fffffff
#error "This code only works with 32-bit two's complement int"
#endif
With this guard, what can go wrong with the cast above? Is there a reliable way of manipulating the int array as if its elements were unsigned, without copying the array?
In summary:
I can't change the function prototype. It references an array of int.
The code should manipulate the array (not a copy of the array) as an array of unsigned.
The code should build on platforms where it worked before (at least with sufficiently friendly compilers) and should not build on platforms where it can't work.
I have no control over which compiler is used and with which settings.
However this is not actually correct C since it's casting between pointers to different types.
Indeed, you cannot do such casts, because the two structure types are not compatible. You could however use a work-around such as this:
typedef union
{
struct data;
uint32_t array[10];
} internal_t;
...
void frobnicate(struct data *param) {
internal_t* internal = (internal_t*)param;
...
Another option if you can change the original struct declaration but not its member names, is to use C11 anonymous union:
struct data {
union {
int a[10];
uint32_t u32[10];
}
};
This means that user code accessing foo.a won't break. But you'd need C11 or newer.
Alternatively, you could use a uint32_t* to access the int[10] directly. This is also well-defined, since uint32_t in this case is the unsigned equivalent of the effective type int.
Is there a way I can add compile-time guards so that, for the rare people for whom int isn't “old-school” 32-bit, the code doesn't build?
The obvious is static_assert(sizeof(int) == 4, "int is not 32 bits"); but again this requires C11. If backwards compatibility with older C is needed, you can invent some dirty "poor man's static assert":
#define stat_assert(expr) typedef int dummy_t [expr];
#if INT_MIN != -0x80000000
Depending on how picky you are, this isn't 100% portable. int could in theory be 64 bits, but probably portability to such fictional systems isn't desired either.
If you don't want to drag limits.h around, you could also write the macro as
#if (unsigned int)-1 != 0xFFFFFFFF
It's a better macro regardless, since it doesn't have any hidden implicit promotion gems - note that -0x80000000 is always 100% equivalent to 0x80000000 on a 32 bit system.
So I have a structure with mixed data types like the below and I want to make sure that sizeof(struct a) is a multiple of the word size in x32 and x64. How can I do that? Thank you.
struct a {
vaddr_t v1;
size_t v2;
unsigned short v3;
struct b* v4;
struct a *v5;
int v6;
pthread_mutex_t lock;
};
With basic types, like ints or shorts, you could achieve this by explicitly using int32 or int16 instead of int or short. For other types like size_t or pointers, it gets more complicated. Your best bet is to use type attributes (http://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/Type-Attributes.html).
If all that matters is the structure alignment in memory, align the structure itself, not its members.
I may be stepping a bit outside of my comfort zone here, however there seems to be a variation on the malloc called memalign thus :
void *memalign(size_t alignment, size_t size);
The memalign() function allocates size bytes on a specified
alignment boundary and returns a pointer to the allocated
block. The value of the returned address is guaranteed to be
an even multiple of alignment. The value of alignment must
be a power of two and must be greater than or equal to the
size of a word.
That may or may not exist on all platforms but this one seems to be very common :
int posix_memalign(void **memptr, size_t alignment, size_t size);
Seen at :
http://pubs.opengroup.org/onlinepubs/009695399/functions/posix_memalign.html
Now I would think that the datatypes for fixed width type declarations, as proposed by the ISO/JTC1/SC22/WG14 C, committee's working draft for the revision of the current ISO C standard ISO/IEC 9899:1990 Programming language - C, ( I read that in a manpage ) would be cross platform and cross architecture stable.
So if you looked into the lower levels of your struct members then hopefully they are based on things like int32_t or uint32_t for an integer. There are POSIX types such as :
/*
* POSIX Extensions
*/
typedef unsigned char uchar_t;
typedef unsigned short ushort_t;
typedef unsigned int uint_t;
typedef unsigned long ulong_t;
So I am thinking here that perhaps it is possible to construct your structs using only types that are defined as these totally cross platform stable datatypes and the end result being that the structs are always the same size regardless where or how you compile your code.
Please bear in mind, I am making a stretch here and hoping that someone else may clarify and perhaps correct my thinking.
What's the logic behind calls like getpid() returning a value of type pid_t instead of an unsigned int? Or int? How does this help?
I'm guessing this has to do with portability? Guaranteeing that pid_t is the same size across different platforms that may have different sizes of ints etc.?
I think it's the opposite: making the program portable across platforms, regardless of whether, e.g., a PID is 16 or 32 bits (or even longer).
The reason is to allow nasty historical implementations to still be conformant. Suppose your historical implementation had (rather common):
short getpid(void);
Of course modern systems want pids to be at least 32-bit, but if the standard mandated:
int getpid(void);
then all historical implementations that had used short would become non-conformant. This was deemed unacceptable, so pid_t was created and the implementation was allowed to define pid_t whichever way it prefers.
Note that you are by no means obligated to use pid_t in your own code as long as you use a type that's large enough to store any pid (intmax_t for example would work just fine). The only reason pid_t needs to exist is for the standard to define getpid, waitpid, etc. in terms of it.
On different platforms and operating systems, different types (pid_t for example) might be 32 bits (unsigned int) on a 32-bit machine or 64 bits (unsigned long) on a 64-bit machine. Or, for some other reason, an operating system might choose to have a different size. Additionally, it makes it clear when reading the code that this variable represents an "object", rather than just an arbitrary number.
The purpose of it is to make pid_t, or any other type of the sort, platform-independent, such that it works properly regardless of how it's actually implemented. This practice is used for any type that needs to be platform-independent, such as:
pid_t: Has to be large enough to store a PID on the system you're coding for. Maps to int as far as I'm aware, although I'm not the most familiar with the GNU C library.
size_t: An unsigned variable able to store the result of the sizeof operator. Generally equal in size to the word size of the system you're coding for.
int16_t (intX_t): Has to be exactly 16 bits, regardless of platform, and won't be defined on platforms that don't use 2n-bit bytes (typically 8- or 16-bit) or, much less frequently, provide a means of accessing exactly 16 bits out of a larger type (e.g., the PDP-10's "bytes", which could be any number of contiguous bits out of a 36-bit word, and thus could be exactly 16 bits), and thus don't support 16-bit two's complement integer types (such as a 36-bit system). Generally maps to short on modern computers, although it may be an int on older ones.
int_least32_t (int_leastX_t): Has to be the smallest size possible that can store at least 32 bits, such as 36 bits on a 36-bit or 72-bit system. Generally maps to int on modern computers, although it may be a long on older ones.
int_fastX_t: Has to be the fastest type possible that can store at least X bits. Generally, it's the system's word size if (X <= word_size) (or sometimes char for int_fast8_t), or acts like int_leastX_t if (X > word_size))
intmax_t: Has to be the maximum integer width supported by the system. Generally, it'll be at least 64 bits on modern systems, although some systems may support extended types larger than long long (and if so, intmax_t is required to be the largest of those types).
And more...
Mechanically, it allows the compiler's installer to typedef the appropriate type to the identifier (whether a standard type or an awkwardly-named internal type) behind the scenes, whether by creating appropriate header files, coding it into the compiler's executable, or some other method. For example, on a 32-bit system, Microsoft Visual Studio will implement the intX_t and similar types as follows (note: comments added by me):
// Signed ints of exactly X bits.
typedef signed char int8_t;
typedef short int16_t;
typedef int int32_t;
// Unsigned ints of exactly X bits.
typedef unsigned char uint8_t;
typedef unsigned short uint16_t;
typedef unsigned int uint32_t;
// Signed ints of at least X bits.
typedef signed char int_least8_t;
typedef short int_least16_t;
typedef int int_least32_t;
// Unsigned ints of at least X bits.
typedef unsigned char uint_least8_t;
typedef unsigned short uint_least16_t;
typedef unsigned int uint_least32_t;
// Speed-optimised signed ints of at least X bits.
// Note that int_fast16_t and int_fast32_t are both 32 bits, as a 32-bit processor will generally operate on a full word faster than a half-word.
typedef char int_fast8_t;
typedef int int_fast16_t;
typedef int int_fast32_t;
// Speed-optimised unsigned ints of at least X bits.
typedef unsigned char uint_fast8_t;
typedef unsigned int uint_fast16_t;
typedef unsigned int uint_fast32_t;
typedef _Longlong int64_t;
typedef _ULonglong uint64_t;
typedef _Longlong int_least64_t;
typedef _ULonglong uint_least64_t;
typedef _Longlong int_fast64_t;
typedef _ULonglong uint_fast64_t;
On a 64-bit system, however, they may not necessarily be implemented the same way, and I can guarantee that they won't be implemented the same way on an archaic 16-bit system, assuming you can find a version of MSVS compatible with one.
Overall, it allows code to work properly regardless of the specifics of your implementation, and to meet the same requirements on any standards-compatible system (e.g. pid_t can be guaranteed to be large enough to hold any valid PID on the system in question, no matter what system you're coding for). It also prevents you from having to know the nitty-gritty, and from having to look up internal names you may not be familiar with. In short, it makes sure your code works the same regardless of whether pid_t (or any other similar typedef) is implemented as an int, a short, a long, a long long, or even a __Did_you_really_just_dare_me_to_eat_my_left_shoe__, so you don't have to.
Additionally, it serves as a form of documentation, allowing you to tell what a given variable is for at a glance. Consider the following:
int a, b;
....
if (a > b) {
// Nothing wrong here, right? They're both ints.
}
Now, let's try that again:
size_t a;
pid_t b;
...
if (a > b) {
// Why are we comparing sizes to PIDs? We probably messed up somewhere.
}
If used as such, it can help you locate potentially problematic segments of code before anything breaks, and can make troubleshooting much easier than it would otherwise be.
Each process in the program has a specific process ID. By calling pid, we know the assigned ID of the current process.Knowing the pid is exceptionally important when we use fork(), because it returns 0 and !=0 values for child and parent copies receptively.These two videos have clear explanations: video#1 Video#2
An example: Suppose we have the following c program:
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
int main (int argc, char *argv[])
{
printf("I am %d\n", (int) getpid());
pid_t pid = fork();
printf("fork returned: %d\n", (int) pid);
if(pid<0){
perror("fork failed");
}
if (pid==0){
printf("This is a child with pid %d\n",(int) getpid());
}else if(pid >0){
printf("This is a parent with pid %d\n",(int)getpid());
}
return 0;
}
If you run it, you will get 0 for child and non zero/greater than zero for the parent.
One thing to point out, in most answers I saw something along the lines of
"using pid_t makes the code work on different systems", which is not necessarily true.
I believe the precise wording should be: it makes the code 'compile' on different systems.
As, for instance, compiling the code on a system that uses 32-bit pid_t will produce a binary that will probably break if run on another system that uses 64-bit pid_t.
First off, this is not a dupe of:
Is it safe to cast an int to void pointer and back to int again?
The difference in the questions is this: I'm only using the void* to store the int, but I never actually use it as a void*.
So the question really comes down to this:
Is a void * guaranteed to be at least as wide as an int
I can't use intptr_t because I'm using c89 / ANSI C.
EDIT
In stdint.h from C99 ( gcc version ) I see the following:
/* Types for `void *' pointers. */
#if __WORDSIZE == 64
# ifndef __intptr_t_defined
typedef long int intptr_t;
# define __intptr_t_defined
# endif
typedef unsigned long int uintptr_t;
#else
# ifndef __intptr_t_defined
typedef int intptr_t;
# define __intptr_t_defined
# endif
typedef unsigned int uintptr_t;
#endif
Could I possibly just jerry rig something similar and expect it to work? It would seem that the casting should work as all intptr_t is is a typedef to an integral type...
No, this is not guaranteed to be safe.
The C99 standard has this to say (section 6.3.2.3):
An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.
Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
I'm pretty confident that pre-C99 won't be any different.
FreeRTOS stores timer IDs in Timer_t as void* pvTimerID. So when using this as a storage space, and NOT a pointer to something, it is necessary to cast it to something that can be used as an array index, for instance.
so to read the id, stored as a void*:
void* pvId = pxTimer->pvTimerID;
int index = (int)(pvId - NULL);
There is a C FAQ: Can I temporarily stuff an integer into a pointer, or vice versa? .
The cleanest answer is: no, this is not safe, avoid it and get on with it. But POSIX requires this to be possible. So it is safe on POSIX-compliant systems.
Here's a portable alternative.
static const char dummy[MAX_VALUE_NEEDED];
void *p = (void *)(dummy+i); /* cast to remove the const qualifier */
int i = p-dummy;
Of course it can waste prohibitively large amounts of virtual address space if you need large values, but if you just want to pass small integers, it's a 100% portable and clean way to store integer values in void *.
How can I disable structure padding in C without using pragma?
There is no standard way of doing this. The standard states that padding may be done at the discretion of the implementation. From C99 6.7.2.1 Structure and union specifiers, paragraph 12:
Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type.
Having said that, there's a couple of things you can try.
The first you've already discounted, using #pragma to try and convince the compiler not to pack. In any case, this is not portable. Nor are any other implementation-specific ways but you should check into them as it may be necessary to do it if you really need this capability.
The second is to order your fields in largest to smallest order such as all the long long types followed by the long ones, then all the int, short and finally char types. This will usually work since it's most often the larger types that have the more stringent alignment requirements. Again, not portable.
Thirdly, you can define your types as char arrays and cast the addresses to ensure there's no padding. But keep in mind that some architectures will slow down if the variables aren't aligned properly and still others will fail miserably (such as raising a BUS error and terminating your process, for example).
That last one bears some further explanation. Say you have a structure with the fields in the following order:
char C; // one byte
int I; // two bytes
long L; // four bytes
With padding, you may end up with the following bytes:
CxxxIIxxLLLL
where x is the padding.
However, if you define your structure as:
typedef struct { char c[7]; } myType;
myType n;
you get:
CCCCCCC
You can then do something like:
int *pInt = &(n.c[1]);
int *pLng = &(n.c[3]);
int myInt = *pInt;
int myLong = *pLng;
to give you:
CIILLLL
Again, unfortunately, not portable.
All these "solutions" rely on you having intimate knowledge of your compiler and the underlying data types.
Other than compiler options like pragma pack, you cannot, padding is in the C Standard.
You can always attempt to reduce padding by declaring the smallest types last in the structure as in:
struct _foo {
int a; /* No padding between a & b */
short b;
} foo;
struct _bar {
short b; /* 2 bytes of padding between a & b */
int a;
} bar;
Note for implementations which have 4 byte boundaries
On some architectures, the CPU itself will object if asked to work on misaligned data. To work around this, the compiler could generate multiple aligned read or write instructions, shift and split or merge the various bits. You could reasonably expect it to be 5 or 10 times slower than aligned data handling. But, the Standard doesn't require compilers to be prepared to do that... given the performance cost, it's just not in enough demand. The compilers that support explicit control over padding provide their own pragmas precisely because pragmas are reserved for non-Standard functionality.
If you must work with unpadded data, consider writing your own access routines. You might want to experimenting with types that require less alignment (e.g. use char/int8_t), but it's still possible that e.g. the size of structs will be rounded up to multiples of 4, which would frustrate packing structures tightly, in which case you'll need to implement your own access for the entire memory region.
Either you let compiler do padding, or tell it not to do using #pragma, either you just use some bunch of bytes like a char array, and you build all your data by yourself (shifting and adding bytes). This is really inefficient but you'll exactly control the layout of the bytes. I did that sometimes preparing network packets by hand, but in most case it's a bad idea, even if it's standard.
If you really want structs without padding: Define replacement datatypes for short, int, long, etc., using structs or classes that are composed only of 8 bit bytes. Then compose your higher level structs using the replacement datatypes.
C++'s operator overloading is very convenient, but you could achieve the same effect in C using structs instead of classes. The below cast and assignment implementations assume the CPU can handle misaligned 32bit integers, but other implementations could accommodate stricter CPUs.
Here is sample code:
#include <stdint.h>
#include <stdio.h>
class packable_int { public:
int8_t b[4];
operator int32_t () const { return *(int32_t*) b; }
void operator = ( int32_t n ) { *(int32_t*) b = n; }
};
struct SA {
int8_t c;
int32_t n;
} sa;
struct SB {
int8_t c;
packable_int n;
} sb;
int main () {
printf ( "sizeof sa %d\n", sizeof sa ); // sizeof sa 8
printf ( "sizeof sb %d\n", sizeof sb ); // sizeof sb 5
return 0;
}
We can disable structure padding in c program using any one of the following methods.
-> use __attribute__((packed)) behind definition of structure. for eg.
struct node {
char x;
short y;
int z;
} __attribute__((packed));
-> use -fpack-struct flag while compiling c code. for eg.
$ gcc -fpack-struct -o tmp tmp.c
Hope this helps.
Thanks.