I am working on a software implementation of OpenGL, and OpenGL seems to require that I return 32-bit pointers. To save time, I am putting this into a C equivalent of map with 64-bit systems in order to retrieve 64-bit pointers from 32-bit psuedo-addresses. However, on 32-bit systems, this would cause a hassle, and so I should just use the pointer verbatim.
Here is basically what I want to do in my shared header:
#if <64-bit>
#include <search.h>
extern void * pointerTable;
typedef struct {
int key;
void* value;
} intPtrMap;
inline int compar(const void *l, const void *r) {
const intPtrMap *lm = l;
const intPtrMap *lr = r;
return lm->key - lr->key;
}
inline uint32_t allocate(size) {
void* result = malloc(size);
intPtrMap *a = malloc(sizeof(intStrMap));
a->key = (uint32_t) result;
a->value = result;
tsearch(a, &pointerTable, compar);
return (uint32_t) result;
}
inline int getPtr(ptr) {
intPtrMap *find_a = malloc(sizeof(intPtrMap));
find_a->key = ptr;
void *r = tfind(find_a, &root, compar);
return (*(intPtrMap**)r)->value;
}
#else
inline uint32_t allocate(size) {
return (uint32_t) malloc(size);
}
inline uint32_t getPtr(ptr) {
return (uint32_t) ptr;
}
#endif
Any suggestions on how to do the first if?
How to determine pointer size preprocessor C (?)
To determine pointer size in a portable fashion is tricky.
Various pointers sizes
It is not uncommon to have a pointer to a function wider than a pointer to an object or void*.
Pointers to int char, struct can be of different sizes, although that is rare.
So let us reduce the task to determine void * pointer size.
Pre-processor math
PP math is limited, so code needs to be careful. Let us stay with integer math.
(u)intptr_t
The optional types (u)intptr_t, which are very commonly available, are useful here. They allow conversion of a void * to an integer and then to an equivalent void*.
Although the integer type size may differ from the pointer type, that, I assert is rare and detectable with _Static_assert from C11.
Following will handle many C11 platforms. Useful ideas toward a general solution.
#include <stdint.h>
// C11
_Static_assert(sizeof (void*) == sizeof (uintptr_t),
"TBD code needed to determine pointer size");
// C99 or later
#if UINTPTR_MAX == 0xFFFF
#define PTR16
#elif UINTPTR_MAX == 0xFFFFFFFF
#define PTR32
#elif UINTPTR_MAX == 0xFFFFFFFFFFFFFFFFu
#define PTR64
#else
#error TBD pointer size
#endif
[Edit 2021]
With Is there any way to compute the width of an integer type at compile-time?, code could use, at compile time, the below to find the width of uintptr_t.
/* Number of bits in inttype_MAX, or in any (1<<k)-1 where 0 <= k < 2040 */
#define IMAX_BITS(m) ((m)/((m)%255+1) / 255%255*8 + 7-86/((m)%255+12))
#define UINTPTR_MAX_BITWIDTH IMAX_BITS(UINTPTR_MAX)
Using other questions on StackOverflow and also a solution from somebody on Discord, I have cobbled together this solution:
#if _WIN32 || _WIN64
#if _WIN64
#define PTR64
#else
#define PTR32
#endif
#elif __GNUC__
#if __x86_64__ || __ppc64__
#define PTR64
#else
#define PTR32
#endif
#elif UINTPTR_MAX > UINT_MAX
#define PTR64
#else
#define PTR32
#endif
This should be able to reliably determine 64-bit or 32-bit pointer usage in preprocessor.
Related
uint32_t fail_count = 0;
...
if(is_failed)
if(fail_count < UINT32_MAX - 1 )
++fail_count;
It works fine, but this code is fragile. Tomorrow, I may change the type of fail_count from uint32_t to int32_t and I forget to update UINT32_MAX.
Is there any way to assert fail_count is a uint32_t at the function where I have written my ifs?
P.S. 1- I know it is easy in C++ but I'm looking for a C way.
P.S. 2- I prefer to use two asserts than relying on the compiler warnings. Checking the number size via sizeof should work but is there any way to distinguish if type is unsigned?
As of C11, you can use a generic selection macro to produce a result based on the type of an expression. You can use the result in a static assertion:
#define IS_UINT32(N) _Generic((N), \
uint32_t: 1, \
default: 0 \
)
int main(void) {
uint32_t fail_count = 0;
_Static_assert(IS_UINT32(fail_count), "wrong type for fail_count");
}
You could of course use the result in a regular assert(), but _Static_assert will fail at compile time.
A better approach could be dispatching the comparison based on type, again using generic selection:
#include <limits.h>
#include <stdint.h>
#define UNDER_LIMIT(N) ((N) < _Generic((N), \
int32_t: INT32_MAX, \
uint32_t: UINT32_MAX \
) -1)
int main(void) {
int32_t fail_count = 0;
if (UNDER_LIMIT(fail_count)) {
++fail_count;
}
}
As you mentioned GCC, you can use a compiler extension to accomplish this in case you are not using C11:
First write a macro that emulates the C++ is_same. And then call it with the types you want to compare.
A minimal example for your particular case:
#include<assert.h>
#define is_same(a, b) \
static_assert(__builtin_types_compatible_p(typeof(a), typeof(b)), #a " is not unsigned int")
int main()
{
int fail_count = 0;
is_same(fail_count, unsigned int);
}
The compiler asserts:
<source>: In function 'main':
<source>:4:3: error: static assertion failed: "fail_count is not unsigned int"
static_assert(__builtin_types_compatible_p(typeof(a), typeof(b)), #a " is not unsigned int")
^~~~~~~~~~~~~
<source>:9:5: note: in expansion of macro 'is_same'
is_same(fail_count, unsigned int);
^~~~~~~
See Demo
What about a low-tech solution that works even with K&R C and any compiler past and present?
Place the right comment in the right place:
/*
* If this type is changed, don't forget to change the macro in
* if (fail_count < UINT32_MAX - 1) below (or file foobar.c)
*/
uint32_t fail_count = 0;
With a proper encapsulation this should refer to exactly one place in the code.
Don't tell me you increment the fail count in many places. And if you do, what
about a
#define FAIL_COUNT_MAX UINT32_MAX
right next to the declaration? That's more proper and clean code anyway.
No need for all the assertion magic and rocket sciencery :-)
After some work on the generic vector I asked about on this question, I would like to know if there is any way of checking that each instanciation of the library is only done once per type.
Here is what the current header file looks like:
#ifndef VECTOR_GENERIC_MACROS
#define VECTOR_GENERIC_MACROS
#ifndef TOKENPASTE
#define TOKENPASTE(a, b) a ## b
#endif
#define vector_t(T) TOKENPASTE(vector_t_, T)
#define vector_at(T) TOKENPASTE(*vector_at_, T)
#define vector_init(T) TOKENPASTE(vector_init_, T)
#define vector_destroy(T) TOKENPASTE(vector_destroy_, T)
#define vector_new(T) TOKENPASTE(vector_new_, T)
#define vector_delete(T) TOKENPASTE(vector_delete_, T)
#define vector_push_back(T) TOKENPASTE(vector_push_back_, T)
#define vector_pop_back(T) TOKENPASTE(vector_pop_back_, T)
#define vector_resize(T) TOKENPASTE(vector_resize_, T)
#define vector_reserve(T) TOKENPASTE(vector_reserve_, T)
#endif
typedef struct {
size_t size;
size_t capacity;
TYPE *data;
} vector_t(TYPE);
inline TYPE vector_at(TYPE)(vector_t(TYPE) *vector, size_t pos);
void vector_init(TYPE)(vector_t(TYPE) *vector, size_t size);
void vector_destroy(TYPE)(vector_t(TYPE) *vector);
inline TYPE *vector_new(TYPE)(size_t size);
inline void vector_delete(TYPE)(vector_t(TYPE) *vector);
void vector_push_back(TYPE)(vector_t(TYPE) *vector, TYPE value);
inline TYPE vector_pop_back(TYPE)(vector_t(TYPE) *vector);
inline void vector_resize(TYPE)(vector_t(TYPE) *vector, size_t size);
void vector_reserve(TYPE)(vector_t(TYPE) *vector, size_t size);
The header can then be included along with the source definitions:
#include <stdio.h>
#define TYPE int
#include "vector.h"
#include "vector.def"
#undef TYPE
int main()
{
vector_t(int) myVectorInt;
vector_init(int)(&myVectorInt, 0);
for (int i = 0; i < 10; ++i)
vector_push_back(int)(&myVectorInt, i);
for (int i = 0; i < myVectorInt.size; ++i)
printf("%d ", ++vector_at(int)(&myVectorInt, i));
vector_destroy(int)(&myVectorInt);
return 0;
}
I would like to make sure that the content below that last endif is only included once per TYPE.
Obviously, #ifdef VECTOR_INSTANCE(TYPE) does not work, so I'm really out of ideas...
It's a though question, however, I was also interested in the matter when I asked a similar question to yours some time ago.
My conclusions is that if you are going to use vectors (or, using more accurate naming, dynamic arrays) of many different types then it's wasteful to have all those functions vector_##TYPE##_reserve(), vector_##type##_resize(), etc... multiple times.
Instead, it is more efficient and clean to have those functions defined only once in a separate .c file, using your type's size as an extra argument. Those functions prototyped in a separate .h file. Then the same .h file would provide macros that generate functions wrappers for your own types, so that you don't see it using the size as an extra argument.
For example, your vector.h header would contain the following :
/* Declare functions operating on a generic vector type */
void vector_generic_resize(void *vector, size_t size, size_t data_size);
void vector_generic_push_back(void *vector, void *value, size_t data_size);
void *vector_generic_pop_back(void *vector, size_t data_size);
void vector_generic_init(void *vector, size_t size, size_t data_size);
void vector_generic_destroy(void *vector) ; // I don't think data_size is needed here
/* Taken from the example in the question */
#define VECTOR_DEFINITION(type)\
typedef struct {\
size_t size;\
size_t capacity;\
type *data;\
} vector_ ## type ## _t;\
/* Declare wrapper macros to make the above functions usable */
/* First the easy ones */
#define vector_resize(vector, size) vector_generic_resize(vector, size, sizeof(vector.data[0]))
#define vector_init(vector, size) vector_generic_init(vector, size, sizeof(vector.data[0]))
/* Type has to be given as an argument for the cast operator */
#define vector_pop_back(vector, type) (*(type*)(vector_generic_pop_back(vector, sizeof(vector.data[0]))))
/* This one is tricky, if 'value' is a constant, it's address cannot be taken.
I don't know if any better workarround is possible. */
#define vector_push_const(vector, type, value) \
{ \
type temp = value; \
vector_generic_push_back(vector, &temp, sizeof(vector.data[0]));\
}
/* Equivalent macro, but for pushing variables instead of constants */
#define vector_push_var(vector, value) vector_generic_push_back(vector, &value, sizeof(vector.data[0]))
/* Super-macro rediriging to constant or variable version of push_back depending on the context */
#define GET_MACRO(_1,_2,_3,NAME,...) NAME
#define vector_push_back(...) GET_MACRO(__VA_ARGS__, vector_push_const, vector_push_var)(__VA_ARGS__)
/* This macro isn't really needed, but just for homogenity */
#define vector_descroy(vector) vector_generic_destroy(vector)
The functions can then be used as you said in the example you linked, with the significant exception of vector_generic_push_back where unfortunately the type has to be specified each time as an extra macro argument.
So with this solution
You only have to do VECTOR_DEFINITION() within the .c file, avoiding the risk of declaring it with the same type twice
The vector library is only existing once in the binary
The macros can be used elegantly without using the type in their names, except for the pop back macro and the push literal macro.
If this is a problem you could make the push literal use long long always, it will work but potentially loose efficiency.
Similarly you could make the pop_back() macro and the vector_generic_pop_back() functions not return anything like they does in the C++ language, so that if you do both of those tricks you never need to use the type name explicitly in the macros.
As a reference, the main function you posted in the example that is linked in your question has to be adapted like that :
#include <stdio.h>
#include <stdlib.h>
#include "vector.h"
typedef unsigned int uint;
typedef char* str;
VECTOR_DEFINITION(uint)
VECTOR_DEFINITION(str)
int main()
{
vector_uint_t vector;
vector_init(&vector, 10);
for (unsigned int i = 0; i < vector.size; ++i)
vector.data[i] = i;
for (unsigned int i = 0; i < 10; ++i)
vector_push_back(&vector, i);
/* When pushing back a constant, we *have* to specity the type */
/* It is OK to use C keywords as they are supressed by the preprocessor */
vector_push_back(&vector, unsigned int, 12);
for (unsigned int i = 0; i < vector.size; ++i)
printf("%d ", vector.data[i]);
printf("\n");
vector_destroy(&vector);
vector_str_t sentence;
vector_init(&sentence, 0);
vector_push_back(&sentence, "Hello");
vector_push_back(&sentence, str, "World!"); /* Also possible, less efficient */
vector_push_back(&sentence, "How");
vector_push_back(&sentence, "are");
vector_push_back(&sentence, "you?");
for (unsigned int i = 0; i < sentence.size; ++i)
printf("%s ", sentence.data[i]);
printf("\n");
vector_destroy(&sentence);
return 0;
}
suggest:
remove the prototypes from the vector.h file.
place the prototypes at the top of the vector.def file.
remove the typedef struct from the vector.h file
place the typedef struct before the prototypes in the vector.def file.
then multiples #include statements for the vector.h file will have no bad effects.
Then use the following, in each source file that is to use these vector types:
#include<vector.h>
#define TYPE int
#include<vector.def>
#undef TYPE
#define TYPE char
#include<vector.def>
#undef TYPE
... etc
BTW:
There is no library involved, so I'm a bit confused by the reference
to 'library' in the question
It may be worthwhile to also prefix the 'static' modifier
to each of the function definitions so the definitions are
not visible across source files
It may be worthwhile to use parens around the parameters to TOKENPASTE
so modifiers like 'static' and.or 'const'
can be prefixed to the function names.
I got one question when looking into KVM-QEMU source codes.
ram_size = sz;
if (ram_size != sz) {
fprintf(stderr, "qemu: ram size too large\n");
exit(1);
}
sz is uint64_t and ram_size is ram_addr_t, which is also defined as uint64_t.
What are the above codes used for (check integer overflow)? How does it work?
Thanks.
If you look closer at the definition of ram_addr_t, you'll see something like:
/* address in the RAM (different from a physical address) */
#if defined(CONFIG_XEN_BACKEND)
typedef uint64_t ram_addr_t;
# define RAM_ADDR_MAX UINT64_MAX
# define RAM_ADDR_FMT "%" PRIx64
#else
typedef uintptr_t ram_addr_t;
# define RAM_ADDR_MAX UINTPTR_MAX
# define RAM_ADDR_FMT "%" PRIxPTR
#endif
Note that it might also be a uintptr_t, which might not be a 64-bit type. In that case, there'd be a problem with that assignment if sz were greater than UINTPTR_MAX.
I'm trying to initialize a global-scoped const variable with a value that is byte-swapped appropriately.
#include <stdio.h>
#include <stdint.h>
#include <arpa/inet.h>
const uint32_t a = ntohl(0x11223344);
int main(int argc, char const *argv[])
{
printf("%08x\n", a);
return 0;
}
Using gcc this fails with "error: initializer element is not constant". Yeah, okay, so the gcc header has ntohl() defined as a function or as "do {...} while (0)" or something similar that can't be evaluated at compile time. Bummer.
Is there anything I can do which will achieve the same end? I need to initialize the value for the appropriate endedness, and I want it to be a globally-scoped const. Is there any way to convince gcc to do this, short of rolling my own ntohl-like macro?
(BTW, I note that clang has ntohl() defined such that it can be evaluated at compile time. The above code sample works perfectly with clang. Unfortunately I don't get my pick of compilers.)
Section 6.7.8/4 of the standard reads
All the expressions in an initializer for an object that has static storage duration shall be constant expressions or string literals.
A call to ntohl is neither a constant expression nor a string literal. You can’t get there from here.
But global variables are bad anyway, and I suspect this may be a premature optimization. The easy fix is to use the expression directly in your code, which will have no effect at all on big-endian platforms, e.g.,
void foo(void)
{
const unit32_t a = ntohl(0x11223344);
/* ... */
}
Even better, use a preprocessor macro, as in
#define POTRZEBIE ntohl(0x11223344)
void bar(void)
{
const unit32_t a = POTRZEBIE;
/* ... */
}
On variables with automatic storage, the const qualifier means single assignment, so there is no problem with the above usage.
Initialize it in main() or use something like (assuming Linux):
#include <endian.h>
#if __BYTE_ORDER == __LITTLE_ENDIAN
const uint32_t a = 0x44332211;
#else
const uint32_t a = 0x11223344;
#endif
or perhaps
#include <endian.h>
#define A_ADDR 0x11223344
#if __BYTE_ORDER == __LITTLE_ENDIAN
const uint32_t a = __bswap_constant_32(A_ADDR);
#else
const uint32_t a = A_ADDR;
#endif
I'm trying to find a way to make an enum "unsigned".
enum{
x1 = 0,
x2,
x3
};
uint8_t = x2; /* <--- PC-LINT MISRA-C 2004 will complain about mixing signed and unsigned here */
Of course, I can add a typecast to get rid of the error, that is time consuming and error prone.
uint8_t = (uint8_t)x2; /* This works, but is a lot of extra work over the course of 1000s of lines of code*/
So, is there a way to make a specific enum unsigned that MISRA-C 2004 will like?
There is no standard C way to control the type chosen for an enum. You can do it in implementation specific ways sometimes, like by adding a value to the enumeration that forces the type to be unsigned:
enum {
x1,
x2,
x3,
giant_one_for_forcing_unsigned = 0x80000000;
};
But that's not even standard C, either (since the value provided won't fit in an int). Unfortunately, you're pretty much out of luck. Here's the relevant bit from the standard:
6.7.2.2 Enumeration specifiers, paragraph 4
Each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined, but shall be capable of representing the values of all the members of the enumeration. The enumerated type is incomplete until immediately after the } that terminates the list of enumerator declarations, and complete thereafter.
You might be better off using #define rather than enum to make your constants:
#define x1 0U
#define x2 1U
#define x3 2U
uint8_t x = x2;
There are several concerns here, where there is a slight potential for conversion bugs, which MISRA is trying to make you avoid:
Enum constants, that is x1 etc in your example, are guaranteed to be of type int (1). But enum variables and the variable type enum is not guaranteed to be of the same type (2), if you are unlucky it is defined to be a small integer type and thereby subject to the integer promotion rules.
MISRA bans implicit conversions for large integer types to smaller ones, mainly to dodge unintentional truncation of values, but also to dodge various implicit promotion rules.
Your specific MISRA-compliance error actually comes from the latter concern above, violation of rule 10.3 (3).
You can either solve this by adding an explicit cast to the "underlying type" (intended type), in this case a cast to uint8_t. Or you can solve it by never using enums at all, replace them with #defines. That might sound very radical, but keep in mind that C has no type safety whatsoever, so there is no apparent benefit of using enums apart from perhaps readability.
It is somewhat common to replace enums in this manner:
#define FALSE 0
#define TRUE 1
typedef uint8_t BOOL;
(Though the purpose in this example is mainly to make the BOOL type portable, with a guarantee to be 8 bits and never 16 bits, as might happen in case it was an enum.)
References:
(1) C11 6.2.7.7/2:
"The expression that defines the value of an enumeration constant
shall be an integer constant expression that has a value representable
as an int."
(2) C11 6.2.7.7/4:
"Each enumerated type shall be compatible with char, a signed integer
type, or an unsigned integer type. The choice of type is
implementation-defined, but shall be capable of representing the
values of all the members of the enumeration."
(3) MISRA-c:2004 rule 10.3:
"The value of a complex expression of integer type may only be cast to
a type that is narrower and of the same signedness as the underlying
type of the expression."
Not only is there not a way in C90 to specify that an enum take on an unsigned type, but in C90:
An identifier declared as an enumeration constant has type int
This also applies to C99 (6.4.4.3). If you want an unsigned type, you're looking at a language extension.
The enumeration type may be something other than int, but the constants themselves must have int type.
You can force it to be unsigned by including a value large enough that it cannot fit in an int (per specification). This is pretty simple for types >= sizeof int, but unsigned char/short is more complicated and requires compiler specific packing. Of course implementations could technically still represent UINT_MAX as an unsigned long long... not that I've ever seen though.
#include <stdio.h> //only included for printf example
#include <limits.h>
#include <stdint.h>
/** set up some helper macros **/
#ifdef _MSC_VER
#define PACK( ... ) __pragma( pack(push, 1) ) __VA_ARGS__ __pragma( pack(pop) )
#else /* for gcc, clang, icc and others */
#define PACK( ... ) __VA_ARGS__ __attribute__((__packed__))
#endif
#define _PASTE(x,y) x ## y
#define PASTE(x,y) _PASTE(x,y)
/* __LINE__ added for semi-unique names */
#define U_ENUM(n, ... ) \
enum n { __VA_ARGS__ , PASTE( U_DUMMY , __LINE__ ) = UINT_MAX }
#define UL_ENUM(n, ... ) \
enum n { __VA_ARGS__ , PASTE( UL_DUMMY , __LINE__ ) = ULONG_MAX }
#define SZ_ENUM(n, ... ) /* useful for array indices */ \
enum n { __VA_ARGS__ , PASTE( SZ_DUMMY , __LINE__ ) = SIZE_MAX }
#define ULL_ENUM(n, ... ) \
enum n { __VA_ARGS__ , PASTE( ULL_DUMMY , __LINE__ ) = ULLONG_MAX }
#define UC_ENUM(n,...) \
PACK(enum n { __VA_ARGS__ , PASTE( UC_DUMMY , __LINE__ ) = UCHAR_MAX })
#define US_ENUM(n,...) \
PACK(enum n { __VA_ARGS__ , PASTE( US_DUMMY , __LINE__ ) = USHRT_MAX })
Here is a check to see that it works as expected:
typedef UC_ENUM(,a) A_t;
typedef US_ENUM(,b) B_t;
typedef U_ENUM(,c) C_t;
typedef UL_ENUM(,d) D_t;
typedef ULL_ENUM(,e) E_t;
typedef SZ_ENUM(,e) F_t;
int main(void) {
printf("UC %d,\nUS %d,\nU %d,\nUL %d,\nULL %d,\nSZ %d,\n",sizeof(A_t),
sizeof(B_t),sizeof(C_t),sizeof(D_t),sizeof(E_t),sizeof(F_t));
return 0;
}
To be more like a standard enum statement this is slightly different than the simpler version I use, which takes an additional named parameter for the last enum instead of the __LINE__ hack (this is also useful for functions that return -1 on error, because it will cast to U*_MAX)
Here is how that version looks:
#define U_ENUM( n, err, ...) enum n { __VA_ARGS__ , err = UINT_MAX }
#define UL_ENUM(n, err, ...) enum n { __VA_ARGS__ , err = ULONG_MAX }
#define ULL_ENUM(n,err, ...) enum n { __VA_ARGS__ , err = ULLONG_MAX}
#define SZ_ENUM(n, err, ...) enum n { __VA_ARGS__ , err = SIZE_MAX }
#define UC_ENUM(n, err, ...) PACK(enum n { __VA_ARGS__ , err = UCHAR_MAX })
#define US_ENUM(n, err, ...) PACK(enum n { __VA_ARGS__ , err = USHRT_MAX })
Apart from packing enums in char or short for compactness, size_t enums are the most interesting, because they can be used as array indices without an extra MOV instruction.
typedef SZ_ENUM(message_t,MSG_LAST,MSG_HELLO,MSG_GOODBYE,MSG_BAD) message_t;
static const char *messages[]={"hello","goodbye","bad message"};
void printmsg(message_t msg){
if (msg > MSG_BAD) msg = MSG_BAD;
(void) puts(messages[msg]);
}
Note if you use C++11 vs C, you can enum Foo : char { A, B, C}; or enum class Bar : size_t { X, Y, Z};
In addtion to #Carl's answer, to get some of the benefits of an enum declaration and result in some unsigned type, code could use the below.
// Form values 0, 5, 6
enum {
x1,
x2 = 5,
x3
};
// Form values 0u, 5u, 6u
#define ux1 (1u * x1)
#define ux2 (1u * x2)
#define ux3 (1u * x3)
This may not help with enumerations constants outside the int range.
Of course code could do the conversion instead as OP knows.
// uint8_t = x2;
uint8_t = x2 * 1u;