functions that deals with endianness only for unsigned value - c

Maybe I don't find other functions, but all functions that deals with endianness that I find accept only unsigned variable. My question is why (or are there functions that deals with endianness and accept signed variable) ?
List of functions I find : here and here.
Maybe solution is to use these macro ? What is the difference between macro and above function ?

Since endianness is implementation defined, it is safe to assume that you are talking about an implementation and not C standard. Looking at the links you have sent, I think you refer to Linux and GNU C compiler.
Then under this implementation it is safe to first type pun the signed int to unsigned int, change the endianness and type pun it back.
Following is one way of doing it
union signed_unsigned {
signed long a;
unsinged long b;
} converter;
signed long to_convert = .... //whatever value
converter.a = to_convert;
converter.b = htonl(converted.b);
to_convert = converter.a;
You can make this into a macro or a function as you see fit.
As suggested by #underscore_d, the other way to type pun a signed long to unsigned long (and back) is using pointer cast. That is valid in both C and C++ (although in C++ you should use reinterpret_cast rather than C style pointer casts).
You can use the following way to achieve the same.
signed long to_convert = .... //whatever value
unsigned long temp = *(unsinged long*)&to_convert;
temp = htonl(temp);
to_convert = *(signed long*)&temp;

Related

Work on an array of signed int as if it contained unsigned values

I've inherited some old code that assumes that an int can store values from -231 to 2^31-1, that overflow just wraps around, and that the sign bit is the high-order bit. In other words, that code should have used uint32_t, except that it wasn't. I would like to fix this code to use uint32_t.
The difficulty is that the code is distributed as source code and I'm not allowed to change the external interface. I have a function that works on an array of int. What it does internally is its own business, but int is exposed in the interface. In a nutshell, the interface is:
struct data {
int a[10];
};
void frobnicate(struct data *param);
I'd like to change int a[10] to uint32_t a[10], but I'm not allowed to modify the definition of struct data.
I can make the code work on uint32_t or unsigned internally:
struct internal_data {
unsigned a[10];
};
void frobnicate(struct data *param) {
struct internal_data *internal = (struct internal_data *)param;
// ... work with internal ...
}
However this is not actually correct C since it's casting between pointers to different types.
Is there a way I can add compile-time guards so that, for the rare people for whom int isn't “old-school” 32-bit, the code doesn't build? If int is less than 32 bits, the code has never worked anyway. For the vast majority of users, the code should build, and in a way that tells the compiler not to do “weird” things with overflowing int calculations.
I distribute the source code and people may use it with whatever compiler they choose, so compiler-specific tricks are not relevant.
I'm at least going to add
#if INT_MIN + 1 != -0x7fffffff
#error "This code only works with 32-bit two's complement int"
#endif
With this guard, what can go wrong with the cast above? Is there a reliable way of manipulating the int array as if its elements were unsigned, without copying the array?
In summary:
I can't change the function prototype. It references an array of int.
The code should manipulate the array (not a copy of the array) as an array of unsigned.
The code should build on platforms where it worked before (at least with sufficiently friendly compilers) and should not build on platforms where it can't work.
I have no control over which compiler is used and with which settings.
However this is not actually correct C since it's casting between pointers to different types.
Indeed, you cannot do such casts, because the two structure types are not compatible. You could however use a work-around such as this:
typedef union
{
struct data;
uint32_t array[10];
} internal_t;
...
void frobnicate(struct data *param) {
internal_t* internal = (internal_t*)param;
...
Another option if you can change the original struct declaration but not its member names, is to use C11 anonymous union:
struct data {
union {
int a[10];
uint32_t u32[10];
}
};
This means that user code accessing foo.a won't break. But you'd need C11 or newer.
Alternatively, you could use a uint32_t* to access the int[10] directly. This is also well-defined, since uint32_t in this case is the unsigned equivalent of the effective type int.
Is there a way I can add compile-time guards so that, for the rare people for whom int isn't “old-school” 32-bit, the code doesn't build?
The obvious is static_assert(sizeof(int) == 4, "int is not 32 bits"); but again this requires C11. If backwards compatibility with older C is needed, you can invent some dirty "poor man's static assert":
#define stat_assert(expr) typedef int dummy_t [expr];
#if INT_MIN != -0x80000000
Depending on how picky you are, this isn't 100% portable. int could in theory be 64 bits, but probably portability to such fictional systems isn't desired either.
If you don't want to drag limits.h around, you could also write the macro as
#if (unsigned int)-1 != 0xFFFFFFFF
It's a better macro regardless, since it doesn't have any hidden implicit promotion gems - note that -0x80000000 is always 100% equivalent to 0x80000000 on a 32 bit system.

When to use uint16_t vs int and when to cast type [duplicate]

This question already has answers here:
Should I use cstdint?
(6 answers)
Closed 8 years ago.
I have 2 questions about C programming:
For int and uint16_t, long and uint32_t, and so on. When should I use the u*_t types instead of int, long, and so on? I found it confusing to choose which one is best for my program.
When do I need to cast type?
I have the following statement in my program:
long * src;
long * dst;
...
memcpy(dst, src, len);
My friend changes this to
memcpy((char *)dst, (char *)src, len).
This is just example I encountered. Generally, I am confused when cast is required?
Use the plain types (int etc) except when you need a precisely-sized type. You might need the precisely sized type if you are working with a wire protocol which defines that the size field shall be a 2-byte unsigned integer (hence uint16_t), but for most work, most of the time, use the plain types. (There are some caveats to this, but most of the time, most people can work with the plain types for simple numeric work. If you are working to a set of interfaces, use the types dictated by the interfaces. If you're using multiple interfaces and the types clash, you'll have to consider using casting some of the time — or change one or both interfaces. Etc.)
The casts added by your friend are pointless. The actual prototype of memcpy() is:
void *memcpy(void * restrict s1, const void * restrict s2, size_t n);
The compiler converts the long * values to void * (nominally via char * because of the cast), all of which is almost always a no-op.
More generally, you use a cast when you need to change the type of something. One place you might need it is in bitwise operations, where you want a 64-bit result but the operands are 32-bit and leaving the conversion until after the bitwise operations gives a different result from the one you wanted. For example, assuming a system where int is 32 bits and long is 64 bits.
unsigned int x = 0x012345678;
unsigned long y = (~x << 22) | 0x1111;
This would calculate ~x as a 32-bit quantity, and the shift would be performed on a 32-bit quantity, losing a number of bits. By contrast:
unsigned long z = (~(unsigned long)x << 22) | 0x1111;
ensures that the calculation is done in 64-bit arithmetic and doesn't lose any bits from the original value.
The size of "classical" types like int and long int can vary between systems. This can cause problems when, for example, accessing files with fixed-width data structures. For example, int long is currently a 64-bit integer on new systems, but only 32 bits on older systems.
The intN_t and uintN_t types were introduced with C99 and are defined in <inttypes.h>. Since they explicitly specify the number of bits, they eliminate any ambiguity. As a rule, you should use these types in preference if you are at all concerned about making your code portable.
Wikipedia has more information
If you do not want to rely on your compiler use predefined types provided by standard library headers. Every C library you'd compile with is guaranteed to assign proper types to have at least size to store values of size their types declare.
In your friend specific case one can assume that he made this type cast just because he wanted to point other readers that two pointers actually hold symbol characters. Or maybe he is kind of old-fashion guy who remembers the times when there was no void type and the "lowest common divisor" was pointer to char. In my developer life, if I want to emphasize some of my actions I'll make an explicit type cast even if it is, in fact, redundant.
For you 1st question, look at : https://stackoverflow.com/questions/11786113/difference-between-different-integer-types
Basically, the _t is the real standard type name and without, it's a define of the same type.
the u is for unsigned which doesn't allow negative number.
As for your second question, you often need to cast when the function called needs arguments of another type that what you're passing. You can look here for casting tips, or here...

How do I use a return value straight from a function as a bitstring instead of a formatted number?

How do I get the effect of intVariable = *(int*) &floatVariable straight from a function instead of having to save the output of the function to a dummy variable first?
i.e.
float functionWithFloatTypedReturn(int input) {
[enter code here]
return serialStringStruct.floatRetVal;
}
...
intVariable = [desired code] functionWithFloatTypedReturn(commandCode);
Background: I have a function that reads from a serial response given by someone else's hardware. There is legacy code and curiosity about language capability motivating continued use of a singular function. If the function returns the numerical value '1', it could be either 0x3f800000 or 0x1, depending on the input to the function (which determines which command is given to the hardware). The serial string coming from the hardware gets broken up independent of type into a struct by a call of memcpy.
intVariable = (union { float f; int i; }) { functionWithFloatTypedReturn(commandCode) } .i;
This uses a compound literal containing a union to reinterpret the bytes of the value (which is supported in C 1999 and later).
Note that it is usually advisable to use an unsigned integer rather than an int to access the bits of floating-point encodings, because unsigned integers are less prone to certain issues such as sign issues with bit shifts.
You must also ensure that the two types you are use have the same number of bytes in your C implementation.
This code accomplishes the reinterpretation without an intermediate C variable. However, that does not mean it will be any more efficient in the machine language implementation.
You can't, in part because many compilers return floating point values in a floating point register (or on the FPU stack) and therefore they have to be stored to "regular" memory before you can access them in bitwise fashion.
You can use an inline function to do the conversion. Note that Standard C does not define the effect of treating a floating point bit pattern as an integer bit pattern (or vice versa), but it's OK to depend on your implementation here since, well, you have to. Just put this in some sort of "depends on implementation" file (header, module, whatever).
(I'd suggest looking at a way to refactor this a bit to hide the hardware variability from the rest of the code.)
If you have a compiler that supports the C99 standard, you can use a compound literal:
intVariable = *(int *)&(float){ functionWithFloatTypedReturn(commandCode) };
As Eric Postpischil notes in comments, a union does not have the same aliasing issues that pointer type punning does:
intVariable = (union { float f; int i; }){ .f = functionWithFloatTypedReturn(commandCode) }.i;
This suggests a better approach, however - change the floatRetVal member of the struct to a union containing a float and an int, and directly return that union type from the function:
union retval {
float f;
int i;
};
union retval functionWithUnionTypedReturn(int input) {
[enter code here]
return serialStringStruct.floatOrIntRetVal;
}
Then code that knows it will get a float return type uses:
floatVariable = functionWithUnionTypedReturn(commandCode).f;
and code that knows it will get an int return type uses:
intVariable = functionWithUnionTypedReturn(commandCode).i;

How to convert or cast a float into its bit sequence such as a long

Good day,
I am working in a 16-bit C environment, and I want to convert a float value into its bit sequence such as an integer value.
There are multiple ways I know how to achieve this, one is with a union; such as:
union ConvertFloatToInt
{
float input;
unsigned long output;
};
this will "convert" the floating values into a long value, by reading the same memory area, just interpreting it differently.
union ConvertFloatToInt x;
x.input = 20.00;
result
x.output = 0x41A00000;
Other methods are void pointer casts...
float input = 40.00;
unsigned long output;
void* ptr;
ptr = &input;
output = *(unsigned long*) ptr;
result
output = 0x42200000;
This is the idea of what I am trying to do, however, I want the compiler to do the conversion for me, during build, not during run time.
I need a to insert the converted floating data into a constant (const) unsigned long.
I was thinking of trying to convert the float value into a void, and then the void into the unsigned long.
Something like this: (and yes this is incorrect, you can not cast to a void)
const unsigned long FloatValue = (unsigned long) ((void) ((float) 20.654));
Is there some way to do this? I was thinking maybe something with void pointers, but all void pointers I know of needs a variable, and variables may not be used in the assignment of const values.
Edit
I am using a C90 compiler.
The question is intended in the file scope.
Conclusion
The conclusion was that there is no real solution to this question except when working in the block scope. For which multiple answers were given, and I thank all of you.
My Solution
This is not a good solution, however it solves my problem, but I do not think that this will help many people either.
I created a small program for a demonstration purpose. This is not my projects code, and also not the compiler used in my project (before someone says that this is not a C90 compiler)
The compiler used in the demonstration: gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
typedef union
{
float myfloat;
unsigned long mylong;
} custom_type;
typedef struct
{
int a;
int b;
custom_type typeA;
custom_type typeB;
} my_struct;
const my_struct myobj =
{
1,2,3.84F,4
};
int main(void)
{
printf(":: %f\n", myobj.typeA.myfloat);
printf(":: %ul\n", myobj.typeA.mylong);
return 0;
}
Output
:: 3.840000
:: 1081459343l
This is little bit crude, however it works in the file scope (but generates warnings).
You can do this by type-punning through an anonymous union:
unsigned int i = ((union { float f; unsigned int i; }){5.0}).i;
Note that this initialiser is not a constant expression and so cannot be used at file scope.
Type-punning through a union is specified to be allowed by the standard in a footnote:
c11
6.5.2.3 Structure and union members
95) If the member used to read the contents of a union object is not the same as the member last used to
store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type
punning’’). This might be a trap representation.
From a practical point of view, although you cannot use this method to initialise a file-scope constant, you could write an initialisation function that loads the values into file-scope variables at program or module initialisation time.
You're not going to find a portable method that allows you to calculate the values as a compile-time constant expression, because the object representations covered by section 6.2.6 of the standard only apply at run time. Otherwise, a cross-compiler would be required to simulate and not just parametrise the execution environment of its target.
Addendum: this is valid C++, with the condition that the union type must be named:
union u { float f; unsigned int i; };
unsigned int i = u{5.0}.i;
So if you're willing to write in hybrid C/C++ and compile with a C++ compiler, then you can perform the cast at compile time.
You can use a C99 compound literal:
const unsigned long FloatValue =
*(unsigned long *) &(float) {20.654f};
Note that the initializer is not a constant expression so FloatValue can only be declared at block scope and not at file scope.
I am assuming that these floats are constants and therefore you could just write a small program to do it as a one off exercise - generate the output as required. From that small program do a cut'n'paste job into the other code.
If you have a lot of them, why not just write a script to create the appropriate file for C.
You should know somethings about IEEE floating point standards.
http://en.wikipedia.org/wiki/IEEE_floating-point_standard
get the fractions bits and get the exponent bits and process them into a long
You could achieve your purpose by defining a float constant and then a macro:
const float _FloatValue = 20.654;
#define FloatValueL *((unsigned long *) &_FloatValue)

Is the sizeof(enum) == sizeof(int), always?

Is the sizeof(enum) == sizeof(int), always ?
Or is it compiler dependent?
Is it wrong to say, as compiler are optimized for word lengths (memory alignment) ie y int is the word-size on a particular compiler? Does it means that there is no processing penalty if I use enums, as they would be word aligned?
Is it not better if I put all the return codes in an enum, as i clearly do not worry about the values it get, only the names while checking the return types. If this is the case wont #DEFINE be better as it would save memory.
What is the usual practice?
If I have to transport these return types over a network and some processing has to be done at the other end, what would you prefer enums/#defines/ const ints.
EDIT - Just checking on net, as complier don't symbolically link macros, how do people debug then, compare the integer value with the header file?
From Answers —I am adding this line below, as I need clarifications—
"So it is implementation-defined, and
sizeof(enum) might be equal to
sizeof(char), i.e. 1."
Does it not mean that compiler checks for the range of values in enums, and then assign memory. I don't think so, of course I don't know. Can someone please explain me what is "might be".
It is compiler dependent and may differ between enums. The following are the semantics
enum X { A, B };
// A has type int
assert(sizeof(A) == sizeof(int));
// some integer type. Maybe even int. This is
// implementation defined.
assert(sizeof(enum X) == sizeof(some_integer_type));
Note that "some integer type" in C99 may also include extended integer types (which the implementation, however, has to document, if it provides them). The type of the enumeration is some type that can store the value of any enumerator (A and B in this case).
I don't think there are any penalties in using enumerations. Enumerators are integral constant expressions too (so you may use it to initialize static or file scope variables, for example), and i prefer them to macros whenever possible.
Enumerators don't need any runtime memory. Only when you create a variable of the enumeration type, you may use runtime memory. Just think of enumerators as compile time constants.
I would just use a type that can store the enumerator values (i should know the rough range of values before-hand), cast to it, and send it over the network. Preferably the type should be some fixed-width one, like int32_t, so it doesn't come to conflicts when different machines are involved. Or i would print the number, and scan it on the other side, which gets rid of some of these problems.
Response to Edit
Well, the compiler is not required to use any size. An easy thing to see is that the sign of the values matter - unsigned types can have significant performance boost in some calculations. The following is the behavior of GCC 4.4.0 on my box
int main(void) {
enum X { A = 0 };
enum X a; // X compatible with "unsigned int"
unsigned int *p = &a;
}
But if you assign a -1, then GCC choses to use int as the type that X is compatible with
int main(void) {
enum X { A = -1 };
enum X a; // X compatible with "int"
int *p = &a;
}
Using the option --short-enums of GCC, that makes it use the smallest type still fitting all the values.
int main() {
enum X { A = 0 };
enum X a; // X compatible with "unsigned char"
unsigned char *p = &a;
}
In recent versions of GCC, the compiler flag has changed to -fshort-enums. On some targets, the default type is unsigned int. You can check the answer here.
C99, 6.7.2.2p4 says
Each enumerated type shall be
compatible with char, a signed
integer type, or an unsigned
integer type. The choice of type
is implementation-defined,108) but
shall be capable of representing the
values of all the members of the
enumeration. [...]
Footnote 108 adds
An implementation may delay the choice of which integer
type until all enumeration constants have been seen.
So it is implementation-defined, and sizeof(enum) might be equal to sizeof(char), i.e. 1.
In chosing the size of some small range of integers, there is always a penalty. If you make it small in memory, there probably is a processing penalty; if you make it larger, there is a space penalty. It's a time-space-tradeoff.
Error codes are typically #defines, because they need to be extensible: different libraries may add new error codes. You cannot do that with enums.
Is the sizeof(enum) == sizeof(int), always
The ANSI C standard says:
Each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined. (6.7.2.2 Enumerationspecifiers)
So I would take that to mean no.
If this is the case wont #DEFINE be better as it would save memory.
In what way would using defines save memory over using an enum? An enum is just a type that allows you to provide more information to the compiler. In the actual resulting executable, it's just turned in to an integer, just as the preprocessor converts a macro created with #define in to its value.
What is the usual practise. I if i have to transport these return types over a network and some processing has to be done at the other end
If you plan to transport values over a network and process them on the other end, you should define a protocol. Decide on the size in bits of each type, the endianess (in which order the bytes are) and make sure you adhere to that in both the client and the server code. Also don't just assume that because it happens to work, you've got it right. It just might be that the endianess, for example, on your chosen client and server platforms matches, but that might not always be the case.
No.
Example: The CodeSourcery compiler
When you define an enum like this:
enum MyEnum1 {
A=1,
B=2,
C=3
};
// will have the sizeof 1 (fits in a char)
enum MyEnum1 {
A=1,
B=2,
C=3,
D=400
};
// will have the sizeof 2 (doesn't fit in a char)
Details from their mailing list
On some compiler the size of an enum is depending on how many entry's are in the Enum. (less than 255 Entrys => Byte, More than 255 Entrys int)
But this is depending on the Compiler and the Compiler Settings.
enum fruits {apple,orange,strawberry,grapefruit};
char fruit = apple;
fruit = orange;
if (fruit < strawberry)
...
all of this works perfectly
if you want a specific underlying type for an enum instance, just don't use the type itself.

Resources