Converting Char array to Long in C error - c

I try to covert unsigned long int to char
but I got some error
int main(void)
{
unsigned char pdest[4];
unsigned long l=0xFFFFFFFF;
pdest[0] = l & 0xFF;
pdest[1] = (l >> 8) & 0xFF;
pdest[2] = (l >> 16) & 0xFF;
pdest[3] = (l >> 24) & 0xFF;
unsigned long int l1=0;
l1 |= (pdest[0]);
l1 |= (pdest[1] << 8);
l1 |= (pdest[2] << 16);
l1 |= (pdest[3] << 24);
printf ("%lu",l1);
}
and output is
18446744073709551615
not 4294967295?
How to do it correct?

Read this:
https://en.wikipedia.org/wiki/C_data_types
...Long unsigned integer type. Capable of containing at least the [0,
4,294,967,295] range;

You should write the last 4 lines as:
l1 |= ((unsigned long) pdest[0]);
l1 |= (((unsigned long) pdest[1]) << 8);
l1 |= (((unsigned long) pdest[2]) << 16);
l1 |= (((unsigned long) pdest[3]) << 24);
As you should cast the byte to unsigned long before shifting.

pdest[3] << 24 is of type signed int.
Change your code to
unsigned long int l1=0;
l1 |= (pdest[0]);
l1 |= (pdest[1] << 8);
l1 |= (pdest[2] << 16);
l1 |= ((unsigned int)pdest[3] << 24);

The problem is the shifting of char. You must force a conversion to unsigned long before to shift beyond the 8 bits of a char. Moreover the sign of char will play a further alteration on the result.
Try
unsigned long int l1=0;
l1 |= ((unsigned long)pdest[0]);
l1 |= ((unsigned long)pdest[1] << 8);
l1 |= ((unsigned long)pdest[2] << 16);
l1 |= ((unsigned long)pdest[3] << 24);
Note the use of a cast to force compilers to convert the char to an unsigned long before that the shift take place.

Your unsigned long does not have to be 4 bytes long.
#include <stdio.h>
#include <stdint.h>
int main(void) {
int index;
unsigned char pdest[sizeof(unsigned long)];
unsigned long l=0xFFFFFFFFUL;
for(index = 0; index < sizeof(unsigned long); index++)
{
pdest[index] = l & 0xff;
l >>= 8;
}
unsigned long l1=0;
for(index = 0; index < sizeof(unsigned long); index++)
{
l1 |= (unsigned long)pdest[index] << (8 * index);
}
printf ("%lx\n",l1);
}

First of all, to name a type that is exactly 32 bits wide, use uint32_t, not unsigned long int. unsigned long int is generally 64 bits wide in 64-bit *nixes (so called LP64), whereas they're 32 bits in Windows (LLP64).
Anyway, the problem is with integer promotions. An operand to a arithmetic operation with conversion rank less than int or unsigned int will be converted to int or unsigned int, whichever its range fits into. Since all unsigned chars are representible as signed ints, the pdest[3] is converted to signed int and the result of pdest[3] << 24 is also of type signed int!. Now, if that has the most significant bit set, the bit is shifted into the sign bit of the integer, and the behaviour is according to the C standard, undefined.
However, GCC has defined behaviour for this case; in there, the result is just a negative integer with 2's complement representation; therefore the result of (unsigned char)0xFF << 24 is (int)-16777216. Now, for | operation this then needs to be promoted to the rank of the other operand, which is unsigned. The unsigned conversion happens as if by repeatedly adding or subtracting one more than the maximum (i.e. repeatedly adding or subtracting 2⁢⁴) until the value fits in the range of the value. Since unsigned long is 64 bits on your platform, the result of this conversion is 2^64 - 16777216, or 18446744073692774400, which is ORred with the bits from previous steps.
How to fix? Easy, just prior to shifts, cast each shifted number to uint32_t. Print with the help of PRIu32 macro:
#include <inttypes.h>
...
uint32_t l1=0;
l1 |= (uint32_t)pdest[0];
l1 |= (uint32_t)pdest[1] << 8;
l1 |= (uint32_t)pdest[2] << 16;
l1 |= (uint32_t)pdest[3] << 24;
printf ("%" PRIu32, l1);

The problem with your code is implicit type conversions of unsigned char to int and then to unsigned long with signed extension for bitwise or operation, the corresponding values for each lines are as commented below
l1 |= (pdest[0]); //dec = 255 hex = 0xFF
l1 |= (pdest[1] << 8); //dec = 65535 hex = 0xFFFF
l1 |= (pdest[2] << 16); //dec = 16777215 hex =0xFFFFFF
l1 |= (pdest[3] << 24); //here is the problem
In last line pdest[3] << 24 = 0xFF000000 which is equivalent to -16777216 due to implicit conversion to int. It is again converted to unsigned long for bitwise or operation, where signed extension happens in l1 |= (pdest[3] << 24) which is equivalent to 0x0000000000FFFFFF | 0xFFFFFFFFFF000000.
As many people suggested you can use explicit type conversion or you can use below code snippet,
l1 = (l1 << 0) | pdest[3];
l1 = (l1 << 8) | pdest[2];
l1 = (l1 << 8) | pdest[1];
l1 = (l1 << 8) | pdest[0];
I hope it solves your problem and reasons out for such a huge output.

Related

Same-line vs. multi-lined bitwise operation discrepancy

While writing a program for uni, I noticed that
unsigned char byte_to_write_1 = (0xFF << 2) >> 2; ==> 0xFF (wrong)
unsigned char byte_to_write_2 = (0xFF << 2);
byte_to_write_2 = byte_to_write_2 >> 2; ==> 0x3F (correct)
I don't understand what's causing the discrepancy... my best guess is that while modifying a byte with multiple operations on the same line, C "holds onto" the extra bits in a slightly larger datatype until the line is terminated so 0xFF << 2 is held as 11[1111 1100] instead of 1111 1100 so on the same-line shiftback, the result is 1111 1111 instead of 1111 1100.
What causes the difference in results? Thanks in advance...
I first noticed the issue in a larger code project but have been able to recreate the issue using a much more simple program.image of simplified code to showcase problem
unsigned char byte_to_write_1 = (0xFF << 2) >> 2; // 0xFF
0xFF is an int. (Obtaining 0xFF from an unsigned char wouldn't change anything since it would get promoted to an int.) 0xFF << 2 is 0x3FC. 0x3FC >> 2 is 0xFF.
unsigned char byte_to_write_2 = 0xFF << 2; // 0xFC
byte_to_write_2 >>= 2; // 0x3F
We've already established that 0xFF << 2 is 0x3FC. But you assign that to an unsigned char which is presumably only 8 bits in size. So you end up assigning 0xFC instead. (gcc warns about this if you enable warnings as you should.)
And of course, you get the desired value when you right-shift that.
Solutions:
(unsigned char)( 0xFF << 2 ) >> 2
( ( 0xFF << 2 ) & 0xFF ) >> 2
0xFF & 0x3F
Demo on Compiler Explorer.
The difference between these two code snippets
unsigned char byte_to_write_1 = (0xFF << 2) >> 2; ==> 0xFF (wrong)
and
unsigned char byte_to_write_2 = (0xFF << 2);
byte_to_write_2 = byte_to_write_2 >> 2; ==> 0x3F (correct)
is that in the second code snippet there is used the variable byte_to_write_2 to store the intermediate result of the expression (0xFF << 2). The variable can not hold the full integer result. So the integer result is converted to the type unsigned char that can store only one byte.
The maximum value that can be stored in an object of the type unsigned character is 0xFF
β€” maximum value for an object of type unsigned char
UCHAR_MAX 255 // 28 βˆ’ 1
that is equal to 255. While the expression 0xFF << 2 that is equivalent to 255 * 4 can not fit in an object of the type unsigned char.
In the first code snippet the intermediate result of the expression (0xFF << 2) has the type int due to the integer promotion and can be used without a change in the full expression (0xFF << 2) >> 2.
Consider the outputs of these two calls of printf
printf( "0xFF << 2 = %d\n", 0xFF << 2 );
printf( "( unsigned char )( 0xFF << 2 ) = %d\n", ( unsigned char )( 0xFF << 2 ) );
They are
0xFF << 2 = 1020
( unsigned char )( 0xFF << 2 ) = 252

Using bitwise operators to change all bits of the most significant byte to 1

Let's suppose I have an unsigned int x = 0x87654321. How can I use bitwise operators to change the most significant byte (the leftmost 8 bits) of this number to 1?
So, instead of 0x87654321, I would have 0xFF654321?
As an unsigned in C may be 32 bits, 16 bits or other sizes, best to drive code without assuming the width.
The value UINT_MAX has all value bits set.
A "byte" in C is CHAR_BIT wide - usually 8.
UINT_MAX ^ (UINT_MAX >> CHAR_BIT) or ~(UINT_MAX >> CHAR_BIT) is the desired mask.
#include <limits.h>
#include <stdio.h>
#define UPPER_BYTE_MASK (UINT_MAX ^ (UINT_MAX >> CHAR_BIT))
// or
#define UPPER_BYTE_MASK (~(UINT_MAX >> CHAR_BIT))
int main() {
unsigned value = 0x87654321;
printf("%X\n", value | UPPER_BYTE_MASK);
}
#define MSB1(x) ((x) | (((1ULL << CHAR_BIT) - 1)<< ((sizeof(x) - 1) * CHAR_BIT)))
int main(void)
{
char x;
short y;
int z;
long q;
long long l;
printf("0x%llx\n", (unsigned long long)MSB1(x));
printf("0x%llx\n", (unsigned long long)MSB1(y));
printf("0x%llx\n", (unsigned long long)MSB1(z));
printf("0x%llx\n", (unsigned long long)MSB1(q));
printf("0x%llx\n", (unsigned long long)MSB1(l));
l = MSB1(l);
}
If you know the size of the integer, you can simply use something like
x |= 0xFF000000;
If not, you'll need to calculate the mask. One way:
x |= UINT_MAX - ( UINT_MAX >> 8 );

Bitmask for exactly one byte in C

My goal is to save a long in four bytes like this:
unsigned char bytes[4];
unsigned long n = 123;
bytes[0] = (n >> 24) & 0xFF;
bytes[1] = (n >> 16) & 0xFF;
bytes[2] = (n >> 8) & 0xFF;
bytes[3] = n & 0xFF;
But I want the code to be portable, so I use CHAR_BIT from <limits.h>:
unsigned char bytes[4];
unsigned long n = 123;
bytes[0] = (n >> (CHAR_BIT * 3)) & 0xFF;
bytes[1] = (n >> (CHAR_BIT * 2)) & 0xFF;
bytes[2] = (n >> CHAR_BIT) & 0xFF;
bytes[3] = n & 0xFF;
The problem is that the bitmask 0xFF only accounts for eight bits, which is not necessarily equivalent to one byte. Is there a way to make the upper code completely portable for all platforms?
How about something like:
unsigned long mask = 1;
mask<<=CHAR_BIT;
mask-=1;
and then using this as the mask instead of 0xFF?
Test program:
#include <stdio.h>
int main() {
#define MY_CHAR_BIT_8 8
#define MY_CHAR_BIT_9 9
#define MY_CHAR_BIT_10 10
#define MY_CHAR_BIT_11 11
#define MY_CHAR_BIT_12 12
{
unsigned long mask = 1;
mask<<=MY_CHAR_BIT_8;
mask-= 1;
printf("%lx\n", mask);
}
{
unsigned long mask = 1;
mask<<=MY_CHAR_BIT_9;
mask-= 1;
printf("%lx\n", mask);
}
{
unsigned long mask = 1;
mask<<=MY_CHAR_BIT_10;
mask-= 1;
printf("%lx\n", mask);
}
{
unsigned long mask = 1;
mask<<=MY_CHAR_BIT_11;
mask-= 1;
printf("%lx\n", mask);
}
{
unsigned long mask = 1;
mask<<=MY_CHAR_BIT_12;
mask-= 1;
printf("%lx\n", mask);
}
}
Output:
ff
1ff
3ff
7ff
fff
I work almost exclusively with embedded systems where I rather often have to provide portable code between all manner of more or less exotic systems. Like writing code which will work both on some tiny 8 bit MCU and a x86_64.
But even for me, bothering with portability to exotic obsolete DSP systems and the like is a huge waste of time. These systems barely exist in the real world - why exactly do you need portability to them? Is there any other reason than "showing off" mostly useless language lawyer knowledge of C? In my experience, 99% of all such useless portability concerns boil down to programmers "showing off", rather than an actual requirement specification.
And even if you for some strange reason do need such portability, this task doesn't make any sense to begin with since neither char nor long are portable! If char is not 8 bits then what makes you think long is 4 bytes? It could be 2 bytes, it could be 8 bytes, or it could be something else.
If portability is an actual concern, then you must use stdint.h. Then if you truly must support exotic systems, you have to decide which ones. The only real-world computers I know of that actually do use different byte sizes are various obsolete exotic TI DSPs from the 1990s, which use 16 bit bytes/char. Lets assume this is your intended target which you have decided is important to support.
Lets also assume that a standard C compiler (ISO 9899) exists for that exotic target, which is highly unlikely. (More likely you'll get a poorly conforming, mostly broken legacy C90 thing... or even more likely those who use the target write everything in assembler.) In case of a standard C compiler, it will not implement uint8_t since it's not a mandatory type if the target doesn't support it. Only uint_least8_t and uint_fast8_t are mandatory.
Then you'd go about it like this:
#include <stdint.h>
#include <limits.h>
#if CHAR_BIT == 8
static void uint32_to_uint8 (uint8_t dst[4], uint32_t u32)
{
dst[0] = (u32 >> 24) & 0xFF;
dst[1] = (u32 >> 16) & 0xFF;
dst[2] = (u32 >> 8) & 0xFF;
dst[3] = (u32 >> 0) & 0xFF;
}
#endif
// whatever other conversion functions you need:
static void uint32_to_uint16 (uint16_t dst[2], uint32_t u32){ ... }
static void uint64_to_uint16 (uint16_t dst[2], uint32_t u32){ ... }
The exotic DSP will then use the uint32_to_uint16 function. You could use the same compiler #if CHAR_BIT checks to do #define byte_to_word uint32_to_uint16 etc.
And then should also immediately notice that endianess will be the next major portability concern. I have no idea what endianess obsolete DSPs often use, but that's another question.
What about:
unsigned long mask = (unsigned char)-1;
This will work because the C standard says in 6.3.1.3p2
1 When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
And that unsigned long can represent all values of unsigned char.
#define CHARMASK ((1UL << CHAR_BIT) - 1)
int main(void)
{
printf("0x%x\n", CHARMASK);
}
And the mask will always have width of the char. Calculated compile time, no additional variables needed.
Or
#define CHARMASK ((unsigned char)(~0))
You can do it without the masks as well
void foo(unsigned int n, unsigned char *bytes)
{
bytes[0] = ((n << (CHAR_BIT * 0)) >> (CHAR_BIT * 3));
bytes[1] = ((n << (CHAR_BIT * 1)) >> (CHAR_BIT * 3));
bytes[2] = ((n << (CHAR_BIT * 2)) >> (CHAR_BIT * 3));
bytes[3] = ((n << (CHAR_BIT * 3)) >> (CHAR_BIT * 3));
}
int main(void)
{
unsigned int z = 0xaabbccdd;
unsigned char bytes[4];
foo(z, bytes);
printf("0x%02x 0x%02x 0x%02x 0x%02x\n", bytes[0], bytes[1], bytes[2], bytes[3]);
}

How to combine 4 bytes and do math calculation in C using AVR

I have an atMega1281 micro controller using C. I have a routine to pull 4 bytes off of a CAN bus to read an SPN. I am able to get the 4 bytes but I cannot print the 4 byte number because it is truncating the the first 2 bytes making it a 16 bit number. I have tried using unsigned long as the declaration with no success. What is the trick when using 32 bit numbers with an AVR?
unsigned long engine_hours_raw;
float engine_hours_uint;
engine_hours_raw = (OneMessage.Msg.Data[3] << 24) | (OneMessage.Msg.Data[2] << 16) | (OneMessage.Msg.Data[1] << 8) | OneMessage.Msg.Data[0]);
engine_hours_uint = engine_hours_raw * 0.05;
ftoa(engine_hours_uint, engHours, 1);
UART1_Printf("Engine Hours: %s ", engHours);
(OneMessage.Msg.Data[3] << 24) will be 0 as the default size for an expression is an int. unless it is cast.
Instead, load the data into the long int and then perform the shift.
engine_hours_raw = OneMessage.Msg.Data[3];
engine_hours_raw <<= 8;
engine_hours_raw |= OneMessage.Msg.Data[2];
engine_hours_raw <<= 8;
engine_hours_raw |= OneMessage.Msg.Data[1];
engine_hours_raw <<= 8;
engine_hours_raw |= OneMessage.Msg.Data[0];
You could also cast the intermediate expressions as unsigned log or uint32_t, but it just as messy and may take longer.
engine_hours_raw = ((unit32_t)(OneMessage.Msg.Data[3]) << 24) | ((unit32_t)(OneMessage.Msg.Data[2]) << 16) | ((unit32_t)(OneMessage.Msg.Data[1]) << 8) | (unit32_t)(OneMessage.Msg.Data[0]);
There is much easier and straightforward way :)
uint32_t *engine_hours_raw = (uint32_t *)OneMessage.Msg.Data;
float engine_hours_uint = *engine_hours_raw * 0.05;

Changing endianness on 3 byte integer

I am receiving a 3-byte integer, which I'm storing in an array. For now, assume the array is unsigned char myarray[3]
Normally, I would convert this into a standard int using:
int mynum = ((myarray[2] << 16) | (myarray[1] << 8) | (myarray[0]));
However, before I can do this, I need to convert the data from network to host byte ordering.
So, I change the above to (it comes in 0-1-2, but it's n to h, so 0-2-1 is what I want):
int mynum = ((myarray[1] << 16) | (myarray[2] << 8) | (myarray[0]));
However, this does not seem to work. For the life of me can't figure this out. I've looked at it so much that at this point I think I'm fried and just confusing myself. Is what I am doing correct? Is there a better way? Would the following work?
int mynum = ((myarray[2] << 16) | (myarray[1] << 8) | (myarray[0]));
int correctnum = ntohl(mynum);
Here's an alternate idea. Why not just make it structured and make it explicit what you're doing. Some of the confusion you're having may be rooted in the "I'm storing in an array" premise. If instead, you defined
typedef struct {
u8 highByte;
u8 midByte;
u8 lowByte;
} ThreeByteInt;
To turn it into an int, you just do
u32 ThreeByteTo32(ThreeByteInt *bytes) {
return (bytes->highByte << 16) + (bytes->midByte << 8) + (bytes->lowByte);
}
if you receive the value in network ordering (that is big endian) you have this situation:
myarray[0] = most significant byte
myarray[1] = middle byte
myarray[2] = least significant byte
so this should work:
int result = (((int) myarray[0]) << 16) | (((int) myarray[1]) << 8) | ((int) myarray[2]);
Beside the ways of using strucures / unions with byte-size members you have two other ways
Using ntoh / hton and masking out the high byte of the 4-byte integer before or after
the conversion with an bitwise and.
Doing the bitshift operations contained in other answers
At any rate you should not rely on side effects and shift data beyond the size of data type.
Shift by 16 is beyond the size of unsigned char and will cause problems depending on compiler, flags, platform endianess and byte order. So always do the proper cast before bitwise to make it work on any compiler / platform:
int result = (((int) myarray[0]) << 16) | (((int) myarray[1]) << 8) | ((int) myarray[2]);
Why don't just receive into the top 3 bytes of a 4-byte buffer? After that you could use ntohl which is just a byte swap instruction in most architectures. In some optimization levels it'll be faster than simple bitshifts and or
union
{
int32_t val;
unsigned char myarray[4];
} data;
memcpy(&data, buffer, 3);
data.myarray[3] = 0;
data.val = ntohl(data.val);
or in case you have copied it to the bottom 3 bytes then another shift is enough
memcpy(&data.myarray[1], buffer, 3);
data.myarray[0] = 0;
data.val = ntohl(data.val) >> 8; // or data.val = ntohl(data.val << 8);
unsigned char myarray[3] = { 1, 2, 3 };
# if LITTLE_ENDIAN // you figure out a way to express this on your platform
int mynum = (myarray[0] << 0) | (myarray[1] << 8) | (myarray[2] << 16);
# else
int mynum = (myarray[0] << 16) | (myarray[1] << 8) | (myarray[2] << 0);
# endif
printf("%x\n", mynum);
That prints 30201 which I think is what you want. The key is to realize that you have to shift the bytes differently per-platform: you can't easily use ntohl() because you don't know where to put the extra zero byte.

Resources