Converting unsigned chars to signed integer - c

I have an unsigned char array with 2 elements that represents a signed integer. How can I convert these 2 bytes into a signed integer?
Edit: The unsigned char array is in little endian

For maximum safety, use
int i = *(signed char *)(&c[0]);
i *= 1 << CHAR_BIT;
i |= c[1];
for big endian. Swap c[0] and c[1] for little endian.
(Explanation: we interpret the byte at c[0] as a signed char, then arithmetically left shift it in a portable way, then add in c[1].)

wrap them up in a union:
union {
unsigned char a[2];
int16_t smt;
} number;
Now after filling the array you can use this as number.smt

It depend of endianness.
Something for big endian :
unsigned char x[2];
short y = (x[0] << 8) | x[1]
Something for little endian :
unsigned char x[2];
short y = (x[1] << 8) | x[0]

The portable solution:
unsigned char c[2];
long tmp;
int result;
tmp = (long)c[0] << 8 | c[1];
if (tmp < 32768)
result = tmp;
else
result = tmp - 65536;
This assumes that the bytes in the array represent a 16 bit, two's complement, big endian signed integer. If they are a little endian integer, just swap c[1] and c[0].
(In the highly unlikely case that it is ones' complement, use 65535 instead of 65536 as the value to subtract. Sign-magnitude is left as an exercise for the reader ;)

Related

Convert a 4 byte data to int in C

I need to convert 4 byte data which is in below format to a original int value . I cannot change the below assignment of int to 4 bytes.
int main() {
//code
int num = 1000;
char a[4];
a[0] = ( char )(num>>24) ;
a[1] = ( char )(num>>16) ;
a[2] = ( char )(num>>8) ;
a[3] = ( char )num ;
printf("Original number is:%d\n", (a[0] << 24 | a[1] << 16 | a[2] << 8 | a[3] ) );
return 0;
}
I was expecting output to be 1000, but output is 768. How do we restore the original number from above byte array ?Is this an endianess issue ?
a[0] = ( char )(num>>24) ;
That works “okay” in this example. However, in situations where num is negative, the result is implementation-defined (C 2018 6.5.7 5).
In the remaining assignments to a[1], a[2], and a[3], values that may exceed the range of char will be converted to char automatically. If char is signed, the results of these conversions are implementation-defined or a signal is raised (6.3.1.3 3). So that is a problem we will have to fix, below.
First, for num = 1000, let’s suppose that −24 is stored in a[3]. This is the result we would get by taking the low eight bits of 1000 and putting them in an eight-bit two’s complement char, which is likely what your implementation uses. Then, we have a[0] = 0, a[1] = 0, a[2] = 3, and a[3] = −24.
Now let’s consider a[0] << 24 | a[1] << 16 | a[2] << 8 | a[3].
a[0] << 24 and a[1] << 16 both yield 0. a[2] << 8 is 3 << 8, which produces 768, or 300 in hexadecimal. a[3] is −24. While a[3] is a char, it is promoted to an int when used in an expression (6.3.1.1 2). Still assuming your C implementation uses two’s complement, the binary for −24 is 11111111111111111111111111101000, or ffffffe8 in hexadecimal.
When we bitwise OR 300 and ffffffe8, the result is ffffffe8, which, in a 32-bit two’s complement int, is −24.
The easiest way to fix this is to change char a[4]; to unsigned char a[4];. That avoids any negative char values.
However, to make your code completely work for any value of int (assuming it is four bytes and two’s complement), we need to make some other changes:
unsigned char a[4];
/* Convert the signed num to unsigned before shifting.
Shifts of unsigned values are better defined than shifts
of signed values.
*/
a[0] = (unsigned) num >> 24;
a[1] = (unsigned) num >> 16;
a[2] = (unsigned) num >> 8;
a[3] = (unsigned) num;
/* The cast in the last assignment is not really needed since
we are assigning to an unsigned char, and it will be converted
as desired, but we keep it for uniformity.
*/
// Reconstruct the value using all unsigned values.
unsigned u = (unsigned) a[0] << 24 | (unsigned) a[1] << 16 | (unsigned) a[2] << 8 | a[3];
/* Copy the bits into an int. (Include <string.h> to get memcpy.)
Note: It is easy to go from signed to unsigned because the C standard
completely defines that conversion. For unsigned to signed, the
conversion is not completely defined, so we have to use some indirect
method to get the bits into an int.
*/
int i;
memcpy(&i, &u, sizeof i);
printf("Original number: %d.\n", i);
We need to use an unsigned value to reconstruct the bits because C’s shift operators are not well defined for signed values, especially when we want to shift a bit into the sign bit. Once we have the bits in the unsigned object, we can copy them into an int.

Algorithm to write two's complement integer in memory portably

Say I have the following:
int32 a = ...; // value of variable irrelevant; can be negative
unsigned char *buf = malloc(4); /* assuming octet bytes, this is just big
enough to hold an int32 */
Is there an efficient and portable algorithm to write the two's complement big-endian representation of a to the 4-byte buffer buf in a portable way? That is, regardless of how the machine we're running represents integers internally, how can I efficiently write the two's complement representation of a to the buffer?
This is a C question so you can rely on the C standard to determine if your answer meets the portability requirement.
Yes, you can certainly do it portably:
int32_t a = ...;
uint32_t b = a;
unsigned char *buf = malloc(sizeof a);
uint32_t mask = (1U << CHAR_BIT) - 1; // one-byte mask
for (int i = 0; i < sizeof a; i++)
{
int shift = CHAR_BIT * (sizeof a - i - 1); // downshift amount to put next
// byte in low bits
buf[i] = (b >> shift) & mask; // save current byte to buffer
}
At least, I think that's right. I'll make a quick test.
unsigned long tmp = a; // Converts to "twos complement"
unsigned char *buf = malloc(4);
buf[0] = tmp>>24 & 255;
buf[1] = tmp>>16 & 255;
buf[2] = tmp>>8 & 255;
buf[3] = tmp & 255;
You can drop the & 255 parts if you're assuming CHAR_BIT == 8.
If I understand correctly, you want to store 4 bytes of an int32 inside a char buffer, in a specific order(e.g. lower byte first), regardless of how int32 is represented.
Let's first make clear about those assumptions: sizeof(char)=8, two's compliment, and sizeof(int32)=4.
No, there is NO portable way in your code because you are trying to convert it to char instead of unsigned char. Storing a byte in char is implementation defined.
But if you store it in an unsigned char array, there are portable ways. You can right shift the value each time by 8 bit, to form a byte in the resulting array, or with the bitwise and operator &:
// a is unsigned
1st byte = a & 0xFF
2nd byte = a>>8 & 0xFF
3rd byte = a>>16 & 0xFF
4th byte = a>>24 & 0xFF

14-bit left-justified two's complement to a signed short

I have two bytes containing a 14-bit left-justified two's complement value, and I need to convert it to a signed short value (ranging from -8192 to +8191, I guess?)
What would be the fastest way to do that?
Simply divide by 4.
(Note, right-shift leads to implementation/undefined behaviour.)
A portable solution:
short convert(unsigned char hi, unsigned char lo)
{
int s = (hi << 6) | (lo >> 2);
if (s >= 8192)
s -= 16384;
return s;
}

How do I convert and break a 2 byte integer into 2 different chars in C?

I want to convert an unsigned int and break it into 2 chars. For example: If the integer is 1, its binary representation would be 0000 0001. I want the 0000 part in one char variable and the 0001 part in another binary variable. How do I achieve this in C?
If you insist that you have a sizeof(int)==2 then:
unsigned int x = (unsigned int)2; //or any other value it happens to be
unsigned char high = (unsigned char)(x>>8);
unsigned char low = x & 0xff;
If you have eight bits total (one byte) and you are breaking it into two 4-bit values:
unsigned char x=2;// or whatever
unsigned char high = (x>>4);
unsigned char low = x & 0xf;
Shift and mask off the part of the number you want. Unsigned ints are probably four bytes, and if you wanted all four bytes, you'd just shift by 16 and 24 for the higher order bytes.
unsigned char low = myuint & 0xff;
unsigned char high = (myuint >> 8) & 0xff;
This is assuming 16 bit ints check with sizeof!! On my platform ints are 32bit so I will use a short in this code example. Mine wins the award for most disgusting in terms of pulling apart the pointer - but it also is the clearest for me to understand.
unsigned short number = 1;
unsigned char a;
a = *((unsigned char*)(&number)); // Grab char from first byte of the pointer to the int
unsigned char b;
b = *((unsigned char*)(&number) + 1); // Offset one byte from the pointer and grab second char
One method that works is as follows:
typedef union
{
unsigned char c[sizeof(int)];
int i;
} intchar__t;
intchar__t x;
x.i = 2;
Now x.c[] (an array) will reference the integer as a series of characters, although you will have byte endian issues. Those can be addressed with appropriate #define values for the platform you are programming on. This is similar to the answer that Justin Meiners provided, but a bit cleaner.
unsigned short s = 0xFFEE;
unsigned char b1 = (s >> 8)&0xFF;
unsigned char b2 = (((s << 8)>> 8) & 0xFF);
Simplest I could think of.
int i = 1 // 2 Byte integer value 0x0001
unsigned char byteLow = (i & 0x00FF);
unsinged char byteHigh = ((i & 0xFF00) >> 8);
value in byteLow is 0x01 and value in byteHigh is 0x00

How can I cast a char to an unsigned int?

I have a char array that is really used as a byte array and not for storing text. In the array, there are two specific bytes that represent a numeric value that I need to store into an unsigned int value. The code below explains the setup.
char* bytes = bytes[2];
bytes[0] = 0x0C; // For the sake of this example, I'm
bytes[1] = 0x88; // assigning random values to the char array.
unsigned int val = ???; // This needs to be the actual numeric
// value of the two bytes in the char array.
// In other words, the value should equal 0x0C88;
I can not figure out how to do this. I would assume it would involve some casting and recasting of the pointers, but I can not get this to work. How can I accomplish my end goal?
UPDATE
Thank you Martin B for the quick response, however this doesn't work. Specifically, in my case the two bytes are 0x00 and 0xbc. Obviously what I want is 0x000000bc. But what I'm getting in my unsigned int is 0xffffffbc.
The code that was posted by Martin was my actual, original code and works fine so long as all of the bytes are less than 128 (.i.e. positive signed char values.)
unsigned int val = (unsigned char)bytes[0] << CHAR_BIT | (unsigned char)bytes[1];
This if sizeof(unsigned int) >= 2 * sizeof(unsigned char) (not something guaranteed by the C standard)
Now... The interesting things here is surely the order of operators (in many years still I can remember only +, -, * and /... Shame on me :-), so I always put as many brackets I can). [] is king. Second is the (cast). Third is the << and fourth is the | (if you use the + instead of the |, remember that + is more importan than << so you'll need brakets)
We don't need to upcast to (unsigned integer) the two (unsigned char) because there is the integral promotion that will do it for us for one, and for the other it should be an automatic Arithmetic Conversion.
I'll add that if you want less headaches:
unsigned int val = (unsigned char)bytes[0] << CHAR_BIT;
val |= (unsigned char)bytes[1];
unsigned int val = (unsigned char) bytes[0]<<8 | (unsigned char) bytes[1];
The byte ordering depends on the endianness of your processor. You can do this, which will work on big or little endian machines. (without ntohs it will work on big-endian):
unsigned int val = ntohs(*(uint16_t*)bytes)
unsigned int val = bytes[0] << 8 + bytes[1];
I think this is a better way to go about it than relying on pointer aliasing:
union {unsigned asInt; char asChars[2];} conversion;
conversion.asInt = 0;
conversion.asChars[0] = 0x0C;
conversion.asChars[1] = 0x88;
unsigned val = conversion.asInt;

Resources