store 2 signed shorts in one unsigned int - c

This is given:
signed short a, b;
a = -16;
b = 340;
Now I want to store these 2 signed shorts in one unsigned int and later retrieve these 2 signed shorts again. I tried this but the resulting shorts are not the same:
unsigned int c = a << 16 | b;
signed short ar, br;
ar = c >> 16;
br = c & 0xFFFF;

OP almost had it right
#include <assert.h>
#include <limits.h>
unsigned ab_to_c(signed short a, signed short b) {
assert(SHRT_MAX == 32767);
assert(UINT_MAX == 4294967295);
// unsigned int c = a << 16 | b; fails as `b` get sign extended before the `|`.
// *1u insures the shift of `a` is done as `unsigned` to avoid UB
// of shifting into the sign bit.
unsigned c = (a*1u << 16) | (b & 0xFFFF);
return c;
}
void c_to_ab(unsigned c, signed short *a, signed short *b) {
*a = c >> 16;
*b = c & 0xFFFF;
}

Since a has a negative value,
unsigned int c = a << 16 | b;
results in undefined behavior.
From the C99 standard (emphasis mine):
6.5.7 Bitwise shift operators
4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 x 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 x 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
You can explicitly cast the signed short to unsigned short to get a predictable behavior.
#include <stdio.h>
int main()
{
signed short a, b;
a = -16;
b = 340;
unsigned int c = (unsigned short)a << 16 | (unsigned short)b;
signed short ar, br;
ar = c >> 16;
br = c & 0xFFFF;
printf("ar: %hd, br: %hd\n", ar, br);
}
Output:
ar: -16, br: 340

This is really weird, I've compiled your code and it works for me
perhaps this is undefined behavior I'm not sure, however if I were you
I'd add castings to explicitly avoid some bit loss that may or may not be caused by abusing two complement or compiler auto casting....
In my opinion what's happening is probably you shifting out all the bits in
a... try this
unsigned int c = ((unsigned int) a) << 16 | b;

This is because you are using an unsigned int, which is usually 32 bits and a negative signed short which is usually 16 bits.
When you put a short with a negative value into an unsigned int, that "negative" bit is going to be interpreted as part of a positive number.
And so you get a vastly different number in the unsigned int.
Storing two positive numbers would solve this problem....but you might need to store a negative one.

Not sure if this way of doing is good for portability or others but I use...
#ifndef STDIO_H
#define STDIO_H
#include <stdio.h>
#endif
#ifndef SDTINT_H
#define STDINT_H
#include <stdint.h>
#endif
#ifndef BOOLEAN_TE
#define BOOLEAN_TE
typedef enum {false, true} bool;
#endif
#ifndef UINT32_WIDTH
#define UINT32_WIDTH 32 // defined in stdint.h, inttypes.h even in libc.h... undefined ??
#endif
typedef struct{
struct{ // anonymous struct
uint32_t x;
uint32_t y;
};}ts_point;
typedef struct{
struct{ // anonymous struct
uint32_t line;
uint32_t column;
};}ts_position;
bool is_little_endian()
{
uint8_t n = 1;
return *(char *)&n == 1;
}
int main(void)
{
uint32_t x, y;
uint64_t packed;
ts_point *point;
ts_position *position;
x = -12;
y = 3457254;
printf("at start: x = %i | y = %i\n", x, y);
if (is_little_endian()){
packed = (uint64_t)y << UINT32_WIDTH | (uint64_t)x;
}else{
packed = (uint64_t)x << UINT32_WIDTH | (uint64_t)y;
}
printf("packed: position = %llu\n", packed);
point = (ts_point*)&packed;
printf("unpacked: x = %i | y = %i\n", point->x, point->y); // access via pointer
position = (ts_position*)&packed;
printf("unpacked: line = %i | column = %i\n", position->line, position->column);
return 0;
}
I like the way I do as it's offer lots of readiness and can be applied in manay ways ie. 02x32, 04x16, 08x08, etc. I'm new at C so feel free to critic my code and way of doing... thanks

Related

Why left shift 24 bits changed the value of unsigned long in C?

I expect 0b11010010 << 24 should be the same value as 0b11010010000000000000000000000000.
I tested it in C, 0b11010010 << 24 doesn't work as expected if we saved it in c unsigned long.
Does anyone know how C unsigned long works like this?
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
int main(){
unsigned long a = 0b11010010000000000000000000000000;
unsigned long b = 0b11010010 << 24;
bool isTheSame1 = a == b;
printf("isTheSame1 %d \n",isTheSame1);
bool isTheSame2 = 0b11010010000000000000000000000000 == (0b11010010 << 24);
printf("isTheSame2 %d",isTheSame2);
}
isTheSame1 should be 1 but it prints 0 as following
isTheSame1 0
isTheSame2 1
Compiled and executed by gcc main.c && ./a.out
gcc --version
Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: x86_64-apple-darwin22.2.0
Thread model: posix
Updated
As Allan Wind pointed out, I added UL suffix and now it works as expected.
unsigned long a = 0b11010010000000000000000000000000UL;
unsigned long b = 0b11010010UL << 24;
bool isTheSame1 = a == b;
printf("isTheSame1 %d \n",isTheSame1);
bool isTheSame2 = 0b11010010000000000000000000000000UL == (0b11010010UL << 24);
printf("isTheSame2 %d",isTheSame2);
The constant 0b11010010 has type int which is signed. Assuming an int is 32 bits, the expression 0b11010010 << 24 will shift a "1" bit into the sign bit. Doing so triggers undefined behavior which is why you're getting strange results.
Add the UL suffix to the constant to give it type unsigned long, then the shift will work as expected.
unsigned long b = 0b11010010UL << 24;
You are doing a left shift of a signed value (see good answer of #dbush)
In absence of suffixes numbers have int or double types
b = 0b11010010 ; /* type int */
b = 1.0; /* type double */
If you want want b in your example as unsigned long use a suffix:
b = 0b11010010UL; /* type unsigned long */
or a cast:
b = (unsigned long)0b11010010; /* type unsigned long */
With 32-bit (or smaller) int, 0b11010010 << 24 is undefined behaver (UB). It attempts to shift into the sign bit.
When int is 32-bit (common), this often results in a negative value corresponding to the bit pattern 11010010-00000000-00000000-00000000.
When a negative value is saved as an unsigned long, ULONG_MAX + 1 is added to it. With a 64-bit unsigned long the value has the bit pattern:
11111111-11111111-11111111-11111111-11010010-00000000-00000000-00000000
This large unsigned long in not equal to 0b11010010000000000000000000000000UL and so the output of "isTheSame1 0".
Had OP's long been 32-bit, it "might" have worked as OP had intended - yet unfortunately still replying on UB.
Appending an L
32-bit unsigned long: 0b11010010 << 24 suffers the same UB problem as above - yet might have "worked".
64-bit unsigned long: 0b11010010L is also long and 0b11010010L << 24 becomes the value 0b11010010000000000000000000000000, the same value as a.
Appending an U
32-bit unsigned: 0b11010010U << 24 becomes the value 0b11010010000000000000000000000000, the same value as a.
16-bit unsigned: 0b11010010U << 24 is undefined behavior as the shift is too great. Often the UB results in the same as 0b11010010U << (24-16), yet this is not reliably done.
Appending an UL
32 or 64-bit unsigned long: 0b11010010UL << 24 becomes the value 0b11010010000000000000000000000000, the same value as a.
Since the left hand side of the = of the below is unsigned long, better for the right hand side constant to be unsigned long.
unsigned long b = 0b11010010 << 24; // Original
unsigned long b = 0b11010010UL << 24; // Better

Convert signed int of variable bit size

I have a number of bits (the number of bits can change) in an unsigned int (uint32_t). For example (12 bits in the example):
uint32_t a = 0xF9C;
The bits represent a signed int of that length.
In this case the number in decimal should be -100.
I want to store the variable in a signed variable and gets is actual value.
If I just use:
int32_t b = (int32_t)a;
it will be just the value 3996, since it gets casted to (0x00000F9C) but it actually needs to be (0xFFFFFF9C)
I know one way to do it:
union test
{
signed temp :12;
};
union test x;
x.temp = a;
int32_t result = (int32_t) x.temp;
now i get the correct value -100
But is there a better way to do it?
My solution is not very flexbile, as I mentioned the number of bits can vary (anything between 1-64bits).
But is there a better way to do it?
Well, depends on what you mean by "better". The example below shows a more flexible way of doing it as the size of the bit field isn't fixed. If your use case requires different bit sizes, you could consider it a "better" way.
unsigned sign_extend(unsigned x, unsigned num_bits)
{
unsigned f = ~((1 << (num_bits-1)) - 1);
if (x & f) x = x | f;
return x;
}
int main(void)
{
int x = sign_extend(0xf9c, 12);
printf("%d\n", x);
int y = sign_extend(0x79c, 12);
printf("%d\n", y);
}
Output:
-100
1948
A branch free way to sign extend a bitfield (Henry S. Warren Jr., CACM v20 n6 June 1977) is this:
// value i of bit-length len is a bitfield to sign extend
// i is right aligned and zero-filled to the left
sext = 1 << (len - 1);
i = (i ^ sext) - sext;
UPDATE based on #Lundin's comment
Here's tested code (prints -100):
#include <stdio.h>
#include <stdint.h>
int32_t sign_extend (uint32_t x, int32_t len)
{
int32_t i = (x & ((1u << len) - 1)); // or just x if you know there are no extraneous bits
int32_t sext = 1 << (len - 1);
return (i ^ sext) - sext;
}
int main(void)
{
printf("%d\n", sign_extend(0xF9C, 12));
return 0;
}
This relies on the implementation defined behavior of sign extension when right-shifting signed negative integers. First you shift your unsigned integer all the way left until the sign bit is becoming MSB, then you cast it to signed integer and shift back:
#include <stdio.h>
#include <stdint.h>
#define NUMBER_OF_BITS 12
int main(void) {
uint32_t x = 0xF9C;
int32_t y = (int32_t)(x << (32-NUMBER_OF_BITS)) >> (32-NUMBER_OF_BITS);
printf("%d\n", y);
return 0;
}
This is a solution to your problem:
int32_t sign_extend(uint32_t x, uint32_t bit_size)
{
// The expression (0xffffffff << bit_size) will fill the upper bits to sign extend the number.
// The expression (-(x >> (bit_size-1))) is a mask that will zero the previous expression in case the number was positive (to avoid having an if statemet).
return (0xffffffff << bit_size) & (-(x >> (bit_size-1))) | x;
}
int main()
{
printf("%d\n", sign_extend(0xf9c, 12)); // -100
printf("%d\n", sign_extend(0x7ff, 12)); // 2047
return 0;
}
The sane, portable and effective way to do this is simply to mask out the data part, then fill up everything else with 0xFF... to get proper 2's complement representation. You need to know is how many bits that are the data part.
We can mask out the data with (1u << data_length) - 1.
In this case with data_length = 8, the data mask becomes 0xFF. Lets call this data_mask.
Thus the data part of the number is a & data_mask.
The rest of the number needs to be filled with zeroes. That is, everything not part of the data mask. Simply do ~data_mask to achieve that.
C code: a = (a & data_mask) | ~data_mask. Now a is proper 32 bit 2's complement.
Example:
#include <stdio.h>
#include <inttypes.h>
int main(void)
{
const uint32_t data_length = 8;
const uint32_t data_mask = (1u << data_length) - 1;
uint32_t a = 0xF9C;
a = (a & data_mask) | ~data_mask;
printf("%"PRIX32 "\t%"PRIi32, a, (int32_t)a);
}
Output:
FFFFFF9C -100
This relies on int being 32 bits 2's complement but is otherwise fully portable.

Bizarre right bitshift inconsistency

I've been working with bits in C (running on ubuntu). In using two different ways to right shift an integer, I got oddly different outputs:
#include <stdio.h>
int main(){
int x = 0xfffffffe;
int a = x >> 16;
int b = 0xfffffffe >> 16;
printf("%X\n%X\n", a, b);
return 0;
}
I would think the output would be the same for each: FFFF, because the right four hex places (16 bits) are being rightshifted away. Instead, the output is:
FFFFFFFF
FFFF
What explains this behaviour?
When you say:
int x = 0xfffffffe;
That sets x to -2 because the maximum value an int can hold here is 0x7FFFFFFF and it wraps around during conversion. When you bit-shift the negative number it gets weird.
If you change those values to unsigned int it all works out.
#include <stdio.h>
int main(){
unsigned int x = 0xfffffffe;
unsigned int a = x >> 16;
unsigned int b = 0xfffffffe >> 16;
printf("%X\n%X\n", a, b);
return 0;
}
The behaviour you see here has to do with shifting on signed or unsigned integers which give different results.
Shifts on unsigned integers are logical. On the contrary, shift on signed integers are arithmetic. EDIT: In C, it's implementation defined but generally the case.
Consequently,
int x = 0xfffffffe;
int a = x >> 16;
this part performs an arithmetic shift because x is signed. And because x is actually negative (-2 in two's complement), x is sign extended, so '1's are appended which results in 0xFFFFFFFF.
On the contrary,
int b = 0xfffffffe >> 16;
0xfffffffe is a litteral interpreted as an unsigned integer. Therefore a logical shift of 16 results in 0x0000FFFF as expected.

right left shift bits in C

I did a small test on bit shifting in C, and all of the shifts by 0, 8, 16 bits are OK and I understood what's happening.
But the 32 bits right or left shift which is not clear to me, the variable I'm doing the test with is 32-bit long.
Then, I changed the 32-bit variables which would hold the shifting results, but 32-bit right left shifts are the same!
Here's my code:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <inttypes.h>
int main() {
uint32_t code = 0xCDBAFFEE;
uint64_t bit32R = code >> 32;
uint16_t bit16R = code >> 16;
uint8_t bit8R = code >> 8;
uint8_t bit0R = code >> 0;
uint64_t bit32L = code << 32;
uint16_t bit16L = code << 16;
uint8_t bit8L = code << 8;
uint8_t bit0L = code << 0;
printf("Right shift:\nbit32R %.16x\nbit16R %x\nbit8R %x\nbit0R %x\n\n",
bit32R, bit16R, bit8R, bit0R);
printf("Left shift:\nbit32L %.16x\nbit16L %x\nbit8L %x\nbit0L %x\n\n",
bit32L, bit16L, bit8L, bit0L);
}
Here's the result I get:
Right shift:
bit32R 00000000cdbaffee
bit16R 0
bit8R cdba
bit0R ff
Left shift:
bit32L 00000000cdbaffee
bit16L 0
bit8L 0
bit0L 0
Process returned 61 (0x3D) execution time : 0.041 s
Press any key to continue.
Right shifting an integer by a number of bits equal or greater than its size is undefined behavior.
C11 6.5.7 Bitwise shift operators
Syntax
shift-expression: additive-expression
shift-expression << additive-expression
shift-expression >> additive-expression
Constraints
Each of the operands shall have integer type.
Semantics
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.
The size of int on your platform seems to be at most 32 bits, so the initializers for bit32R and bit32L have undefined behavior.
The 64-bit expressions should be written:
uint64_t bit32R = (uint64_t)code >> 32;
and
uint64_t bit32L = (uint64_t)code << 32;
Furthermore, the formats used in printf are not correct for the arguments passed (unless int has 64 bits, which would produce different output).
Your compiler does not seem to be fully C99 compliant, you should add a final return 0; statement at the end of the body of function main().
Here is a corrected version:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <inttypes.h>
int main(void) {
uint32_t code = 0xCDBAFFEE;
uint64_t bit32R = (uint64_t)code >> 32;
uint16_t bit16R = code >> 16;
uint8_t bit8R = code >> 8;
uint8_t bit0R = code >> 0;
uint64_t bit32L = (uint64_t)code << 32;
uint16_t bit16L = code << 16;
uint8_t bit8L = code << 8;
uint8_t bit0L = code << 0;
printf("Right shift:\n"
"bit32R %.16"PRIx64"\n"
"bit16R %"PRIx16"\n"
"bit8R %"PRIx8"\n"
"bit0R %"PRIx8"\n\n",
bit32R, bit16R, bit8R, bit0R);
printf("Left shift:\n"
"bit32L %.16"PRIx64"\n"
"bit16L %"PRIx16"\n"
"bit8L %"PRIx8"\n"
"bit0L %"PRIx8"\n\n",
bit32L, bit16L, bit8L, bit0L);
return 0;
}
The output is:
Right shift:
bit32R 0000000000000000
bit16R cdba
bit8R ff
bit0R ee
Left shift:
bit32L cdbaffee00000000
bit16L 0
bit8L 0
bit0L ee
this might not be what you expect, because the types of the variables are somewhat inconsistent.
One problem is that you are using %x to print a 64-bit integer. You should use the correct format specifier for each variable. There are macros for this available:
#define __STDC_FORMAT_MACROS
#include <inttypes.h>
// ...
printf("64 bit result: %" PRIx64 "\n", bit32R);
printf("16 bit result: %" PRIx16 "\n", bit16R);
printf("8 bit result: %" PRIx8 "\n", bit8R);
More information can be found here.
You are not doing 64 bit left shift there, because code is uint32_t, so compiler uses 32bit version of operator. Also, you should tell print to use long long (same as uint64_t)
#include <cstdint>
#include <stdio.h>
int main ()
{
uint32_t code = 0xCDBAFFEE;
uint64_t bit32R=((uint64_t)code)>>32;
uint16_t bit16R=code>>16;
uint8_t bit8R=code>>8;
uint8_t bit0R=code>>0;
uint64_t bit32L=((uint64_t)code)<<32;
uint16_t bit16L=code<<16;
uint8_t bit8L=code<<8;
uint8_t bit0L=code<<0;
printf("Right shift:\nbit32R %llx\nbit16R %x\nbit8R %x\nbit0R %x\n\n", bit32R,bit16R,bit8R,bit0R);
printf("Leftt shift:\nbit32L %llx\nbit16L %x\nbit8L %x\nbit0L %x", bit32L,bit16L,bit8L,bit0L);
}
Result is:
Right shift:
bit32R 0
bit16R cdba
bit8R ff
bit0R ee
Leftt shift:
bit32L cdbaffee00000000
bit16L 0
bit8L 0
bit0L ee
You should use the macros defined in inttypes.h if you have C99 compliant compiler, sadly some platforms do not have those definitions. Format descriptors for printf are platform-dependent.

Concatenate two 32bit numbers to get a 64bit result

I need to concatenate two hexadecimal numbers 32 bits each each, to get a final result of 64 bits.
I tried the following code but didn't get a good result:
unsigned long a,b;
unsigned long long c;
c = (unsigned long long) (a << 32 | b);
Can anybody help me please?
Thanks.
Use proper fixed size types and be careful about type promotion and operator precedence, e.g.
#include <stdint.h>
uint32_t a, b;
uint64_t c;
c = ((uint64_t)a << 32) | b;
You need to cast a to long long before shifting it:
unsigned long long c = ((unsigned long long)a << 32 | b);
Shortest form is:
c = a+0ULL<<32|b
The third line should be changed to
((unsigned long long)a) << 32 | ((unsigned long long) b)
What your current code is doing, is taking the 32-bit variable a and shifting it 32 bits to the left (making its value 0, because the bottom 32 bits are all empty), then or-ing it with the 32-bit variable b.
What the changed version does is to case the 32-bit variable a to 64 bits, shift it 32 bits to the left, cast the 32-bit variable b to 64 bits, then or the two 64-bit variables together. The result is naturally 64 bits.
I would imagine that this would do the trick:
typedef unsigned long U64 ; // your unsigned 64-bit int typedef here
typedef unsigned int U32 ; // your unsigned 32-bit int typedef here
U64 join( U32 a , U32 b )
{
U64 result = ((U64)a) << 32
| ((U64)b)
;
return result ;
}
I'll leave to you to divine the appropriate typedefs for U64 and U32.

Resources