I managed to get a unsigned long int octets-representation (BE) by reading IPv4 methods, and I managed to read about how signed integers are using the MSB as the sign indicator, which makes 00 00 00 00 to be 0, while 7F FF FF FF is 2147483647.
But I can't manage how to do the same for signed long integers?
#include <stdio.h>
#include <string.h>
int main (void)
{
unsigned long int intu32;
unsigned char octets[4];
intu32 = 255;
octets[3] = (intu32) & 255;
octets[2] = (intu32 >> 8) & 255;
octets[1] = (intu32 >> 16) & 255;
octets[0] = (intu32 >> 24) & 255;
printf("(%d)(%d)(%d)(%d)\n", octets[0], octets[1], octets[2], octets[3]);
intu32 = (octets[0] << 24) | (octets[1] << 16) | (octets[2] << 8) | octets[3];
printf("intu32:%lu\n", intu32);
return 0;
}
Thanks in advance,
Doori bar
There is no difference. You can always serialize/deserialize signed integers as if they are unsigned, the difference is only in the interpretation of the bits, not in the bits themselves.
Of course, this only holds true if you know that the unsigned and signed integers are of the same size, so that no bits get lost.
Also, you need to be careful (as you are) that no intermediary stage does any unplanned sign-extension or the like, the use of unsigned char for individual bytes is a good idea.
You are probably confused that it is common practise (and applied in ix86 processors) to encode negative values using twos complement encoding. This means that the hex notation of 4 byte -1 is 0xffffffff. The reason this encoding is used is that by taking into account automatic overflow adding 2 0x00000002 to -1 will yield the correct result (0x00000001).
Do you want something like this? It would be helpul (as Vicki asked) if you could provide what you have and what you want to get.
#include <stdio.h>
#include <string.h>
int main (void)
{
union{
long int intu32;
char octets[4];
} u;
u.intu32 = 255;
printf("(%d)(%d)(%d)(%d)\n", (int) u.octets[3], (int) u.octets[2], (int) u.octets[1], (int) u.octets[0]);
printf("intu32:%lu\n", u.intu32);
return 0;
}
Related
I want to compose the number 0xAAEFCDAB from individual bytes. Everything goes well up to 4 tetrads, and for some reason extra 4 bytes are added with it. What am I doing wrong?
#include <stdio.h>
int main(void) {
unsigned long int a = 0;
a = a | ((0xAB) << 0);
printf("%lX\n", a);
a = a | ((0xCD) << 8);
printf("%lX\n", a);
a = a | ((0xEF) << 16);
printf("%lX\n", a);
a = a | ((0xAA) << 24);
printf("%lX\n", a);
return 0;
}
Output:
Constants in C are actually typed, which might not be obvious at first, and the default type for a constant is an int which is a signed 32-bit integer (it depends on the platform, but it probably is in your case).
In signed numbers, the highest bit describes the sign of the number: 1 is negative and 0 is positive (for more details you can read about two's complement).
When you perform the operation 0xAB << 24 it results in a 32-bit signed value of 0xAB000000 which is equal to 10101011 00000000 00000000 00000000 in binary. As you can see, the highest bit is set to 1, which means that the entire 32-bit signed number is actually negative.
In order to perform the | OR operation between a (which is a 64-bit unsigned number) and a 32-bit signed number, some type conversions must be performed. The size promotion is performed first, and the 32-bit signed value of 0xAB000000 is promoted to a 64-bit signed value of 0xFFFFFFFFAB000000, according to the rules of the two's complement system. This is a 64-bit signed number which has the same numerical value as the 32-bit signed one before conversion.
Afterwards, type conversion is performed from 64-bit signed to 64-bit unsigned value in order to OR the value with a. This fills the top bits with ones and results in the value you see on the screen.
In order to force your constants to be different type than 32-bit signed int you may use suffixes such as u and l, as shown in the website I linked in the beginning of my answer. In your case, a ul suffix should work best, indicating a 64-bit unsigned value. Your lines of code which OR constants with your a variable would then look similarly to this:
a = a | ((0xAAul) << 24);
Alternatively, if you want to limit yourself to 4 bytes only, a 32-bit unsigned int is enough to hold them. In that case, I suggest you change your a variable type to unsigned int and use the u suffix for your constants. Do not forget to change the printf formats to reflect the type change. The resulting code looks like this:
#include <stdio.h>
int main(void) {
unsigned int a = 0;
a = a | ((0xABu) << 0);
printf("%X\n", a);
a = a | ((0xCDu) << 8);
printf("%X\n", a);
a = a | ((0xEFu) << 16);
printf("%X\n", a);
a = a | ((0xAAu) << 24);
printf("%X\n", a);
return 0;
}
My last suggestion is to not use the default int and long types when portability and size in bits are important to you. These types are not guaranteed to have the same amount of bits on all platforms. Instead use types defined in the <stdint.h> header file, in your case probably either a uint64_t or uint32_t. These two are guaranteed to be unsigned integers (their signed counterparts omit the 'u': int64_t and int32_t) while being 64-bit and 32-bit in size respectively on all platforms. For Pros and Cons of using them instead of traditional int and long types I refer you to this Stack Overflow answer.
a = a | ((0xAA) << 24);
((0xAA) << 24) is a negative number (it is int), then it is sign extended to the size of 'unsigned long' which adds those 0xffffffff at the beginning.
You need to tell the compiler that you want an unsigned number.
a = a | ((0xAAU) << 24);
int main(void) {
unsigned long int a = 0;
a = a | ((0xAB) << 0);
printf("%lX\n", a);
a = a | ((0xCD) << 8);
printf("%lX\n", a);
a = a | ((0xEF) << 16);
printf("%lX\n", a);
a = a | ((0xAAUL) << 24);
printf("%lX\n", a);
printf("%d\n", ((0xAA) << 24));
return 0;
}
https://gcc.godbolt.org/z/fjv19bKGc
0xAA gets treated as a signed value when it is scaled up during the bit shifting. Since its high bit is 1 (0xAA = 10101010b), the scaled value is sign extended to 0x...FFFFFFAA before you shift and OR it to a.
You need to cast 0xAA to an unsigned value before bit shifting it, so it gets zero extended instead.
I have a number of bits (the number of bits can change) in an unsigned int (uint32_t). For example (12 bits in the example):
uint32_t a = 0xF9C;
The bits represent a signed int of that length.
In this case the number in decimal should be -100.
I want to store the variable in a signed variable and gets is actual value.
If I just use:
int32_t b = (int32_t)a;
it will be just the value 3996, since it gets casted to (0x00000F9C) but it actually needs to be (0xFFFFFF9C)
I know one way to do it:
union test
{
signed temp :12;
};
union test x;
x.temp = a;
int32_t result = (int32_t) x.temp;
now i get the correct value -100
But is there a better way to do it?
My solution is not very flexbile, as I mentioned the number of bits can vary (anything between 1-64bits).
But is there a better way to do it?
Well, depends on what you mean by "better". The example below shows a more flexible way of doing it as the size of the bit field isn't fixed. If your use case requires different bit sizes, you could consider it a "better" way.
unsigned sign_extend(unsigned x, unsigned num_bits)
{
unsigned f = ~((1 << (num_bits-1)) - 1);
if (x & f) x = x | f;
return x;
}
int main(void)
{
int x = sign_extend(0xf9c, 12);
printf("%d\n", x);
int y = sign_extend(0x79c, 12);
printf("%d\n", y);
}
Output:
-100
1948
A branch free way to sign extend a bitfield (Henry S. Warren Jr., CACM v20 n6 June 1977) is this:
// value i of bit-length len is a bitfield to sign extend
// i is right aligned and zero-filled to the left
sext = 1 << (len - 1);
i = (i ^ sext) - sext;
UPDATE based on #Lundin's comment
Here's tested code (prints -100):
#include <stdio.h>
#include <stdint.h>
int32_t sign_extend (uint32_t x, int32_t len)
{
int32_t i = (x & ((1u << len) - 1)); // or just x if you know there are no extraneous bits
int32_t sext = 1 << (len - 1);
return (i ^ sext) - sext;
}
int main(void)
{
printf("%d\n", sign_extend(0xF9C, 12));
return 0;
}
This relies on the implementation defined behavior of sign extension when right-shifting signed negative integers. First you shift your unsigned integer all the way left until the sign bit is becoming MSB, then you cast it to signed integer and shift back:
#include <stdio.h>
#include <stdint.h>
#define NUMBER_OF_BITS 12
int main(void) {
uint32_t x = 0xF9C;
int32_t y = (int32_t)(x << (32-NUMBER_OF_BITS)) >> (32-NUMBER_OF_BITS);
printf("%d\n", y);
return 0;
}
This is a solution to your problem:
int32_t sign_extend(uint32_t x, uint32_t bit_size)
{
// The expression (0xffffffff << bit_size) will fill the upper bits to sign extend the number.
// The expression (-(x >> (bit_size-1))) is a mask that will zero the previous expression in case the number was positive (to avoid having an if statemet).
return (0xffffffff << bit_size) & (-(x >> (bit_size-1))) | x;
}
int main()
{
printf("%d\n", sign_extend(0xf9c, 12)); // -100
printf("%d\n", sign_extend(0x7ff, 12)); // 2047
return 0;
}
The sane, portable and effective way to do this is simply to mask out the data part, then fill up everything else with 0xFF... to get proper 2's complement representation. You need to know is how many bits that are the data part.
We can mask out the data with (1u << data_length) - 1.
In this case with data_length = 8, the data mask becomes 0xFF. Lets call this data_mask.
Thus the data part of the number is a & data_mask.
The rest of the number needs to be filled with zeroes. That is, everything not part of the data mask. Simply do ~data_mask to achieve that.
C code: a = (a & data_mask) | ~data_mask. Now a is proper 32 bit 2's complement.
Example:
#include <stdio.h>
#include <inttypes.h>
int main(void)
{
const uint32_t data_length = 8;
const uint32_t data_mask = (1u << data_length) - 1;
uint32_t a = 0xF9C;
a = (a & data_mask) | ~data_mask;
printf("%"PRIX32 "\t%"PRIi32, a, (int32_t)a);
}
Output:
FFFFFF9C -100
This relies on int being 32 bits 2's complement but is otherwise fully portable.
I have a sensor which gives its output in three bytes. I read it like this:
unsigned char byte0,byte1,byte2;
byte0=readRegister(0x25);
byte1=readRegister(0x26);
byte2=readRegister(0x27);
Now I want these three bytes merged into one number:
int value;
value=byte0 + (byte1 << 8) + (byte2 << 16);
it gives me values from 0 to 16,777,215 but I'm expecting values from -8,388,608 to 8,388,607. I though that int was already signed by its implementation. Even if I try define it like signed int value; it still gives me only positive numbers. So I guess my question is how to convert int to its two's complement?
Thanks!
What you need to perform is called sign extension. You have 24 significant bits but want 32 significant bits (note that you assume int to be 32-bit wide, which is not always true; you'd better use type int32_t defined in stdint.h). Missing 8 top bits should be either all zeroes for positive values or all ones for negative. It is defined by the most significant bit of the 24 bit value.
int32_t value;
uint8_t extension = byte2 & 0x80 ? 0xff:00; /* checks bit 7 */
value = (int32_t)byte0 | ((int32_t)byte1 << 8) | ((int32_t)byte2 << 16) | ((int32_t)extension << 24);
EDIT: Note that you cannot shift an 8 bit value by 8 or more bits, it is undefined behavior. You'll have to cast it to a wider type first.
#include <stdint.h>
uint8_t byte0,byte1,byte2;
int32_t answer;
// assuming reg 0x25 is the signed MSB of the number
// but you need to read unsigned for some reason
byte0=readRegister(0x25);
byte1=readRegister(0x26);
byte2=readRegister(0x27);
// so the trick is you need to get the byte to sign extend to 32 bits
// so force it signed then cast it up
answer = (int32_t)((int8_t)byte0); // this should sign extend the number
answer <<= 8;
answer |= (int32_t)byte1; // this should just make 8 bit field, not extended
answer <<= 8;
answer |= (int32_t)byte2;
This should also work
answer = (((int32_t)((int8_t)byte0))<<16) + (((int32_t)byte1)<< 8) + byte2;
I may be overly aggressive with parentheses but I never trust myself with shift operators :)
I have to do a sign extension for a 16-bit integer and for some reason, it seems not to be working properly. Could anyone please tell me where the bug is in the code? I've been working on it for hours.
int signExtension(int instr) {
int value = (0x0000FFFF & instr);
int mask = 0x00008000;
int sign = (mask & instr) >> 15;
if (sign == 1)
value += 0xFFFF0000;
return value;
}
The instruction (instr) is 32 bits and inside it I have a 16bit number.
Why is wrong with:
int16_t s = -890;
int32_t i = s; //this does the job, doesn't it?
what's wrong in using the builtin types?
int32_t signExtension(int32_t instr) {
int16_t value = (int16_t)instr;
return (int32_t)value;
}
or better yet (this might generate a warning if passed a int32_t)
int32_t signExtension(int16_t instr) {
return (int32_t)instr;
}
or, for all that matters, replace signExtension(value) with ((int32_t)(int16_t)value)
you obviously need to include <stdint.h> for the int16_t and int32_t data types.
Just bumped into this looking for something else, maybe a bit late, but maybe it'll be useful for someone else. AFAIAC all C programmers should start off programming assembler.
Anyway sign extending is much easier than the proposals. Just make sure you are using signed variables and then use 2 shifts.
long value; // 32 bit storage
value=0xffff; // 16 bit 2's complement -1, value is now 0x0000ffff
value = ((value << 16) >> 16); // value is now 0xffffffff
If the variable is signed then the C compiler translates >> to Arithmetic Shift Right which preserves sign. This behaviour is platform independent.
So, assuming that value starts of with 0x1ff then we have, << 16 will SL (Shift Left) the value so instr is now 0xff80, then >> 16 will ASR the value so instr is now 0xffff.
If you really want to have fun with macros then try something like this (syntax works in GCC haven't tried in MSVC).
#include <stdio.h>
#define INT8 signed char
#define INT16 signed short
#define INT32 signed long
#define INT64 signed long long
#define SIGN_EXTEND(to, from, value) ((INT##to)((INT##to)(((INT##to)value) << (to - from)) >> (to - from)))
int main(int argc, char *argv[], char *envp[])
{
INT16 value16 = 0x10f;
INT32 value32 = 0x10f;
printf("SIGN_EXTEND(8,3,6)=%i\n", SIGN_EXTEND(8,3,6));
printf("LITERAL SIGN_EXTEND(16,9,0x10f)=%i\n", SIGN_EXTEND(16,9,0x10f));
printf("16 BIT VARIABLE SIGN_EXTEND(16,9,0x10f)=%i\n", SIGN_EXTEND(16,9,value16));
printf("32 BIT VARIABLE SIGN_EXTEND(16,9,0x10f)=%i\n", SIGN_EXTEND(16,9,value32));
return 0;
}
This produces the following output:
SIGN_EXTEND(8,3,6)=-2
LITERAL SIGN_EXTEND(16,9,0x10f)=-241
16 BIT VARIABLE SIGN_EXTEND(16,9,0x10f)=-241
32 BIT VARIABLE SIGN_EXTEND(16,9,0x10f)=-241
Try:
int signExtension(int instr) {
int value = (0x0000FFFF & instr);
int mask = 0x00008000;
if (mask & instr) {
value += 0xFFFF0000;
}
return value;
}
People pointed out casting and a left shift followed by an arithmetic right shift. Another way that requires no branching:
(0xffff & n ^ 0x8000) - 0x8000
If the upper 16 bits are already zeroes:
(n ^ 0x8000) - 0x8000
• Community wiki as it's an idea from "The Aggregate Magic Algorithms, Sign Extension"
if i have int temp=(1<<31)>>31. How come the temp becomes -1?
how do i get around this problem?
thanks
Ints are signed by default, which usually means that the high bit is reserved to indicate whether the integer is negative or not. Look up Two's complement for an explanation of how this works.
Here's the upshot:
[steven#sexy:~]% cat test.c
#include <stdint.h>
#include <stdio.h>
int main(int argc, char **argv[]) {
uint32_t uint;
int32_t sint;
int64_t slong;
uint = (((uint32_t)1)<<31) >> 31;
sint = (1<<31) >> 31;
slong = (1L << 31) >> 31;
printf("signed 32 = %d, unsigned 32 = %u, signed 64 = %ld\n", sint, uint, slong);
}
[steven#sexy:~]% ./test
signed 32 = -1, unsigned 32 = 1, signed 64 = 1
Notice how you can avoid this problem either by using an "unsigned" int (allowing the use of all 32 bits), or by going to a larger type which you don't overflow.
In your case, the 1 in your expression is a signed type - so when you upshift it by 31, its sign changes. Then downshifting causes the sign bit to be duplicated, and you end up with a bit pattern of 0xffffffff.
You can fix it like this:
int temp = (1UL << 31) >> 31;
GCC warns about this kind of error if you have -Wall turned on.
int is signed.
what 'problem' - what are you trying to do ?
int i = (1<<31); // i = -2147483648
i>>31; // i = -1
unsigned int i = (1<<31); // i = 2147483648
i>>31; // i = 1
ps ch is a nice command line 'c' intepreter for windows that lets you try this sort of stuff without compiling, it also gives you a unix command shell. See http://www.drdobbs.com/184402054
When you do (1<<31), the MSB which is the sign-bit is set(becomes 1). Then when you do the right shift, it is sign extended. Hence, you get -1. Solution: (1UL << 31) >> 31.
bit to indicate the sign is set when you do "such" left shift on a integer variable. Hence the result.