Assign unsigned char to unsigned short with bit operators in ansi C - c

I know it is possible to assign an unsigned char to an unsigned short, but I would like to have more control how the bits are actually assigned to the unsigned short.
unsigned char UC_8;
unsigned short US_16;
UC_8 = 0xff;
US_16 = (unsigned char) UC_8;
The bits from UC_8 are now placed in the lower bits of US_16. I need more control of the conversion since the application I'm currently working on are safety related. Is it possible to control the conversion with bit operators? So I can specify where the 8 bits from the unsigned char should be placed in the bigger 16 bit unsigned short variable.
My guess is that it would be possible with masking combined with some other bit-operator, maybe left/right shifting.
UC_8 = 0xff;
US_16 = (US_16 & 0x00ff) ?? UC_8; // Maybe masking?
I have tried different combinations but have not come up with a smart solution. I'm using ansi C and as said earlier, need more control how the bits actually are set in the larger variable.
EDIT:
My problem or concern comes from a CRC generating function. It will and should always return an unsigned short, since it will sometimes calculate an 16 bit CRC. But sometimes it should calculate a 8 bit CRC instead, and place the 8 bit on the eight LSB in the 16 bit return variable. And on the eight MSB should then contain only zeros.
I would like to say something like:
US_16(7 downto 0) = UC_8;
US_16(15 downto 8) = 0x00;
If I just typecast it, can I guarantee that the bits always will be placed on the lower bits in the larger variable? (On all different architectures)

What do you mean, "control"?
The C standard unambiguously defines the unsigned binary format in terms of bit positions and significance. Certain bits of a 16-bit variable are "low", by numerical definition, and they will hold the pattern from the 8-bit variable, the other bits being set to zero. There is no ambiguity, no wiggle room, and nothing else to control.

Maybe rotation of bits will help you:
US_16 = (US_16 & 0x00ff) | ( UC_8 << 8 );
Result in bits will be:
C - UC_8 bits
S - US_16 bits
CCCC CCCC SSSS SSSS, resp.: SSSS SSSS are last 8 bits of US_16
But if UC_8 was 1 and US_16 was 0, then US_16 will be 512. Are you mean this?
US_16 = (US_16 & 0xff00) | ( UC_8 & 0x00ff );

US_16=~-1|UC_8;
Is this what you want?

If it is important to use ansi C, and not be restricted to a particular implementation, then you should not assume sizeof(short) == 2. And why bother to cast an unsigned char to an unsigned char (the same thing)? Although probably safe to assume char is 8 bits nowadays, even though that's not guaranteed.
uint8_t UC_8;
uint16_t US_16;
int nbits = ...# of bits to shift...;
US_16 = UC_8 << nbits;
Obviously, if you shift more than 15 bits, it may not be what you want. If you need to actually rearrange the bits, rather than just shift them to some position, you'll have to set them individually
int sourcebit = ...0 to 7...;
int destinationbit = ...0 to 15...;
// set
US_16 |= (US_8 & (1<<sourcebit)) << (destinationbit - sourcebit);
// clear
US_16 &= ~((US_8 & (1<<sourcebit)) << (destinationbit - sourcebit));
note: just wrote, didn't test. probably not optimal. blah blah blah. but something like that will work.

Related

How to combine two hex value(High Value & Low Value) at two different array positions?

I received two hex values where at array[1] = lowbyte and at array[2] = highbyte where for my example lowbyte = 0xF4 and highbyte = 0x01 so the value will be in my example 1F4(500). So I want to combine these two values and compare but how do I do that without any library function?
Please help and sorry for my bad English.
I did some research and I found this as my solution and it seems to be working fine:
int temp = (short)(((HIGHBYTE) & 0xFF) << 8 | (LOWBYTE) & 0xFF);
Just a basic example showing how to combine values of two different variables into one:
#include <stdio.h>
int main (void)
{
char highbyte = 0x01;
unsigned char lowbyte = 0xF4; //Edited as per comments from #Fe2O3,
short int val = 0;
val = (highbyte << 8) | lowbyte; // If lowbyte declared as signed, then masking is required `lowbyte & 0xFF`
printf("0x%hx\n", val);
return 0;
}
Tested this on Linux PC.
Based on the answer where you converted to short, it seems you may want to combine the two bytes to produce a 16-bit two’s complement integer. This answer shows how to do that in three ways for which the behavior is fully defined by the C standard, as well as a fourth way that requires knowledge of the C implementation being used. Methods 1 and 3 are also defined in C++.
Given two eight-bit unsigned bytes with the more significant byte in highbyte and the less significant byte in lowbyte, four options for constructing the 16-bit two’s complement value they represent are:
Assemble the bytes in the desired order and copy them into an int16_t: uint16_t t = (uint16_t) highbyte << 8 | lowbyte; int16_t result; memcpy(&result, &t, sizeof result);.
Assemble the bytes in the desired order and use a union to reinterpret them: int16_t result = (union { uint16_t u; int16_t i; }) { (uint16_t) highbyte << 8 | lowbyte } .i;.
Construct the result arithmetically: int16_t result = ((highbyte ^ 128) - 128) * 256 + lowbyte;.
If it is given that the code will be used only with C implementations that define conversion to a signed integer to wrap, then a conversion may be used: int16_t result = (int16_t) ((uint16_t) highbyte << 8 | lowbyte);.
(In the last, the conversion to int16_t is implicit in the initialization, but a cast is used because, without it, some compilers will produce a warning or error, depending on switches.)
Note: int16_t and uint16_t are defined by including <stdint.h>. Alternatively, if it is given that short is 16 bits, then short and unsigned short may be used in place of int16_t and uint16_t.
Here is more information about the first three of these.
1. Assemble the bytes and copy
(uint16_t) highbyte << 8 | lowbyte converts to a type suitable for shifting without sign-bit issues, moves the more significant byte into the upper 8 bits of 16, and puts the less significant byte into the lower 8 bits.
Then uint16_t = …; puts those bits into a uint16_t.
memcpy(&result, &t, sizeof result); copies those bits into an int16_t. C 2018 7.20.1.1 1 guarantees that int16_t uses two’s complement. C 2018 6.2.6.2 2 guarantees that the value bits in int16_t have the same position values as their counterparts in uint16_t, so the copy produces the desired arrangement in result.
2. Assemble the bytes and use a union
(type) { initial value } is a compound literal. (union { uint16_t u; int16_t i; }) { (uint16_t) highbyte << 8 | lowbyte } makes a compound literal that is a union and initializes its u member to have the value described above. Then .i reads the i member of the union, which reinterprets the bits using the type int16_t, which is two’s complement as describe above. Then int16_t result = …; initializes result to this value.
3. Construct the result arithmetically
Here we start with the more significant byte separately, interpreting the eight bits of highbyte as two’s complement. In eight-bit two’s complement, the sign bit represents 0 if it is off and −128 if it is on. (For example, 111111002 as unsigned binary represents 128+64+32+16+8+4 =252, but, in two’s complement, it is −128+64+32+16+8+4 = −4.)
Consider highbyte ^ 128) - 128. If the first bit is off, ^ 128 turns it on, which adds 128 to its unsigned binary meaning. Then - 128 subtracts 128, producing a net effect of zero. If the first bit is on, ^ 128 turns it off, which cancels its unsigned binary meaning. Then - 128 gives the desired value. Thus (highbyte ^ 128) - 128 reinterprets the first bit to have a value of 0 if it is off and −128 if it is on.
Then ((highbyte ^ 128) - 128) * 256 moves this to the more significant byte of 16 bits (in an int type at this point), and + lowbyte puts the less significant byte in the less significant position. And of course int16_t result = …; initializes result to this computed value.

How to get the most significant bit of an unsigned 8-bit type in C

I'm trying to get the most significant bit of an unsigned 8-bit type in C.
This is what I'm trying to do right now:
uint8_t *var = ...;
...
(*var >> 6) & 1
Is this right? If it's not, what would be?
To get the most significant bit from a memory pointed to by uint8_t pointer, you need to shift by 7 bits.
(*var >> 7) & 1
The most standard/correct way of masking bits is to use a readable bit mask of the form 1u << bit. Any C programmer spotting 1u << n in code will know that it is a bit mask - so it is self-documenting code.
So if you want bit number 7, you would write
*var & (1u << 7)
The u suffix is important for rugged code, since you want to avoid accidental implicit promotions to signed types.
Another option is to simply apply a bit mask and check the resulting value:
*var & 0x80u // 1000 0000

C - three bytes into one signed int

I have a sensor which gives its output in three bytes. I read it like this:
unsigned char byte0,byte1,byte2;
byte0=readRegister(0x25);
byte1=readRegister(0x26);
byte2=readRegister(0x27);
Now I want these three bytes merged into one number:
int value;
value=byte0 + (byte1 << 8) + (byte2 << 16);
it gives me values from 0 to 16,777,215 but I'm expecting values from -8,388,608 to 8,388,607. I though that int was already signed by its implementation. Even if I try define it like signed int value; it still gives me only positive numbers. So I guess my question is how to convert int to its two's complement?
Thanks!
What you need to perform is called sign extension. You have 24 significant bits but want 32 significant bits (note that you assume int to be 32-bit wide, which is not always true; you'd better use type int32_t defined in stdint.h). Missing 8 top bits should be either all zeroes for positive values or all ones for negative. It is defined by the most significant bit of the 24 bit value.
int32_t value;
uint8_t extension = byte2 & 0x80 ? 0xff:00; /* checks bit 7 */
value = (int32_t)byte0 | ((int32_t)byte1 << 8) | ((int32_t)byte2 << 16) | ((int32_t)extension << 24);
EDIT: Note that you cannot shift an 8 bit value by 8 or more bits, it is undefined behavior. You'll have to cast it to a wider type first.
#include <stdint.h>
uint8_t byte0,byte1,byte2;
int32_t answer;
// assuming reg 0x25 is the signed MSB of the number
// but you need to read unsigned for some reason
byte0=readRegister(0x25);
byte1=readRegister(0x26);
byte2=readRegister(0x27);
// so the trick is you need to get the byte to sign extend to 32 bits
// so force it signed then cast it up
answer = (int32_t)((int8_t)byte0); // this should sign extend the number
answer <<= 8;
answer |= (int32_t)byte1; // this should just make 8 bit field, not extended
answer <<= 8;
answer |= (int32_t)byte2;
This should also work
answer = (((int32_t)((int8_t)byte0))<<16) + (((int32_t)byte1)<< 8) + byte2;
I may be overly aggressive with parentheses but I never trust myself with shift operators :)

c Code that reads a 4 byte little endian number from a buffer

I encountered this piece of C code that's existing. I am struggling to understand it.
I supposidly reads a 4 byte unsigned value passed in a buffer (in little endian format) into a variable of type "long".
This code runs on a 64 bit word size, little endian x86 machine - where sizeof(long) is 8 bytes.
My guess is that this code is intended to also run on a 32 bit x86 machine - so a variable of type long is used instead of int for sake of storing value from a four byte input data.
I am having some doubts and have put comments in the code to express what I understand, or what I don't :-)
Please answer questions below in that context
void read_Value_From_Four_Byte_Buff( char*input)
{
/* use long so on 32 bit machine, can still accommodate 4 bytes */
long intValueOfInput;
/* Bitwise and of input buffer's byte 0 with 0xFF gives MSB or LSB ?*/
/* This code seems to assume that assignment will store in rightmost byte - is that true on a x86 machine ?*/
intValueOfInput = 0xFF & input[0];
/*left shift byte-1 eight times, bitwise "or" places in 2nd byte frm right*/
intValueOfInput |= ((0xFF & input[1]) << 8);
/* similar left shift in mult. of 8 and bitwise "or" for next two bytes */
intValueOfInput |= ((0xFF & input[2]) << 16);
intValueOfInput |= ((0xFF & input[3]) << 24);
}
My questions
1) The input buffer is expected to be in "Little endian". But from code looks like assumption here is that it read in as Byte 0 = MSB, Byte 1, Byte 2, Byte 3= LSB. I thought so because code reads bytes starting from Byte 0, and subsequent bytes ( 1 onwards) are placed in the target variable after left shifting. Is that how it is or am I getting it wrong ?
2) I feel this is a convoluted way of doing things - is there a simpler alternative to copy value from 4 byte buffer into a long variable ?
3) Will the assumption "that this code will run on a 64 bit machine" will have any bearing on how easily I can do this alternatively? I mean is all this trouble to keep it agnostic to word size ( I assume its agnostic to word size now - not sure though) ?
Thanks for your enlightenment :-)
You have it backwards. When you left shift, you're putting into more significant bits. So (0xFF & input[3]) << 24) puts Byte 3 into the MSB.
This is the way to do it in standard C. POSIX has the function ntohl() that converts from network byte order to a native 32-bit integer, so this is usually used in Unix/Linux applications.
This will not work exactly the same on a 64-bit machine, unless you use unsigned long instead of long. As currently written, the highest bit of input[3] will be put into the sign bit of the result (assuming a twos-complement machine), so you can get negative results. If long is 64 bits, all the results will be positive.
The code you are using does indeed treat the input buffer as little endian. Look how it takes the first byte of the buffer and just assigns it to the variable without any shifting. If the first byte increases by 1, the value of your result increases by 1, so it is the least-significant byte (LSB). Left-shifting makes a byte more significant, not less. Left-shifting by 8 is generally the same as multiplying by 256.
I don't think you can get much simpler than this unless you use an external function, or make assumptions about the machine this code is running on, or invoke undefined behavior. In most instances, it would work to just write uint32_t x = *(uint32_t *)input; but this assumes your machine is little endian and I think it might be undefined behavior according to the C standard.
No, running on a 64-bit machine is not a problem. I recommend using types like uint32_t and int32_t to make it easier to reason about whether your code will work on different architectures. You just need to include the stdint.h header from C99 to use those types.
The right-hand side of the last line of this function might exhibit undefined behavior depending on the data in the input:
((0xFF & input[3]) << 24)
The problem is that (0xFF & input[3]) will be a signed int (because of integer promotion). The int will probably be 32-bit, and you are shifting it so far to the left that the resulting value might not be representable in an int. The C standard says this is undefined behavior, and you should really try to avoid that because it gives the compiler a license to do whatever it wants and you won't be able to predict the result.
A solution is to convert it from an int to a uint32_t before shifting it, using a cast.
Finally, the variable intValueOfInput is written to but never used. Shouldn't you return it or store it somewhere?
Taking all this into account, I would rewrite the function like this:
uint32_t read_value_from_four_byte_buff(char * input)
{
uint32_t x;
x = 0xFF & input[0];
x |= (0xFF & input[1]) << 8;
x |= (0xFF & input[2]) << 16;
x |= (uint32_t)(0xFF & input[3]) << 24;
return x;
}
From the code, Byte 0 is LSB, Byte 3 is MSB. But there are some typos. The lines should be
intValueOfInput |= ((0xFF & input[2]) << 16);
intValueOfInput |= ((0xFF & input[3]) << 24);
You can make the code shorter by dropping 0xFF but using the type "unsigned char" in the argument type.
To make the code shorter, you can do:
long intValueOfInput = 0;
for (int i = 0, shift = 0; i < 4; i++, shift += 8)
intValueOfInput |= ((unsigned char)input[i]) << shift;

Convert a uint16_t to char[2] to be sent over socket (unix)

I know that there are things out there roughly on this.. But my brains hurting and I can't find anything to make this work...
I am trying to send an 16 bit unsigned integer over a unix socket.. To do so I need to convert a uint16_t into two chars, then I need to read them in on the other end of the connection and convert it back into either an unsigned int or an uint16_t, at that point it doesn't matter if it uses 2bytes or 4bytes (I'm running 64bit, that's why I can't use unsigned int :)
I'm doing this in C btw
Thanks
Why not just break it up into bytes with mask and shift?
uint16_t value = 12345;
char lo = value & 0xFF;
char hi = value >> 8;
(edit)
On the other end, you assemble with the reverse:
uint16_t value = lo | uint16_t(hi) << 8;
Off the top of my head, not sure if that cast is required.
char* pUint16 = (char*)&u16;
ie Cast the address of the uint16_t.
char c16[2];
uint16_t ui16 = 0xdead;
memcpy( c16, ui16, 2 );
c16 now contains the 2 bytes of the u16. At the far end you can simply reverse the process.
char* pC16 = /*blah*/
uint16_t ui16;
memcpy( &ui16, pC16, 2 );
Interestingly though there is a call to memcpy nearly every compiler will optimise it out because its of a fixed size.
As Steven sudt points out you may get problems with big-endian-ness. to get round this you can use the htons (host-to-network short) function.
uint16_t ui16correct = htons( 0xdead );
and at the far end use ntohs (network-to-host short)
uint16_t ui16correct = ntohs( ui16 );
On a little-endian machine this will convert the short to big-endian and then at the far end convert back from big-endian. On a big-endian machine the 2 functions do nothing.
Of course if you know that the architecture of both machines on the network use the same endian-ness then you can avoid this step.
Look up ntohl and htonl for handling 32-bit integers. Most platforms also support ntohll and htonll for 64-bits as well.
Sounds like you need to use the bit mask and shift operators.
To split up a 16-bit number into two 8-bit numbers:
you mask the lower 8 bits using the bitwise AND operator (& in C) so that the upper 8 bits all become 0, and then assign that result to one char.
you shift the upper 8 bits to the right using the right shift operator (>> in C) so that the lower 8 bits are all pushed out of the integer, leaving only the top 8 bits, and assign that to another char.
Then when you send these two chars over the connection, you do the reverse: you shift what used to be the top 8 bits to the left by 8 bits, and then use bitwise OR to combine that with the other 8 bits.
Basically you are sending 2 bytes over the socket, that's all the socket need to know, regardless of endianness, signedness and so on... just decompose your uint16 into 2 bytes and send them over the socket.
char byte0 = u16 & 0xFF;
char byte1 = u16 >> 8;
At the other end do the conversion in the opposite way

Resources