I've run into a small issue here. I have an unsigned char array, and I am trying to access bytes 2-3 (0xFF and 0xFF) and get their value as a short.
Code:
unsigned char Temp[512] = {0x00,0xFF,0xFF,0x00};
short val = (short)*((unsigned char*)Temp+1)
While I would expect val to contain 0xFFFF it actually contains 0x00FF. What am I doing wrong?
There's no guarantee that you can access a short when the data is improperly aligned.
On some machines, especially RISC machines, you'd get a bus error and core dump for misaligned access. On other machines, the misaligned access would involve a trap into the kernel to fix up the error — which is only a little quicker than the core dump.
To get the result reliably, you'd be best off doing shifting and or:
val = *(Temp+1) << 8 | *(Temp+2);
or:
val = *(Temp+2) << 8 | *(Temp+1);
Note that this explicitly offers big-endian (first option) or little-endian (second) interpretation of the data.
Also note the careful use of << and |; if you use + instead of |, you have to parenthesize the shift expression or use multiplication instead of shift:
val = (*(Temp+1) << 8) + *(Temp+2);
val = *(Temp+1) * 256 + *(Temp+2);
Be logical and use either logic or arithmetic and not a mixture.
Well you're dereferencing a unsigned char* when you should be derefencing a short*
I think this should work:
short val = *((short*)(Temp+1))
Your problem is that you are only accessing one byte of the array:
*((unsigned char*)Temp+1) will dereference the pointer Temp+1 giving you 0xFF
(short)*((unsigned char*)Temp+1) will cast the result of the dereference to short. Casting unsigned char 0xFF to short obviously gives you 0x00FF
So what you are trying to do is *((short*)(Temp+1))
It should however be noted that what you are doing is a horrible hack. First of all when you have different chars the result will obviously depend on the endianess of the machine.
Second there is no guarantee that the accessed data is correctly aligned to be accessed as a short.
So it might be a better idea to do something like short val= *(Temp+1)<<8 | *(Temp+2) or short val= *(Temp+2)<<8 | *(Temp+1) depending on the endianess of your architecture
I do not recommend this approach because it is architecture-specific.
Consider the following definition of Temp:
unsigned char Temp[512] = {0x00,0xFF,0x88,0x00};
Depending on the endianness of the system, you will get different results casting Temp + 1 to a short *; on a little endian system, the result would be the value 0x88FF, but on a Big endian system, the result would be 0xFF88.
Also, I believe that this is an undefined cast because of issues with alignment.
What you could use is:
short val = (((short)Temp[1]) << 8) | Temp[2];
Related
I am using C to read a .png image file, and if you're not familiar with the PNG encoding format, useful integer values are encoded in .png files in the form of 4-byte big-endian integers.
My computer is a little-endian machine, so to convert from a big-endian uint32_t that I read from the file with fread() to a little-endian one my computer understands, I've been using this little function I wrote:
#include <stdint.h>
uint32_t convertEndian(uint32_t val){
union{
uint32_t value;
char bytes[sizeof(uint32_t)];
}in,out;
in.value=val;
for(int i=0;i<sizeof(uint32_t);++i)
out.bytes[i]=in.bytes[sizeof(uint32_t)-1-i];
return out.value;
}
This works beautifully on my x86_64 UNIX environment, gcc compiles without error or warning even with the -Wall flag, but I feel rather confident that I'm relying on undefined behavior and type-punning that may not work as well on other systems.
Is there a standard function I can call that can reliably convert a big-endian integer to one the native machine understands, or if not, is there an alternative safer way to do this conversion?
I see no real UB in OP's code.
Portability issues: yes.
"type-punning that may not work as well on other systems" is not a problem with OP's C code yet may cause trouble with other languages.
Yet how about a big (PNG) endian to host instead?
Extract the bytes by address (lowest address which has the MSByte to highest address which has the LSByte - "big" endian) and form the result with the shifted bytes.
Something like:
uint32_t Endian_BigToHost32(uint32_t val) {
union {
uint32_t u32;
uint8_t u8[sizeof(uint32_t)]; // uint8_t insures a byte is 8 bits.
} x = { .u32 = val };
return
((uint32_t)x.u8[0] << 24) |
((uint32_t)x.u8[1] << 16) |
((uint32_t)x.u8[2] << 8) |
x.u8[3];
}
Tip: many libraries have a implementation specific function to efficiently to this. Example be32toh.
IMO it'd be better style to read from bytes into the desired format, rather than apparently memcpy'ing a uint32_t and then internally manipulating the uint32_t. The code might look like:
uint32_t read_be32(uint8_t *src) // must be unsigned input
{
return (src[0] * 0x1000000u) + (src[1] * 0x10000u) + (src[2] * 0x100u) + src[3];
}
It's quite easy to get this sort of code wrong, so make sure you get it from high rep SO users 😉. You may often see the alternative suggestion return (src[0] << 24) + (src[1] << 16) + (src[2] << 8) + src[3]; however, that causes undefined behaviour if src[0] >= 128 due to signed integer overflow , due to the unfortunate rule that the integer promotions take uint8_t to signed int. And also causes undefined behaviour on a system with 16-bit int due to large shifts.
Modern compilers should be smart enough to optimize, this, e.g. the assembly produced by clang little-endian is:
read_be32: # #read_be32
mov eax, dword ptr [rdi]
bswap eax
ret
However I see that gcc 10.1 produces a much more complicated code, this seems to be a surprising missed optimization bug.
This solution doesn't rely on accessing inactive members of a union, but relies instead on unsigned integer bit-shift operations which can portably and safely convert from big-endian to little-endian or vice versa
#include <stdint.h>
uint32_t convertEndian32(uint32_t in){
return ((in&0xffu)<<24)|((in&0xff00u)<<8)|((in&0xff0000u)>>8)|((in&0xff000000u)>>24);
}
This code reads a uint32_t from a pointer of uchar_t in big endian storage, independently of the endianness of your architecture. (The code just acts as if it was reading a base 256 number)
uint32_t read_bigend_int(uchar_t *p, int sz)
{
uint32_t result = 0;
while(sz--) {
result <<= 8; /* multiply by base */
result |= *p++; /* and add the next digit */
}
}
if you call, for example:
int main()
{
/* ... */
uchar_t buff[1024];
read(fd, buff, sizeof buff);
uint32_t value = read_bigend_int(buff + offset, sizeof value);
/* ... */
}
I am programming the 8051 in C using the Si Labs IDE. I currently have three bytes: address_byte3, address_byte2, and address_byte1. I then initialized a variable address_sum to be an unsigned long int then did the following operation on it...
address_sum=(address_byte3<<16)+(address_byte2<<8)+(address_byte1);
This operation would lead me to believe that the value loaded into address_sum if address_byte3, address_byte2, & address_byte1 were 0x92, 0x56, & 0x78, respectively, would be 0xXX925678. Instead I am getting a value of 0xXX005678. My logic seems sound but then again I am the one writing the code so I'm biased and could be blinded by my own ignorance. Does anyone have a solution or an explanation as to why the value for address_byte is "lost"?
Thank you.
Variables shorter than int are promoted to int when doing calculations on them. It seems that your int type is 16-bit, so shifting it by 16 bits doesn't work right.
You should explicitly cast the variables to the result type (unsigned long):
address_sum = ((unsigned long)address_byte3<<16) +
((unsigned long)address_byte2<<8) +
(unsigned long)address_byte1;
The last casting is superfluous but doesn't hurt.
A shift of a 16-bit int/unsigned, as well explained by #anatolyg will only result in a 16-bit answer.
I avoid casting, as a general promotion scheme, as sometimes it may narrow the result as code evolves over time and the maintainer uses wider operands.
Alternatives:
((type_of_target) 1) *: This will insure each operation is at least the width of the target.
unsigned long address_sum;
...
address_sum = (1UL*address_byte3<<16) + (1UL*address_byte2<<8) + address_byte1;
Assign to the destination and then operate:
address_sum = address_byte3;
address_sum = address_sum << 8 + address_byte2;
address_sum = address_sum << 8 + address_byte1;
A sneaky, thought not pleasant looking 1-line alternative. Recall * + higher order precedence than shift
address_sum = (0*address_sum + address_byte3 << 16) +
(0*address_sum + address_byte2 << 8) + address_byte1;
Consider #Eugene Sh. concern and use 8-bit unsigned "bytes".
My preference is a variation on chux
bigger declaration address;
byte declaration a,b,c;
address =a; address<<=8;
address|=b; address<<=8;
address|=c;
Despite being the most verbose all of the answers thus far should optimize into basically the same code. But would have to test the specific compiler to see. Can the 8051 shift more than one bit at a time per instruction anyway? Dont remember.
Say I have an unsigned char (or byte) array. I want to take array[1] and array[2] from memory and cast it as short int (2 bytes). Something similar to how a union works, but not starting from the first byte.
Example:
#include <stdio.h>
void main()
{
unsigned char a[28];
unsigned short t;
a[0]=12;
a[1]=10;
a[2]=55;
t=(short) *(a+1);
printf("%i", t);
}
What I want is the value 14090 in decimal. Or 370Ah.
Thank you.
EDIT: I forgot to say, but most of you understood from my example, I am working on a little-endian machine. An 8bit Atmel microcontroller.
It's very simple:
unsigned short t = (a[2] << 8) | a[1];
Note, this assumes unsigned char is 8 bits, which is most likely the case.
The memory access operation (short)*(a+1) is not safe.
If a+1 is not aligned to short (i.e., a+1 is not a multiple of sizeof short), then the result of this operation depends on the compiler at hand.
Compilers that support unaligned load/store operations can resolve it correctly, while others will "round it down" to the nearest address which is aligned to short.
In general, this operations yields undefined behavior.
On top of all that, even if you know for sure that a+1 is aligned to short, this operation will still give you different results between Big-Endian architecture and Little-Endian architecture.
Here is a safe way to work-around both issues:
short x = 0x1234;
switch (*(char*)&x)
{
case 0x12: // Big-Endian
t = (a[1] << 8) | a[2]; // Simulate t = (short)*(a+1) on BE
break;
case 0x34: // Little-Endian
t = (a[2] << 8) | a[1]; // Simulate t = (short)*(a+1) on LE
break;
}
Please note that the code above assumes the following:
CHAR_BIT == 8
sizeof short == 2
This is not necessarily true on every platform (although it is mostly the case).
t= *(short *)(a+1);
You cast the pointer to the first element to a pointer-to-short, and then dereference it.
Note that this is not very portable, and can go wrong if the machine is big endian or aligns data somehow. A better way would be:
t = (a[2] << CHAR_BIT) | a[1];
For full portability, you should check your endianness and see which byte to shift, and which one not to. See here how to check a machine's endianness
I have a byte array containing 16 & 32bit data samples, and to cast them to Int16 and Int32 I currently just do a memcpy with 2 (or 4) bytes.
Because memcpy is probably isn't optimized for lenghts of just two bytes, I was wondering if it would be more efficient to convert the bytes using integer arithmetic (or an union) to an Int32.
I would like to know what the effiency of calling memcpy vs bit shifting is, because the code runs on an embedded platform.
I would say that memcpy is not the way to do this. However, finding the best way depends heavily on how your data is stored in memory.
To start with, you don't want to take the address of your destination variable. If it is a local variable, you will force it to the stack rather than giving the compiler the option to place it in a processor register. This alone could be very expensive.
The most general solution is to read the data byte by byte and arithmetically combine the result. For example:
uint16_t res = ( (((uint16_t)char_array[high]) << 8)
| char_array[low]);
The expression in the 32 bit case is a bit more complex, as you have more alternatives. You might want to check the assembler output which is best.
Alt 1: Build paris, and combine them:
uint16_t low16 = ... as example above ...;
uint16_t high16 = ... as example above ...;
uint32_t res = ( (((uint32_t)high16) << 16)
| low16);
Alt 2: Shift in 8 bits at a time:
uint32_t res = char_array[i0];
res = (res << 8) | char_array[i1];
res = (res << 8) | char_array[i2];
res = (res << 8) | char_array[i3];
All examples above are neutral to the endianess of the processor used, as the index values decide which part to read.
Next kind of solutions is possible if 1) the endianess (byte order) of the device match the order in which the bytes are stored in the array, and 2) the array is known to be placed on an aligned memory address. The latter case depends on the machine, but you are safe if the char array representing a 16 bit array starts on an even address and in the 32 bit case it should start on an address dividable by four. In this case you could simply read the address, after some pointer tricks:
uint16_t res = *(uint16_t *)&char_array[xxx];
Where xxx is the array index corresponding to the first byte in memory. Note that this might not be the same as the index to he lowest value.
I would strongly suggest the first class of solutions, as it is endianess-neutral.
Anyway, both of them are way faster than your memcpy solution.
memcpy is not valid for "shifting" (moving data by an offset shorter than its length within the same array); attempting to use it for such invokes very dangerous undefined behavior. See http://lwn.net/Articles/414467/
You must either use memmove or your own shifting loop. For sizes above about 64 bytes, I would expect memmove to be a lot faster. For extremely short shifts, your own loop may win. Note that memmove has more overhead than memcpy because it has to determine which direction of copying is safe. Your own loop already knows (presumably) which direction is safe, so it can avoid an extra runtime check.
I was wondering why both utf-16le and utf-16be exists? Is it considered to be "inefficient" for a big-endian environment to process a little-endian data?
Currently, this is what I use while storing 2 bytes var locally:
unsigned char octets[2];
short int shotint = 12345; /* (assuming short int = 2 bytes) */
octets[0] = (shortint) & 255;
octets[1] = (shortint >> 8) & 255);
I know that while storing and reading as a fixed endianness locally - there is no endian risk. I was wondering if it's considered to be "inefficient"? what would be the most "efficient" way to store a 2 bytes var? (while restricting the data to the environment's endianness, local use only.)
Thanks, Doori Bar
This allows code to write large amounts of Unicode data to a file without conversion. During loading, you must always check the endianess. If you're lucky, you need no conversion. So in 66% of the cases, you need no conversion and only on 33% you must convert.
In memory, you can then access the data using the native datatypes of your CPU which allows for efficient processing.
That way, everyone can be as happy as possible.
So in your case, you need to check the encoding when loading the data but in RAM, you can use an array of short int to process it.
[EDIT] The fastest way to convert a 16bit value to 2 octets is:
char octet[2];
short * prt = (short*)&octet[0];
*ptr = 12345;
Now you don't know if octet[0] is the low or upper 8 bits. To find that out, write a know value and then examine it.
This will give you one of the encodings; the native one of your CPU.
If you need the other encoding, you can either swap the octets as you write them to a file (i.e. write them octet[1],octet[0]) or your code.
If you have several octets, you can use 32bit integers to swap two 16bit values at once:
char octet[4];
short * prt = (short*)&octet[0];
*ptr ++ = 12345;
*ptr ++ = 23456;
int * ptr32 = (int*)&octet[0];
int val = ((*ptr32 << 8) & 0xff00ff00) || (*ptr >> 8) & 0x00ff00ff);