I know how to reverse the byte order (convert big endian to little endian in C [without using provided func]) - in this case I'd like to use __builtin_bswap64
I also know how to copy a 64bit uint to a char array - ideally memcopy. (How do I convert a 64bit integer to a char array and back?)
My problem is the combination of both these. At the root of the problem, I'm trying to find a faster alternative to this code:
carr[33] = ((some64bitvalue >> 56) & 0xFF) ;
carr[34] = ((some64bitvalue >> 48) & 0xFF) ;
carr[35] = ((some64bitvalue >> 40) & 0xFF) ;
carr[36] = ((some64bitvalue >> 32) & 0xFF) ;
carr[37] = ((some64bitvalue >> 24) & 0xFF) ;
carr[38] = ((some64bitvalue >> 16) & 0xFF) ;
carr[39] = ((some64bitvalue >> 8) & 0xFF) ;
carr[40] = (some64bitvalue & 0XFF);
As memcopy doesn't take the result of __builtin_bswap64 as source argument (or does it?), I tried this:
*(uint64_t *)upub+33 = __builtin_bswap64(some64bitvalue);
but I end up with the
error: lvalue required as left operand of assignment
Is there a faster alternative to the original code I'm trying to replace at all?
This:
*(uint64_t *)upub+33 = __builtin_bswap64(PplusQ[di][3]);
parses as
(*(uint64_t *) upub) + 33 = __builtin_bswap64(PplusQ[di][3]);
so the left-hand side is a uint64_t, not an lvalue.
So would this work?
*(uint64_t *) (upub+33) = __builtin_bswap64(PplusQ[di][3]);
or did you mean to cast upub to uint64_t * first, as Aconcagua commented?
*((uint64_t *) upub + 33) = __builtin_bswap64(PplusQ[di][3]);
I didn't see the type of upub mentioned, so I can't tell.
Also, I have a feeling that there may be an issue with the aliasing rules if upub is originally pointing to another type, so you may want to use something like gcc's -fno-strict-aliasing or make the assignment through a union, or one byte at a time as in your first code snippet.
You can copy as:
uint64_t tmp = __builtin_bswap64(some64bitvalue);
memcpy(upub+33,&tmp,sizeof(tmp));
assuming upub is pointer variable
When writing endian-independent code there is no alternative to bit shifts. You code is likely already close to ideal.
What you could play around with is to use a loop instead of hard-coded numbers. Something along the lines of this:
for(uint_fast8_t i=0; i<8; i++)
{
carr[i+offset] = (some64bitvalue >> (56-(i*8)) & 0xFF;
}
This may turn slower or faster or equal compared to what you already have, depending on the system. Overall, it doesn't make any sense to discuss manual optimization like this without a specific system in mind.
Related
I am programming an Atmel SAMD20 in C. I came upon an error, that I have now fixed, but I'm not quite sure why it happened in the first place. Can someone point it out to me? (it's probably far too obvious, and I'm going to facepalm later.)
An array of sensors is generating uint16_t data, which I converted to uint8_t to send over I2C. So, this is how I originally wrote it:
for (i = 0; i < SENSBUS1_COUNT; ++i)
{
write_buffer[ (i*2) ] = (uint8_t) sample_sensbus1[i] & 0xff;
write_buffer[(i*2)+1] = (uint8_t) sample_sensbus1[i] >> 8;
}
Here, write_buffer is uint8_t and sample_sensbus1 is uint16_t.
This, for some reason, ends up messing up the most significant byte (in most cases, the most significant byte is just 1 (i.e. 0x100)). This, on the other hand, works fine, and is exactly what it should be:
for (i = 0; i < SENSBUS1_COUNT; ++i)
{
write_buffer[ (i*2) ] = sample_sensbus1[i] & 0xff;
write_buffer[(i*2)+1] = sample_sensbus1[i] >> 8;
}
Clearly, the implicit cast is smarter than I am.
What is going on?
write_buffer[(i*2)+1] = (uint8_t) sample_sensbus1[i] >> 8;
This is equivalent to:
write_buffer[(i*2)+1] = ((uint8_t) sample_sensbus1[i]) >> 8;
As you see, it does the cast before it does the shift. Your most significant byte is now gone.
This should work, though:
write_buffer[(i*2)+1] = (uint8_t) (sample_sensbus1[i] >> 8);
Your cast converts the uint16_t to uint8_t before it does the shift or mask. It is treated as though you wrote:
write_buffer[ (i*2) ] = ((uint8_t)sample_sensbus1[i]) & 0xff;
write_buffer[(i*2)+1] = ((uint8_t)sample_sensbus1[i]) >> 8;
You might need:
write_buffer[ (i*2) ] = (uint8_t)(sample_sensbus1[i] & 0xff);
write_buffer[(i*2)+1] = (uint8_t)(sample_sensbus1[i] >> 8);
In practice, the uncast version is OK too. Remember, a cast tells the compiler "I know more about this than you do; do as I say". That's dangerous if you don't know more than the compiler. Avoid casts whenever you can.
You might also note that shifting (left or right) by the size of the type in bits (or more) is undefined behaviour. However, the ((uint8_t)sample_sensbus[i]) >> 8 is not undefined behaviour, because of the 'usual arithmetic conversions' which mean that the result of (uint8_t)sample_sensbus[i] is converted to int before the shift occurs, and the size of an int cannot be 8 bits (it must be at least 16 bits to satisfy the standard), so the shift is not too big.
This is a question of operator precedence. In the first example, you are first converting to uint8_t and are applying the & and >> operators second. In the second example, those are applied before the implicit conversion takes place.
Casting is a unary prefix operator and as such has very high precedence.
(uint8_t) sample_sensbus1[i] & 0xff
parses as
((uint8_t)sample_sensbus1[i]) & 0xff
In this case & 0xff is redundant. But:
(uint8_t) sample_sensbus1[i] >> 8
parses as
((uint8_t)sample_sensbus1[i]) >> 8
Here the cast truncates the number to 8 bits, then >> 8 shifts everything out.
The problem is in this expression:
(uint8_t) sample_sensbus1[i] >> 8;
It is doing the following sequence:
Converting the sample_sensbus1[i] to uint8_t, effectively truncating it to the 8 least significant bits. This is where you are losing your data.
Converting the above to int as a part of usual arithmetic conversions, making an int with only 8 lower bits set.
Shifting the above int right 8 bits, effectively making the whole expression zero.
I have been programming the 8051 for about two months now and am somewhat of a newbie to the C language. I am currently working with flash memory in order to read, write, erase, and analyze it. I am working on the write phase at the moment and one of the tasks that I need to do is specify an address location and fill that location with data then increment to the next location and fill it with complementary data. So on and so forth until I reach the end.
My dilemma is I have 18 address bits to play with and currently have three bytes allocated for those 18 bits. Is there anyway that I could combine those 18 bits into an int or unsigned int and increment like that? Or is my only option to increment the first byte, then when that byte rolls over to 0x00 increment the next byte and when that one rolls over, increment the next?
I currently have:
void inc_address(void)
{
P6=address_byte1;
P7=address_byte2;
P2=address_byte3;
P5=data_byte;
while(1)
{
P6++;
if(P6==0x00){P7++;}
else if(P7==0x00){P2++;}
else if(P2 < 0x94){break;} //hex 9 is for values dealing with flash chip
P5=~data_byte;
}
}
Where address is uint32_t:
void inc_address(void)
{
// Increment address
address = (address + 1) & 0x0003ffff ;
// Assert address A0 to A15
P6 = (address & 0xff)
P7 = (address >> 8) & 0xff
// Set least significant two bits of P2 to A16,A17
// without modifying other bits in P2
P2 &= 0xFC ; // xxxxxx00
P2 |= (address >> 16) & 0x03 ; // xxxxxxAA
// Set data
P5 = ~data_byte ;
}
However it is not clear why the function is called inc_address but also assigns P5 with ~data_byte, which presumably asserts the the data bus? It is doing something more than increment an address it seems, so is poorly and confusingly named. I suggest also that the function should take address and data as parameters rather than global data.
Is there anyway that I could combine those 18 bits into an int or
unsigned int and increment like that?
Sure. Supposing that int and unsigned int are at least 18 bits wide on your system, you can do this:
unsigned int next_address = (hi_byte << 16) + (mid_byte << 8) + low_byte + 1;
hi_byte = next_address >> 16;
mid_byte = (next_address >> 8) & 0xff;
low_byte = next_address & 0xff;
The << and >> are bitwise shift operators, and the binary & is the bitwise "and" operator.
It would be a bit safer and more portable to not make assumptions about the sizes of your types, however. To avoid that, include stdint.h, and use type uint_least32_t instead of unsigned int:
uint_least32_t next_address = ((uint_least32_t) hi_byte << 16)
+ ((uint_least32_t) mid_byte << 8)
+ (uint_least32_t) low_byte
+ 1;
// ...
Say you have a integer and you want to convert it to a byte array. After searching various places I've seen two ways of doing this, one with is shift only and one is shift then mask. I understand the shifting part, but why masking?
For example, scenario 1:
uint8 someByteArray[4];
uint32 someInt;
someByteArray[0] = someInt >> 24;
someByteArray[1] = someInt >> 16;
someByteArray[2] = someInt >> 8;
someByteArray[3] = someInt;
Scenario 2:
uint8 someByteArray[4];
uint32 someInt;
someByteArray[0] = (someInt >> 24) & 0xFF;
someByteArray[1] = (someInt >> 16) & 0xFF;
someByteArray[2] = (someInt >> 8) & 0xFF;
someByteArray[3] = someInt & 0xFF;
Is there a reason for choosing one over the other?
uint8 and uint32 are not standard types in C. I assume they represent 8-bit and 32-bit unsigned integral types, respectively (such as supported by Microsoft compilers as a vendor-specific extension).
Anyways ....
The masking is more general - it ensures the result is between 0 and 0xFF regardless of the actual type of elements someByteArray or of someInt.
In this particular case, it makes no difference, since the conversion of uint32 to uint8 is guaranteed to use modulo arithmetic (modulo 0xFF + 0x01 which is equal to 0x100 or 256 in decimal). However, if your code is changed to use variables or arrays of different types, the masking is necessary to ensure the result is between 0 and 255 (inclusive).
With some compilers the masking stops compiler warnings (it effectively tells the compiler that the expression produces a value between 0 and 0xFF, which can be stored in a 8 bit unsigned). However, some other compilers complain about the act of converting a larger type to an 8 bit type. Because of that, you will sometimes see a third variant, which truly demonstrates a "belts and suspenders" mindset.
uint8 someByteArray[4];
uint32 someInt;
someByteArray[0] = (uint8)((someInt >> 24) & 0xFF);
someByteArray[1] = (uint8)(someInt >> 16) & 0xFF);
someByteArray[2] = (uint8)((someInt >> 8) & 0xFF);
someByteArray[3] = (uint8)(someInt & 0xFF);
In an interview, I was asked to implement big_to_little_endian() as a macro. I implemented using shift operator. But the interviewer want me to optimize this further. I could not do it. Later I googled & searched but could not find it. Can someone help in understanding how to further optimize this code?
#define be_to_le (((x) >> 24) | (((x) & 0x00FF0000) >> 8) | (((x) & 0x0000FF00) << 8) | ((x) << 24))
He might have been referring to using a 16-bit op to swap the top two words then using 8-bit ops to swap the bytes in them -- saves a couple instructions, easiest done in a union, though C technically doesn't like it (but many compilers will accept it), and it still compiler dependent since you are hoping the compiler optimizes a couple things out:
union dword {
unsigned int i;
union shorts {
unsigned short s0, s1;
union bytes {
unsigned char c0, c1, c2, c3;
} c;
} s;
};
union dword in = (union dword)x;
union dword temp = { x.s.s1, x.s.s0 };
union dword out = { temp.s.c.c1, temp.s.c.c0, temp.s.c.c3, temp.s.c.c2 };
Not even valid C, but you get the idea (and I don't think the compiler will even emit what I'm hoping it will).
Or you can save an op, but introduce a data dependency so probably runs slower.
temp = (x << 16) | ( x >> 16)
out = ((0xff00ff00 & temp) >> 8) | (0x00ff00ff & temp) << 8)
Best is just use the compiler intrinsic since it maps to a single bswap instruction.
I'm using AES to encrypt some data that I'm going to send in a packet. I need to store an integer in an array of 8 bit elements. To make this clear, my array is declared as:
uint8_t in[16] = {0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00};
I need to be able to store an integer in this array and then easily retrieve the data in the receiving client. Is there an easy way to accomplish this?
This is usually achieved via bit-shifting:
int i = 42;
in[0] = i & 0xff;
in[1] = (i >> 8) & 0xff;
in[2] = (i >> 16) & 0xff;
in[3] = (i >> 24) & 0xff;
Note that you cannot always be guaranteed that an int is four bytes. However, it's easy enough to turn the above code into a loop, based on sizeof i.
Retrieving the integer works as follows:
int i = in[0] | (in[1] << 8) | (in[2] << 16) | (in[3] << 24);
Of course, if you are about to encrypt this with AES, you need to give some thought to a sensible padding algorithm. Currently you look like you're heading towards zero-padding, which is far from optimal.