i'm currently learning from the book "the shellcoder's handbook", I have a strong understanding of c but recently I came across a piece of code that I can't grasp.
Here is the piece of code:
char a[4];
unsigned int addr = 0x0806d3b0;
a[0] = addr & 0xff;
a[1] = (addr & 0xff00) >> 8;
a[2] = (addr & 0xff0000) >> 16;
a[3] = (addr) >> 24;
So the question is what does this, what is addr & 0xff (and the three lines below it) and what makes >> 8 to it (I know that it divides it 8 times by 2)?
Ps: don't hesitate to tell me if you have ideas for the tags that I should use.
The variable addr is 32 bits of data, while each element in the array a is 8 bits. What the code does is copy the 32 bits of addr into the array a, one byte at a time.
Lets take this line:
a[1] = (addr & 0xff00) >> 8;
And then do it step by step.
addr & 0xff00 This gets the bits 8 to 15 of the value in addr, the result after the operation is 0x0000d300.
>> 8 This shifts the bits to the right, so 0x0000d300 becomes 0x000000d3.
Assign the resulting value of the mask and shift to a[1].
The code is trying to enforce endianness on the data input. Specifically, it is trying to enforce little endian behavior on the data. Here is the explaination:
a[0] = addr & 0xff; /* gets the LSB 0xb0 */
a[1] = (addr & 0xff00) >> 8; /* gets the 2nd LSB 0xd3 */
a[2] = (addr & 0xff0000) >> 16; /* gets 2nd MSB 0x06 */
a[3] = (addr) >> 24; /* gets the MSB 0x08 */
So basically, the code is masking and separating out every byte of data and storing it in the array "a" in the little endian format.
unsigned char a[4]; /* I think using unsigned char is better in this case */
unsigned int addr = 0x0806d3b0;
a[0] = addr & 0xff; /* get the least significant byte 0xb0 */
a[1] = (addr & 0xff00) >> 8; /* get the second least significant byte 0xd3 */
a[2] = (addr & 0xff0000) >> 16; /* get the second most significant byte 0x06 */
a[3] = (addr) >> 24; /* get the most significant byte 0x08 */
Apparently, the code isolates the individual bytes from addr to store them in the array a so they can be indexed. The first line
a[0] = addr & 0xff;
masks out the byte of lowest value by using 0xff as a bit mask; the subsequent lines do the same, but in addition shift the result to the rightmost position. Finally, the the last line
a[3] = (addr) >> 24;
no masking is necessary anymore, as all unneccesary information is discarded by the shift.
The code is effectively storing a 32 bit adress in a 4 chars long array. As you may know, a char has a byte (8 bit). It first copies the first byte of the adress, then shifts, copies the second byte, then shifts, etc. You get the gist.
It enforces endianness, and stores the integer in little-endian format in a.
See the illustration on wikipedia.
also, why not visualize the bit shifting results..
char a[4];
unsigned int addr = 0x0806d3b0;
a[0] = addr & 0xff;
a[1] = (addr & 0xff00) >> 8;
a[2] = (addr & 0xff0000) >> 16;
a[3] = (addr) >> 24;
int i = 0;
for( ; i < 4; i++ )
{
printf( "a[%d] = %02x\t", i, (unsigned char)a[i] );
}
printf("\n" );
Output:
a[0] = b0 a[1] = d3 a[2] = 06 a[3] = 08
I addition to the multiple answers given, the code has some flaws that need to be fixed to make the code portable. In particular, the char type is very dangerous to use for storing values, because of its implementation-defined signedness. Very classic C bug. If the code was taken from a book, then you should read that book sceptically.
While we are at it, we can also tidy up the code, make it overly explicit to avoid potential future maintenance bugs, remove some implicit type promotions of integer literals etc.
#include <stdint.h>
uint8_t a[4];
uint32_t addr = 0x0806d3b0UL;
a[0] = addr & 0xFFu;
a[1] = (addr >> 8) & 0xFFu;
a[2] = (addr >> 16) & 0xFFu;
a[3] = (addr >> 24) & 0xFFu;
The masks & 0xFFu are strictly speaking not needed, but they might save you from some false positive compiler warnings about wrong integer types. Alternatively, each shift result could be cast to uint8_t and that would have been fine too.
Related
I am in need of your help in this problem:
I want to store a 2 byte number in a char array I have tried the below 2 logics but both have failed
char buff[10];
char* ptr = buff;
/*
I want to store a 2 byte value say 750
Method 1 */
short a = 750;
*(++ptr)=a; //Did not work got these values in first 2 bytes in buffer: 0xffffffc8 0xffffffef
/* Method 2 */
short *a=750;
memcpy(++ptr,a,2) // Got segmentation fault
I know I can do this by dividing by 256 but I want to use a simpler method
*ptr++=750/256;
*ptr=750%256;
The easiest way is simply:
uint16_t u16 = 12345;
memcpy(&buff[i], &u16, 2);
memcpy will place the data according to your CPU endianess.
Alternatively you can bit shift, but since bit shifts themselves are endianess-independent, you need to manually pick the correct indices for buff according to endianess.
Memory layout like Little Endian:
buff[i] = u16 & 0xFFu;
buff[i+1] = (u16 >> 8) & 0xFFu;
Memory layout like Big Endian:
buff[i] = (u16 >> 8) & 0xFFu;
buff[i+1] = u16 & 0xFFu;
char buff[10];
short a=1023;
//To store in char array
buff[0] = a & 0xff;
buff[1] = (a >> 8) & 0xff;
// To get original value.
short b = ((buff[1] << 8) & 0xff00) | (buff[0] & 0x00ff);
Please note my comments, the question is still unanswered.
I have the following which I can't change:
unsigned long addr=142;
u16 offset_low, offset_middle;
u32 offset_high;
I want to set offset_low for low 16 bits, offset_middle for mid 16 bits and offset_high for higher 32 bits of addr.
So I wrote:
offset_low = addr & 0xFFFF;
offset_middle = addr & 0xFFFF0000;
offset_high = addr & 0xFFFFFFFF0000;
Is this right? Is there any clear way to do it instead of wiriting so many F?
Why I think it's not right?
I am working with little endian, so when doing addr & 0xFFFF0000; I will get the mid bits but with zeros and it may load the zeros instead of non-zeroes.
For your purpose you must shift the masked values:
unsigned long addr = 142; // 64-bit on the target system
uint16_t offset_low = addr & 0xFFFF;
uint16_t offset_middle = (addr & 0xFFFF0000) >> 16;
uint32_t offset_high = (addr & 0xFFFFFFFF00000000) >> 32;
Note that since you extract exactly 16 and 32 bits to variables with the same size, masking can be omitted:
uint64_t addr = 142;
uint16_t offset_low = addr;
uint16_t offset_middle = addr >> 16;
uint32_t offset_high = addr >> 32;
The order of bytes in memory (little endian vs big endian) is irrelevant for this question. You could read the specific parts from memory using this knowledge, reading the first 2 bytes for offset_low, the next 2 for offset_middle and the next 4 for offset_high, but extracting from the full 64-bit value is performed the same for both architectures.
Shifting one by desired bits and then subtracting one will give sequence of bits 1 unless you want the top (most significant) bit in the integer type to be one.
Assuming that unsigned long in the environment has 33 bits or more, it can be written like this:
offset_low = addr & ((1UL << 16) - 1);
offset_middle = (addr >> 16) & ((1UL << 16) - 1);
offset_high = (addr >> 32) & ((1UL << 32) - 1);
Is this right?
Not quite, these would be correct:
offset_low = addr & 0xFFFF;
offset_middle = (addr >> 16) & 0xFFFF;
offset_high = addr >> 32;
You didn't shift your results to the right (and your high was just wrong).
I am working with little endian, so when doing addr & 0xFFFF0000; I will get the mid bits but with zeros and it may load the zeros instead of non-zeroes.
The endianness doesn't matter in code, because code runs on the same machine. It only matters during serialization, where you write to a stream on one machine and read from another machine of another endianness, thus getting garbage.
I have been programming the 8051 for about two months now and am somewhat of a newbie to the C language. I am currently working with flash memory in order to read, write, erase, and analyze it. I am working on the write phase at the moment and one of the tasks that I need to do is specify an address location and fill that location with data then increment to the next location and fill it with complementary data. So on and so forth until I reach the end.
My dilemma is I have 18 address bits to play with and currently have three bytes allocated for those 18 bits. Is there anyway that I could combine those 18 bits into an int or unsigned int and increment like that? Or is my only option to increment the first byte, then when that byte rolls over to 0x00 increment the next byte and when that one rolls over, increment the next?
I currently have:
void inc_address(void)
{
P6=address_byte1;
P7=address_byte2;
P2=address_byte3;
P5=data_byte;
while(1)
{
P6++;
if(P6==0x00){P7++;}
else if(P7==0x00){P2++;}
else if(P2 < 0x94){break;} //hex 9 is for values dealing with flash chip
P5=~data_byte;
}
}
Where address is uint32_t:
void inc_address(void)
{
// Increment address
address = (address + 1) & 0x0003ffff ;
// Assert address A0 to A15
P6 = (address & 0xff)
P7 = (address >> 8) & 0xff
// Set least significant two bits of P2 to A16,A17
// without modifying other bits in P2
P2 &= 0xFC ; // xxxxxx00
P2 |= (address >> 16) & 0x03 ; // xxxxxxAA
// Set data
P5 = ~data_byte ;
}
However it is not clear why the function is called inc_address but also assigns P5 with ~data_byte, which presumably asserts the the data bus? It is doing something more than increment an address it seems, so is poorly and confusingly named. I suggest also that the function should take address and data as parameters rather than global data.
Is there anyway that I could combine those 18 bits into an int or
unsigned int and increment like that?
Sure. Supposing that int and unsigned int are at least 18 bits wide on your system, you can do this:
unsigned int next_address = (hi_byte << 16) + (mid_byte << 8) + low_byte + 1;
hi_byte = next_address >> 16;
mid_byte = (next_address >> 8) & 0xff;
low_byte = next_address & 0xff;
The << and >> are bitwise shift operators, and the binary & is the bitwise "and" operator.
It would be a bit safer and more portable to not make assumptions about the sizes of your types, however. To avoid that, include stdint.h, and use type uint_least32_t instead of unsigned int:
uint_least32_t next_address = ((uint_least32_t) hi_byte << 16)
+ ((uint_least32_t) mid_byte << 8)
+ (uint_least32_t) low_byte
+ 1;
// ...
I am trying to extract two bytes from a 16-bit word, and to make a 16-bit word from two bytes. This is what I have tried (byte = unsigned char, word = unsigned short):
Split grpix word into 2 bytes:
word grpix; // Assume that the value has been initialized
byte grpixl = grpix & 0x00FF;
byte grpixh = grpix & 0xFF00;
Make grpix word from 2 bytes
byte grpixh; // Assume that the value has been initialized
byte grpixl; // Assume that the value has been initialized
word grpix = grpixh;
grpix <<= 8;
grpix += grpixl;
For some reason, my code doesn't work as expected, and now I'm not sure if the "splitting" of the word is wrong, if the "making" of the word is wrong, or both... Could you give me some advice?
You're not shifting when you split the word. So if grpix is 0x1234, then grpixl gets the expected 0x34 but grpixh ends up as 0x1200. You should say
byte grpixh = grpix >> 8;
Of course, you're also ignoring any endianness concerns that may be present. You should probably convert your word to a known endian (with something like htons()) before attempting to split (and do the reverse conversion when joining).
Get to know: http://graphics.stanford.edu/~seander/bithacks.html for doing all manner of operations.
right_byte = short_val & 0xFF;
left_byte = ( short_val >> 8 ) & 0xFF
short_val = ( ( left_byte & 0xFF ) << 8 ) | ( right_byte & 0xFF );
I always do a &0xFF mask to assure I have no sign problems.
The simple code that I use to solve this, is:
word=(msb<<8)+lsb;
The following routines have proved very reliable for me:-
unsigned short get16int(a) char *a;{
unsigned short hi,lo;
hi = *a++ << 8;
lo = *a & 0x00ff; /* Needed to cater for sign extending when msb bit is set */
return (hi | lo);
}
put16int(a,i) char *a; int i;{
*a++ = i >> 8;
*a = i;
}
When you mask out the high byte, you need to also shift down by 8 bits, otherwise you just end up with a 16bit number with the bottom eight bits cleared.
byte grpixh = (grpix & 0xFF00) >> 8
Also your composition can be more efficient by using or-equals instead of plus-equals:
grpix |= grpixh << 8
word grpix = grpixl+256*grpixh;
I first convert an int32 number to char[4] array, then convert the array back to int32 by (int *), but the number isn't the same as before:
unsigned int num = 2130706432;
unsigned int x;
unsigned char a[4];
a[0] = (num>>24) & 0xFF;
a[1] = (num>>16) & 0xFF;
a[2] = (num>>8) & 0xFF;
a[3] = num & 0xFF;
x = *(int *)a;
printf("%d\n", x);
the output is 127. And if I set num = 127, the output is 2130706432.
Does anyone have ideas?
Reverse the order of the a[] indexes, e.g,. a[0] -> a[3]
I think you have the endianness in reverse.
Try this:
a[3] = (num>>24) & 0xFF;
a[2] = (num>>16) & 0xFF;
a[1] = (num>>8) & 0xFF;
a[0] = num & 0xFF;
To see what happens use
printf("%x\n", ...);
to print both input and output number.
Endian-independent way:
x = (a[0] << 24) | (a[1] << 16) | (a[2] << 8) | a[3];
This line is never going to work correctly on a little-endian machine:
x = *(int *)a;
You need to unpack the data before you print out the value.
Your code a[0] = (num>>24) & 0xFF; takes the most significant 8 bits from num and sticks them in the first byte of a. On little endian machines the first byte holds the least signficant bits. That means that on little endian machines, this code takes the most significant 8 bits and stores them in the place where the least significant bits go, changing the value.
2130706432 is 0x7F000000 in hex, and 127 is 0x0000007F.
Also, x = *(int *)a; results in undefined behavior. Consider hardware where reading an int from an improperly aligned address causes a bus error. If a doesn't happen to be aligned properly for an int then the program would crash.
A correct approach to interpreting the bytes as an int would be std::memcpy(&x, a, sizeof x);