I'm looking at some C code that contains this statement.
if (
((uint8_t *)row)[byte] & (1 << (8-bit))
)
value |= (value + 1);
What would be the meaning and purpose of putting the AND of a pointer and an integer inside the conditional parentheses?
There are meanings, in other contexts, but that's not what's happening here.
It's casting row (which I assume is a pointer of some sort) to a uint8_t *, and then picking out the byte-th uint8_t in that array. That is then bitwise-anded with the shifted-left stuff.
It's logically the same as:
uint8_t shifted = (1 << (8 - bit))
uint8_t *rowptr = (uint8_t *)row;
uint8_t rowval = rowptr[byte];
uint8_t combined = (rowval & shifted);
if (combined) // or, if (combined != 0)
value |= (value + 1);
That isn't what it's doing.
(uint8_t *)row
cast row to pointer-to-unsigned-byte
((uint8_t *)row)[byte]
... and apply array addressing to retrieve the unsigned byte byte bytes forward from there. (Array addressing and pointer math are somewhat interchangable; pointerval[intval] means the same thing as *(pointerval + intval).
So that means
((uint8_t *)row)[byte] & (1 << (8-bit))
retrieves the byteth unsigned byte from the row, and masks out everything but the bitth bit.
Finally, putting it all together,
if ( ((uint8_t *)row)[byte] & (1 << (8-bit)) )
tests whether the result of the expression is true (nonzero).
So this is asking whether a particular bit of a particular byte in the row is nonzero.
I believe in this case is for cheking if a specific bit is on.
It's testing whether bit 7 of row[byte] is set or not. The & binary operator is the bitwise AND operator, not the logical AND operator. 1<<(8-bit) is an expression commonly used to generate a bit mask to isolate one bit.
row may be a generic pointer, so (uint8_t *)row is used to cast this pointer to be a poiniter to an array of bytes.
This isn't an AND of the pointer. You have a pointer, and then you are [byte] above that starting location that is what is being ANDed.
Related
int n_b ( char *addr , int i ) {
char char_in_chain = addr [ i / 8 ] ;
return char_in_chain >> i%8 & 0x1;
}
Like what is that : " i%8 & Ox1" ?
Edit: Note that 0x1 is the hexadecimal notation for 1. Also note that :
0x1 = 0x01 = 0x000001 = 0x0...01
i%8 means i modulo 8, ie the rest in the Euclidean division of i by 8.
& 0x1 is a bitwise AND, it converts the number before to binary form then computes the bitwise operation. (it's already in binary but it's just so you understand)
Example : 0x1101 & 0x1001 = 0x1001
Note that any number & 0x1 is either 0 or one.
Example: 0x11111111 & 0x00000001 is 0x1 and 0x11111110 & 0x00000001 is 0x0
Essentially, it is testing the last bit on the number, which the bit determining parity.
Final edit:
I got the precedence wrong, thanks to the comments for pointing it out. Here is the real precedence.
First, we compute i%8.
The result could be 0, 1, 2, 3, 4, 5, 6, 7.
Then, we shift the char by the result, which is maximum 7. That means the i % 8 th bit is now the least significant bit.
Then, we check if the original i % 8 bit is set (equals one) or not. If it is, return 1. Else, return 0.
This function returns the value of a specific bit in a char array as the integer 0 or 1.
addr is the pointer to the first char.
i is the index to the bit. 8 bits are commonly stored in a char.
First, the char at the correct offset is fetched:
char char_in_chain = addr [ i / 8 ] ;
i / 8 divides i by 8, ignoring the remainder. For example, any value in the range from 24 to 31 gives 3 as the result.
This result is used as the index to the char in the array.
Next and finally, the bit is obtained and returned:
return char_in_chain >> i%8 & 0x1;
Let's just look at the expression char_in_chain >> i%8 & 0x1.
It is confusing, because it does not show which operation is done in what sequence. Therefore, I duplicate it with appropriate parentheses: (char_in_chain >> (i % 8)) & 0x1. The rules (operation precedence) are given by the C standard.
First, the remainder of the division of i by 8 is calculated. This is used to right-shift the obtained char_in_chain. Now the interesting bit is in the least significant bit. Finally, this bit is "masked" with the binary AND operator and the second operand 0x1. BTW, there is no need to mark this constant as hex.
Example:
The array contains the bytes 0x5A, 0x23, and 0x42. The index of the bit to retrieve is 13.
i as given as argument is 13.
i / 8 gives 13 / 8 = 1, remainder ignored.
addr[1] returns 0x23, which is stored in char_in_chain.
i % 8 gives 5 (13 / 8 = 1, remainder 5).
0x23 is binary 0b00100011, and right-shifted by 5 gives 0b00000001.
0b00000001 ANDed with 0b00000001 gives 0b00000001.
The value returned is 1.
Note: If more is not clear, feel free to comment.
What the various operators do is explained by any C book, so I won't address that here. To instead analyse the code step by step...
The function and types used:
int as return type is an indication of the programmer being inexperienced at writing hardware-related code. We should always avoid signed types for such purposes. An experienced programmer would have used an unsigned type, like for example uint8_t. (Or in this specific case maybe even bool, depending on what the data is supposed to represent.)
n_b is a rubbish name, we should obviously never give an identifier such a nondescript name. get_bit or similar would have been a better name.
char* is, again, an indication of the programmer being inexperienced. char is particularly problematic when dealing with raw data, since we can't even know if it is signed or unsigned, it depends on which compiler that is used. Had the raw data contained a value of 0x80 or larger and char was negative, we would have gotten a negative type. And then right shifting a negative value is also problematic, since that behavior too is compiler-specific.
char* is proof of the programmer lacking the fundamental knowledge of const correctness. The function does not modify this parameter so it should have been const qualified. Good code would use const uint8_t* addr.
int i is not really incorrect, the signedness doesn't really matter. But good programming practice would have used an unsigned type or even size_t.
With types unsloppified and corrected, the function might look like this:
#include <stdint.h>
uint8_t get_bit (const uint8_t* addr, size_t i ) {
uint8_t char_in_chain = addr [ i / 8 ] ;
return char_in_chain >> i%8 & 0x1;
}
This is still somewhat problematic, because the average C programmer might not remember the precedence of >> vs % vs & on top of their head. It happens to be % over >> over &, but lets write the code a bit more readable still by making precedence explicit: (char_in_chain >> (i%8)) & 0x1.
Then I would question if the local variable really adds anything to readability. Not really, we might as well write:
uint8_t get_bit (const uint8_t* addr, size_t i ) {
return ((addr[i/8]) >> (i%8)) & 0x1;
}
As for what this code actually does: this happens to be a common design pattern for how to access a specific bit in a raw bit-field.
Any bit-field in C may be accessed as an array of bytes.
Bit number n in that bit-field, will be found at byte n/8.
Inside that byte, the bit will be located at n%8.
Bit masking in C is most readably done as data & (1u << bit). Which can be obfuscated as somewhat equivalent but less readable (data >> bit) & 1u, where the masked bit ends up in the LSB.
For example lets assume we have 64 bits of raw data. Bits are always enumerated from 0 to 63 and bytes (just like any C array) from index 0. We want to access bit 33. Then 33/8 integer division = 4.
So byte[4]. Bit 33 will be found at 33%8 = 1. So we can obtain the value of bit 33 from ordinary bit masking byte[33/8] & (1u << (bit%8)). Or similarly, (byte[33/8] >> (bit%8)) & 1u
An alternative, more readable version of it all:
bool is_bit_set (const uint8_t* data, size_t bit)
{
uint8_t byte = data [bit / 8u];
size_t mask = 1u << (bit % 8u);
return (byte & mask) != 0u;
}
(Strictly speaking we could as well do return byte & mask; since a boolean type is used, but it doesn't hurt to be explicit.)
Say I am reading and writing uint32_t values to and from a stream. If I read/write one byte at a time to/from a stream and shift each byte like the below examples, will the results be consistent regardless of machine endianness?
In the examples here the stream is a buffer in memory called p.
static uint32_t s_read_uint32(uint8_t** p)
{
uint32_t value;
value = (*p)[0];
value |= (((uint32_t)((*p)[1])) << 8);
value |= (((uint32_t)((*p)[2])) << 16);
value |= (((uint32_t)((*p)[3])) << 24);
*p += 4;
return value;
}
static void s_write_uint32(uint8_t** p, uint32_t value)
{
(*p)[0] = value & 0xFF;
(*p)[1] = (value >> 8 ) & 0xFF;
(*p)[2] = (value >> 16) & 0xFF;
(*p)[3] = value >> 24;
*p += 4;
}
I don't currently have access to a big-endian machine to test this out, but the idea is if each byte is written one at a time each individual byte can be independently written or read from the stream. Then the CPU can handle endianness by hiding these details behind the shifting operations. Is this true, and if not could anyone please explain why not?
If I read/write one byte at a time to/from a stream and shift each byte like the below examples, will the results be consistent regardless of machine endianness?
Yes. Your s_write_uint32() function stores the bytes of the input value in order from least significant to most significant, regardless of their order in the native representation of that value. Your s_read_uint32() correctly reverses this process, regardless of the underlying representation of uint32_t. These work because
the behavior of the shift operators (<<, >>) is defined in terms of the value of the left operand, not its representation
the & 0xff masks off all bits of the left operand but those of its least-significant byte, regardless of the value's representation (because 0xff has a matching representation), and
the |= operations just put the bytes into the result; the positions are selected, appropriately, by the preceding left shift. This might be more clear if += were used instead, but the result would be no different.
Note, however, that to some extent, you are reinventing the wheel. POSIX defines a function pair htonl() and nothl() -- supported also on many non-POSIX systems -- for dealing with byte-order issues in four-byte numbers. The idea is that when sending, everyone uses htonl() to convert from host byte order (whatever that is) to network byte order (big endian) and sends the resulting four-byte buffer. On receipt, everyone accepts four bytes into one number, then uses ntohl() to convert from network to host byte order.
It'll work but a memcpy followed by a conditional byteswap will give you much better codegen for the write function.
#include <stdint.h>
#include <string.h>
#define LE (((char*)&(uint_least32_t){1})[0]) // little endian ?
void byteswap(char*,size_t);
uint32_t s2_read_uint32(uint8_t** p)
{
uint32_t value;
memcpy(&value,*p,sizeof(value));
if(!LE) byteswap(&value,4);
return *p+=4, value;
}
void s2_write_uint32(uint8_t** p, uint32_t value)
{
memcpy(*p,&value,sizeof(value));
if(!LE) byteswap(*p,4);
*p+=4;
}
Gcc since the 8th series (but not clang) can eliminate this shifts on a little-endian platforms, but you should help it by restrict-qualifying the doubly-indirect pointer to the destination, or else it might think that a write to (*p)[0] can invalidate *p (uint8_t is a char type and therefore permitted to alias anything).
void s_write_uint32(uint8_t** restrict p, uint32_t value)
{
(*p)[0] = value & 0xFF;
(*p)[1] = (value >> 8 ) & 0xFF;
(*p)[2] = (value >> 16) & 0xFF;
(*p)[3] = value >> 24;
*p += 4;
}
I am programming an Atmel SAMD20 in C. I came upon an error, that I have now fixed, but I'm not quite sure why it happened in the first place. Can someone point it out to me? (it's probably far too obvious, and I'm going to facepalm later.)
An array of sensors is generating uint16_t data, which I converted to uint8_t to send over I2C. So, this is how I originally wrote it:
for (i = 0; i < SENSBUS1_COUNT; ++i)
{
write_buffer[ (i*2) ] = (uint8_t) sample_sensbus1[i] & 0xff;
write_buffer[(i*2)+1] = (uint8_t) sample_sensbus1[i] >> 8;
}
Here, write_buffer is uint8_t and sample_sensbus1 is uint16_t.
This, for some reason, ends up messing up the most significant byte (in most cases, the most significant byte is just 1 (i.e. 0x100)). This, on the other hand, works fine, and is exactly what it should be:
for (i = 0; i < SENSBUS1_COUNT; ++i)
{
write_buffer[ (i*2) ] = sample_sensbus1[i] & 0xff;
write_buffer[(i*2)+1] = sample_sensbus1[i] >> 8;
}
Clearly, the implicit cast is smarter than I am.
What is going on?
write_buffer[(i*2)+1] = (uint8_t) sample_sensbus1[i] >> 8;
This is equivalent to:
write_buffer[(i*2)+1] = ((uint8_t) sample_sensbus1[i]) >> 8;
As you see, it does the cast before it does the shift. Your most significant byte is now gone.
This should work, though:
write_buffer[(i*2)+1] = (uint8_t) (sample_sensbus1[i] >> 8);
Your cast converts the uint16_t to uint8_t before it does the shift or mask. It is treated as though you wrote:
write_buffer[ (i*2) ] = ((uint8_t)sample_sensbus1[i]) & 0xff;
write_buffer[(i*2)+1] = ((uint8_t)sample_sensbus1[i]) >> 8;
You might need:
write_buffer[ (i*2) ] = (uint8_t)(sample_sensbus1[i] & 0xff);
write_buffer[(i*2)+1] = (uint8_t)(sample_sensbus1[i] >> 8);
In practice, the uncast version is OK too. Remember, a cast tells the compiler "I know more about this than you do; do as I say". That's dangerous if you don't know more than the compiler. Avoid casts whenever you can.
You might also note that shifting (left or right) by the size of the type in bits (or more) is undefined behaviour. However, the ((uint8_t)sample_sensbus[i]) >> 8 is not undefined behaviour, because of the 'usual arithmetic conversions' which mean that the result of (uint8_t)sample_sensbus[i] is converted to int before the shift occurs, and the size of an int cannot be 8 bits (it must be at least 16 bits to satisfy the standard), so the shift is not too big.
This is a question of operator precedence. In the first example, you are first converting to uint8_t and are applying the & and >> operators second. In the second example, those are applied before the implicit conversion takes place.
Casting is a unary prefix operator and as such has very high precedence.
(uint8_t) sample_sensbus1[i] & 0xff
parses as
((uint8_t)sample_sensbus1[i]) & 0xff
In this case & 0xff is redundant. But:
(uint8_t) sample_sensbus1[i] >> 8
parses as
((uint8_t)sample_sensbus1[i]) >> 8
Here the cast truncates the number to 8 bits, then >> 8 shifts everything out.
The problem is in this expression:
(uint8_t) sample_sensbus1[i] >> 8;
It is doing the following sequence:
Converting the sample_sensbus1[i] to uint8_t, effectively truncating it to the 8 least significant bits. This is where you are losing your data.
Converting the above to int as a part of usual arithmetic conversions, making an int with only 8 lower bits set.
Shifting the above int right 8 bits, effectively making the whole expression zero.
I'm working on the "buddy-allocation" for a memory management project in C (see page 14 of this .pdf).
I'd like to find the "buddy" of a given address, knowing that the two buddies are only one-bit-different (the size of the chunk tells us which bit changes). For example, if one of the two 32-bits buddy chunks has the binary address 0b110010100, the second one will be located at 0b110110100 (the 6th bit from the right changes, as 32=2^(6-1)).
I'd like to implement that in C, without exponentiation algorithms because I'm trying to make my program as fast-executing as possible. At best I'd use a tool to manipulate bits, if that exists. Any hints?
EDIT: the type of the addresses is void*. With the solutions posted below, gcc won't let me compile.
EDIT2: I've tried the answers posted below with the XOR operator, but I can't compile because of the type of the addresses. Here's what I've tried :
void* ptr1 = mmap(NULL, 640000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_FILE | MAP_PRIVATE, -1, 0);
printf("%p\n", ptr1);
void* ptr2 = ptr1+0x15f6d44;
printf("%p\n", ptr2);
void* ptr3 = (void*)(ptr2-ptr1);
printf("%p\n", ptr3);
void* ptr4 = ptr3 ^ (1 << 6);
printf("%p\n", ptr4);
and the gcc error :
invalid operands to binary ^ (have ‘void *’ and ‘int’)
It looks like you just want to toggle a given bit, which is achieved using an XOR operation:
buddy_adr = (unsigned long)adr ^ (1 << bit_location);
The cast to unsigned long is required to avoid errors of undefined XOR operation on type void*.
Depending on your compiler settings, you may also get a warning about creating a pointer (i.e., an address) by casting an integer, which is obviously dangerous in the general case (you could pass an invalid address value). To silent this warning, cast back the result to void* to let the compiler know that you know what you are doing:
buddy_adr = (void *)((unsigned long(adr ^ (1 << bit_location));
Note that in embedded system programming (where I've used this technique most of the time since many peripherals are memory-mapped) you would usually "simplify" this line of code using macros like TOGGLE_BIT(addr, bit) and INT_TO_ADDR(addr).
You can set one bit with a | bitwise or.
adr = adr | 0x10;
A tool? To manipulate bits? You don't need a "tool", that's about as primitive an operation as you can do.
uint32_t address = 0x0194;
address |= 1 << 5; /* This sets the sixth bit. */
If you really want to toggle the bit, i.e. set if if it's clear, but clear it if it's set, you use the bitwise XOR operator:
address ^= 1 << 5;
This is not "exponentiation", it's just a bitwise XOR.
If the address is held in a pointer register, either cast or copy to integer (uintptr_t) and the copy back.
This is case of bit manipulation which is very common in c programming
if you want to change xxbxxxxx simply XOR this with xx1xxxxx. XOR topple the given bit. If you want to make it 1 just use OR (|) with all bits 0 except that bit 1 which you want to turn on
a more compact way to do this
#define BIT_ON(x,bit) (x |= ( 1 << (bit-1)) )
#define BIT_TOGGLE(x,bit) (x ^= ( 1 << (bit-1)) )
#define BIT_OFF(x,bit) (x &= ~( 1 << (bit-1)) )
How can I see the bytes/bits of a variable in C? In terms of binary, just zeros and ones.
My problem is that I want to test to see if any zeros exist in the most significant byte of variable x. Any help would be appreciated.
Use the logical AND operator &. For example:
char c = ....
if ( (c & 0xFF) == 0xFF) ... // test char c for zeroes
You may want to use shifts and macros to automate it, instead of using numeric constants, because for different types you'll need different values to test the MSB. You can get the value for shifts using sizeof.
// test MSB of an int for zeroes
int i = ...
if ( ( i & (0xFF << 8*(sizeof(int)-1))) == (0xFF<<8*(sizeof(int)-1))) ...
You can use following test
var & (1 << N)
To check if bit N is set in var. Most significant bit depends on the datatype of var.
Print the memory byte by byte, i.e. from 0 to sizeof(x) (if x happens to be your variable). Then, when printing each byte, print all eight bits individually.
if(x & 0x80) // assuming x is a byte(char type)
{
// msb is set
}