Changing one given bit in a binary address in C - c

I'm working on the "buddy-allocation" for a memory management project in C (see page 14 of this .pdf).
I'd like to find the "buddy" of a given address, knowing that the two buddies are only one-bit-different (the size of the chunk tells us which bit changes). For example, if one of the two 32-bits buddy chunks has the binary address 0b110010100, the second one will be located at 0b110110100 (the 6th bit from the right changes, as 32=2^(6-1)).
I'd like to implement that in C, without exponentiation algorithms because I'm trying to make my program as fast-executing as possible. At best I'd use a tool to manipulate bits, if that exists. Any hints?
EDIT: the type of the addresses is void*. With the solutions posted below, gcc won't let me compile.
EDIT2: I've tried the answers posted below with the XOR operator, but I can't compile because of the type of the addresses. Here's what I've tried :
void* ptr1 = mmap(NULL, 640000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_FILE | MAP_PRIVATE, -1, 0);
printf("%p\n", ptr1);
void* ptr2 = ptr1+0x15f6d44;
printf("%p\n", ptr2);
void* ptr3 = (void*)(ptr2-ptr1);
printf("%p\n", ptr3);
void* ptr4 = ptr3 ^ (1 << 6);
printf("%p\n", ptr4);
and the gcc error :
invalid operands to binary ^ (have ‘void *’ and ‘int’)

It looks like you just want to toggle a given bit, which is achieved using an XOR operation:
buddy_adr = (unsigned long)adr ^ (1 << bit_location);
The cast to unsigned long is required to avoid errors of undefined XOR operation on type void*.
Depending on your compiler settings, you may also get a warning about creating a pointer (i.e., an address) by casting an integer, which is obviously dangerous in the general case (you could pass an invalid address value). To silent this warning, cast back the result to void* to let the compiler know that you know what you are doing:
buddy_adr = (void *)((unsigned long(adr ^ (1 << bit_location));
Note that in embedded system programming (where I've used this technique most of the time since many peripherals are memory-mapped) you would usually "simplify" this line of code using macros like TOGGLE_BIT(addr, bit) and INT_TO_ADDR(addr).

You can set one bit with a | bitwise or.
adr = adr | 0x10;

A tool? To manipulate bits? You don't need a "tool", that's about as primitive an operation as you can do.
uint32_t address = 0x0194;
address |= 1 << 5; /* This sets the sixth bit. */
If you really want to toggle the bit, i.e. set if if it's clear, but clear it if it's set, you use the bitwise XOR operator:
address ^= 1 << 5;
This is not "exponentiation", it's just a bitwise XOR.
If the address is held in a pointer register, either cast or copy to integer (uintptr_t) and the copy back.

This is case of bit manipulation which is very common in c programming
if you want to change xxbxxxxx simply XOR this with xx1xxxxx. XOR topple the given bit. If you want to make it 1 just use OR (|) with all bits 0 except that bit 1 which you want to turn on

a more compact way to do this
#define BIT_ON(x,bit) (x |= ( 1 << (bit-1)) )
#define BIT_TOGGLE(x,bit) (x ^= ( 1 << (bit-1)) )
#define BIT_OFF(x,bit) (x &= ~( 1 << (bit-1)) )

Related

What does this code does ? There are so many weird things

int n_b ( char *addr , int i ) {
char char_in_chain = addr [ i / 8 ] ;
return char_in_chain >> i%8 & 0x1;
}
Like what is that : " i%8 & Ox1" ?
Edit: Note that 0x1 is the hexadecimal notation for 1. Also note that :
0x1 = 0x01 = 0x000001 = 0x0...01
i%8 means i modulo 8, ie the rest in the Euclidean division of i by 8.
& 0x1 is a bitwise AND, it converts the number before to binary form then computes the bitwise operation. (it's already in binary but it's just so you understand)
Example : 0x1101 & 0x1001 = 0x1001
Note that any number & 0x1 is either 0 or one.
Example: 0x11111111 & 0x00000001 is 0x1 and 0x11111110 & 0x00000001 is 0x0
Essentially, it is testing the last bit on the number, which the bit determining parity.
Final edit:
I got the precedence wrong, thanks to the comments for pointing it out. Here is the real precedence.
First, we compute i%8.
The result could be 0, 1, 2, 3, 4, 5, 6, 7.
Then, we shift the char by the result, which is maximum 7. That means the i % 8 th bit is now the least significant bit.
Then, we check if the original i % 8 bit is set (equals one) or not. If it is, return 1. Else, return 0.
This function returns the value of a specific bit in a char array as the integer 0 or 1.
addr is the pointer to the first char.
i is the index to the bit. 8 bits are commonly stored in a char.
First, the char at the correct offset is fetched:
char char_in_chain = addr [ i / 8 ] ;
i / 8 divides i by 8, ignoring the remainder. For example, any value in the range from 24 to 31 gives 3 as the result.
This result is used as the index to the char in the array.
Next and finally, the bit is obtained and returned:
return char_in_chain >> i%8 & 0x1;
Let's just look at the expression char_in_chain >> i%8 & 0x1.
It is confusing, because it does not show which operation is done in what sequence. Therefore, I duplicate it with appropriate parentheses: (char_in_chain >> (i % 8)) & 0x1. The rules (operation precedence) are given by the C standard.
First, the remainder of the division of i by 8 is calculated. This is used to right-shift the obtained char_in_chain. Now the interesting bit is in the least significant bit. Finally, this bit is "masked" with the binary AND operator and the second operand 0x1. BTW, there is no need to mark this constant as hex.
Example:
The array contains the bytes 0x5A, 0x23, and 0x42. The index of the bit to retrieve is 13.
i as given as argument is 13.
i / 8 gives 13 / 8 = 1, remainder ignored.
addr[1] returns 0x23, which is stored in char_in_chain.
i % 8 gives 5 (13 / 8 = 1, remainder 5).
0x23 is binary 0b00100011, and right-shifted by 5 gives 0b00000001.
0b00000001 ANDed with 0b00000001 gives 0b00000001.
The value returned is 1.
Note: If more is not clear, feel free to comment.
What the various operators do is explained by any C book, so I won't address that here. To instead analyse the code step by step...
The function and types used:
int as return type is an indication of the programmer being inexperienced at writing hardware-related code. We should always avoid signed types for such purposes. An experienced programmer would have used an unsigned type, like for example uint8_t. (Or in this specific case maybe even bool, depending on what the data is supposed to represent.)
n_b is a rubbish name, we should obviously never give an identifier such a nondescript name. get_bit or similar would have been a better name.
char* is, again, an indication of the programmer being inexperienced. char is particularly problematic when dealing with raw data, since we can't even know if it is signed or unsigned, it depends on which compiler that is used. Had the raw data contained a value of 0x80 or larger and char was negative, we would have gotten a negative type. And then right shifting a negative value is also problematic, since that behavior too is compiler-specific.
char* is proof of the programmer lacking the fundamental knowledge of const correctness. The function does not modify this parameter so it should have been const qualified. Good code would use const uint8_t* addr.
int i is not really incorrect, the signedness doesn't really matter. But good programming practice would have used an unsigned type or even size_t.
With types unsloppified and corrected, the function might look like this:
#include <stdint.h>
uint8_t get_bit (const uint8_t* addr, size_t i ) {
uint8_t char_in_chain = addr [ i / 8 ] ;
return char_in_chain >> i%8 & 0x1;
}
This is still somewhat problematic, because the average C programmer might not remember the precedence of >> vs % vs & on top of their head. It happens to be % over >> over &, but lets write the code a bit more readable still by making precedence explicit: (char_in_chain >> (i%8)) & 0x1.
Then I would question if the local variable really adds anything to readability. Not really, we might as well write:
uint8_t get_bit (const uint8_t* addr, size_t i ) {
return ((addr[i/8]) >> (i%8)) & 0x1;
}
As for what this code actually does: this happens to be a common design pattern for how to access a specific bit in a raw bit-field.
Any bit-field in C may be accessed as an array of bytes.
Bit number n in that bit-field, will be found at byte n/8.
Inside that byte, the bit will be located at n%8.
Bit masking in C is most readably done as data & (1u << bit). Which can be obfuscated as somewhat equivalent but less readable (data >> bit) & 1u, where the masked bit ends up in the LSB.
For example lets assume we have 64 bits of raw data. Bits are always enumerated from 0 to 63 and bytes (just like any C array) from index 0. We want to access bit 33. Then 33/8 integer division = 4.
So byte[4]. Bit 33 will be found at 33%8 = 1. So we can obtain the value of bit 33 from ordinary bit masking byte[33/8] & (1u << (bit%8)). Or similarly, (byte[33/8] >> (bit%8)) & 1u
An alternative, more readable version of it all:
bool is_bit_set (const uint8_t* data, size_t bit)
{
uint8_t byte = data [bit / 8u];
size_t mask = 1u << (bit % 8u);
return (byte & mask) != 0u;
}
(Strictly speaking we could as well do return byte & mask; since a boolean type is used, but it doesn't hurt to be explicit.)

ARM Cortex M4 - C Programming and Memory Access Optimization

The following three lines of code are optimized ways to modify bits with 1 MOV instruction instead of using a less interrupt safe read-modify-write idiom. They are identical to each other, and set the LED_RED bit in the GPIO Port's Data Register:
*((volatile unsigned long*)(0x40025000 + (LED_RED << 2))) = LED_RED;
*(GPIO_PORTF_DATA_BITS_R + LED_RED) = LED_RED;
GPIO_PORTF_DATA_BITS_R[LED_RED] = LED_RED;
LED_RED is simply (volatile unsigned long) 0x02. In the memory map of this microcontroller, the first 2 LSBs of this register (and others) are unused, so the left shift in the first example makes sense.
The definition for GPIO_PORTF_DATA_BITS_R is:
#define GPIO_PORTF_DATA_BITS_R ((volatile unsigned long *)0x40025000)
My question is: How come I do not need to left shift twice when using pointer arithmetic or array indexing (2nd method and 3rd method, respectively)? I'm having a hard time understanding. Thank you in advance.
Remember how C pointer arithmetic works: adding an offset to a pointer operates in units of the type pointed to. Since GPIO_PORTF_DATA_BITS_R has type unisgned long *, and sizeof(unsigned long) == 4, then GPIO_PORTF_DATA_BITS_R + LED_RED effectively adds 2 * 4 = 8 bytes.
Note that (1) does arithmetic on 0x40025000, which is an integer, not a pointer, so we need to add 8 to get the same result. Left shift by 2 is the same as multiplication by 4, so LED_RED << 2 again equals 8.
(3) is exactly equivalent to (2) by definition of the [] operator.

Demystifying LOWORD macro pseudocode into valid C

From IDA decompilation of a subroutine into C pseudocode, I encountered this line of code:
LOWORD(v9) = *(BYTE *)v6);
Where v9, v6 are 32-bit integers.
LOWORD is a macro defined in windef.h as follows:
#define LOWORD(l) ((WORD)(((DWORD_PTR)(l)) & 0xffff))
Obviously it's a macro, so I can't assign things to it. I am interested an implementation which would be valid C, but I am unsure of the decompiler's intent here. From my guess, the intent is to dereference the LOWORD of v9, and assign the byte value pointed by v6, but I would like to make sure.
In the event that it may be necessary, here is the context of the pseudocode. All types are 32-bit integers:
if (*(BYTE *)v6) {
LOWORD(v9) = *(BYTE *)v6);
do {
v7 = v7 & 0xFFFF0000 | (unsigned __int16)(v7 + v9);
v9 = *++v8;
if (*v8) {
v7 = (unsigned __int16)v7 | ((*v8++ + (v7 >> 16)) << 16);
v9 = *v8;
}
} while (v9);
}
The windef.h macro you show is not the one being used by IDA, since as you noticed you cannot use the result of such a macro as an lvalue.
From my guess, the intent is to dereference the LOWORD of v9, and assign the byte value pointed by v6
Not quite. The intent is to replace the low word (that is the lower 2 bytes) of the variable v9 (which is an int) with the right hand side.
So in IDA, this:
LOWORD(a) = b;
Can be seen as this:
a = (a & 0xffff0000) | b;
The way to write such a macro in C would be something like:
#define LOWORD(x) (*((uint16_t*)&(x)))
Which is a possibly more convoluted, yet more versatile way of doing the same thing, and can be used both as an lvalue and an rvalue.

If Conditional Containing an AND of pointer and integer

I'm looking at some C code that contains this statement.
if (
((uint8_t *)row)[byte] & (1 << (8-bit))
)
value |= (value + 1);
What would be the meaning and purpose of putting the AND of a pointer and an integer inside the conditional parentheses?
There are meanings, in other contexts, but that's not what's happening here.
It's casting row (which I assume is a pointer of some sort) to a uint8_t *, and then picking out the byte-th uint8_t in that array. That is then bitwise-anded with the shifted-left stuff.
It's logically the same as:
uint8_t shifted = (1 << (8 - bit))
uint8_t *rowptr = (uint8_t *)row;
uint8_t rowval = rowptr[byte];
uint8_t combined = (rowval & shifted);
if (combined) // or, if (combined != 0)
value |= (value + 1);
That isn't what it's doing.
(uint8_t *)row
cast row to pointer-to-unsigned-byte
((uint8_t *)row)[byte]
... and apply array addressing to retrieve the unsigned byte byte bytes forward from there. (Array addressing and pointer math are somewhat interchangable; pointerval[intval] means the same thing as *(pointerval + intval).
So that means
((uint8_t *)row)[byte] & (1 << (8-bit))
retrieves the byteth unsigned byte from the row, and masks out everything but the bitth bit.
Finally, putting it all together,
if ( ((uint8_t *)row)[byte] & (1 << (8-bit)) )
tests whether the result of the expression is true (nonzero).
So this is asking whether a particular bit of a particular byte in the row is nonzero.
I believe in this case is for cheking if a specific bit is on.
It's testing whether bit 7 of row[byte] is set or not. The & binary operator is the bitwise AND operator, not the logical AND operator. 1<<(8-bit) is an expression commonly used to generate a bit mask to isolate one bit.
row may be a generic pointer, so (uint8_t *)row is used to cast this pointer to be a poiniter to an array of bytes.
This isn't an AND of the pointer. You have a pointer, and then you are [byte] above that starting location that is what is being ANDed.

Casting troubles when using bit-banding macros with a pre-cast address on Cortex-M3

TL;DR:
Why isn't (unsigned long)(0x400253FC) equivalent to (unsigned long)((*((volatile unsigned long *)0x400253FC)))?
How can I make a macro which works with the former work with the latter?
Background Information
Environment
I'm working with an ARM Cortex-M3 processor, the LM3S6965 by TI, with their StellarisWare (free download, export controlled) definitions. I'm using gcc version 4.6.1 (Sourcery CodeBench Lite 2011.09-69). Stellaris provides definitions for some 5,000 registers and memory addresses in "inc/lm3s6965.h", and I really don't want to redo all of those. However, they seem to be incompatible with a macro I want to write.
Bit Banding
On the ARM Cortex-M3, a portion of memory is aliased with one 32-bit word per bit of the peripheral and RAM memory space. Setting the memory at address 0x42000000 to 0x00000001 will set the first bit of the memory at address 0x40000000 to 1, but not affect the rest of the word. To change bit 2, change the word at 0x42000004 to 1. That's a neat feature, and extremely useful. According to the ARM Technical Reference Manual, the algorithm to compute the address is:
bit_word_offset = (byte_offset x 32) + (bit_number × 4)
bit_word_addr = bit_band_base + bit_word_offset
where:
bit_word_offset is the position of the target bit in the bit-band memory region.
bit_word_addr is the address of the word in the alias memory region that maps to the
targeted bit.
bit_band_base is the starting address of the alias region.
byte_offset is the number of the byte in the bit-band region that contains the targeted bit.
bit_number is the bit position, 0 to 7, of the targeted bit
Implementation of Bit Banding
The "inc/hw_types.h" file includes the following macro which implements this algorithm. To be clear, it implements it for a word-based model which accepts 4-byte-aligned words and 0-31-bit offsets, but the resulting address is equivalent:
#define HWREGBITB(x, b) \
HWREGB(((unsigned long)(x) & 0xF0000000) | 0x02000000 | \
(((unsigned long)(x) & 0x000FFFFF) << 5) | ((b) << 2))
This algorithm takes the base which is either in SRAM at 0x20000000 or the peripheral memory space at 0x40000000) and ORs it with 0x02000000, adding the bit band base offset. Then, it multiples the offset from the base by 32 (equivalent to a five-position left shift) and adds the bit number.
The referenced HWREG simply performs the requisite cast for writing to a given location in memory:
#define HWREG(x) \
(*((volatile unsigned long *)(x)))
This works quite nicely with assignments like
HWREGBITW(0x400253FC, 0) = 1;
where 0x400253FC is a magic number for a memory-mapped peripheral and I want to set bit 0 of this peripheral to 1. The above code computes (at compile-time, of course) the bit offset and sets that word to 1.
What doesn't work
Unfortunately, the aforememntioned definitions in "inc/lm3s6965.h" already perform the cast done by HWREG. I want to avoid magic numbers and instead use provided definitions like
#define GPIO_PORTF_DATA_R (*((volatile unsigned long *)0x400253FC))
An attempt to paste this into HWREGBITW causes the macro to no longer work, as the cast interferes:
HWREGBITW(GPIO_PORTF_DATA_R, 0) = 1;
The preprocessor generates the following mess (indentation added):
(*((volatile unsigned long *)
((((unsigned long)((*((volatile unsigned long *)0x400253FC)))) & 0xF0000000)
| 0x02000000 |
((((unsigned long)((*((volatile unsigned long *)0x400253FC)))) & 0x000FFFFF) << 5)
| ((0) << 2))
)) = 1;
Note the two instances of
(((unsigned long)((*((volatile unsigned long *)0x400253FC)))))
I believe that these extra casts are what is causing my process to fail. The following result of preprocessing HWREGBITW(0x400253FC, 0) = 1; does work, supporting my assertion:
(*((volatile unsigned long *)
((((unsigned long)(0x400253FC)) & 0xF0000000)
| 0x02000000 |
((((unsigned long)(0x400253FC)) & 0x000FFFFF) << 5)
| ((0) << 2))
)) = 1;
The (type) cast operator has right-to-left precedence, so the last cast should apply and an unsigned long used for the bitwise arithmetic (which should then work correctly). There's nothing implicit anywhere, no float to pointer conversions, no precision/range changes...the left-most cast should simply nullify the casts to the right.
My question (finally...)
Why isn't (unsigned long)(0x400253FC) equivalent to (unsigned long)((*((volatile unsigned long *)0x400253FC)))?
How can I make the existing HWREGBITW macro work? Or, how can a macro be written to do the same task but not fail when given an argument with a pre-existing cast?
1- Why isn't (unsigned long)(0x400253FC) equivalent to (unsigned long)((*((volatile unsigned long *)0x400253FC)))?
The former is an integer literal and its value is 0x400253FCul while the latter is the unsigned long value stored in the (memory or GPIO) address 0x400253FC
2- How can I make the existing HWREGBITW macro work? Or, how can a macro be written to do the same task but not fail when given an argument with a pre-existing cast?
Use HWREGBITW(&GPIO_PORTF_DATA_R, 0) = 1; instead.

Resources