I been a long time since I wrote a C code. Does anyone knows how to translate this piece of code to Delphi 2010?
char * pAlignedBuf = (char *) ((int(buf) + 7) & ~7);
where buf is char * buf.
I know that char * is Pchar, but I dont know what & and ~7are.
& is the bitwise and operator.
~ id the bitwise unary not operator.
~7 is a number with all the lower 3 bits set to 0.
& ~7 makes all the lower 3 bits 0 for whatever is on the left side.
The (char *) at the right of the assignment is a hard cast to char *
int(buf) is a hard cast of buf to integer.
That code can be written, in pascal, like this:
var pAlignedBuf: PChar;
pAlignedBuf := PChar((integer(Buf) + 7) and (not 7))
And it's a way to obtain an 8 byte aligned buffer from whatever Buf is. It works by incrementing the Buf with 7 and then clearing the lower 3 bits.
Edit
To be on the safe side, since Delphi 64 bit is somewhat around the corner, that code can be expressed as:
var pAlignedBuf: PChar;
pAlignedBuf := PChar(NativeUInt(Buf) + 7) and (not 7))
And for those that don't like bitwise logic-fu, it can be again re-written as:
var pAlignedBuf: PChar;
pAlignedBuf := PChar(((NativeUInt(Buf) + 7) div 8) * 8);
& is the binary "bitwise and" operator, which you write and in Delphi. ~ is unary "bitwise not" operator, which you write not in Delphi.
The translation is therefore
var
PAlignedBuf: PChar;
begin
pAlignedBuf := PChar((cardinal(buf) + 7) and not 7).
(Well, strictly speaking, the literal translation is integer(buf), not cardinal(buf), but I think cardinal is better. But I am not 100 % sure since I don't know the actual case.)
& is the bitwise-and operation. Example: 0b0011 & 0b0110 == 0b0010.
~ is the bitwise-negation operation. Example: ~0b0111 == 0b1000 (assuming 4-bit numbers).
Assuming all the casts are legal, the statement
char * pAlignedBuf = (char *) ((int(buf) + 7) & ~7);
puts in pAlignedBuf the address pointed to buf aligned to 8 bytes (last 3 bits set to 0).
`buf` `pAlignedBuf`
0x...420 0x...420
0x...421 0x...428
0x...422 0x...428
...
0x...427 0x...428
0x...428 0x...428
...
0x...429 0x...430
Related
int n_b ( char *addr , int i ) {
char char_in_chain = addr [ i / 8 ] ;
return char_in_chain >> i%8 & 0x1;
}
Like what is that : " i%8 & Ox1" ?
Edit: Note that 0x1 is the hexadecimal notation for 1. Also note that :
0x1 = 0x01 = 0x000001 = 0x0...01
i%8 means i modulo 8, ie the rest in the Euclidean division of i by 8.
& 0x1 is a bitwise AND, it converts the number before to binary form then computes the bitwise operation. (it's already in binary but it's just so you understand)
Example : 0x1101 & 0x1001 = 0x1001
Note that any number & 0x1 is either 0 or one.
Example: 0x11111111 & 0x00000001 is 0x1 and 0x11111110 & 0x00000001 is 0x0
Essentially, it is testing the last bit on the number, which the bit determining parity.
Final edit:
I got the precedence wrong, thanks to the comments for pointing it out. Here is the real precedence.
First, we compute i%8.
The result could be 0, 1, 2, 3, 4, 5, 6, 7.
Then, we shift the char by the result, which is maximum 7. That means the i % 8 th bit is now the least significant bit.
Then, we check if the original i % 8 bit is set (equals one) or not. If it is, return 1. Else, return 0.
This function returns the value of a specific bit in a char array as the integer 0 or 1.
addr is the pointer to the first char.
i is the index to the bit. 8 bits are commonly stored in a char.
First, the char at the correct offset is fetched:
char char_in_chain = addr [ i / 8 ] ;
i / 8 divides i by 8, ignoring the remainder. For example, any value in the range from 24 to 31 gives 3 as the result.
This result is used as the index to the char in the array.
Next and finally, the bit is obtained and returned:
return char_in_chain >> i%8 & 0x1;
Let's just look at the expression char_in_chain >> i%8 & 0x1.
It is confusing, because it does not show which operation is done in what sequence. Therefore, I duplicate it with appropriate parentheses: (char_in_chain >> (i % 8)) & 0x1. The rules (operation precedence) are given by the C standard.
First, the remainder of the division of i by 8 is calculated. This is used to right-shift the obtained char_in_chain. Now the interesting bit is in the least significant bit. Finally, this bit is "masked" with the binary AND operator and the second operand 0x1. BTW, there is no need to mark this constant as hex.
Example:
The array contains the bytes 0x5A, 0x23, and 0x42. The index of the bit to retrieve is 13.
i as given as argument is 13.
i / 8 gives 13 / 8 = 1, remainder ignored.
addr[1] returns 0x23, which is stored in char_in_chain.
i % 8 gives 5 (13 / 8 = 1, remainder 5).
0x23 is binary 0b00100011, and right-shifted by 5 gives 0b00000001.
0b00000001 ANDed with 0b00000001 gives 0b00000001.
The value returned is 1.
Note: If more is not clear, feel free to comment.
What the various operators do is explained by any C book, so I won't address that here. To instead analyse the code step by step...
The function and types used:
int as return type is an indication of the programmer being inexperienced at writing hardware-related code. We should always avoid signed types for such purposes. An experienced programmer would have used an unsigned type, like for example uint8_t. (Or in this specific case maybe even bool, depending on what the data is supposed to represent.)
n_b is a rubbish name, we should obviously never give an identifier such a nondescript name. get_bit or similar would have been a better name.
char* is, again, an indication of the programmer being inexperienced. char is particularly problematic when dealing with raw data, since we can't even know if it is signed or unsigned, it depends on which compiler that is used. Had the raw data contained a value of 0x80 or larger and char was negative, we would have gotten a negative type. And then right shifting a negative value is also problematic, since that behavior too is compiler-specific.
char* is proof of the programmer lacking the fundamental knowledge of const correctness. The function does not modify this parameter so it should have been const qualified. Good code would use const uint8_t* addr.
int i is not really incorrect, the signedness doesn't really matter. But good programming practice would have used an unsigned type or even size_t.
With types unsloppified and corrected, the function might look like this:
#include <stdint.h>
uint8_t get_bit (const uint8_t* addr, size_t i ) {
uint8_t char_in_chain = addr [ i / 8 ] ;
return char_in_chain >> i%8 & 0x1;
}
This is still somewhat problematic, because the average C programmer might not remember the precedence of >> vs % vs & on top of their head. It happens to be % over >> over &, but lets write the code a bit more readable still by making precedence explicit: (char_in_chain >> (i%8)) & 0x1.
Then I would question if the local variable really adds anything to readability. Not really, we might as well write:
uint8_t get_bit (const uint8_t* addr, size_t i ) {
return ((addr[i/8]) >> (i%8)) & 0x1;
}
As for what this code actually does: this happens to be a common design pattern for how to access a specific bit in a raw bit-field.
Any bit-field in C may be accessed as an array of bytes.
Bit number n in that bit-field, will be found at byte n/8.
Inside that byte, the bit will be located at n%8.
Bit masking in C is most readably done as data & (1u << bit). Which can be obfuscated as somewhat equivalent but less readable (data >> bit) & 1u, where the masked bit ends up in the LSB.
For example lets assume we have 64 bits of raw data. Bits are always enumerated from 0 to 63 and bytes (just like any C array) from index 0. We want to access bit 33. Then 33/8 integer division = 4.
So byte[4]. Bit 33 will be found at 33%8 = 1. So we can obtain the value of bit 33 from ordinary bit masking byte[33/8] & (1u << (bit%8)). Or similarly, (byte[33/8] >> (bit%8)) & 1u
An alternative, more readable version of it all:
bool is_bit_set (const uint8_t* data, size_t bit)
{
uint8_t byte = data [bit / 8u];
size_t mask = 1u << (bit % 8u);
return (byte & mask) != 0u;
}
(Strictly speaking we could as well do return byte & mask; since a boolean type is used, but it doesn't hurt to be explicit.)
This question already has answers here:
How does the bitwise complement operator (~ tilde) work?
(18 answers)
Closed 3 years ago.
I recently had to use the sbrk() function in c
and i had to calcul the size that i'll use for allocate spaces in memory.
After some researches i found this line of code:
size_t calc_size = ((size) + ((4096) - 1)) & ~((4096) - 1);
Despite my searches for understand what the operators "~" and "&" means, i have a mean level in c and i could not find clear explanations, precisely for the ~ operator. Could you help me to understand what operation is being performed ?
Despite my searches for understand what the operators "~" and "&" means,
Those are bitwise NOT and bitwise AND operators, respectively. They differ from logical NOT (!) and logical AND (&&) in that they work on each individual bit (hence the name). There are also bitwise OR (|) and XOR (^) operators. Examples:
int a = 0x5A; // a = 01011010
int b = ~a; // b = 10100101, i.e. bits of a inverted
int c = a & b; // c = 00000000, i.e. bits of a and b ANDed together
int d = a | b; // d = 11111111, i.e. bits of a and b ORed together
int e = a ^ 0x3D; // e = 01100110, i.e. bits of a XORed with 00111100
Could you help me to understand what operation is being performed ?
The code is converting size into a multiple of 4096. The sbrk() function adjusts the amount of memory allocated to a process, and 4 kilobytes is a typical virtual memory page size, so it makes sense to increase memory in 4K increments. The idea here seems to be to add 4095 to the requested size, so that even a 1-byte request will be promoted to a whole 4096-byte block, and then eliminate the low bits so that you get a multiple of 4K. To understand this better, plug in some different values for size and look at what you get for calc_size.
~ is the bitwise NOT operator. It inverts all bits in its operand.
As for what it means in the context of this expression:
size_t calc_size = ((size) + ((4096) - 1)) & ~((4096) - 1);
This rounds up size to the nearest multiple of 4096.
First lets look at ~((4096) - 1) using binary representation. 4096 is:
0001000000000000
(For simplicity's sake I'll just show the lowest 16 bits. Any higher order bits will be the same as the leftmost). Now subtract 1:
0000111111111111
And apply ~:
1111000000000000
This value is then used as a bitmask which clears the lowest order 12 bits, i.e. the result will be a multiple of 4096.
After that, 4095 is added to size. If it is already a multiple of 4096, this results in only the low order 12 bits being set which the mask will remove. If it is not, then the addition will carry into the 13th bit, rounding it up, and the mask again removes the lower bits.
& is the AND bitwise operator. It works like this:
0101
0011
= 0001
It is the same logic of the && but it's a bit by bit operation.
~ is the NOT bitwise operator, the same way:
0111
= 1000
I am not able to understand how does the last statement increments the pointer.Can somebody explain me with few examples?
The code, as shown:
aptr = (aptr + 1) & (void *)(BUFFERSIZE - 1);
// |________| incremented here
Since it is a circular buffer AND the buffer size is a power of 2, then the & is an easy and fast way to roll over by simply masking. Assuming that the BUFFERSIZE is 256, then:
num & (256 - 1) == num % 256
num & (0x100 - 1) == num % 0x100
num & (0x0ff) == num % 0x100
When the number is not a power of 2, then you can't use the masking technique:
num & (257 - 1) != num % 257
num & (0x101 - 1) != num % 0x101
num & 0x100 != num % 0x101
The (void *) allows the compiler to choose an appropriate width for the BUFFERSIZE constant based on your pointer width... although it is generally best to know - and use! - the width before a statement like this.
I added the hex notation so to make more clear why the & results in an emulated rollover event. Note that 0xff is binary 0x11111111, so the AND operation is simply masking off the upper bits.
2 problems with this approach.
A) Using a pointer with a bit-wise operation is not portable code. #Ilja Everilä
char *aptr;
// error: invalid operands to binary & (have 'char *' and 'void *')
// The following increments the index: (not really)
// aptr = (aptr + 1) & (void *)(BUFFERSIZE-1);
B) With compilers that support the non-standard math on a void * akin to a char *, the math is wrong if aptr point to an object wider than char and BUFFERSIZE is the number of elements in the buffer and not the byte-size. Of course this depends on how the non-standard complier implements some_type * & void *. Why bother to unnecessarily code to use some implementation specific behavior?
Instead use i % BUFFERSIZE. This portable approach works when BUFFERSIZE is a power-of-2 and well as when it is not. When a compiler sees i % power-of-2 and i is some unsigned type, then the same code is certainly emitted as i & (power-of-2 - 1).
For compilers that do not recognize this optimization, then one should consider a better compiler.
#define BUFFERSIZE 256
int main(void) {
char buf[BUFFERSIZE];
// pointer solution
char *aptr = buf;
aptr = &buf[(aptr - buf + 1) % BUFFERSIZE];
// index solution
size_t index = 0;
index = (index + 1) % BUFFERSIZE;
}
I'm looking at some C code that contains this statement.
if (
((uint8_t *)row)[byte] & (1 << (8-bit))
)
value |= (value + 1);
What would be the meaning and purpose of putting the AND of a pointer and an integer inside the conditional parentheses?
There are meanings, in other contexts, but that's not what's happening here.
It's casting row (which I assume is a pointer of some sort) to a uint8_t *, and then picking out the byte-th uint8_t in that array. That is then bitwise-anded with the shifted-left stuff.
It's logically the same as:
uint8_t shifted = (1 << (8 - bit))
uint8_t *rowptr = (uint8_t *)row;
uint8_t rowval = rowptr[byte];
uint8_t combined = (rowval & shifted);
if (combined) // or, if (combined != 0)
value |= (value + 1);
That isn't what it's doing.
(uint8_t *)row
cast row to pointer-to-unsigned-byte
((uint8_t *)row)[byte]
... and apply array addressing to retrieve the unsigned byte byte bytes forward from there. (Array addressing and pointer math are somewhat interchangable; pointerval[intval] means the same thing as *(pointerval + intval).
So that means
((uint8_t *)row)[byte] & (1 << (8-bit))
retrieves the byteth unsigned byte from the row, and masks out everything but the bitth bit.
Finally, putting it all together,
if ( ((uint8_t *)row)[byte] & (1 << (8-bit)) )
tests whether the result of the expression is true (nonzero).
So this is asking whether a particular bit of a particular byte in the row is nonzero.
I believe in this case is for cheking if a specific bit is on.
It's testing whether bit 7 of row[byte] is set or not. The & binary operator is the bitwise AND operator, not the logical AND operator. 1<<(8-bit) is an expression commonly used to generate a bit mask to isolate one bit.
row may be a generic pointer, so (uint8_t *)row is used to cast this pointer to be a poiniter to an array of bytes.
This isn't an AND of the pointer. You have a pointer, and then you are [byte] above that starting location that is what is being ANDed.
In C when you do something like this:
char var = 1;
while(1)
{
var = var << 1;
}
In the 8th iteration the "<<" operator will shift out the 1 and var will be 0. I need to perform a shift in order to mantain the bit shifting. In other words I need this:
initial ----- 00000001
1st shift -- 00000010
2nd shift - 00000100
3rd shift - 00001000
4th shift - 00010000
5th shift -- 00100000
6th shift -- 01000000
7th shift - 10000000
8th shift - 00000001 (At the 8th shift the one automatically start again)
Is there something equivalent to "<<" but to achieve this?
This is known as a circular shift, but C doesn't offer this functionality at the language level.
You will either have to implement this yourself, or resort to inline assembler routines, assuming your platform natively has such an instruction.
For example:
var = (var << 1) | (var >> 7);
(This is not well-defined for negative signed types, though, so you'd have to change your example to unsigned char.)
Yes, you can use a circular shift. (Although it isn't a built-in C operation, but it is a CPU instruction on x86 CPUs)
So you want to do a bit rotation, a.k.a. circular shift, then.
#include <limits.h> // Needed for CHAR_BIT
// positive numbits -> right rotate, negative numbits -> left rotate
#define ROTATE(type, var, numbits) ((numbits) >= 0 ? \
(var) >> (numbits) | (var) << (CHAR_BIT * sizeof(type) - (numbits)) : \
(var) << -(numbits) | (var) >> (CHAR_BIT * sizeof(type) + (numbits)))
As sizeof() returns sizes as multiples of the size of char (sizeof(char) == 1), and CHAR_BIT indicates the number of bits in a char (which, while usually 8, won't necessarily be), CHAR_BIT * sizeof(x) will give you the size of x in bits.
This is called a circular shift. There are intel x86 assembly instructions to do this but unless performance is REALLY REALLY A HUGE ISSUE you're better off using something like this:
int i = 0x42;
int by = 13;
int shifted = i << by | i >> ((sizeof(int) * 8) - by);
If you find yourself really needing the performance, you can use inline assembly to use the instructions directly (probably. I've never needed it badly enough to try).
It's also important to note that if you're going to be shifting by more places than the size of your data type, you need additional checks to make sure you're not overshifting. Using by = 48 would probably result in shifted receiving a value of 0, though this behavior may be platform specific (i.e. something to avoid like the plague) because if I recall correctly, some platforms perform this masking automatically and others do not.