C defines proper usage - c

A problem with C defines:
#define RANGE1_ms 64
#define FS 16000.0f
#define BS 64
#define COUNTER1_LIMIT ( ( RANGE1_ms/1000.0f)* FS/BS )
This gives 16.0 for COUNTER1_LIMIT.
Debugging the code in Eclipse shows all is OK.
However, when I build release version from a makefile, it produces different result. I have narrowed the problem down to this line:
if( counter1 == (uint16_t)COUNTER1_LIMIT )
{
...
}
where counter1 is uint16_t.
What am I doing wrong with those defines?
This solves the problem:
if( counter1 == 16 )
but that's not the way to go.

The reason for the bug is that floating point numbers are accurate, see the millions of others posts on SO about this, for example Why Are Floating Point Numbers Inaccurate?
But in your specific case the problem is that you use floating point where it isn't needed or useful. You just need a compile-time constant. Fix the expression like this:
#define RANGE1_ms 64u
#define COUNTER1_LIMIT (16000u / 64u * RANGE1_ms / 1000u )

Avoid FP math with the pre-processor.
// Avoid the following
#define FS 16000.0f
#define COUNTER1_LIMIT ( ( RANGE1_ms/1000.0f)* FS/BS )
Alternative 1: use integer math that rounds quotients to nearest. Adding half the divisor works when dividend/divisor are positive.
#define RANGE1_ms 64
#define FS 16000
#define BS 64
#define ms_PER_s 1000
#define COUNTER1_LIMIT_N (1ul * RANGE1_ms * FS)
#define COUNTER1_LIMIT_D (1ul * ms_PER_s * BS )
#define COUNTER1_LIMIT_I ((COUNTER1_LIMIT_N + COUNTER1_LIMIT_D/2)/COUNTER1_LIMIT_D)
#define COUNTER1_LIMIT_F (1.0*COUNTER1_LIMIT_N/COUNTER1_LIMIT_D)
if (counter1 == COUNTER1_LIMIT_I)
Alternative 2:
When constants like FS truly need to be FP like 16123.4f, use a rounding function rather than truncation with an integer cast like (uint16_t)
#include <math.h>
if (counter1 == lround(COUNTER1_LIMIT))
Alternative 3:
When constants like FS truly need to be FP like 16123.4f, add 0.5 then truncate with an integer cast like (uint16_t). The add 0.5 trick works when the value is positive. It fails with a number of values when adding 0.5 is not exact. Yet has the advantage: it can be computed, as in OP's case, at compile time.
if (counter1 == (uint16_t)(COUNTER1_LIMIT + 0.5))

Floating point numbers are susceptible to loss of precision. The 16.0 can get converted to 15.9999 when loaded as a temporary. Casting it to uint16_t makes it 15 that doesn't compare against 16.
You need a function to round the floating point values. That you can call on COUNTER1_LIMIT.
Other option would be to promote counter1 to float and check if the absolute difference between this and COUNTER1_LIMIT is less than a small value like 0.001.
It can be done as
float diff = counter1 - COUNTER1_LIMIT;
if(diff > -0.001 && diff < 0.001) {
...
}
This should work for you.

RANGE1_ms/1000.0f is 64/1000.0f . The problem is, computers cannot represent 0.064 exactly. Ultimately, numbers are represented as a*2b, where a and b are integers, which is impossible for 0.064 . So, RANGE1_ms/1000.0f is approximately 0.064. When you multiply it by 16000f and divide by 64, you receive approximately 16f. It may be 15.999999, or 16.000003, or something. When you cast it to uint16_t, it becomes either 15 or 16.
One way is to cast it as (uint16_t)(COUNTER1_LIMIT+0.5). It will round it to the closest integer. Alternatively, if your COUNTER1_LIMIT is always integer (that is you know that the expression should produce integer number), you may do
#define FS_BY_1000 16
#define COUNTER1_LIMIT (RANGE1_ms * FS_BY_1000/BS)

Related

C: Representing a fraction without floating points

I'm writing some code for an embedded system (MSP430) without hardware floating point support. Unfortunately, I will need to work with fractions in my code as I'm doing ranging, and a short-range sensor with a precision of 1m isn't a very good sensor.
I can do as much of the math as I need in ints, but by the end there are two values that I will definitely need to have fractions on; range and speed. Range will be a value between 2-500 (cm), while speed should be no higher than -10 to 10 (ms^-1). I am unsure how to represent them without floating point values, if it is possible. A simple way of rounding the fractions up or down would be best.
Some sample code I have:
voltage_difference_new = ((memval3_new - memval4_new)*3.3/4096);
where memval3_new and memval4_new are ints, but voltage_difference_new is a float.
Please let me know if more information is needed. Or if there is a blindingly easy fix.
You have rather answered your own question with the statement:
Range will be a value between 2-500 (cm),
Work in centimetre (or even millimetre) rather than metre units.
That said you don't need floating-point hardware to do floating point math; the compiler will support "soft" floating point and generate the code to perform floating point operations - it will be slower than hardware floating point or integer operations, but that may not be an issue in your application.
Nonetheless there are many reasons to avoid floating-point even with hardware support and it does not sound like your case for FP is particularly compelling, but it is hard to tell without seeing your code and a specific example. In 32 years of embedded systems development I have seldom resorted to FP even for trig, log, sqrt and digital signal processing.
A general method is to use a fixed point presentation. My earlier suggestion of using centimetres is an example of decimal fixed point, but for greater efficiency you should use binary fixed point. For example you might represent distance in 1/1024 metre units (giving > 1 mm precision). Because the fixed point is binary, all the necessary rescaling can be done with shifts rather than more expensive multiply/divide operations.
For example, say you have an 8 bit sensor generating linear output 0 to 255 corresponding to a real distance 0 to 0.5 metre.
#define Q10_SHIFT = 10 ; // 10 bits fractional (1/1024)
typedef int q10_t ;
#define ONE_METRE = (1 << Q10_SHIFT)
#define SENSOR_MAX = 255
#define RANGE_MAX = (ONE_METRE/2)
q10_t distance = read_sensor() * RANGE_MAX / SENSOR_MAX ;
distance is in Q10 fixed point representation. Performing addition and subtraction on such is normal integer arithmentic, multiply and divide require scaling:
int q10_add( q10_t a, q10_t b )
{
return a + b ;
}
int q10_sub( q10_t a, q10_t b )
{
return a - b ;
}
int q10_mul( q10_t a, q10_t b )
{
return (a * b) >> Q10_SHIFT ;
}
int q10_div( q10_t a, q10_t b )
{
return (a << Q10_SHIFT) / b ;
}
Of course you may want to be able to mix types and say multiply a q10_t by an int - providing a comprehensive library for fixed-point can get complex. Personally for that I use C++ where you have classes, function overloading and operator overloading to support more natural code. But unless your code has a great deal of general fixed point math, it may be simpler to code specific fixed point operations ad-hoc.
To take the one example you have provided:
double voltage_difference_new = ((memval3_new - memval4_new)*3.3/4096);
The floating-point there is trivially removed using millivolts:
int voltage_difference_new_mv = ((memval3_new - memval4_new) * 3300) /4096 ;
The issue then perhaps becomes one of presentation. For example if you have to present or report the value in volts to a user. In that case:
int volt_fract = abs(voltage_difference_new_mv % 1000) ;
int volt_whole = voltage_difference_new_mv / 1000 ;
printf( "%d.%04d", volt_whole, volt_fract ) ;

How to compute non-integer (fractional) log2 at compile time in C

There are various macro-based solutions out there to compute the integer-valued log2 at compile time, but what if you need a bit more precision than what you get with integers, i.e. a few binary places after the binary point? It doesn't matter whether the value generated is a floating point expression, or a fixed-point scaled integer expression, as long as it evaluates to a compile-time constant value.
What is this useful for, you may ask? The application I had in mind was to compute the number of bits needed to optimally pack a structure whose fields have value sets that don't span a power of 2 range.
I've come up with the following - it's not particularly inventive, but it works, and doesn't slow the compilation totally to a crawl. IOW, it does what I needed it to do - add a couple (here: dozen) binary places' worth of precision after the binary point.
It works by representing the number as a power of 2 and trial-multiplying it by coefficients that happen to have log2 values equal to a binary fraction (2^-n). Multiplying by such coefficients is equivalent to adding together the logarithms, and thus the FRAC_LOG2 macro expands to a sum with elements selected with nested ternary expressions.
#define IROOT2_1 .7071067812 // 2^-(2^-1)
#define IROOT2_2 .8408964153 // 2^-(2^-2)
#define IROOT2_3 .9170040432 // 2^-(2^-3)
#define IROOT2_4 .9576032807 // 2^-(2^-4)
#define IROOT2_5 .9785720621 // 2^-(2^-5)
#define IROOT2_6 .9892280132 // 2^-(2^-6)
#define IROOT2_7 .9945994235 // 2^-(2^-7)
#define IROOT2_8 .9972960561 // 2^-(2^-8)
#define IROOT2_9 .9986471129 // 2^-(2^-9)
#define IROOT2_A .9993233275 // 2^-(2^-10)
#define IROOT2_B .9996616065 // 2^-(2^-11)
#define IROOT2_C .9998307889 // 2^-(2^-12)
#define BIT_SCAN_REV(n) \
(n>>15?15:n>>14?14:n>>13?13:n>>12?12:n>>11?11:n>>10?10:n>>9?9:\
n>>8?8:n>>7?7:n>>6?6:n>>5?5:n>>4?4:n>>3?3:n>>2?2:n>>1?1:0)
#define FRAC_LOG2_1(m,n) (1./4096.)*\
((m<=n*IROOT2_1?2048:0)+FRAC_LOG2_2(m,n*(m<=n*IROOT2_1?IROOT2_1:1)))
#define FRAC_LOG2_2(m,n) ((m<=n*IROOT2_2?1024:0)+FRAC_LOG2_3(m,n*(m<=n*IROOT2_2?IROOT2_2:1)))
#define FRAC_LOG2_3(m,n) ((m<=n*IROOT2_3?512:0)+FRAC_LOG2_4(m,n*(m<=n*IROOT2_3?IROOT2_3:1)))
#define FRAC_LOG2_4(m,n) ((m<=n*IROOT2_4?256:0)+FRAC_LOG2_5(m,n*(m<=n*IROOT2_4?IROOT2_4:1)))
#define FRAC_LOG2_5(m,n) ((m<=n*IROOT2_5?128:0)+FRAC_LOG2_6(m,n*(m<=n*IROOT2_5?IROOT2_5:1)))
#define FRAC_LOG2_6(m,n) ((m<=n*IROOT2_6?64:0)+FRAC_LOG2_7(m,n*(m<=n*IROOT2_6?IROOT2_6:1)))
#define FRAC_LOG2_7(m,n) ((m<=n*IROOT2_7?32:0)+FRAC_LOG2_8(m,n*(m<=n*IROOT2_7?IROOT2_7:1)))
#define FRAC_LOG2_8(m,n) ((m<=n*IROOT2_8?16:0)+FRAC_LOG2_9(m,n*(m<=n*IROOT2_8?IROOT2_8:1)))
#define FRAC_LOG2_9(m,n) ((m<=n*IROOT2_9?8:0)+FRAC_LOG2_A(m,n*(m<=n*IROOT2_9?IROOT2_9:1)))
#define FRAC_LOG2_A(m,n) ((m<=n*IROOT2_A?4:0)+FRAC_LOG2_B(m,n*(m<=n*IROOT2_A?IROOT2_A:1)))
#define FRAC_LOG2_B(m,n) ((m<=n*IROOT2_B?2:0)+FRAC_LOG2_C(m,n*(m<=n*IROOT2_B?IROOT2_B:1)))
#define FRAC_LOG2_C(m,n) (m<=n*IROOT2_C?1:0)
#define FRAC_LOG2(n) (BIT_SCAN_REV(n) + FRAC_LOG2_1(1<<BIT_SCAN_REV(n), n))
It's not exactly cheap, of course - for a 2-digit number, it expands to about 700kb of code that the compiler has to dig through, but it has precision of over 5 fractional decimal digits.
A work around is to store the integral result of BIT_SCAN_REV in an enum, so that it's just a couple letters instead of about 170:
enum {
input = 36,
bsr = BIT_SCAN_REV(input),
bsr_ = 1<<bsr_,
};
static const float output = bsr + FRAC_LOG2_1(bsr_, input);
Another way of doing this at a much lower memory cost, without recursive macros, would require an include file to be used any time a value is to be computed.

Mapping [-1,+1] floats to Q31 fixed-point

I need to convert float to Q31 fixed-point, Q31 meaning 1 sign bit, 0 bits for integer part, and 31 bits for fractional part. This means that Q31 can only represent numbers in the range [-1,0.9999].
By definition, when converting from float to fixed-point, a multiplication by 2ˇN is done, where N is the fractional part size, in this case 31.
However, I got confused with this code, it doesn't look right, but works:
#define q31_float_to_int(x) ( (int) ( (float)(x)*(float)0x7FFFFFFF ) )
And it seems to work OK. For example:
int a = q31_float_to_int(0.5f);
gives Hex: 0x40000000, which is OK.
Why is the multipication here done with 2ˇ31 - 1, and not just 2ˇ31?
The code above is not a good solution to convert from float to fixed point. I am guessing whoever wrote the code used the scale factor of 0x7FFFFFFF to avoid an overflow when the input is 1.0. The correct scaling factor is 2^31 and not 2^31 - 1. Note that there are also precision issues when converting a float (with 24 bits of precision) to an Q1.31 (with 31 bits of precision). Consider saturating the input data before multiplication:
const float Q31_MAX_F = 0x0.FFFFFFp0F;
const float Q31_MIN_F = -1.0F;
float clamped = fmaxf(fminf(input, Q31_MAX_F), Q31_MIN_F);
The code above will clamp input to the range of [-1.0, 1.0). The constantQ31_MAX_F is approximately 1 - (2 ^ -24), considering 24-bits of precision, and Q31_MIN_F is -1. Then you can multiply clamped by 2^31, or even better, use scalbnf, or ldexpf:
int result = (int) scalbnf(clamped, 31);
And if you want rounding:
int result = (int) roundf(scalbnf(clamped, 31)));
I recently had to use STM32's CORDIC for hardware-accelerated trigonometry, and left unsatisfied with the accepted answer (and everything else I found on the web), I came up with a simpler (but slightly less precise) algorithm for Q31/F32 conversion:
#define Q31_SCALAR (float)M_PI
#define F32_TO_Q31(F) (int32_t)((fmodf((F)+Q31_SCALAR,2.f*Q31_SCALAR) + ((F)<-Q31_SCALAR?Q31_SCALAR:-Q31_SCALAR)) * ((float)(INT32_MAX+1u)/Q31_SCALAR))
#define Q31_TO_F32(Q) ((int32_t)(Q) / (float)(INT32_MAX+1u))
#define CORDIC_COS_SIN(RAD,COS_VAR,SIN_VAR) { hcordic.Instance->WDATA = F32_TO_Q31(RAD); \
(COS_VAR) = Q31_TO_F32(hcordic.Instance->RDATA); (SIN_VAR) = Q31_TO_F32(hcordic.Instance->RDATA); }
This will map floats from [-π, +π] to approximately [INT32_MIN, INT32_MAX[. If the input value is out of range, it will be "wrapped" back into that range (e.g. -5.9π will be treated as 0.1π).
If instead you want to map [-1, +1] as per the original question, simply use the following:
#define Q31_SCALAR 1.f

How to avoid branching in C for this operation

Is there a way to remove the following if-statement to check if the value is below 0?
int a = 100;
int b = 200;
int c = a - b;
if (c < 0)
{
c += 3600;
}
The value of c should lie between 0 and 3600. Both a and b are signed. The value of a also should lie between 0 and 3600. (yes, it is a counting value in 0.1 degrees). The value gets reset by an interrupt to 3600, but if that interrupt comes too late it underflows, which is not of a problem, but the software should still be able to handle it. Which it does.
We do this if (c < 0) check at quite some places where we are calculating positions. (Calculating a new position etc.)
I was used to pythons modulo operator to use the signedness of the divisor where our compiler (C89) is using the dividend signedness.
Is there some way to do this calculation differently?
example results:
a - b = c
100 - 200 = 3500
200 - 100 = 100
Good question! How about this?
c += 3600 * (c < 0);
This is one way we preserve branch predictor slots.
What about this (assuming 32-bit ints):
c += 3600 & (c >> 31);
c >> 31 sets all bits to the original MSB, which is 1 for negative numbers and and 0 for others in 2-complement.
Negative number shift right is formally implementation-defined according to C standard documents, however it's almost always implemented with MSB copying (common processors can do it in a single instruction).
This will surely result in no branches, unlike (c < 0) which might be implemented with branch in some cases.
Why are you worried about the branch? [Reason explained in comments to the question.]
The alternative is something like:
((a - b) + 3600) % 3600
This assumes a and b are in the range 0..3600 already; if they're not under control, the more general solution is the one Drew McGowen suggests:
((a - b) % 3600 + 3600) % 3600
The branch miss has to be very expensive to make that much calculation worthwhile.
#skjaidev showed how to do it without branching. Here's how to automatically avoid multiplication as well when ints are twos-complement:
#if ((3600 & -0) == 0) && ((3600 & -1) == 3600)
c += 3600 & -(c < 0);
#else
c += 3600 * (c < 0);
#endif
What you want to do is modular arithmetic. Your 2's complement machine already does this with integer math. So, by mapping your values into 2's complement arithmetic, you can get the modolo operation free.
The trick is represent your angle as a fraction of 360 degrees between 0 and 1-epsilon. Of course, then your constant angles would have to represented similarly, but that shouldn't be hard; its just a bit of math we can hide in a conversion function (er, macro).
The value in this idea is that if you add or subtract angles, you'll get a value whose fraction part you want, and whose integer part you want to throw away. If we represent the fraction as a 32 bit fixed point number with the binary point at 2^32 (e.g., to the left of what is normally considered to be a sign bit), any overflows of the fraction simply fall off the top of the 32 bit value for free. So, you do all integer math, and "overflow" removal happens for free.
So I'd rewrite your code (preserving the idea of degrees times 10):
typedef unsigned int32 angle; // angle*3600/(2^32) represents degrees
#define angle_scale_factor 1193046.47111111 // = 2^32/3600
#define make_angle(degrees) (unsigned int32)((degrees%3600)*angle_scale_factor )
#define make_degrees(angle) (angle/(angle_scale_factor*10)) // produces float number
...
angle a = make_angle(100); // compiler presumably does compile-time math to compute 119304647
angle b = make_angle(200); // = 238609294
angle c = a - b; // compiler should generate integer subtract, which computes 4175662649
#if 0 // no need for this at all; other solutions execute real code to do something here
if (c < 0) // this can't happen
{ c += 3600; } // this is the wrong representation for our variant
#endif
// speed doesn't matter here, we're doing output:
printf("final angle %f4.2 = \n", make_degrees(c)); // should print 350.00
I have not compiled and run this code.
Changes to make this degrees times 100 or times 1 are pretty easy; modify the angle_scale_factor. If you have a 16 bit machine, switching to 16 bits is similarly easy; if you have 32 bits, and you still want to only do 16 bit math, you will need to mask the value to be printed to 16 bits.
This solution has one other nice property: you've documented which variables are angles (and have funny representations). OP's original code just called them ints, but that's not what they represent; a future maintainer will get suprised by the original code, especially if he finds the subtraction isolated from the variables.

Question about round_up macro

#define ROUND_UP(N, S) ((((N) + (S) - 1) / (S)) * (S))
With the above macro, could someone please help me on understanding the "(s)-1" part, why's that?
and also macros like:
#define PAGE_ROUND_DOWN(x) (((ULONG_PTR)(x)) & (~(PAGE_SIZE-1)))
#define PAGE_ROUND_UP(x) ( (((ULONG_PTR)(x)) + PAGE_SIZE-1) & (~(PAGE_SIZE-1)) )
I know the "(~(PAGE_SIZE-1)))" part will zero out the last five bits, but other than that I'm clueless, especially the role '&' operator plays.
Thanks,
The ROUND_UP macro is relying on integer division to get the job done. It will only work if both parameters are integers. I'm assuming that N is the number to be rounded and S is the interval on which it should be rounded. That is, ROUND_UP(12, 5) should return 15, since 15 is the first interval of 5 larger than 12.
Imagine we were rounding down instead of up. In that case, the macro would simply be:
#define ROUND_DOWN(N,S) ((N / S) * S)
ROUND_DOWN(12,5) would return 10, because (12/5) in integer division is 2, and 2*5 is 10. But we're not doing ROUND_DOWN, we're doing ROUND_UP. So before we do the integer division, we want to add as much as we can without losing accuracy. If we added S, it would work in almost every case; ROUND_UP(11,5) would become (((11+5) / 5) * 5), and since 16/5 in integer division is 3, we'd get 15.
The problem comes when we pass a number that's already rounded to the multiple specified. ROUND_UP(10, 5) would return 15, and that's wrong. So instead of adding S, we add S-1. This guarantees that we'll never push something up to the next "bucket" unnecessarily.
The PAGE_ macros have to do with binary math. We'll pretend we're dealing with 8-bit values for simplicity's sake. Let's assume that PAGE_SIZE is 0b00100000. PAGE_SIZE-1 is thus 0b00011111. ~(PAGE_SIZE-1) is then 0b11100000.
A binary & will line up two binary numbers and leave a 1 anywhere that both numbers had a 1. Thus, if x was 0b01100111, the operation would go like this:
0b01100111 (x)
& 0b11100000 (~(PAGE_SIZE-1))
------------
0b01100000
You'll note that the operation really only zeroed-out the last 5 bits. That's all. But that was exactly that operation needed to round down to the nearest interval of PAGE_SIZE. Note that this only worked because PAGE_SIZE was exactly a power of 2. It's a bit like saying that for any arbitrary decimal number, you can round down to the nearest 100 simply by zeroing-out the last two digits. It works perfectly, and is really easy to do, but wouldn't work at all if you were trying to round to the nearest multiple of 76.
PAGE_ROUND_UP does the same thing, but it adds as much as it can to the page before cutting it off. It's kinda like how I can round up to the nearest multiple of 100 by adding 99 to any number and then zeroing-out the last two digits. (We add PAGE_SIZE-1 for the same reason we added S-1 above.)
Good luck with your virtual memory!
Using integer arithmetic, dividing always rounds down. To fix that, you add the largest possible number that won't affect the result if the original number was evenly divisible. For the number S, that largest possible number is S-1.
Rounding to a power of 2 is special, because you can do it with bit operations. A multiple of 2 will aways have a zero in the bottom bit, a multiple of 4 will always have zero in the bottom two bits, etc. The binary representation of a power of 2 is a single bit followed by a bunch of zeros; subtracting 1 will clear that bit, and set all the bits to the right. Inverting that value creates a bit mask with zeros in the places that need to be cleared. The & operator will clear those bits in your value, thus rounding the value down. The same trick of adding (PAGE_SIZE-1) to the original value causes it to round up instead of down.
The page rounding macros assume that `PAGE_SIZE is a power of two, such as:
0x0400 -- 1 KiB
0x0800 -- 2 KiB`
0x1000 -- 4 KiB
The value of PAGE_SIZE - 1, therefore, is all one bits:
0x03FF
0x07FF
0x0FFF
Therefore, if integers were 16 bits (instead of 32 or 64 - it saves me some typing), then the value of ~(PAGE_SIZE-1) is:
0xFC00
0xFE00
0xF000
When you take the value of x (assuming, implausibly for real life, but sufficient for the purposes of exposition, that ULONG_PTR is an unsigned 16-bit integer) is 0xBFAB, then
PAGE_SIZE PAGE_ROUND_DN(0xBFAB) PAGE_ROUND_UP(0xBFAB)
0x0400 --> 0xBC00 0xC000
0x0800 --> 0xB800 0xC000
0x1000 --> 0xB000 0xC000
The macros round down and up to the nearest multiple of a page size. The last five bits would only be zeroed out if PAGE_SIZE == 0x20 (or 32).
Based on the current draft standard (C99) this macro is not entirely correct however, note that for negative values of N the result will almost certainly be incorrect.
The formula:
#define ROUND_UP(N, S) ((((N) + (S) - 1) / (S)) * (S))
Makes use of the fact that integer division rounds down for non-negative integers and uses the S - 1 part to force it to round up instead.
However, integer division rounds towards zero (C99, Section 6.5.5. Multiplicative operators, item 6). For negative N, the correct way to 'round up' is: 'N / S', nothing more, nothing less.
It gets even more involved if S is also allowed to be a negative value, but let's not even go there... (see: How can I ensure that a division of integers is always rounded up? for a more detailed discussion of various wrong and one or two right solutions)
The & makes it so.. well ok, lets take some binary numbers.
(with 1000 being page size)
PAGE_ROUND_UP(01101b)=
01101b+1000b-1b & ~(1000b-1b) =
01101b+111b & ~(111b) =
01101b+111b & ...11000b = (the ... means 1's continuing for size of ULONG)
10100b & 11000b=
10000b
So, as you can see(hopefully) This rounds up by adding PAGE_SIZE to x and then ANDing so it cancels out the bottom bits of PAGE_SIZE that are not set
This is what I use:
#define SIGN(x) ((x)<0?-1:1)
#define ROUND(num, place) ((int)(((float)(num) / (float)(place)) + (SIGN(num)*0.5)) * (place))
float A=456.456789
B=ROUND(A, 50.0f) // 450.0
C=ROUND(A, 0.001) // 456.457

Resources