I have been using the Bitset class in Java and I would like to do something similar in C. I suppose I would have to do it manually as most stuff in C. What would be an efficient way to implement?
byte bitset[]
maybe
bool bitset[]
?
CCAN has a bitset implementation you can use: http://ccan.ozlabs.org/info/jbitset.html
But if you do end up implementing it yourself (for instance if you don't like the dependencies on that package), you should use an array of ints and use the native size of the computer architecture:
#define WORD_BITS (8 * sizeof(unsigned int))
unsigned int * bitarray = (int *)calloc(size / 8 + 1, sizeof(unsigned int));
static inline void setIndex(unsigned int * bitarray, size_t idx) {
bitarray[idx / WORD_BITS] |= (1 << (idx % WORD_BITS));
}
Don't use a specific size (e.g. with uint64 or uint32), let the computer use what it wants to use and adapt to that using sizeof.
Nobody mentioned what the C FAQ recommends, which is a bunch of good-old-macros:
#include <limits.h> /* for CHAR_BIT */
#define BITMASK(b) (1 << ((b) % CHAR_BIT))
#define BITSLOT(b) ((b) / CHAR_BIT)
#define BITSET(a, b) ((a)[BITSLOT(b)] |= BITMASK(b))
#define BITCLEAR(a, b) ((a)[BITSLOT(b)] &= ~BITMASK(b))
#define BITTEST(a, b) ((a)[BITSLOT(b)] & BITMASK(b))
#define BITNSLOTS(nb) ((nb + CHAR_BIT - 1) / CHAR_BIT)
(via http://c-faq.com/misc/bitsets.html)
Well, byte bitset[] seems a little misleading, no?
Use bit fields in a struct and then you can maintain a collection of these types (or use them otherwise as you see fit)
struct packed_struct {
unsigned int b1:1;
unsigned int b2:1;
unsigned int b3:1;
unsigned int b4:1;
/* etc. */
} packed;
I recommend my BITSCAN C++ library (version 1.0 has just been released). BITSCAN is specifically oriented for fast bitscan operations. I have used it to implement NP-Hard combinatorial problems involving simple undirected graphs, such as maximum clique (see BBMC algorithm, for a leading exact solver).
A comparison between BITSCAN and standard solutions STL bitset and BOOST dynamic_bitset is available here:
http://blog.biicode.com/bitscan-efficiency-at-glance/
You can give my PackedArray code a try with a bitsPerItem of 1.
It implements a random access container where items are packed at the bit-level. In other words, it acts as if you were able to manipulate a e.g. uint9_t or uint17_t array:
PackedArray principle:
. compact storage of <= 32 bits items
. items are tightly packed into a buffer of uint32_t integers
PackedArray requirements:
. you must know in advance how many bits are needed to hold a single item
. you must know in advance how many items you want to store
. when packing, behavior is undefined if items have more than bitsPerItem bits
PackedArray general in memory representation:
|-------------------------------------------------- - - -
| b0 | b1 | b2 |
|-------------------------------------------------- - - -
| i0 | i1 | i2 | i3 | i4 | i5 | i6 | i7 | i8 | i9 |
|-------------------------------------------------- - - -
. items are tightly packed together
. several items end up inside the same buffer cell, e.g. i0, i1, i2
. some items span two buffer cells, e.g. i3, i6
As usual you need to first decide what sort of operations you need to perform on your bitset. Perhaps some subset of what Java defines? After that you can decide how best to implement it. You can certainly look at the source for BitSet.java in OpenJDK for ideas.
Make it an array of unsigned int 64.
Related
I'm trying to implement a rotateRight by n function in C by only using bitwise operators.
So far, I have settled on using this.
y = x >> n
z = x << (32 - n)
g = y | z
So take for instance the value 11010011
If I were to try and `rotateRight(5):
y becomes 11111110
z becomes 01100000
Then g becomes 111111110
However the correct answer should be 10011110
This almost works, but the problem is that the right-shift copies the sign bit when I need it to perform logical shift, and so some of my answers are the negative of what they should be. How can I fix this?
Note
I am unable to use casting or unsigned types
You could shift unsigned values:
y = (int)((unsigned)x >> n);
z = x << (32 - n);
g = y | z;
Or, you could mask appropriately:
y = (x >> n) & ~(-1 << (32 - n));
z = x << (32 - n);
g = y | z;
Though #jlahd answer is correct I will try and provide a brief explanation of the difference between a logical shift right and an arithmetic shift right (another nice diagram of the difference can be found here).
Please read the links first and then if you're still confused read below:
Brief explanation of the two different shifts right
Now, if you declare your variable as int x = 8; the C compiler knows that this number is signed and when you use a shift operator like this:
int x = 8;
int y = -8;
int shifted_x, shifted_y;
shifted_x = x >> 2; // After this operation shifted_x == 2
shifted_y = y >> 2; // After this operation shifted_y == -2
The reason for this is that a shift right represents a division by a power of 2.
Now, I'm lazy so lets make int's on my hypothetical machine 8 bits so I can save myself some writing. In binary 8 and -8 would look like this:
8 = 00001000
-8 = 11111000 ( invert and add 1 for complement 2 representation )
But in computing the binary number 11111000 is 248 in decimal. It can only represent -8 if we remember that that variable has a sign...
If we want to keep the nice property of a shift where the shift represents a division by a power of 2 (this is really useful) and we want to now have signed numbers, we need to make two different types of right shifts because
248 >> 1 = 124 = 01111100
-8 >> 1 = -4 = 11111100
// And for comparison
8 >> 1 = 4 = 00000100
We can see that the first shift inserted a 0 at the front while the second shift inserted a 1. This is because of the difference between the signed numbers and unsigned numbers, in two's complement representation, when dividing by a power of 2.
To keep this nicety we have two different right shift operators for signed and unsigned variables. In assembly you can explicitly state which you wish to use while in C the compiler decides for you based on the declared type.
Code generalisation
I would write the code a little differently in an attempt to keep myself at least a little platform agnostic.
#define ROTR(x,n) (((x) >> (n)) | ((x) << ((sizeof(x) * 8) - (n))))
#define ROTR(x,n) (((x) >> (n)) | ((x) << ((sizeof(x) * 8) - (n))))
This is a little better but you still have to remember to keep the variables unsigned when using this macro. I could try casting the macro like this:
#define ROTR(x,n) (((size_t)(x) >> (n)) | ((size_t)(x) << ((sizeof(x) * 8) - (n))))
#define ROTR(x,n) (((size_t)(x) >> (n)) | ((size_t)(x) << ((sizeof(x) * 8) - (n))))
but now I'm assuming that you're never going to try and rotate an integer larger than size_t...
In order to get rid of the upper bits of the right shift which may be 1's or 0's depending on the type of shift the compiler chooses one might try the following (which satisfies your no casting requirement):
#define ROTR(x,n) ((((x) >> (n)) & (~(0u) >> (n))) | ((x) << ((sizeof(x) * 8) - (n))))
#define ROTR(x,n) ((((x) >> (n)) & (~(0u) >> (n))) | ((x) << ((sizeof(x) * 8) - (n))))
But it would not work as expected for the long type since the ~(0u) is of type unsigned int (first type which zero fits in the table) and hence restricts us to rotations that are less than sizeof(unsigned int) * 8 bits. In which case we could use ~(0ul) but that makes it of unsigned long type and this type may be inefficient on your platform and what do we do if you want to pass in a long long? We would need it to be of the same type as x and we could achieve it by doing more magical expressions like ~((x)^(x)), but we would still need to turn it into and unsigned version so lets not go there.
#MattMcNabb also points out in the comments two more problems:
our left shift operation could overflow. When operating on signed types, even though in practice it is most often the same, we need to cast the x in the left shift operation to an unsigned type, because it is undefined behavior when an arithmetic shift operation overflows (see this answer's reference to the standard). But if we cast it we will once again need to pick a suitable type for the cast because its size in bytes will act as an upper limit on what we can rotate...
We are assuming that bytes have 8 bits. Which is not always the case, and we should use CHAR_BIT instead of 8.
In which case why bother? Why not go back to the previous solution and just use the largest integer type, uintmax_t (C99), instead of size_t. But this now means that we could be penalized in performance since we might be using integers larger than the processor word and that could involve more than just one assembly instruction per arithmetic operation... Nevertheless here it is:
#define ROTR(x,n) (((uintmax_t)(x) >> (n)) | ((uintmax_t)(x) << ((sizeof(x) * CHAR_BIT) - (n))))
#define ROTR(x,n) (((uintmax_t)(x) >> (n)) | ((uintmax_t)(x) << ((sizeof(x) * CHAR_BIT) - (n))))
So really, there is likely no perfect way to do it (at least none that I can think of). You can either have it work for all types or have it be fast by only dealing with things equal to or smaller than the processor word (eliminate long long and the likes). But this is nice and generic and should adhere to the standard...
If you want fast algorithms there is a point where you need to know what machine/s you're writing code for otherwise you can't optimize.
So in the end #jlahd's solution will work better, whilst my one might help you make things more generic (at a cost).
I've tried your code on x86 Linux with gcc 4.6.3.
y = x >> n
z = x << (32 - n)
g = y | z
This works correct.If x equals 11010011 then rotateRight(5) will makes y become 00000110.">>" will not add 1.
Lets say I have an enum with bitflag options larger than the amount of bits in a standard data type:
enum flag_t {
FLAG_1 = 0x1,
FLAG_2 = 0x2,
...
FLAG_130 = 0x400000000000000000000000000000000,
};
This is impossible for several reasons. Enums are max size of 128 bits (in C/gcc on my system from experimentation), single variables are also of max size 128 bits etc.
In C you can't perform bitwise operations on arrays, though in C++ I suppose you could overload bitwise operators to do the job with a loop.
Is there any way in C other than manually remembering which flags go where to have this work for large numbers?
This is exactly what bit-fields are for.
In C, it's possible to define the following data layout :
struct flag_t
{
unsigned int flag1 : 1;
unsigned int flag2 : 1;
unsigned int flag3 : 1;
(...)
unsigned int flag130 : 1;
(...)
unsigned int flag1204 : 1; // for fun
};
In this example, all flags occupy just one bit. An obvious advantage is the unlimited number of flags. Another great advantage is that you are no longer limited to single-bit flags, you could have some multi-value flags merged in the middle.
But most importantly, testing and attribution would be a bit different, and probably simplified, as far as unit operations are concerned : you no longer need to do any masking, just access the flag directly by naming it. And by the way, use the opportunity to give these flags more comprehensive names :)
Instead of trying to assign absurdly large numbers to an enum so you can have a hundreds-of-bits-wide bitfield, let the compiler assign a normal zero-based sequence of numbers to your flag names, and simulate a wide bitfield using an array of unsigned char. You can have a 1024-bit bitfield using unsigned char bits[128], and write get_flag() and set_flag() accessor functions to mask the minor amount of extra work involved.
However, a far better piece of advice would be to look at your design again, and ask yourself "Why do I need over a hundred different flags?". It seems to me that what you really need is a redesign.
In this answer to a question related to bitflags, Bit Manipulation and Flags, I provided an example of using an unsigned char array that is an approach for very large sets of bitflags which I am moving to this posting.
This source example provides the following:
a set of Preprocessor defines for the bitflag values
a set of Preprocessor macros to manipulate bits
a couple of functions to implement bitwise operations on the arrays
The general approach for this is as follows:
create a set of defines for the flags which specify an array offset and a bit pattern
create a typedef for an unsigned char array of the proper size
create a set of functions that implement the bitwise logical operations
The Specifics from the Answer with a Few Improvements and More Exposition
Use a set of C Preprocessor defines to create a set of bitflags to be used with the array. These bitflag defines specify an offset within the unsigned char array along with the bit to manipulate.
The defines in this example are 16 bit values in which the upper byte contains the array offset and the lower byte contains the bit flag(s) for the byte of the unsigned char array whose offset is in the upper byte. Using this technique you can have arrays up to 256 elements, 256 * 8 or 2,048 bitflags, or by going from a 16 bit define to a 32 bit long you could have much more. (In the comments below bit 0 means least significant bit of a byte and bit 7 means most significant bite of a byte).
#define ITEM_FLG_01 0x0001 // array offset 0, bit 0
#define ITEM_FLG_02 0x0002 // array offset 0, bit 1
#define ITEM_FLG_03 0x0101 // array offset 1, bit 0
#define ITEM_FLG_04 0x0102 // array offset 1, bit 1
#define ITEM_FLG_05 0x0201 // array offset 2, bit 0
#define ITEM_FLG_06 0x0202 // array offset 2, bit 1
#define ITEM_FLG_07 0x0301 // array offset 3, bit 0
#define ITEM_FLG_08 0x0302 // array offset 3, bit 1
#define ITEM_FLG_10 0x0908 // array offset 9, bit 7
Next you have a set of macros to set and unset the bits along with a typedef to make it a bit easier to use. Unfortunately using a typedef with C does not provide you better type checking from the compiler but it does make it easier to use. These macros do no checking of their arguments so you might feel safer using regular functions instead.
#define SET_BIT(p,b) (*((p) + (((b) >> 8) & 0xf)) |= (b) & 0xf)
#define TOG_BIT(p,b) (*((p) + (((b) >> 8) & 0xf)) ^= (b) & 0xf)
#define CLR_BIT(p,b) (*((p) + (((b) >> 8) & 0xf)) &= ~ ((b) & 0xf))
#define TST_BIT(p,b) (*((p) + (((b) >> 8) & 0xf)) & ((b) & 0xf))
typedef unsigned char BitSet[10];
An example of using this basic framework is as follows.
BitSet uchR = { 0 };
int bValue;
SET_BIT(uchR, ITEM_FLG_01);
bValue = TST_BIT(uchR, ITEM_FLG_01);
SET_BIT(uchR, ITEM_FLG_03);
TOG_BIT(uchR, ITEM_FLG_03);
TOG_BIT(uchR, ITEM_FLG_04);
CLR_BIT(uchR, ITEM_FLG_05);
CLR_BIT(uchR, ITEM_FLG_01);
Next you can introduce a set of utility functions to do some of the bitwise operations we want to support. These bitwise operations would be analogous to the built in C operators such as bitwise Or (|) or bitwise And (&). These functions use the built in C operators to perform the designated operator on all array elements.
These particular examples of the utility functions modify one of the sets of bitflags provided. However if that is a problem, you can modify the functions to accept three arguments, one being for the result of the operation and the other two for the two sets of bitflags to use in the operation.
void AndBits(BitSet s1, const BitSet s2)
{
size_t nLen = sizeof(BitSet);
for (; nLen > 0; nLen--) {
*s1++ &= *s2++;
}
}
void OrBits(BitSet s1, const BitSet s2)
{
size_t nLen = sizeof(BitSet);
for (; nLen > 0; nLen--) {
*s1++ |= *s2++;
}
}
void XorBits(BitSet s1, const BitSet s2)
{
size_t nLen = sizeof(BitSet);
for (; nLen > 0; nLen--) {
*s1++ ^= *s2++;
}
}
If you need more than one size of a bitflags type using this approach then the most flexible approach to eliminate the typedef and just use straight unsigned char arrays of various sizes. This change would entail modifying the interface of the utility functions replacing BitSet with unsigned char pointer and unsigned char arrays where bitflag variables are defined. Along with the unsigned char pointers, you would also need to specify a length for the arrays.
You may also consider an approach similar to what is being done for text strings in Is concatenating arbitrary number of strings with nested function calls in C undefined behavior?.
I'm looking for input on the most elegant interface to put around a memory-mapped register interface where the target object is split in the register:
union __attribute__ ((__packed__)) epsr_t {
uint32_t storage;
struct {
unsigned reserved0 : 10;
unsigned ICI_IT_2to7 : 6; // TOP HALF
unsigned reserved1 : 8;
unsigned T : 1;
unsigned ICI_IT_0to1 : 2; // BOTTOM HALF
unsigned reserved2 : 5;
} bits;
};
In this case, accessing the single bit T or any of the reserved fields work fine, but to read or write the ICI_IT requires code more like:
union epsr_t epsr;
// Reading:
uint8_t ici_it = (epsr.bits.ICI_IT_2to7 << 2) | epsr.bits.ICI_IT_0to1;
// Writing:
epsr.bits.ICI_IT_2to7 = ici_it >> 2;
epsr.bits.ICI_IT_0to1 = ici_it & 0x3;
At this point I've lost a chunk of the simplicity / convenience that the bitfield abstraction is trying to provide. I considered the macro solution:
#define GET_ICI_IT(_e) ((_e.bits.ICI_IT_2to7 << 2) | _e.bits.ICI_IT_0to1)
#define SET_ICI_IT(_e, _i) do {\
_e.bits.ICI_IT_2to7 = _i >> 2;\
_e.bits.ICI_IT_0to1 = _i & 0x3;\
while (0);
But I'm not a huge fan of macros like this as a general rule, I hate chasing them down when I'm reading someone else's code, and far be it from me to inflict such misery on others. I was hoping there was a creative trick involving structs / unions / what-have-you to hide the split nature of this object more elegantly (ideally as a simple member of an object).
I don't think there's ever a 'nice' way, and actually I wouldn't rely on bitfields... Sometimes it's better to just have a bunch of exhaustive macros to do everything you'd want to do, document them well, and then rely on them having encapsulated your problem...
#define ICI_IT_HI_SHIFT 14
#define ICI_IT_HI_MASK 0xfc
#define ICI_IT_LO_SHIFT 5
#define ICI_IT_LO_MASK 0x02
// Bits containing the ICI_IT value split in the 32-bit EPSR
#define ICI_IT_PACKED_MASK ((ICI_IT_HI_MASK << ICI_IT_HI_SHIFT) | \
(ICI_IT_LO_MASK << ICI_IT_LO_SHIFT))
// Packs a single 8-bit ICI_IT value x into a 32-bit EPSR e
#define PACK_ICI_IT(e,x) ((e & ~ICI_IT_PACKED_MASK) | \
((x & ICI_IT_HI_MASK) << ICI_IT_HI_SHIFT) | \
((x & ICI_IT_LO_MASK) << ICI_IT_LO_SHIFT)))
// Unpacks a split 8-bit ICI_IT value from a 32-bit EPSR e
#define UNPACK_ICI_IT(e) (((e >> ICI_IT_HI_SHIFT) & ICI_IT_HI_MASK) | \
((e >> ICI_IT_LO_SHIFT) & ICI_IT_LO_MASK)))
Note that I haven't put type casting and normal macro stuff in, for the sake of readability. Yes, I get the irony in mentioning readability...
If you dislike macros that much just use an inline function, but the macro solution you have is fine.
Does your compiler support anonymous unions?
I find it an elegant solution which gets rid of your .bits part. It is not C99 compliant, but most compilers do support it. And it became a standard in C11.
See also this question: Anonymous union within struct not in c99?.
I am writing a Linux kernel module and I need to come up with a hashing function that takes two integers for input. Because the code runs in kernel space, none of the standard libraries are available to me.
Basically, I need a hashing function where:
hash(a, b) = c
hash(b, a) = c
Where acceptable inputs for a and b are unsigned 32-bit integers. The hashing function should return an unsigned 64-bit integer. Collision (i.e. hash(a, b) = c and hash(d, f) = c as well) is not desirable as these values will be used in a binary search tree. The result of the search is a linked list of possible results that is then iterated over where a and b are actually compared. So some collision is acceptable, but the less collisions, the less iterations required, and the faster it will run.
Performance is also of extreme importance, this lookup will be used for every packet received in a system as I am writing a firewall application (the integers are actually packet source and destination addresses). This function is used to lookup existing network sessions.
Thank you for your time.
Pseudocode of how you can do it:
if a>b
return (a << 32) | b;
else
return (b << 32) | a;
This satisfies hash(a,b) == hash(b,a), utilizes the full 64 bit space, and shouldn't have collisions ...I think :)
Be careful to not directly shift the 32bit variables. Use intermediate 64-bit buffers or inline casts instead:
uint64_t myhash(uint32_t a, uint32_t b)
{
uint64 a64 = (uint64_t) a;
uint64 b64 = (uint64_t) b;
return (a > b) ? ((a64 << 32) | b64) : ((b64 << 32) | a64);
}
#define MYHASH(a,b) ( (((UINT64) max(a,b)) << 32) | ((UINT64) min(a,b)) )
((a | b) << 32) + (a & b)
is commutative and should lead to a minimum number of collisions.
I have to think more about it though ...
How about ((uint64_t)max(a, b) << UINT64_C(32)) | (uint64_t)min(a, b))? This would avoid collisions entirely, as there is no possible overlap between inputs. I can't speak to the distribution though, as that depends on your input values.
(a ^ b) | ((a ^ ~b) <<32);
TL;DR:
Why isn't (unsigned long)(0x400253FC) equivalent to (unsigned long)((*((volatile unsigned long *)0x400253FC)))?
How can I make a macro which works with the former work with the latter?
Background Information
Environment
I'm working with an ARM Cortex-M3 processor, the LM3S6965 by TI, with their StellarisWare (free download, export controlled) definitions. I'm using gcc version 4.6.1 (Sourcery CodeBench Lite 2011.09-69). Stellaris provides definitions for some 5,000 registers and memory addresses in "inc/lm3s6965.h", and I really don't want to redo all of those. However, they seem to be incompatible with a macro I want to write.
Bit Banding
On the ARM Cortex-M3, a portion of memory is aliased with one 32-bit word per bit of the peripheral and RAM memory space. Setting the memory at address 0x42000000 to 0x00000001 will set the first bit of the memory at address 0x40000000 to 1, but not affect the rest of the word. To change bit 2, change the word at 0x42000004 to 1. That's a neat feature, and extremely useful. According to the ARM Technical Reference Manual, the algorithm to compute the address is:
bit_word_offset = (byte_offset x 32) + (bit_number × 4)
bit_word_addr = bit_band_base + bit_word_offset
where:
bit_word_offset is the position of the target bit in the bit-band memory region.
bit_word_addr is the address of the word in the alias memory region that maps to the
targeted bit.
bit_band_base is the starting address of the alias region.
byte_offset is the number of the byte in the bit-band region that contains the targeted bit.
bit_number is the bit position, 0 to 7, of the targeted bit
Implementation of Bit Banding
The "inc/hw_types.h" file includes the following macro which implements this algorithm. To be clear, it implements it for a word-based model which accepts 4-byte-aligned words and 0-31-bit offsets, but the resulting address is equivalent:
#define HWREGBITB(x, b) \
HWREGB(((unsigned long)(x) & 0xF0000000) | 0x02000000 | \
(((unsigned long)(x) & 0x000FFFFF) << 5) | ((b) << 2))
This algorithm takes the base which is either in SRAM at 0x20000000 or the peripheral memory space at 0x40000000) and ORs it with 0x02000000, adding the bit band base offset. Then, it multiples the offset from the base by 32 (equivalent to a five-position left shift) and adds the bit number.
The referenced HWREG simply performs the requisite cast for writing to a given location in memory:
#define HWREG(x) \
(*((volatile unsigned long *)(x)))
This works quite nicely with assignments like
HWREGBITW(0x400253FC, 0) = 1;
where 0x400253FC is a magic number for a memory-mapped peripheral and I want to set bit 0 of this peripheral to 1. The above code computes (at compile-time, of course) the bit offset and sets that word to 1.
What doesn't work
Unfortunately, the aforememntioned definitions in "inc/lm3s6965.h" already perform the cast done by HWREG. I want to avoid magic numbers and instead use provided definitions like
#define GPIO_PORTF_DATA_R (*((volatile unsigned long *)0x400253FC))
An attempt to paste this into HWREGBITW causes the macro to no longer work, as the cast interferes:
HWREGBITW(GPIO_PORTF_DATA_R, 0) = 1;
The preprocessor generates the following mess (indentation added):
(*((volatile unsigned long *)
((((unsigned long)((*((volatile unsigned long *)0x400253FC)))) & 0xF0000000)
| 0x02000000 |
((((unsigned long)((*((volatile unsigned long *)0x400253FC)))) & 0x000FFFFF) << 5)
| ((0) << 2))
)) = 1;
Note the two instances of
(((unsigned long)((*((volatile unsigned long *)0x400253FC)))))
I believe that these extra casts are what is causing my process to fail. The following result of preprocessing HWREGBITW(0x400253FC, 0) = 1; does work, supporting my assertion:
(*((volatile unsigned long *)
((((unsigned long)(0x400253FC)) & 0xF0000000)
| 0x02000000 |
((((unsigned long)(0x400253FC)) & 0x000FFFFF) << 5)
| ((0) << 2))
)) = 1;
The (type) cast operator has right-to-left precedence, so the last cast should apply and an unsigned long used for the bitwise arithmetic (which should then work correctly). There's nothing implicit anywhere, no float to pointer conversions, no precision/range changes...the left-most cast should simply nullify the casts to the right.
My question (finally...)
Why isn't (unsigned long)(0x400253FC) equivalent to (unsigned long)((*((volatile unsigned long *)0x400253FC)))?
How can I make the existing HWREGBITW macro work? Or, how can a macro be written to do the same task but not fail when given an argument with a pre-existing cast?
1- Why isn't (unsigned long)(0x400253FC) equivalent to (unsigned long)((*((volatile unsigned long *)0x400253FC)))?
The former is an integer literal and its value is 0x400253FCul while the latter is the unsigned long value stored in the (memory or GPIO) address 0x400253FC
2- How can I make the existing HWREGBITW macro work? Or, how can a macro be written to do the same task but not fail when given an argument with a pre-existing cast?
Use HWREGBITW(&GPIO_PORTF_DATA_R, 0) = 1; instead.