C bit field/array - c

I have to do a project for school. But I got stuck right at the beginning.
I have to define type for bitfield. It isn't a problem, it would look like this:
typedef struct {
unsigned flag1 : 1;
unsigned flag2 : 1;
}BitArray;
But next task is to do set of macros for working with bitfield. One of them is:
create(field_name,size) /* defines and initializes bitfield */
The question is how can I typedef bitfield so I could change number of its members later?
Second method that came to my mind is to use bool array. But again, how can I typedef bool array? At frist I tried:
typedef bool BitArray[]; //new identifier BitArray for bool array
BitArray Array[5]; //new BitArray variable Array - this line would be in the macro mentioned above
It didn't take me a long time to realise that it won't work. However I can do:
typedef bool BitArray;
BitArray Array[5];
But it just ranames identifier for bool.
I hope my post makes sense and thank you for any advise you can give.

The question is how can I typedef bitfield so I could change number of its members later?
That's what the typedef struct ... BitArray does. No need for additional types.
A bool array will most likely not compile into bit-wise code, it will likely compile into an array of bytes, which is not what you want.
In addition, it is a very bad idea to hide arrays or pointers behind typedefs, so that they don't look like arrays or pointers no longer.
Some recommendations:
I have to define type for bitfield
I would not recommend to use bit fields for any purpose what-so-ever. You should question your teacher why they are teaching you to use dangerous and poorly specified parts of the C language.
But next task is to do set of macros for working with bitfield
You should ask your teacher why they teach you to use function-like macros and not proper functions. Using a function-like macro might be the worst thing you can ever do in C: they are dangerous, they are unreadable, they are hard to debug, they are hard to maintain.
Combining function-like macros mixed with bit-fields seems like a really stupid idea, but of course that is just my personal opinion. The safe and 100% portable way is to use bit-wise operators with masks on byte-level variables, such as:
uint8_t my_var=0;
my_var |= 0x80; // set msb, bit 7
my_var &= ~0x80; // clear msb, bit 7

In C you can declare a typedef only for fixed-size array, like this:
typedef bool bitset8[8]; // 8 is constant expression
bitset8 bs8;
bs8[0] = true;
I don't quite understand how exactly create macro from your post is going to be used, but if you need to dynamically change number of fields you have to use malloc'ed objects ANYWAY, so the declaration of BitArray struct should contain a pointer to let's say unsigned char (that is a pointer to byte array essentially). The content of the array should be managed by separate functions, that may be called from macros (though there is no real need in them).

Related

Partial bytewise access to C enum

Setting:
I define an enum in C99:
enum MY_ENUM {TEST_ENUM_ITEM1, TEST_ENUM_ITEM2, TEST_ENUM_ITEM_MAX};
I ensure with compile time asserts that TEST_ENUM_ITEM_MAX does not exceed UINT16_MAX. I assume little endian as byte order.
I have a serialize-into-buffer function with following parameters:
PutIntoBuffer(uint8_t* src, uint32_t count);
I serialize a variable holding an value into a buffer. For this task i access the variable, holding the enum, like this:
enum MY_ENUM testVar = TEST_ENUM_ITEM;
PutIntoBuffer((uint8_t*) &testVar, sizeof(uint16_t));
Question: Is it legitimate to access the enum (which is an int) in this way? Does C standard guarantee the intended behaviour?
It is legitimate as in "it will work if int is 16 bits". It does not violate any pointer aliasing rules either, as long as you use a character type like uint8_t. (De-serializing is another story though.)
However, the code is not portable. In case int is 32 bit, the enumeration constants will turn 32 bit too, as may the enum variable itself. Then the code will turn endianess-dependent and you might end up reading garbage. Checking TEST_ENUM_ITEM_MAX against UINT16_MAX doesn't solve this.
The proper way to serialize an enum is to use a pre-generated read-only look-up table which is guaranteed to be 8 bits, like this:
#include <stdint.h>
enum MY_ENUM {TEST_ENUM_ITEM1, TEST_ENUM_ITEM2, TEST_ENUM_ITEM_MAX};
static const uint8_t MY_ENUM8 [] =
{
[TEST_ENUM_ITEM1] = TEST_ENUM_ITEM1,
[TEST_ENUM_ITEM2] = TEST_ENUM_ITEM2,
};
int main (void)
{
_Static_assert(sizeof(MY_ENUM8)==TEST_ENUM_ITEM_MAX, "Something went wrong");
}
The designated initializer syntax improves the integrity of the data, should the enum be updated during maintenance. Similarly, the static assert will ensure that the list contains the right number of items.

How to avoid pedantic warnings while using Hexadecimal in Enum?

I have an enum like this
typedef enum {
FIRST,
SECOND,
THIRD = 0X80000001,
FOURTH,
FIFTH,
} STATUS;
I am getting a pedantic warning since I am compiling my files with the option -Wpedantic:
warning: ISO C restricts enumerator values to range of 'int' [-Wpedantic]
I found that it occurs since when I convert the hex value 0X80000001 to integer it exceeds the unsigned integer limits. My purpose is to have continuous hex values as the status in the enum without this warning.
I cannot use the macros since this will defy the purpose of having the enums in the first place. What code change will avoid this warning?
Enumeration constants are guaranteed to be of the same size as (signed) int. Apparently your system uses 32 bit int, so an unsigned hex literal larger than 0x7FFFFFFF will not fit.
So the warning is not just "pedantic", it hints of a possibly severe bug. Note that -pedantic in GCC does not mean "be picky and give me unimportant warnings" but rather "ensure that my code actually follows the C standard".
It appears that you want to do a list of bit masks or hardware addresses, or some other hardware-related programming. enum is unsuitable for such tasks, because in hardware-related programming, you rarely ever want to use signed types, but always unsigned ones.
If you must have a safe and portable program, then there is no elegant way to do this. C is a language with a lot of flaws, the way enum is defined by the standard is one of them.
One work-around is to use some sort of "poor man's enum", such as:
typedef uint32_t STATUS;
#define THIRD 0X80000001
If you must also have the increased type safety of an enum, then you could possibly use a struct:
typedef struct
{
uint32_t value;
} STATUS;
Or alternatively, just declare an array of constants and use an enum to define the array index. Probably the cleanest solution but takes a little bit of extra overhead:
typedef enum {
FIRST,
SECOND,
THIRD,
FOURTH,
FIFTH,
STATUS_N
} STATUS;
const uint32_t STATUS_DATA [STATUS_N] =
{
0,
1,
0X80000001,
0X80000002,
0X80000003
};

C structure syntax

C header sample.
typedef LPVOID UKWD_USB_DEVICE;
typedef struct _UKWD_USB_DEVICE_INFO {
DWORD dwCount;
unsigned char Bus;
unsigned char Address;
unsigned long SessionId;
USB_DEVICE_DESCRIPTOR Descriptor;
} UKWD_USB_DEVICE_INFO, *PUKWD_USB_DEVICE_INFO, * LPUKWD_USB_DEVICE_INFO;
My Understanding
struct defines a structure (the part between {}). The structure's type is _UKWD_USB_DEVICE_INFO. After the closing } UKWD_USB_DEVICE_INFO is an alias to this structure.
Question
What is the purpose of the declarations after that. * PUKD_USB_DEVICE_INFO and *LPUKWD_USB_DEVICE_INFO. Do these pointer aliases mean something different if one is touching the variable and the other has a space between the * and lettering?
C typedef declarations are understood by analogy with variable declarations.
int a, *b;
declares the values a of type int and b of type int*.
typedef int A, *B;
declares the type A equivalent to int and the type B equivalent to int*. So, just think about what the variable type would be if this was a variable declaration.
So yes, PUKWD_USB_DEVICE_INFO becomes equivalent to struct _UKWD_USB_DEVICE_INFO*.
EDIT
Also, the space does not matter. C is a whitespace language. The extra aliases are not necessary, they are just there to fit with conventions of various projects and APIs that like to call pointer types by names that include P or other substrings. Sometimes these projects end up with multiple conventions over time, so there are multiple aliases. They can also be needed for compatibility reasons when APIs get updated, or between different platforms.
Are these pointer aliases?
Yes.
Does it mean anything if one is touching the variable and the other has a space between the * and lettering?
No. In C, spaces between tokens have no meaning to the compiler. They merely change the readability for people looking at the code.
I have seen very few code examples online use more than one name after the close of the curly brackets. Any insight on this?
Typically, and in this case in particular, it's done to allow symbol names that may represent different types, but also may not.
You're seeing that on your architecture, a P "pointer" and a LP "long pointer" happen to be the same type.
On a 16-bit architecture, you would be looking at a different header and those types would be different.
This style of definition is common on the Windows platform. In the days of 16 bit segmented architectures, each structure definition typedef had also 2 pointer typedefs for near and far (aka long pointers):
typedef LPVOID UKWD_USB_DEVICE;
typedef struct _UKWD_USB_DEVICE_INFO {
DWORD dwCount;
unsigned char Bus;
unsigned char Address;
unsigned long SessionId;
USB_DEVICE_DESCRIPTOR Descriptor;
} UKWD_USB_DEVICE_INFO, NEAR * PUKWD_USB_DEVICE_INFO, FAR * LPUKWD_USB_DEVICE_INFO;
NEAR pointers were 16 bit wide and FAR pointers were 32 bit wide. Most Windows APIs took FAR pointers, and their prototypes used the pointer typedefs. Incidentally, LPVOID was defined this way:
typedef void FAR *LPVOID;
32 bit Windows came out in 1995 and made this obsolete. NEAR and FAR keywords were kept for a while, defined as empty, for compatibility reasons.
Compatibility with 16 bit Windows has long become useless, but the usage still lingers as the typedefs are still in use, but the FAR and NEAR keywords were removed.
The space between * and PUKWD_USB_DEVICE_INFO is ignored, but I agree with you it is rather confusing to put one there.
Yes, it's a pointer alias, you can then use PUKWD_USB_DEVICE_INFO as UKWD_USB_DEVICE_INFO*. Most Windows structs do this:
That L in the third alias stands for long (pointer), and unless I'm much mistaken, it has no meaning in 32/64 bit code - it's likely a leftover from 16 bit stuff, as is the case with that WNDCLASS definition.

What's the benefit of encapsulating only one basic field into a struct in C?

I saw some C code like this:
// A:
typedef uint32_t in_addr_t;
struct in_addr { in_addr_t s_addr; };
And I always prefer like this:
// B:
typedef uint32_t in_addr;
So my question is: what's the difference / benefit of doing it in A from B?
It's a layer to introduce type safety, and it can be helpful 'for future expansion'.
One problem with the former is that it's easy to 'convert' a value of a type represented by a typedefed builtin to any of several other types or typedefed builtins.
consider:
typedef int t_millisecond;
typedef int t_second;
typedef int t_degrees;
versus:
// field notation could vary greatly here:
struct t_millisecond { int ms; };
struct t_second { int s; };
struct t_degrees { int f; };
In some cases, it makes it a little clearer to use a notation, and the compiler will also forbid erroneous conversions. Consider:
int a = millsecond * second - degree;
this is a suspicious program. using typedefed ints, that's a valid program. Using structs, it's ill-formed -- compiler errors will require your corrections, and you can make your intent explicit.
Using typedefs, arbitrary arithmetic and conversions may be applied, and they may be assigned to each other without warning, which can can become a burden to maintain.
Consider also:
t_second s = millisecond;
that would also be a fatal conversion.
It's just another tool in the toolbox -- use at your discretion.
Justin's answer is essentially correct, but I think some expansion is needed:
EDIT: Justin expanded his answer significantly, which makes this one somewhat redundant.
Type safety - you want to provide your users with API functions which manipulate the data, not let it just treat it as an integer. Hiding the field in a structure makes it harder to use it the wrong way, and pushes the user towards the proper API.
For future expansion - perhaps a future implementation would like to change things. Maybe add a field, or break the existing field into 4 chars. With a struct, this can be done without changing APIs.
What's your benefit? That your code won't break if implementation changes.

Enforce strong type checking in C (type strictness for typedefs)

Is there a way to enforce explicit cast for typedefs of the same type? I've to deal with utf8 and sometimes I get confused with the indices for the character count and the byte count. So it be nice to have some typedefs:
typedef unsigned int char_idx_t;
typedef unsigned int byte_idx_t;
With the addition that you need an explicit cast between them:
char_idx_t a = 0;
byte_idx_t b;
b = a; // compile warning
b = (byte_idx_t) a; // ok
I know that such a feature doesn't exist in C, but maybe you know a trick or a compiler extension (preferable gcc) that does that.
EDIT
I still don't really like the Hungarian notation in general. I couldn't use it for this problem because of project coding conventions, but I used it now in another similar case, where also the types are the same and the meanings are very similar. And I have to admit: it helps. I never would go and declare every integer with a starting "i", but as in Joel's example for overlapping types, it can be life saving.
For "handle" types (opaque pointers), Microsoft uses the trick of declaring structures and then typedef'ing a pointer to the structure:
#define DECLARE_HANDLE(name) struct name##__ { int unused; }; \
typedef struct name##__ *name
Then instead of
typedef void* FOOHANDLE;
typedef void* BARHANDLE;
They do:
DECLARE_HANDLE(FOOHANDLE);
DECLARE_HANDLE(BARHANDLE);
So now, this works:
FOOHANDLE make_foo();
BARHANDLE make_bar();
void do_bar(BARHANDLE);
FOOHANDLE foo = make_foo(); /* ok */
BARHANDLE bar = foo; /* won't work! */
do_bar(foo); /* won't work! */
You could do something like:
typedef struct {
unsigned int c_idx;
} char_idx;
typedef struct {
unsigned int b_idx;
} byte_idx;
Then you would see when you are using each:
char_idx a;
byte_idx b;
b.b_idx = a.c_idx;
Now it is more clear that they are different types but would still compile.
What you want is called "strong typedef" or "strict typedef".
Some programming languages [Rust, D, Haskell, Ada, ...] give some support for this at language level, C[++] does not. There was a proposal to include it into the language with the name "opaque typedef", but was not accepted.
The lack of language support is really not a problem though. Just wrap the type to be aliased into a new class having exactly 1 data member, of type T. Much of the repetition can be factored out by templates and macros. This simple technique is just as convenient as in the programming languages with direct support.
Use a lint. See Splint:Types and strong type check.
Strong type checking often reveals
programming errors. Splint can check
primitive C types more strictly and
flexibly than typical compilers (4.1)
and provides support a Boolean type
(4.2). In addition, users can define
abstract types that provide
information hiding (0).
In C, the only distinction between user-defined types that is enforced by the compiler is the distinction between structs. Any typedef involving distinct structs will work. Your major design question is should different struct types use the same member names? If so, you can simulate some polymorphic code using macros and other scurvy tricks. If not, you are really committed to two different representations. E.g., do you want to be able to
#define INCREMENT(s, k) ((s).n += (k))
and use INCREMENT on both byte_idx and char_idx? Then name the fields identically.
You asked about extensions. Jeff Foster's CQual is very nice, and I think it could do the job you want.
With C++11 you can use an enum class, e.g.
enum class char_idx_t : unsigned int {};
enum class byte_idx_t : unsigned int {};
The compiler will enforce an explicit cast between the two types; it is like a thin wrapper class. Unfortunately you won't have operator overloading, e.g. if you want to add two char_idx_t together you will have to cast them to unsigned int.
If you were writing C++, you could make two identically defined classes with different names that were wrappers around an unsigned int. I don't know of a trick to do what you want in C.
Use strong typedef as defined in BOOST_STRONG_TYPEDEF

Resources