Moving to preprocessor to avoid indirection - c

I have an array of structure:
typedef struct s_values{
field1;
field2;
field3;
}t_values;
t_values values[5];
So, there are 5 types and each types has three fields.
To get the value for a particular type and field is got using values[type].field
I want to move away from this structure and instead use constant macros.
The goal is to have a macro
#define VALUE(type, field)
:- where type is an enum and field is just the field name
How do I go about doing that?
I was thinking something like:
#define VALUE2(type, field) type##field
#define VALUE(type, field) VALUE2(type, field)
#define type1field1 7
#define type2field2 67
....
But type is actually an enum..
Also, I am not sure if using ## beats the purpose of avoiding indirection..
Anyone has a better idea.. or help to improve the direction in which I was going?

If type is a constant value anytime you use the construct values[type].field (the only case where you can hope to replace that construct with a “constant macro”), then the compiler will access it directly. In addition, if you marked the array values as const, it will know to replace values[type].field by the value of the member field of values[type], as if you had written a constant expression. Any reasonable optimizing compiler will do this for free, you don't need to pollute the source code for this. Any reasonable C compiler should hard-code the value 67 in the code of f here, as GCC does:
typedef struct s_values{
int field1;
int field2;
int field3;
}t_values;
const t_values values[5] = {7, 67};
int f(void) {
return values[0].field2;
}
If type is not a constant value when you use values[type].field then it cannot be replaced by a “constant macro”. The mapping from type and field has to be stored somewhere, and it takes an indirection to access it where it is stored. What the compiler will do for free in this case is to add the offset corresponding to field to the address where the array value is stored, so that the address to access is computed with only one multiplication and one addition (instead of two) at run-time. Again, this is optimal.

Related

Partial bytewise access to C enum

Setting:
I define an enum in C99:
enum MY_ENUM {TEST_ENUM_ITEM1, TEST_ENUM_ITEM2, TEST_ENUM_ITEM_MAX};
I ensure with compile time asserts that TEST_ENUM_ITEM_MAX does not exceed UINT16_MAX. I assume little endian as byte order.
I have a serialize-into-buffer function with following parameters:
PutIntoBuffer(uint8_t* src, uint32_t count);
I serialize a variable holding an value into a buffer. For this task i access the variable, holding the enum, like this:
enum MY_ENUM testVar = TEST_ENUM_ITEM;
PutIntoBuffer((uint8_t*) &testVar, sizeof(uint16_t));
Question: Is it legitimate to access the enum (which is an int) in this way? Does C standard guarantee the intended behaviour?
It is legitimate as in "it will work if int is 16 bits". It does not violate any pointer aliasing rules either, as long as you use a character type like uint8_t. (De-serializing is another story though.)
However, the code is not portable. In case int is 32 bit, the enumeration constants will turn 32 bit too, as may the enum variable itself. Then the code will turn endianess-dependent and you might end up reading garbage. Checking TEST_ENUM_ITEM_MAX against UINT16_MAX doesn't solve this.
The proper way to serialize an enum is to use a pre-generated read-only look-up table which is guaranteed to be 8 bits, like this:
#include <stdint.h>
enum MY_ENUM {TEST_ENUM_ITEM1, TEST_ENUM_ITEM2, TEST_ENUM_ITEM_MAX};
static const uint8_t MY_ENUM8 [] =
{
[TEST_ENUM_ITEM1] = TEST_ENUM_ITEM1,
[TEST_ENUM_ITEM2] = TEST_ENUM_ITEM2,
};
int main (void)
{
_Static_assert(sizeof(MY_ENUM8)==TEST_ENUM_ITEM_MAX, "Something went wrong");
}
The designated initializer syntax improves the integrity of the data, should the enum be updated during maintenance. Similarly, the static assert will ensure that the list contains the right number of items.

where and how are the values in each of the members of enum stored in C?

The way structures allocate memory is that :-
struct mys
{
int a , b, c ;
};
As we can see in structs, when I declare a struct variable say, struct mys var1, var1 takes the sum of all the basic datatypes inside it. (12 bytes here assuming the word length is 4 bytes)
printf("%d",sizeof(var1)) ;
output is 4.
In enum, we have,
enum myvar{ id1 , id2 , id3 }; and whenever I declare an enum variable and print its size, it only prints the size of integer(4 bytes).
And the id1 ,id2, id3 would get 0 to 2 consecutively. So i thought it's analogous to generators in python (actual allocation of memory is just 2 bytes) and it adds 1 to each of the consecutive members of the enum type when we access it.
But, what confused me about enum is that, even though we define an enum like this :-
enum myvar{ id1 = 20 , id2 =42, id3=1 };
If I declare an enum variable enum myvar var1 , then var1 would still take 4 bytes of memory. Where are the values that I have given in the definition getting stored? Since I had given random values for the members of the enum, I thought it would allocate 6 bytes of memory since it's not the usual 0 to 2 default integers anymore after assigning them. So clearly I'm wrong. What is the reason behind this? If the size of enum is just the word length , how does it manage the memory allocation . Clear explanations please ......
An enum type in C is basically a logical grouping of named int constants. An enum variable is only guaranteed to hold a single int value, regardless of how many named constants the type contains. And there isn't a way to "enumerate" all possible enum values in C. If you are using a compiler and architecture where int is 32-bit, all values of this enum (named constants) will have to fit inside these 4 bytes.
Additionally, the variable doesn't even have to hold any of the constants defined in the enum type at runtime, it will behave as a plain int variable. Some compilers won't even throw a warning if you mix different enum types, at least not until you enable additional warnings.
So, actual values won't be stored anywhere after compilation (unless they are used), just as a #define macro won't be stored anywhere unless you reference it somehow.
So, when you write this:
// define the enum
enum my_enum {
id1 = 20,
id2 = 42,
id3 = 1
};
// declare the variable 'my_val' and assign 'id2'
enum my_enum my_val = id2;
It will be almost equivalent to:
#define id1 20
#define id2 42
#define id3 1
int my_val = id2;
And the compiler will behave as you simply wrote:
int my_val = 42;
and throw everything else away.
So, if you are asking where 42 is stored, then the answer is, somewhere between the instructions inside the code section of your executable. If you didn't use id1 and id3 anywhere, they won't exist anywhere.
If you need to store a list of values in C, you'll either need to use an array of integers, an array of structs, or a more elaborate data structure (linked list, hash table, a tree, or whatever suits your use case).
You use enum to define a set of values allowed to be used in a variable of that type; as only one of them at a time can be in there, the size is what would be needed to hold that, regardless of how many possibilities there are.
The identifiers associated to values of en enumeration only exist in translation time. They are replaced by their numerical values during compilation, so they are not held anywhere as objects in storage space.
Ulterior variables declared as having the enumeration type you defined there will have an appropriate integer type, chosen by the compiler. Thus, you can assign integer values to these objects, not previously named in the enumeration.

Forcing enum field to 8 bits in standard C with bitfields

I saw few questions here about how to force an enum to 8 or 16 bits. The common answer was that it can be done in C++11 or higher. Unfortunately, I don't have that luxury.
So, here is what I'm thinking. I only need the enum to be 8 bits when it's in a struct whose size I want to minimize. So:
Option A:
typedef enum { A, B, C, MAX = 0xFF } my_enum;
struct my_compact_struct
{
my_enum field1 : 8; // Forcing field to be 8 bits
uint8_t something;
uint16_t something_else;
};
I think most or all optimizers should be smart enough to handle the 8 bitfield efficiently.
Option B:
Not use an enum. Use typedef and constants instead.
typedef uint8_t my_type;
static const my_type A = 0;
static const my_type B = 1;
static const my_type C = 2;
struct my_compact_struct
{
my_type field1;
uint8_t something;
uint16_t something_else;
};
Option A is currently implemented and seems to be working, but since I want to do (now and in the future) what's correct now and not just what's working, I was wondering if option B is clearly better.
Thanks,
If your specific values in an enum can fit into a smaller type than an int, then a C implementation is free to choose the underlying type of the enum to be a smaller type than an int (but the type of the enum constants in this case will be int). But there is no way you can force a C compiler to use a type smaller than an int. So with this in mind and the fact that an int is at least 16 bits, you're out of luck.
But enums in C are little more than debugging aids. Just use an uint8_t type if you compiler has it:
static const uint8_t something = /*some value*/
If not then use a char and hope that CHAR_BIT is 8.
Option B would be best. You'd be defining the type in question to be a known size, and the const values you define will also be the correct size.
While you would lose out on the implicit numbering of an enum, the explicit sizing of the field and its values makes up for it.
Step into same problem, I solved it this by using attribute((packed)). Packet align data to 4 Bytes, with packet (int)sizeof(my_compact_struct) = 4, without packet (int)sizeof(my_compact_struct) = 8
typedef enum __attribute__((packed)){
A = 0x01
B = 0x10
C = 0x255 // Max value, or any other lower than this
} my_enum;
struct my_compact_struct __attribute__((packed)){
my_enum field1 : 8; // Forcing field to be 8 bits
uint8_t something;
uint16_t something_else;
};
You can use enum or even typedef enum like a grouped #define. Do not actually define a structure member, data stream, or global variable as type enum. Rather define all storage members as fixed types like uint8_t. As if you used #define just set them to the enum literal constants. If you use any kind of lint tool, then this design style will raise some messages which you will need to tailor. Just like malformed #define if the literal value doesn't fit, then bad things can happen, and either way you need to pay attention. In a debugger or hardware simulator, the enum can provide useful display reference information. Temporary variables are an exception to how global definitions are treated. For function parameters or automatic variables, and only then, define them with the enum type. In this context int is going to be the most efficient word size as well as the standard behavior of enum. There is no error possible nor hyper-optimizing you can do.

The importance of c enumeration (typedef enum) [duplicate]

This question already has answers here:
typedef enum explanation in c
(5 answers)
Closed 7 years ago.
I recently saw this in an answer that was posted for me:
typedef enum
{
NO_OP,
ADDITION,
} operator_t;
int main()
{
operator_t operator = NO_OP;
}
What is typedef enum and why should we use it? I googled and found the following:
http://www.programiz.com/c-programming/c-enumeration
Right now it sounds slightly too technical for me so I don't think I understand what is going on or why anyone would use that.
Bonus (optional): What type of variable is the operator_t?
It's definitely not "too technical".
"typedef" and "enum" are two completely different things.
The basic reason to have "enums" is to avoid "magic numbers":
Let's say you have three "states": STOP, CAUTION and GO. How do you represent them in your program?
One way is to use the string literals "STOP", "CAUTION" and "GO". But that has a lot of problems - including the fact that you can't use them in a C "switch/case" block.
Another way is to Map" them to the integer values "0", "1" and "2". This has a lot of benefits. But seeing "STOP" in your code is a lot more meaningful than seeing a "0". Using "0" in your code like that is an example of a "magic number". Magic numbers are Bad: you want to use a "meaningful name" instead.
Before enums were introduced in the language, C programmers used macros:
#define STOP 0
#define CAUTION 1
#define GO 2
A better, cleaner approach in modern C/C++ is to use an enum instead:
enum traffic_light_states {
STOP,
CAUTION,
GO
};
Using a "typedef" just simplifies declaring a variable of this type:
typedef enum {
STOP,
CAUTION,
GO
} traffic_light_states_t ;
typedef is used to define an alternative name for an existing type. The enum could have declared like this:
enum operator_t
{
NO_OP,
ADDITION,
};
and then you could declare a variable of this type like so:
enum operator_t x = NO_OP;
This is kind of verbose so you would use typedef to define a shorter alias for this type:
typedef enum operator_t operator_t;
This defines operator_t to mean the type enum operator_t allowing you to initialize a variable like so:
operator_t x = NO_OP;
This syntax:
typedef enum
{
NO_OP,
ADDITION,
} operator_t;
does the whole process in one step, so it defines an (untagged or tagless) enum type and gives it the alias operator_t.
Bonus: operator_t is an enum data type; read more about it here: https://en.wikipedia.org/wiki/Enumerated_type
What is typedef enum and why should we use it?
There are two different things going on there: a typedef and an enumerated type (an "enum"). A typedef is a mechanism for declaring an alternative name for a type. An enumerated type is an integer type with an associated set of symbolic constants representing the valid values of that type.
Taking the enum first, the full form of an enum declaration consists of the enum keyword, followed by a tag by which that particular enum will be identified, followed by the symbolic enum constants in curly brackets. By default, the enum constants correspond to consecutive integer values, starting at zero. For example:
enum operator {
NO_OP,
ADDITION
};
As you can see, it has some similarities to a struct declaration, and like a struct declaration, variables of that enumerated type can be declared in the same statement:
enum operator {
NO_OP,
ADDITION
} op1, op2, op3;
or they can be declared later, by referencing the enum's tag:
enum operator op4, op5;
Also like a struct declaration, the tag can be omitted, in which case the enumerated type cannot be referenced elsewhere in the source code (but any declared variables of that type are still fine):
enum {
NO_OP,
ADDITION
} op1, op2, op3;
Now we get to the typedef. As I already wrote, a typedef is a means to declare an alternative name for a type. It works by putting the typedef keyword in front of something that would otherwise be a variable declaration; the symbol that would have been the variable name is then the alternative name for the type. For instance this ...
typedef unsigned long long int ull_t;
declares ull_t to be an alternative name for type unsigned long long int. The two type names can thereafter be used interchangeably (within the scope of the typedef declaration).
In your case, you have
typedef enum
{
NO_OP,
ADDITION,
} operator_t;
which declares operator_t as an alias for the tagless enumerated type given. Declaring a typedef in this way makes the enum usable elsewhere, via the typedef name, even though the enum is tagless. This is a fairly common mechanism for declaring a shorthand name for an enumerated type, and an analogous technique is common for structs, too.
Bonus (optional): What type of variable is the operator_t?
As I explained, the operator_t is not a variable, it is a type. In particular, it is an enumerated type, and the symbols NO_OP and ADDITION represent values of that type.
Typedefs for enums, structs and unions are a complete waste. They hide important information for the questionable benefit of saving a few characters to type.
Don't use them in new code.
Technically, a typedef introduce an alias, i.e. a new name for something that already exists. This means, typedefs are not new types. The type system will treat them just like the aliased type.
The downvoters may please educate themselves by, for example, reading the wonderful Peter van der Linden book Expert C Programming where the case against typedefs for enum/struct/union is made.
typedef and enum are two different concepts. You can rewrite the code like this:
enum operator
{
NO_OP,
ADDITION
};
typedef enum operator operator_t;
The first statement declares an enumeration called operator, with two values. The second statement declares that the enumeration operator is now also to be known as the type operator_t.
The syntax does allow to combine these two statements into one statement:
typedef enum operator
{
NO_OP,
ADDITION,
} operator_t;
And finally to omit a name for the enumeration, as there is a datatype for it anyway:
typedef enum
{
NO_OP,
ADDITION,
} operator_t;
Wikipedia has a good discussion of what a typedef is
typedef is a keyword in the C and C++ programming languages. The purpose of typedef is to form complex types from more-basic machine types1 and assign simpler names to such combinations. They are most often used when a standard definition or declaration is cumbersome, potentially confusing, or likely to vary from one implementation to another.
See this page for a detailed discussion of Typedef in Wikipedia
Enumerated Types allow us to create our own symbolic names for a list of related ideas.
Given the example you gave I'm guessing you can use enum to select which arithmetic operation to use for a particular set of variables.
The following example code should give you a good idea on what enum is useful for.
enum ARITHMETIC_OPERATION {ADD, SUBTRACT, MULTIPLY};
int do_arithmetic_operation(int a, int b, enum ARITHMETIC_OPERATION operation){
if(operation == ADD)
return a+b;
if(operation == SUBTRACT)
return a-b;
if(operation == MULTIPLY)
return a*b;
}
If you didn't have enum, you would do something like this instead:
#define ADD 0
#define SUBTRACT 1
#define MULTIPLY 2
int do_artithmetic_operation(int a, int b, int operation);
This alternative is less readable, because operation is not really an integer but a symbolic type that represents an arithmetic operation that is either ADD, MULTIPLY, or SUBTRACT.
The following links provide good discussions and sample code that uses Enum.
http://www.cs.utah.edu/~germain/PPS/Topics/C_Language/enumerated_types.html
http://www.cprogramming.com/tutorial/enum.html
http://cplus.about.com/od/introductiontoprogramming/p/enumeration.htm

How can I access structure fields by name at run time?

The C faqs explain it in a way, here is the link.
But I can't understand it, Somebody can explain it for me? Or give me another way?
Thanks so much!
I think this example makes the answer clear:
struct test
{
int b;
int a;
};
int main()
{
test t;
test* structp = &t;
//Find the byte offset of 'a' within the structure
int offsetf = offsetof(test, a);
//Set the value of 'a' using pointer arithmetic
*(int *)((char *)structp + offsetf) = 5;
return 0;
}
You can't, not without implementing some kind of name lookup yourself.
C doesn't have any time of name information left when the program is running.
Supporting this generally for different struct field types is complicated.
If you have your binary compiled with debug information, you can use it to lookup names at runtime. For example gcc (typically) produces debug info in DWARF format, and you can use libdwarf to process it.
In case of DWARF you can find your field in DW_TAG_member node, DW_AT_data_member_location attribute will give you the field's offset, same as you get from offsetof() at compile time.
If a structure is defined using a struct {...} definition, it is unlikely that there will be any information in the executable code related to member names. Some platforms build "debug" information into generated executable files, and there may be some means by which a running program could retrieve that information, but there's no common way to do such things.
What one may be able to do, however, is use macros to define a structure. For example, one could define:
#define MAKE_ACME_STRUCT \
FIELD(id,int,23) \
X FIELD(name,char30,"Untitled") \
X FIELD(info,int,19) \
// LEAVE THIS COMMENT HERE
and then invoke the MAKE_ACME_STRUCT macro various times, with the FIELD and X macros defined different ways, so that it would expand either to a struct statement, or an initialization expression for a "default" instance of that struct, or as an initialization expression for an array of items describing the struct fields [e.g. something like
STRUCT_INFO acme_struct_info[] = {
{"id", STRUCT_INFO_TYPE_int, sizeof(ACME_STRUCT.id), offsetof(ACME_STRUCT.id)}
,{"name", STRUCT_INFO_TYPE_char30, sizeof(ACME_STRUCT.name), offsetof(ACME_STRUCT.name)}
,{"info", STRUCT_INFO_TYPE_int, sizeof(ACME_STRUCT.info), offsetof(ACME_STRUCT.info)}
,{0}};
It would be necessary that all types used within the struct have single-token names, and that for each such name, an identifier STRUCT_INFO_TYPE_nameGoesHere be defined which identifies the type to a run-time library in some form that it understands.
Such macros are hardly beautiful, but they have the advantage of ensuring that all the things they're used to define remain in sync [e.g. ensuring that adding or removing an element of acme_struct will cause it to be added or removed from the list of struct members stored in acme_struct_info].
Keep track of the field offsets as computed using the offsetof() macro. If structp is a pointer to an instance of the structure, and field f is an int having offset offsetf, f's value can be set indirectly with
*(int *)((char *)structp + offsetf) = value;

Resources