efficient switch statement when using enums

efficient switch statement when using enums - c

I have an enum and a switch statement using some of the enum entries but not all and they are currently out of order too, i.e. I have the following:
enum prot_tun_stat_e {
STAT_A = 0,
STAT_B,
STAT_C,
STAT_D,
STAT_E,
STAT_F, //5
STAT_G,
STAT_H,
STAT_I,
STAT_Y,
STAT_K, //10
STAT_COUNT //must be last
} __attribute__((packed));
and then I have a switch using the following entries:
switch(var) {
case C:
break;
case D:
break
case F:
break
case G:
break
default
}
and I was wondering if I better rearranged the items in the enum to be C=1,D=2,F=3&G=4 ? Would that be more efficient?
Thanks,
Ron
Platform: PowerPC, compiler diab

If the compiler can determine that the parameter to the switch statement is limited to a small number then it can create a jump table. This table will take up less room if the values are contiguous but the difference between 4 entries or the 10 required is unlikely to matter. (And note that 0-3 is a better range than 1-4 - although the compiler can deal with this by jumping to offset n - 1).
You can check the output of the compiler to see if a jump table is being created (assuming you can read assembly!). And of course the answer to all performance questions: profile it!

I can't talk about the diab compiler, because I'm not familiar with it, but an optimizing compiler will most likely create a jump table for a switch statement over an enum. So the order wouldn't matter. Having said that, you shouldn't worry about such trivial things. Correct me if I misunderstood your question.

Related

How Switch Statement Works

How does a switch statement immediately drop to the correct location in memory? With nested if-statements, it has to perform comparisons with each one, but with a switch statement it goes directly to the correct case. How is this implemented?

There are many different ways to compile a switch statement into machine code. Here are a few:
The compiler can produce a series of tests, which is not so inefficient as only about log2(N) tests are enough to dispatch a value among N possible cases.
The compiler can produce a table of values and jump addresses, which in turn will be used by generic lookup code (linear or dichotomic, similar to bsearch()) and finally jump to the corresponding location.
If the case values are dense enough, the compiler can generate a table of jump addresses and code that checks if the switch value is within a range encompassing all case values and jump directly to the corresponding address. This is probably the implementation closest to your description: but with a switch statement it goes directly to the correct case.
Depending on the specific abilities of the target CPU, compiler settings, and the number and distribution of case values, the compiler might use one of the above approaches or another, or a combination of them, or even some other methods.
Compiler designers spend a great deal of effort trying to improve heuristics for these choices. Look at the assembly output or use an online tool such as Godbolt's Compiler Explorer to see various code generation possibilities.

How much code space does a switch statement take?

I am actually surprised I could not find this question already asked. I am wondering how much code space a switch statement takes and if using a const lookup table would be more efficient for my needs.
typedef struct container{
type1 a;
type2 b;
type3 c;
}container;
static container d;
//option A
void foo(int num)
{
void* x;
switch (num)
{
case 1:
x = &d->a;
break;
case 2:
x = &d->b;
break;
case 3:
x = &d->c;
break;
default:
x = NULL;
break;
}
// do something with x
}
// option B
const void* lookup_table[] = {
d.a,
d.b,
d.c,
NULL
};
void foo(int num)
{
void* x = lookup_table[num];
// do something with x
}
How would the switch statement break down into assembly, and how much larger would it be in code space? Is it worth using the lookup table rather than using the switch statement?

If you can rewrite the switch as a simple lookup into a lookup table, that's may be the best solution, particularly if the possible indices are dense, since it is also probably more readable. (If the possible indices are not dense, you could either waste space or use a more sophisticated lookup technique: two-level tables, hash table, binary search into a sorted list. These may be better than the switch statement, but will be less readable.) A good compiler will try hard to match the efficiency, though, and some of them will produce exactly the same code as you did.
But in the usual case that you need to more than just lookup a value, the switch statement is almost certainly better. A good compiler will compile the switch statement into one of the above mentioned strategies, and it may know more than you do about the optimal solution given the details of the target platform.
In particular, turning a switch statement into an indexed lookup of a function pointer and then calling through the function pointer is likely to be significantly slower than the switch statement because of the overhead of calling the function. With the switch statement, the compiler is likely to generate a branch table, in which the lookup code will be very similar to your handbuilt code, but what's done after the lookup is a simple branch rather than a function call.

The question has no precise meaning. An optimizing compiler is (very often) at least compiling an entire function (and often, an entire translation unit) at once.
Read this paper by R.Sayle on compiling switches. You'll learn that there are several competing strategies for that (jump tables, balanced trees, conditional moves, hash jump tables, etc....) and several of them can be combined.
Trust your optimizing compiler to make a good enough choice to compile your switch code. For GCC, compile with gcc -Wall -O2 -march=native perhaps adding -fverbose-asm -S (and/or replacing -O2 with -O3) if you want to look inside the generated assembler. Learn also about gcc -flto -O3 etc...
Of course, for benchmarking purposes and for production code, you should always ask your compiler to optimize.
Notice that as an extension (accepted also by Clang/LLVM...) GCC has labels as values (with indirect gotos). With them, you could force usage of jump tables, or have some threaded code. That won't always make your code faster (e.g. because of branch prediction).

A different way of looking at the post:
void foo(int num) { void* x; switch (num)... copes well with num outside the range 1,2,3.
void foo(int num) { void* x = lookup_table[num]; has undefined behavior when num is outside the range of 0,1,2,3.
Some might then say num range is not the issue. But that was not stated in the post. And so it is with code maintenance - lots of unstated, implied and sometimes falsely assumed conditions.
Is it worth using the lookup table rather than using the switch statement?
For worth of maintenance, I'd go with the switch().

As already stated by others, modern optimizing compilers will try themselves to choose a good strategy to compile switches into more efficient code. Hans Wennborg gave a talk at the 2015 LLVM Developers’ Meeting about the recent switch lowering improvements which gives you a short introduction to this topic.
So better let the compiler do its work and decide for the most readable solution than the one you think is most efficient.
If you want to see what code Clang produces for your switch file, you can use -S or -S -emit-llvm.

Why can't we use variables inside a case in switch construct?

If I have an integer variable like int a=4 then in the switch case If i write
int b = something;
...
switch(a)
{
case 4+b: printf("hii");
}
then why is this statement a compile-time error that variables cannot be used inside a case statement why does the compiler not subtitutes the values in place of the variables.
So basically what problem it creates for which the language developers did not include it as a proper syntax.

The initial idea of the switch control-flow statement was that it should determine the appropriate case very quickly, while potentially having a lot of cases.
A traditional implementation would use a jump table, making it an O(1) operation. The jump table is essentially an array of pointers, where each pointer contains the address of the first instruction for each case. Jumping to the appropriate case is as simple as indexing that array with the switch value and then doing a jump instruction to that address.
If the cases were allowed to contain variables, the compiler would have to emit code that first evaluates these expressions and then compares the switch value against more than one other value. If that was the case, a switch statement would be just a syntactically-sugarized version of a chain of if and else if.
switch statements are usually at the heart of any algorithm which implements a finite-state machine (like parsers), so that was a good reason to include it into the language. Most modern compilers would probably generate identical machine code for a chain of if and else if which are only testing a variable against a constant, but that wasn't the case in the early 1970s when C was conceived. Moreover, switch gives you the ability to fall-through which isn't possible in the latter arrangement.

case 2+a: doSomething();
break:
case 4-a: doSomethingElse();
break;
What do you do when a==1?
There are several possible answers, including
Run all applicable cases, in order
Run all applicable cases, in arbitrary order
Run the first applicable case
Run any one applicable case
The behaviour is undefined
Raise a well-defined error
The problem is, none of the resolutions is preferred over the others. Moreover, all run contrary to the original simple rationale of the switch statement, which is providind a high(ish) level abstraction of a fast, precomputed indexed jump table.

Because it is usually superfluous, and on a compiler level you want a jump to a fixed address. Just put the dependency of the variable in the switch expression
switch(a-b)
{
case 4: printf("hii");
}

Techniques for static code analysis in detecting integer overflows

I'm trying to find some effective techniques which I can base my integer-overflow detection tool on. I know there are many ready-made detection tools out there, but I'm trying to implement a simple one on my own, both for my personal interest in this area and also for my knowledge.
I know techniques like Pattern Matching and Type Inference, but I read that more complicated code analysis techniques are required to detect the int overflows. There's also the Taint Analysis which can "flag" un-trusted sources of data.
Is there some other technique, which I might not be aware of, which is capable of detecting integer overflows?

It may be worth to try with cppcheck static analysis tool, that claims to detect signed integer overflow as of version 1.67:
New checks:
- Detect shift by too many bits, signed integer overflow and dangerous sign conversion
Notice that it supports both C and C++ languages.
There is no overflow check for unsigned integers, as by Standard unsigned types never overflow.
Here is some basic example:
#include <stdio.h>
int main(void)
{
int a = 2147483647;
a = a + 1;
printf("%d\n", a);
return 0;
}
With such code it gets:
$ ./cppcheck --platform=unix64 simple.c
Checking simple.c...
[simple.c:6]: (error) Signed integer overflow for expression 'a+1'
However I wouldn't expect too much from it (at least with current version), as slighly different program:
int a = 2147483647;
a++;
passes without noticing overflow.

It seems you are looking for some sort of Value Range Analysis, and detect when that range would exceed the set bounds. This is something that on the face of it seems simple, but is actually hard. There will be lots of false positives, and that's even without counting bugs in the implementation.
To ignore the details for a moment, you associate a pair [lower bound, upper bound] with every variable, and do some math to figure out the new bounds for every operator. For example if the code adds two variables, in your analysis you add the upper bounds together to form the new upper bound, and you add the lower bounds together to get the new lower bound.
But of course it's not that simple. Firstly, what if there is non-straight-line code? if's are not too bad, you can just evaluate both sides and then take the union of the ranges after it (which can lose information! if two ranges have a gap in between, their union will span the gap). Loops require tricks, a naive implementation may run billions of iterations of analysis on a loop or never even terminate at all. Even if you use an abstract domain that has no infinite ascending chains, you can still get into trouble. The keywords to solve this are "widening operator" and (optionally, but probably a good idea) "narrowing operator".
It's even worse than that, because what's a variable? Your regular local variable of scalar type that never has its address taken isn't too bad. But what about arrays? Now you don't even know for sure which entry is being affected - the index itself may be a range! And then there's aliasing. That's far from a solved problem and causes many real world tools to make really pessimistic assumptions.
Also, function calls. You're going to call functions from some context, hopefully a known one (if not, then it's simple: you know nothing). That makes it hard, not only is there suddenly a lot more state to keep track of at the same time, there may be several places a function could be called from, including itself. The usual response to that is to re-evaluate that function when a range of one of its arguments has been expanded, once again this could take billions of steps if not done carefully. There also algorithms that analyze a function differently for different context, which can give more accurate results, but it's easy to spend a lot of time analyzing contexts that aren't different enough to matter.
Anyway if you've made it this far, you could read Accurate Static Branch Prediction by Value Range Propagation and related papers to get a good idea of how to actually do this.
And that's not all. Considering only the ranges of individual variables without caring about the relationships between (keyword: non-relational abstract domain) them does bad on really simple (for a human reader) things such as subtracting two variables that always close together in value, for which it will make a large range, with the assumption that they may be as far apart as their bounds allow. Even for something trivial such as
; assume x in [0 .. 10]
int y = x + 2;
int diff = y - x;
For a human reader, it's pretty obvious that diff = 2. In the analysis described so far, the conclusions would be that y in [2 .. 12] and diff in [-8, 12]. Now suppose the code continues with
int foo = diff + 2;
int bar = foo - diff;
Now we get foo in [-6, 14] and bar in [-18, 22] even though bar is obviously 2 again, the range doubled again. Now this was a simple example, and you could make up some ad-hoc hacks to detect it, but it's a more general problem. This effect tends to blow up the ranges of variables quickly and generate lots of unnecessary warnings. A partial solution is assigning ranges to differences between variables, then you get what's called a difference-bound matrix (unsurprisingly this is an example of a relational abstract domain). They can get big and slow for interprocedual analysis, or if you want to throw non-scalar variables at them too, and the algorithms start to get more complicated. And they only get you so far - if you throw a multiplication in the mix (that includes x + x and variants), things still go bad very fast.
So you can throw something else in the mix that can handle multiplication by a constant, see for example Abstract Domains of Affine Relations⋆ - this is very different from ranges, and won't by itself tell you much about the ranges of your variables, but you could use it to get more accurate ranges.
The story doesn't end there, but this answer is getting long. I hope this does not discourage you from researching this topic, it's a topic that lends itself well to starting out simple and adding more and more interesting things to your analysis tool.

Checking integer overflows in C:
When you add two 32-bit numbers and get a 33-bit result, the lower 32 bits are written to the destination, with the highest bit signaled out as a carry flag. Many languages including C don't provide a way to access this 'carry', so you can use limits i.e. <limits.h>, to check before you perform an arithmetic operation. Consider unsigned ints a and b :
if MAX - b < a, we know for sure that a + b would cause an overflow. An example is given in this C FAQ.
Watch out: As chux pointed out, this example is problematic with signed integers, because it won't handle MAX - b or MIN + b if b < 0. The example solution in the second link (below) covers all cases.
Multiplying numbers can cause an overflow, too. A solution is to double the length of the first number, then do the multiplication. Something like:
(typecast)a*b
Watch out: (typecast)(a*b) would be incorrect because it truncates first then typecasts.
A detailed technique for c can be found HERE. Using macros seems to be an easy and elegant solution.

I'd expect Frama-C to provide such a capability. Frama-C is focused on C source code, but I don't know if it is dialect-sensitive or specific. I believe it uses abstract interpretation to model values. I don't know if it specifically checks for overflows.
Our DMS Software Reengineering Toolkit has variety of langauge front ends, including most major dialects of C. It provides control and data flow analysis, and also abstract interpretation for computing ranges, as foundations on which you can build an answer. My Google Tech Talk on DMS at about 0:28:30 specifically talks about how one can use DMS's abstract interpretation on value ranges to detect overflow (of an index on a buffer). A variation on checking the upper bound on array sizes is simply to check for values not exceeding 2^N. However, off the shelf DMS does not provide any specific overflow analysis for C code. There's room for the OP to do interesting work :=}

#defined bitflags and enums - peaceful coexistence in "c"

I have just discovered the joy of bitflags. I have several questions related to "best-practices" regarding the use of bitflags in C. I learned everything from various examples I found on the web but still have questions.
In order to save space, I am using a single 32bit integer field in a struct (A->flag) to represent several different sets of boolean properties. In all, 20 different bits are #defined. Some of these are truly presence/absence flags (STORAGE-INTERNAL vs. STORAGE-EXTERNAL). Others have more than two values (e.g. mutually exclusive set of formats: FORMAT-A, FORMAT-B, FORMAT-C). I have defined macros for setting specific bits (and simultaneously turning off mutually exclusive bits). I have also defined macros for testing if specific combination of bits are set in the flag.
However, what is lost in the above approach is the specific grouping of flags that is best captured by enums. For writing functions, I would like to use enums (e.g., STORAGE-TYPE and FORMAT-TYPE), so that function definitions look nice. I expect to use enums only for passing parameters and #defined macros for setting and testing flags.
(a) How do I define flag (A->flag) as a 32 bit integer in a portable fashion (across 32 bit / 64 bit platforms)?
(b) Should I worry about potential size differences in how A->flag vs. #defined constants vs. enums are stored?
(c) Am I making things unnecessarily complicated, meaning should I just stick to using #defined constants for passing parameters as ordinary ints? What else should I worry about in all this?
I apologize for the poorly articulated question. It reflects my ignorance about potential issues.

There is a C99 header that was intended to solve that exact problem (a) but for some reason Microsoft doesn't implement it. Fortunately, you can get <stdint.h> for Microsoft Windows here. Every other platform will already have it. The 32-bit int types are uint32_t and int32_t. These also come in 8, 16, and 64- bit flavors.
So, that takes care of (a).
(b) and (c) are kind of the same question. We do make assumptions whenever we develop something. You assume that C will be available. You assume that <stdint.h> can be found somewhere. You could always assume that int was at least 16 bits and now a >= 32 bit assumption is fairly reasonable.
In general, you should try to write conforming programs that don't depend on layout, but they will make assumptions about word length. You should worry about performance at the algorithm level, that is, am I writing something that is quadratic, polynomial, exponential?
You should not worry about performance at the operation level until (a) you notice a performance lag, and (b) you have profiled your program. You need to get your job done without bogging down worrying about individual operations. :-)
Oh, I should add that you particularly don't need to worry about low level operation performance when you are writing the program in C in the first place. C is the close-to-the-metal go-as-fast-as-possible language. We routinely write stuff in php, python, ruby, or lisp because we want a powerful language and the CPU's are so fast these days that we can get away with an entire interpreter, never mind a not-perfect choice of bit-twiddle-word-length ops. :-)

You can use bit-fields and let the compiler do the bit twiddling. For example:
struct PropertySet {
unsigned internal_storage : 1;
unsigned format : 4;
};
int main(void) {
struct PropertySet x;
struct PropertySet y[10]; /* array of structures containing bit-fields */
if (x.internal_storage) x.format |= 2;
if (y[2].internal_storage) y[2].format |= 2;
return 0;
}
Edited to add array of structures

As others have said, your problem (a) is resolvable by using <stdint.h> and either uint32_t or uint_least32_t (if you want to worry about Burroughs mainframes which have 36-bit words). Note that MSVC does not support C99, but #DigitalRoss shows where you can obtain a suitable header to use with MSVC.
Your problem (b) is not an issue; C will type convert safely for you if it is necessary, but it probably isn't even necessary.
The area of most concern is (c) and in particular the format sub-field. There, 3 values are valid. You can handle this by allocating 3 bits and requiring that the 3-bit field is one of the values 1, 2, or 4 (any other value is invalid because of too many or too few bits set). Or you could allocate a 2-bit number, and specify that either 0 or 3 (or, if you really want to, one of 1 or 2) is invalid. The first approach uses one more bit (not currently a problem since you're only using 20 of 32 bits) but is a pure bitflag approach.
When writing function calls, there is no particular problem writing:
some_function(FORMAT_A | STORAGE_INTERNAL, ...);
This will work whether FORMAT_A is a #define or an enum (as long as you specify the enum value correctly). The called code should check whether the caller had a lapse in concentration and wrote:
some_function(FORMAT_A | FORMAT_B, ...);
But that is an internal check for the module to worry about, not a check for users of the module to worry about.
If people are going to be switching bits in the flags member around a lot, the macros for setting and unsetting the format field might be beneficial. Some might argue that any pure boolean fields barely need it, though (and I'd sympathize). It might be best to treat the flags member as opaque and provide 'functions' (or macros) to get or set all the fields. The less people can get wrong, the less will go wrong.
Consider whether using bit-fields works for you. My experience is that they lead to big code and not necessarily very efficient code; YMMV.
Hmmm...nothing very definitive here, so far.
I would use enums for everything because those are guaranteed to be visible in a debugger where #define values are not.
I would probably not provide macros to get or set bits, but I'm a cruel person at times.
I would provide guidance on how to set the format part of the flags field, and might provide a macro to do that.
Like this, perhaps:
enum { ..., FORMAT_A = 0x0010, FORMAT_B = 0x0020, FORMAT_C = 0x0040, ... };
enum { FORMAT_MASK = FORMAT_A | FORMAT_B | FORMAT_C };
#define SET_FORMAT(flag, newval) (((flag) & ~FORMAT_MASK) | (newval))
#define GET_FORMAT(flag) ((flag) & FORMAT_MASK)
SET_FORMAT is safe if used accurately but horrid if abused. One advantage of the macros is that you could replace them with a function that validated things thoroughly if necessary; this works well if people use the macros consistently.

For question a, if you are using C99 (you probably are using it), you can use the uint32_t predefined type (or, if it is not predefined, it can be found in the stdint.h header file).

Regarding (c): if your enumerations are defined correctly you should be able to pass them as arguments without a problem. A few things to consider:
enumeration storage is often
compiler specific, so depending on
what kind of development you are
doing (you don't mention if it's
Windows vs. Linux vs. embedded vs.
embedded Linux :) ) you may want to
visit compiler options for enum
storage to make sure there are no
issues there. I generally agree with
the above consensus that the
compiler should cast your
enumerations appropriately - but
it's something to be aware of.
in the case that you are doing
embedded work, many static quality
checking programs such as PC Lint
will "bark" if you start getting too
clever with enums, #defines, and
bitfields. If you are doing
development that will need to pass
through any quality gates, this
might be something to keep in mind.
In fact, some automotive standards
(such as MISRA-C) get downright
irritable if you try to get trig
with bitfields.
"I have just discovered the joy of
bitflags." I agree with you - I find
them very useful.

I added comments to each answer above. I think I have some clarity. It seems enums are cleaner as it shows up in debugger and keeps fields separate. macros can be used for setting and getting values.
I have also read that enums are stored as small integers - which as I understand it, is not a problem with the boolean tests as these would be peroformed starting at the right most bits. But, can enums be used to store large integers (1 << 21)??
thanks again to you all. I have already learned more than I did two days ago!!
~Russ

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight