Legal uses of setjmp and GCC - c

Using GCC (4.0 for me), is this legal:
if(__builtin_expect(setjmp(buf) != 0, 1))
{
// handle error
}
else
{
// do action
}
I found a discussion saying it caused a problem for GCC back in 2003, but I would imagine that they would have fixed it by now. The C standard says that it's illegal to use setjmp unless it's one of four conditions, the relevant one being this:
one operand of a relational or equality operator with the other operand an integer constant expression, with the resulting expression being the entire controlling expression of a selection or iteration statement;
But if this is a GCC extension, can I guarantee that it will work under for GCC, since it's already nonstandard functionality? I tested it and it seemed to work, though I don't know how much testing I'd have to do to actually break it. (I'm hiding the call to __builtin_expect behind a macro, which is defined as a no-op for non-GCC, so it would be perfectly legal for other compilers.)

I think that what the standard was talking about was to account for doing something like this:
int x = printf("howdy");
if (setjmp(buf) != x ) {
function_that_might_call_longjmp_with_x(buf, x);
} else {
do_something_about_them_errors();
}
In this case you could not rely on x having the value that it was assigned in the previous line anymore. The compiler may have moved the place where x had been (reusing the register it had been in, or something), so the code that did the comparison would be looking in the wrong spot. (you could save x to another variable, and then reassign x to something else before calling the function, which might make the problem more obvious)
In your code you could have written it as:
int conditional;
conditional = setjump(buf) != 0 ;
if(__builtin_expect( conditional, 1)) {
// handle error
} else {
// do action
}
And I think that we can satisfy ourselves that the line of code that assigns the variable conditional meets that requirement.

But if this is a GCC extension, can I guarantee that it will work under for GCC, since it's already nonstandard functionality? I tested it and it seemed to work, though I don't know how much testing I'd have to do to actually break it. (I'm hiding the call to __builtin_expect behind a macro, which is defined as a no-op for non-GCC, so it would be perfectly legal for other compilers.)
You are correct, __builtin_expect should be a macro no-op for other compilers so the result is still defined.

Related

Why are statements with no effect considered legal in C?

Pardon if this question is naive. Consider the following program:
#include <stdio.h>
int main() {
int i = 1;
i = i + 2;
5;
i;
printf("i: %d\n", i);
}
In the above example, the statements 5; and i; seem totally superfluous, yet the code compiles without warnings or errors by default (however, gcc does throw a warning: statement with no effect [-Wunused-value] warning when ran with -Wall). They have no effect on the rest of the program, so why are they considered valid statements in the first place? Does the compiler simply ignore them? Are there any benefits to allowing such statements?
One benefit to allowing such statements is from code that's created by macros or other programs, rather than being written by humans.
As an example, imagine a function int do_stuff(void) that is supposed to return 0 on success or -1 on failure. It could be that support for "stuff" is optional, and so you could have a header file that does
#if STUFF_SUPPORTED
#define do_stuff() really_do_stuff()
#else
#define do_stuff() (-1)
#endif
Now imagine some code that wants to do stuff if possible, but may or may not really care whether it succeeds or fails:
void func1(void) {
if (do_stuff() == -1) {
printf("stuff did not work\n");
}
}
void func2(void) {
do_stuff(); // don't care if it works or not
more_stuff();
}
When STUFF_SUPPORTED is 0, the preprocessor will expand the call in func2 to a statement that just reads
(-1);
and so the compiler pass will see just the sort of "superfluous" statement that seems to bother you. Yet what else can one do? If you #define do_stuff() // nothing, then the code in func1 will break. (And you'll still have an empty statement in func2 that just reads ;, which is perhaps even more superfluous.) On the other hand, if you have to actually define a do_stuff() function that returns -1, you may incur the cost of a function call for no good reason.
Simple Statements in C are terminated by semicolon.
Simple Statements in C are expressions. An expression is a combination of variables, constants and operators. Every expression results in some value of a certain type that can be assigned to a variable.
Having said that some "smart compilers" might discard 5; and i; statements.
Statements with no effect are permitted because it would be more difficult to ban them than to permit them. This was more relevant when C was first designed and compilers were smaller and simpler.
An expression statement consists of an expression followed by a semicolon. Its behavior is to evaluate the expression and discard the result (if any). Normally the purpose is that the evaluation of the expression has side effects, but it's not always easy or even possible to determine whether a given expression has side effects.
For example, a function call is an expression, so a function call followed by a semicolon is a statement. Does this statement have any side effects?
some_function();
It's impossible to tell without seeing the implementation of some_function.
How about this?
obj;
Probably not -- but if obj is defined as volatile, then it does.
Permitting any expression to be made into an expression-statement by adding a semicolon makes the language definition simpler. Requiring the expression to have side effects would add complexity to the language definition and to the compiler. C is built on a consistent set of rules (function calls are expressions, assignments are expressions, an expression followed by a semicolon is a statement) and lets programmers do what they want without preventing them from doing things that may or may not make sense.
The statements you listed with no effect are examples of an expression statement, whose syntax is given in section 6.8.3p1 of the C standard as follows:
expression-statement:
expressionopt ;
All of section 6.5 is dedicated to the definition of an expression, but loosely speaking an expression consists of constants and identifiers linked with operators. Notably, an expression may or may not contain an assignment operator and it may or may not contain a function call.
So any expression followed by a semicolon qualifies as an expression statement. In fact, each of these lines from your code is an example of an expression statement:
i = i + 2;
5;
i;
printf("i: %d\n", i);
Some operators contain side effects such as the set of assignment operators and the pre/post increment/decrement operators, and the function call operator () may have a side effect depending on what the function in question does. There is no requirement however that one of the operators must have a side effect.
Here's another example:
atoi("1");
This is calling a function and discarding the result, just like the call printf in your example but the unlike printf the function call itself does not have a side effect.
Sometimes such a statements are very handy:
int foo(int x, int y, int z)
{
(void)y; //prevents warning
(void)z;
return x*x;
}
Or when reference manual tells us to just read the registers to archive something - for example to clear or set some flag (very common situation in the uC world)
#define SREG ((volatile uint32_t *)0x4000000)
#define DREG ((volatile uint32_t *)0x4004000)
void readSREG(void)
{
*SREG; //we read it here
*DREG; // and here
}
https://godbolt.org/z/6wjh_5

Saw a strange if statement in some legacy C code

What I saw in an if statement was like this.
if((var = someFunc()) == 0){
...
}
Will the statement
(var = someFunc())
always return the final value of var no matter what environment we are in?
That is just a one-line way of assigning to a variable and comparing the returned value at the same time.
You need the parentheses around the assignment because the comparison operators have higher precedence than the assignment operator, otherwise var would be assigned the value of someFunc() == 0.
This is simply wrong. var is assigned, and then its value is overwritten by a constant 0. The return value of the function is therefore lost, and the if always fails. Most compilers would probably issue a warning about that nowadays, both because of the assignment within an if and because of the impossible if that results. The right way to do what was probably intended is
if((var = someFunc()) == 0) {
(Mind you, this might also be malicious code trying to introduce a vulnerability under the guise of a common newbie mistake. There was a case recently where someone tried to smuggle a check into the Linux kernel where they assigned the UID to 0 (i.e., root) while pretending to check for being root. Didn't work, though.)
This is correct, I use it all the time
if ((f=fopen(s,"r"))==NULL)
return(fprintf(stderr,"fopen(%s,r) failed, errno=%d, %s\n",s,errno,strerror(errno)));
/* successfully opened file s, read from FILE *f as you like */
I also use it when I calloc() memory.
You're assigning the return value of someFunc (fopen or calloc in my cases) to a variable AND also testing that return value, it's a semantic shortcut assuming you'll never want to debug the assignment and the test separately.

When do we use goto *expr; in C?

| GOTO '*' expr ';'
I've never see such statements yet,anyone can bring an example?
This is so called Labels as Values and represents one of the GCC extensions.
As an example, I've applied this extension to give an answer to Printing 1 to 1000 without loop or conditionals question:
void printMe ()
{
int i = 1;
startPrintMe:
printf ("%d\n", i);
void *labelPtr = &&startPrintMe + (&&exitPrintMe - &&startPrintMe) * (i++ / 1000);
goto *labelPtr;
exitPrintMe:
}
IIRC that's a GNU-ism for tail calls. Normally you'd leave that optimization to the compiler, but it can be useful when writing kernels or embedded device drivers.
That is GCC specific. It is not standard C (either C89 or C99). (It would come in handy sometimes though, to be able to do computed gotos.)
Similar to PrintMe() already given, here is my solution using a "jump table" solving the same problem except it can do an arbitary # of operations, in this case printf(). Note that the referenced labels must be local to the function.
int print_iterate( int count )
{
int i=0;
void * jump_table[2] = { &&start_label , &&stop_label };
start_label:
printf( ++i );
// using integer division: i/count will be 0 until count is reached (then it is 1)
goto *jump_table[ i/count ];
stop_label:
return 0;
}
Like others have stated, it's a GNU C extension (https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html).
In addition to the above uses, there is a use in Bypassing the return system, and manually handling function(s)'s epiloge.
While the use cases for this are few and far between, it would be useful in writing a fully C-based Exception ABI. The Exception ABI I wrote for a very legacy platform uses these to perform Longjumps without a buffer. (Yes I do reinstate the stack frame before hand, and I make sure that the jump is safe).
Additionally, it could be used for a "JSR" finally block, like in Java prior to java 7, where prior to the return, an explicit return label is stored, and then the finally block is executed. Same prior to any exception being thrown or rethrown (the documentation does not say anything about it not being valid in GNU C++, but I would probably not use it in C++ at all).
In general, the syntax should not be used. If you need local jumps, use explicit gotos or actual control blocks, if you need non-local jumps use longjmp if you have to, and exceptions in C++ where possible
Never. It's not C. It is possible in "GNU C", but it's, as Paul commented, "one of the worst features of FORTRAN", "ported...into C", and thus should be Considered Harmful.

Why is some of my code being skipped?

My program is exhibiting some odd behavior when I step through it in the debugger. In the following excerpt, it checks pktNum != ~invPktNum and then proceeds directly the the second return 1; statement.
The debugger shows that pktNum is an unsigned char that is 0x01 and invPktNum is an unsigned char that is 0xFE.
/* Verify message integrity. */
if (pktNum != ~invPktNum) {
return 1;
}
ccrc = crc16_ccitt(msg, XModem_Block_Size);
if ( (((ccrc>>8) & 0xFF) != crcBuf[0])
|| ((ccrc & 0xFF) != crcBuf[1]) ) {
return 1;
}
The compiler has folded the two return 1 cases into the exact same code. Both if tests branch to the same assembly instruction. Each instruction can only be tagged with a single line number for the debugger, so you see that strange behavior. If you compile with -g and without -O (or even more explicitly use -O0) it will make distinct cases and things will be more clear.
Unary ! is logical-NOT. If the operand is 0 the result is 1, otherwise the result is 0. This means that !invPktNum is 0, so the if expression is true.
You are probably looking for unary ~, which is bitwise-NOT.
By the way, it may appear in a debugger as if the second return 1; is being executed rather than the first, because the compiler may have reordered the code and combined those two return 1; statements together (particularly if optimisation is enabled).
!(0xFE) is 0. Maybe what you wanted was ~(0xFE)?
Check compiler optimizations are definitely disabled for debug mode. (Just to be different to everybody else)
You're comparing an int to a bool. That's bad style to begin with, and some compilers will complain.
Maybe you mixed up ! and ~? !invPktNum will return false if invPktNum is non-false, and true if it's false. I'm pretty sure you meant ~invPktNum.

Logical value of an assignment in C

while (curr_data[1] != (unsigned int)NULL &&
((curr_ptr = (void*)curr_data[1]) || 1))
Two part question.
What will (curr_ptr = (void*)curr_data[1]) evaluate to, logically. TRUE?
Also, I know its rather hack-ish, but is the while statement legal C? I would have to go through great contortions to put the assignment elsewhere in the code, so I'd be really nice if I could leave it there, but if it's so egregious that it makes everyone's eyeballs burst into flames, I'll change it.
(curr_ptr = (void*)curr_data[1]) will evaluate to TRUE unless it is a null pointer.
Assuming that curr_data is an array of pointers, and what you want to do is to run the loop while the second of these pointers is not null, while assigning its value to curr_ptr, I would do:
while ((curr_ptr = (void*)curr_data[1]) != NULL) { ... }
To answer your questions:
It will evaluate to true if curr_ptr isn't set to NULL (i.e. curr_data[1] isn't 0).
I believe it's legal, but there are bigger problems with this line of code.
Anyway, I'm assuming you didn't write this code, because you're debating about leaving it in vs. taking it out. So I want you to find out who wrote this line of code and introduce them to a heavy blunt object.
(unsigned int)NULL is ridiculous. Why would you do this? This will probably be the same as just writing 0 (not sure if that's guaranteed by the standard).
What kind of data is in curr_data[1] if it's being cast to a pointer (and pointers are being cast to it)? If it's supposed to be holding a pointer as an integral type, you should use the type intptr_t or uintptr_t provided in <stdint.h> for that purpose (if you're compiler doesn't support C99 ptrdiff_t may be an acceptable substitute).
The || 1 at the end seems to be redundant. If curr_ptr = (void*)curr_data[1] would have evaluated to false, we would have caught that in the first condition.
It may be a pain in the ass, but seriously reconsider rewriting this line. It looks like an entry in the IOCCC.
Assignments are expressions in C, so what you have works. Changing the ; to {} means the exact same thing and is much clearer, do that change at the very least. Assignments in conditions should be avoided when you have a clearer alternative (which is usually true), but if this is clearest in this place, then use it.
The result of an assignment is the assigned-to object. a = value will do the assignment and then evaluate to a. This is used to do things like a = b = 0.
To further clean up the code, there's no need for the void cast, and if this is chars, use '\0' (the null character) instead of NULL (which is supposed to be used with pointers only).
You wouldn't have to go through "great contortions", that is completely equivalent to
while (curr_data[1]) {
curr_ptr = (void *)curr_data[1];

Resources