I am not exactly sure what to title this. But, this is the third time I have seen this phenomena in a C macro.
#define sigemptyset(what) (*(what) = 0, 0)
^
eh? why not just ((*what) = 0)
Is there a point to that extra zero? To my understanding (1, 0, 0, 0), for example, would just evaluate to 1 (the first entry).
The comma operator yields the value of its right operand, not the left operand.
UPDATE: as pointed out by #Kninnug in the comments, sigemptyset is a POSIX function specified to return an int (specifications: here). By using (*(what) = 0, 0) it guarantees that the macro yields an int even if *(what) is of another type than int (the sigemptyset argument should be of type sigset_t).
Two things come to my mind with your macro definition:
If it had (void *) 0 as the right , operand, it could mimic a function that returns a null pointer:
#define sigemptyset(what) (*(what) = 0, (void *) 0)
int *p = sigemptyset(q);
It can be useful if all your other sig functions are supposed to return pointers.
The second thing that comes to my mind is to allow debugging by changing the 0 with a printf call when needed:
#define sigemptyset(what) (*(what) = 0, printf("sigemptyset\n"))
Related
This question already has answers here:
What does the comma operator , do?
(8 answers)
Closed 7 days ago.
Following snippet is from mac sdk signal.h:
#define sigemptyset(set) (*(set) = 0, 0)
just wonder what does , 0) do?
The macro expands to a comma expression: the left operand, evaluated first, sets the object pointed to by set to 0, then the right operand is evaluated and its value is the value of the whole expression, hence: 0.
In other words, the macro behaves like a function that always succeeds, success indicated by the return value 0.
Except for the type checking, the macro is equivalent to:
#include <signal.h>
int sigemptyset(sigset_t *set) {
*set = 0;
return 0;
}
Note that the <signal.h> header file also contains a prototype for this function:
int sigemptyset(sigset_t *);
In your code, a call sigemptyset(&sigset) invokes the macro, but you can force a reference to the library function by writing (sigemptyset)(&sigset) or by taking a pointer to the function. The macro allows for inline expansion without a change of prototype. clang can perform inline expansion at link time for small functions, but not for functions defined inside dynamic libraries.
The same trick is used for other functions in the header file, for which the , 0) is necessary for the expression to evaluate to 0 with type int:
#define sigaddset(set, signo) (*(set) |= __sigbits(signo), 0)
#define sigdelset(set, signo) (*(set) &= ~__sigbits(signo), 0)
#define sigemptyset(set) (*(set) = 0, 0)
#define sigfillset(set) (*(set) = ~(sigset_t)0, 0)
In this expression
(*(set) = 0, 0)
there is used the comma operator.
The result of the expression is 0.
As a side effect the object pointed to by the pointer set is set to 0.
From the C Standard (6.5.17 Comma operator)
2 The left operand of a comma operator is evaluated as a void
expression; there is a sequence point between its evaluation and that
of the right operand. Then the right operand is evaluated; the result
has its type and value.
void qsort (void *a, size_t n, size_t es, int (*compare)(const void *, const void *)
where a is a start of array address, n is sizeof array, es is sizeof array element.
I read the source code of qsort in C that I can't understand. the code is as follows.
#define SWAPINT(a,es) swaptype = ((char*)a- (char*)0 % sizeof(long) || \
es % sizeof(long) ? 2: es == sizeof(long)? 0 : 1
I interpret this macro by,
if(((char*)a- (char*)0)% sizeof(long))==1 || es % sizeof(long)==1)
swaptype = 2;
else if(es== sizeof(long))
swaptype = 0;
else
swaptype = 1;
But I don't understand why type conversion is implemented, (char*)a.
And what means of this line?
(char*)a- (char*)0)% sizeof(long)==1
Wherever you found that code, you probably copied it incorrectly. I found some very similar code in libutil from Canu:
c.swaptype = ((char *)a - (char *)0) % sizeof(long) || \
es % sizeof(long) ? 2 : es == sizeof(long)? 0 : 1;
This code was likely illegitimally (because the terms of the copyright license are violated) copied from FreeBSD's libc:
//__FBSDID("$FreeBSD: src/lib/libc/stdlib/qsort.c,v 1.12 2002/09/10 02:04:49 wollman Exp $");
So I'm guessing you got it from a *BSD libc implementation. Indeedd FreeBSD's quicksort implementation contains the SWAPINIT macro (not SWAPINT):
#define SWAPINIT(TYPE, a, es) swaptype_ ## TYPE = \
((char *)a - (char *)0) % sizeof(TYPE) || \
es % sizeof(TYPE) ? 2 : es == sizeof(TYPE) ? 0 : 1;
After parsing, you should find that the above code is roughly the same as
condition_one = ((char *)a - (char *)0) % sizeof(long);
condition_two = es % sizeof(long);
condition_three = es == sizeof(long);
c.swaptype = (condition_one || condition_two) ? 2 : condition_three ? 0 : 1;
Note that condition_two, as a condition, is not the same as es % sizeof(long) == 1, but rather es % sizeof(long) != 0. Aside from that, your translation was correct.
The intent of these conditions seems to be as follows:
condition_one is true when a is not long-aligned.
condition_two is true when es is not a multiple of long.
condition_three is true when es is exactly long.
As a result,
swaptype == 2 is when you don't have enough guarantees about the elements to be clever about swapping,
swaptype == 1 is intended for arrays with elements that are aligned along long boundaries (note: but not necessarily aligned as longs!), and
swaptype == 0 is intended for arrays that match the previous description, that also have elements that are also long-sized.
There is explicit type conversion in this case, because a has type void*, for which type arithmetic is undefined. However, also note that ((char *)a - (char *)0) is undefined too:
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements.
(C11 draft N1570, section 6.5.6, clause 9 on pages 93 and 94.)
It's not exactly spelled out in C11, but the null pointer is not part of the same array as the object pointed to by a, so the basic rules for pointer arithmetic are violated, so the behaviour is undefined.
The macros is trying to check for alignment portably in a language, C, which doesn't really allow for such a test. So we subtract the null pointer from our pointer to obtain an integer, then take modulus the size of a long. If the result is zero, the data is long-aligned and we can access as longs. If it is not, we can try some other scheme.
As remarked in the comments, the macro definition you present does not expand to valid C code because it involves computing (char*)0 % sizeof(long), where the left-hand operand of the % has type char *. That is not an integer type, but both operands of % are required to have integer type.
Additionally, the macro's expansion has unbalanced parentheses. That's not inherently wrong, but it makes that macro tricky to use. Furthermore, even where operator precedence yields a sensible result, usage of parentheses and extra whitespace can aid human interpretation of the code, at no penalty to execution speed, and negligible extra compilation cost.
So, I think the desired macro would be more like this:
#define SWAPINT(a,es) swaptype = ( \
((((char*)a - (char*)0) % sizeof(long)) || (es % sizeof(long))) \
? 2 \
: ((es == sizeof(long)) ? 0 : 1)) \
)
I'd consider instead writing the penultimate line as
: (es != sizeof(long))
to reduce the complexity of the expression at a slight cost to its comprehensibility. In any event, the intent appears to be to set swaptype to:
2 if a is not aligned on an n-byte boundary, where n is the number of bytes in a long, or if es is not an integer multiple of the size of a long; otherwise
1 if es is unequal to the size of a long; otherwise
0
That's similar, but not identical, to your interpretation. Note, however, that even this code has undefined behavior because of (char*)a - (char*)0. Evaluating that difference has defined behavior only if both pointers point into, or just past the end of, the same object, and (char *)0 does not point (in)to or just past the end of any object.
You asked specifically:
But I don't understand why type conversion is implemented, (char*)a.
That is performed because pointer arithmetic is defined in terms of the pointed-to type, so (1), a conforming program cannot perform arithmetic with a void *, and (2) the code wants the result of the subtraction to be in the same units as the result of the sizeof operator (bytes).
And what means of this line?
(char*)a- (char*)0)% sizeof(long)==1
That line does not appear in the macro you presented, and it is not a complete expression because of unbalanced parentheses. It appears to be trying to determine whether a points one past an n-byte boundary, where n is as defined above, but again, evaluating the pointer difference has undefined behavior. Note also that for an integer x, x % sizeof(long) == 1 evaluated in boolean context has different meaning than x % sizeof(long) evaluated in the same context. The latter makes more sense in the context you described.
I saw a program in C that had code like the following:
static void *arr[1] = {&& varOne,&& varTwo,&& varThree};
varOne: printf("One") ;
varTwo: printf("Two") ;
varThree: printf("Three") ;
I am confused about what the && does because there is nothing to the left of it. Does it evaluate as null by default? Or is this a special case?
Edit:
Added some more information to make the question/code more clear for my question.
Thank you all for the help. This was a case of the gcc specific extension.
It's a gcc-specific extension, a unary && operator that can be applied to a label name, yielding its address as a void* value.
As part of the extension, goto *ptr; is allowed where ptr is an expression of type void*.
It's documented here in the gcc manual.
You can get the address of a label defined in the current function (or
a containing function) with the unary operator &&. The value has
type void *. This value is a constant and can be used wherever a
constant of that type is valid. For example:
void *ptr;
/* ... */
ptr = &&foo;
To use these values, you need to be able to jump to one. This is done
with the computed goto statement, goto *exp;. For example,
goto *ptr;
Any expression of type void * is allowed.
As zwol points out in a comment, gcc uses && rather than the more obvious & because a label and an object with the same name can be visible simultaneously, making &foo potentially ambiguous if & means "address of label". Label names occupy their own namespace (not in the C++ sense), and can appear only in specific contexts: defined by a labeled-statement, as the target of a goto statement, or, for gcc, as the operand of unary &&.
This is a gcc extension, known as "Labels as Values". Link to gcc documentation.
In this extension, && is a unary operator that can be applied to a label. The result is a value of type void *. This value may later be dereferenced in a goto statement to cause execution to jump to that label. Also, pointer arithmetic is permitted on this value.
The label must be in the same function; or in an enclosing function in case the code is also using the gcc extension of "nested functions".
Here is a sample program where the feature is used to implement a state machine:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(void)
{
void *tab[] = { &&foo, &&bar, &&qux };
// Alternative method
//ptrdiff_t otab[] = { &&foo - &&foo, &&bar - &&foo, &&qux - &&foo };
int i, state = 0;
srand(time(NULL));
for (i = 0; i < 10; ++i)
{
goto *tab[state];
//goto *(&&foo + otab[state]);
foo:
printf("Foo\n");
state = 2;
continue;
bar:
printf("Bar\n");
state = 0;
continue;
qux:
printf("Qux\n");
state = rand() % 3;
continue;
}
}
Compiling and execution:
$ gcc -o x x.c && ./x
Foo
Qux
Foo
Qux
Bar
Foo
Qux
Qux
Bar
Foo
I'm not aware of any operator that works this way in C.
Depending on the context, the ampersand in C can mean many different things.
Address-Of operator
Right before an lvalue, e.g.
int j;
int* ptr = &j;
In the code above, ptr stores the address of j, & in this context is taking the address of any lvalue. The code below, would have made more sense to me if it was written that way.
static int varOne;
static int varTwo;
static int varThree;
static void *arr[1][8432] = { { &varOne,&varTwo, &varThree } };
Logical AND
The logical AND operator is more simple, unlike the operator above, it's a binary operator, meaning it requires a left and right operand. The way it works is by evaluating the left and right operand and returning true, iff both are true, or greater than 0 if they are not bool.
bool flag = true;
bool flag2 = false;
if (flag && flag2) {
// Not evaluated
}
flag2 = true;
if (flag && flag2) {
// Evaluated
}
Bitwise AND
Another use of the ampersand in C, is performing a bitwise AND. It's similar as the logical AND operator, except it uses only one ampersand, and performs an AND operation at the bit level.
Let's assume we have a number and that it maps to the binary representation shown below, the AND operation works like so:
0 0 0 0 0 0 1 0
1 0 0 1 0 1 1 0
---------------
0 0 0 0 0 0 1 0
In C++ land, things get more complicated. The ampersand can be placed after a type as to denote a reference type (you can think of it as a less powerful but safe kind of pointer), then things get even more complicated with 1) r-value reference when two ampersands are placed after a type. 2) Universal references when two ampersands are placed after a template type or auto deducted type.
I think your code probably compiles only in your compiler due to an extension of some sorts. I was thinking of this https://en.wikipedia.org/wiki/Digraphs_and_trigraphs#C but I doubt that's the case.
I've found C code that prints from 1 to 1000 without loops or conditionals :
But I don't understand how it works. Can anyone go through the code and explain each line?
#include <stdio.h>
#include <stdlib.h>
void main(int j) {
printf("%d\n", j);
(&main + (&exit - &main)*(j/1000))(j+1);
}
Don't ever write code like that.
For j<1000, j/1000 is zero (integer division). So:
(&main + (&exit - &main)*(j/1000))(j+1);
is equivalent to:
(&main + (&exit - &main)*0)(j+1);
Which is:
(&main)(j+1);
Which calls main with j+1.
If j == 1000, then the same lines comes out as:
(&main + (&exit - &main)*1)(j+1);
Which boils down to
(&exit)(j+1);
Which is exit(j+1) and leaves the program.
(&exit)(j+1) and exit(j+1) are essentially the same thing - quoting C99 §6.3.2.1/4:
A function designator is an expression that has function type. Except when it is the
operand of the sizeof operator or the unary & operator, a function designator with
type "function returning type" is converted to an expression that has type "pointer to
function returning type".
exit is a function designator. Even without the unary & address-of operator, it is treated as a pointer to function. (The & just makes it explicit.)
And function calls are described in §6.5.2.2/1 and following:
The expression that denotes the called function shall have type pointer to function returning void or returning an object type other than an array type.
So exit(j+1) works because of the automatic conversion of the function type to a pointer-to-function type, and (&exit)(j+1) works as well with an explicit conversion to a pointer-to-function type.
That being said, the above code is not conforming (main takes either two arguments or none at all), and &exit - &main is, I believe, undefined according to §6.5.6/9:
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; ...
The addition (&main + ...) would be valid in itself, and could be used, if the quantity added was zero, since §6.5.6/7 says:
For the purposes of these operators, a pointer to an object that is not an element of an
array behaves the same as a pointer to the first element of an array of length one with the
type of the object as its element type.
So adding zero to &main would be ok (but not much use).
It uses recursion, pointer arithmetic, and exploits the rounding behavior of integer division.
The j/1000 term rounds down to 0 for all j < 1000; once j reaches 1000, it evaluates to 1.
Now if you have a + (b - a) * n, where n is either 0 or 1, you end up with a if n == 0, and b if n == 1. Using &main (the address of main()) and &exit for a and b, the term (&main + (&exit - &main) * (j/1000)) returns &main when j is below 1000, &exit otherwise. The resulting function pointer is then fed the argument j+1.
This whole construct results in recursive behavior: while j is below 1000, main calls itself recursively; when j reaches 1000, it calls exit instead, making the program exit with exit code 1001 (which is kind of dirty, but works).
https://stackoverflow.com/a/7937813/6607497 explains it all, but for the impatient here is the equivalent (readable) code:
#include <stdio.h>
#include <stdlib.h>
void main(int j) {
printf("%d\n", j);
if (i/1000 == 0)
main(j+1);
else
exit(j+1);
}
So i guess its obvious how it works.
The only real trick being used is the "computed goto" (&main + (&exit - &main)*(j/1000)), evaluating to either main while j/1000 is zero, or exit otherwise (actually if it's 1).
Maybe also note that the program is misusing argc as j, so it will count differently when you pass arguments to the program, and it will most likely crash when you add more than 2000 parameters...
Could someone please explain what this does and how it is legal C code? I found this line in this code: http://code.google.com/p/compression-code/downloads/list, which is a C implementation of the Vitter algorithm for Adaptive Huffman Coding
ArcChar = ArcBit = 0;
From the function:
void arc_put1 (unsigned bit)
{
ArcChar <<= 1;
if( bit )
ArcChar |= 1;
if( ++ArcBit < 8 )
return;
putc (ArcChar, Out);
ArcChar = ArcBit = 0;
}
ArcChar is an int and ArcBit is an unsigned char
The value of the expression (a = b) is the value of b, so you can chain them this way. They are also right-associative, so it all works out.
Essentially
ArcChar = ArcBit = 0;
is (approximately1) the same as
ArcBit = 0;
ArcChar = 0;
since the value of the first assigment is the assigned value, thus 0.
Regarding the types, even though ArcBit is an unsigned char the result of the assignment will get widened to int.
1 It's not exactly the same, though, as R.. points out in a comment below.
ArcChar = ArcBit = 0;
The assignment is left-associative, so it's equivalent to:
ArcChar = (ArcBit = 0);
The result of ArcBit = 0 is the newly-assined value, that is - 0, so it makes sense to assign that 0 to ArcChar
It sets both variables to zero.
int i, j;
i = j = 0;
The same as writing
int i, j;
j = 0;
i = j;
or writing
int i, j;
i = 0;
j = 0;
That is just chaining of the assignment operator. The standard says in 6.5.16 Assignment operators:
An assignment operator shall have a modifiable lvalue as its left operand.
An assignment operator stores a value in the object designated by the left operand. An
assignment expression has the value of the left operand after the assignment, but is not an
lvalue. The type of an assignment expression is the type of the left operand unless the
left operand has qualified type, in which case it is the unqualified version of the type of
the left operand. The side effect of updating the stored value of the left operand shall
occur between the previous and the next sequence point.
So you may do something like:
a=b=2; // ok
But not this:
a=2=b; // error
An assignment operation (a = b) itself returns an rvalue, which can be further assigned to another lvalue; c = (a = b). In the end, both a and c will have the value of b.
It assigns ArcBit to 0, then assigns ArcChar to the value of the expression ArcBit = 0 (ie. 0)
In some languages, assignments are statements: they cause some action to take place, but they don't have a value themselves. For example, in Python1 it's forbidden to write
x = (y = 10) + 5
because the assignment y = 10 can't be used where a value is expected.
However, C is one of many languages where assignments are expressions: they produce a value, as well as any other effects they might have. Their value is the value that is being assigned. The above line of code would be legal in C.
The use of two equals signs on one line is interpreted like this:
ArcChar = (ArcBit = 0);
That is: ArcChar is beging assigned the value of ArcBit = 0, which is 0, so both variables end up being 0.
1 x = y = 0 is actually legal in Python, but it's considered a special-case of the assignment statement, and trying to do anything more complicated with assignments will fail.
Assignment in C is an expression, not statement. Also you can freely assign values of different size (unsigned char to int and vice versa). Welcome to C programming language :)
You can do this:
http://en.wikibooks.org/wiki/C_Programming/Variables
Moreover,
[a int] = 0; is possible.
[a char] = 0; is possible too.
arcbit and arcchar equals 0.
As Hasturkun said, this is due to operator associativity order
C Operator Precedence and Associativity