I was reading some code today by someone I consider to be a reasonable programmer, and I noticed they used a =~0 to set a loop quit variable.
Is there any compelling reason to do this rather than simply quit = 1;?
I'm mostly just curious, before I go ahead and change it. Thanks!
Example:
while(!quit){
...;
if(!strcmp(s, "q"))
quit=~0;
}
There is no strong reason for that, unless some other code tests quit differently, such as testing any other bit. !0 is one, but ~0 is -1 on all modern architectures.
On some architectures ~0 is faster than !0, though that should be optimized away by any decent compiler.
~0 is usually -1, while !0 is defined to be 1.
Of course, !~0 and !!0 are both 0, so there is no compelling reason to use one or the other, aside from the fact that ~0 is non-idiomatic (meaning that people won't know what the heck you're doing).
In C, ~ is the unary operator for ones compliment which flips bits to their opposite state. So technically the code works because the if() clause is only evaluating for the presence of 0 (false) or anything not 0 (true). Frankly I think this is a bit overkill for evaluating true or false. I'd only consider using it if I was truly doing evaluations on a bitwise basis. Maybe there could be an argument made for performance but I somehow doubt it.
Related
I've occasionally noticed some C code insisting on using 0 - x to get the additive complement of x, rather than writing -x. Now, I suppose these are not equivalent for types smaller in size than int (edit: Nope, apparently equivalent even then), but otherwise - is there some benefit to the former rather than the latter form?
tl;dr: 0-x is useful for scrubbing the sign of floating-point zero.
(As #Deduplicator points out in a comment:)
Many of us tend to forget that, in floating-point types, we have both a "positive zero" and a "negative zero" value - flipping the sign bit on and off leaves the same mantissa and exponent. Read more on this here.
Well, it turns out that the two expressions behave differently on positive-signed zero, and the same on negative-signed zero, as per the following:
value of x
value of 0-x
value of -x
-.0
0
0
0
0
-.0
See this on Coliru.
So, when x is of a floating-point type,
If you want to "forget the sign of zero", use 0-x.
If you want to "keep the sign of zero", use x.
For integer types it shouldn't matter.
On the other hand, as #NateEldredge points out the expressions should be equivalent on small integer types, due to integer promotion - -x translates into a promotion of x into an int, then applying the minus sign.
There is no technical reason to do this today. At least not with integers. And at least not in a way that a sane (according to some arbitrary definition) coder would use. Sure, it could be the case that it causes a cast. I'm actually not 100% sure, but in that case I would use an explicit cast instead to clearly communicate the intention.
As M.M pointed out, there were reasons in the K&R time, when =- was equivalent to -=. This had the effect that x=-y was equivalent to x=x-y instead of x=0-y. This was undesirable effect, so the feature was removed.
Today, the reason would be readability. Especially if you're writing a mathematical formula and want to point out that a parameter is zero. One example would be the distance formula. The distance from (x,y) to origo is sqrt(pow(0-x, 2), pow(0-y, 2))
I recently had an interview where I had to propose a function that checks whether all bits from a uint32_t are set or not.
I wrote the following code :
int checkStatus(uint32_t val) {
return val == UINT32_MAX;
}
I assumed it would return 0 if one bit isn't set, and 1 if both of the values are the same.
As I didn't see this solution anywhere else, I assume it's wrong. But I don't understand why. When I did it, I was thinking : "if all bits are set to 1, then the value should be the maximum unsigned integer represented otherwise not." Could you tell me if it was incorrect and why?
The problem with using UINT32_MAX here is that "maximum" is an arithmetic concept, not a bitwise logical concept. Although UINT32_MAX does, in fact, represent the (unsigned) number 0xFFFFFFFF in most scenarios, this is just a consequence of the way numbers are represented in twos-complement arithmetic.
Using UINT32_MAX here has the potential to mislead the reader, because it has this connotation of arithmetic, rather than of bitwise manipulation.
In C programming, we're all using to representing bit patterns as hex numbers. The number "0xFFFFFFFF" immediately expresses the notion that all bits are set, in a way that UNIT32_MAX does not. Similarly, it may not be appropriate to use the value "0xFFFFFFFF" to represent "minus 1" in an arithmetic expression, even if the values are, in practice, the same.
Because I make most of my living looking for errors in other people's code, I'm acutely aware of the differences in expressive powers of different ways of expressing the same thing. Sometimes we have to write inexpressive things for reasons of inefficiency; but I don't think we should do it just for the sake of it.
Suppose I have an integer that is a power of 2, eg. 1024:
int a = 1 << 10; //works with any power of 2 no.
Now I want to check whether another integer b is the same as a. Which is faster/better (especially on weak embedded systems):
if (b == a) {}
or
if (b & a) {}
?
Sorry if this is a noob question, but couldn't find an answer using the search.
edit: thanks for many insightful answers. I could select only one of them, but all of them are welcome.
These operations are not even equivalent, because a & b will be false when both a and b are 0.
So I'd suggest to express the semantics that you want (i.e. a == b) and let the compiler to the optimization.
If you then measuer that you have performance issues at that point, then you can start analyzing/optimizing...
The short answer is this - it depends on what sort of things you're comparing. However, in this case, I'll assume that you're comparing two variables to each other (as opposed to a variable and an immediate, etc.)
This website, although rather old, studied how many clock cycles different instructions took on the x86 platform. The two instructions we're interested in here are the "AND" instruction and the "CMP" instruction (which the compiler uses for & and == respectively). What we can see here is that both of these instructions take about 1/3 of a cycle - that is to say, you can execute 3 of them in 1 cycle on average. Compare this to the "DIV" instruction which (in 1996) took 23 cycles to execute.
However, this omits one important detail. An "AND" instruction is not sufficient to complete the behavior you're looking for. In fact, a brief compilation on x86_64 suggests that you need both an "AND" and a "TEST" instruction for the "&" version, while "==" simply uses the "CMP" instruction. Because all these instructions are otherwise equivalent in IPC, the "==" will in fact be slightly faster...as of 1996.
Nowadays, processors optimize so well at the bare metal layer that you're unlikely to notice a difference. That said, if you wanted to see for sure...simply write a test program and find out for yourself.
As noted above though, even in the case that you have a power of 2, these instructions are still not equivalent, since it doesn't work for 0. Well...I guess technically zero ISN'T a power of 2. :) However you want to spin it though, use "==".
An X86 CPU sets a flag according to how the result of any operation compares to zero.
For the ==, your compiler will either use a dedicated compare instruction or a subtraction, setting this flag in both cases. The if() is then implemented by a jump that is conditional on this bit.
For the &, another instructions is used, the logical bitwise and instruction. That too sets the flag appropriately. So, again, the next instruction will be the conditional branch.
So, the question boils down to: Is there a performance difference between a subtraction and a bitwise and instruction? And the answer is "no" on any sane architecture. Both instructions use the same ALU, both set the same flags, and this ALU is typically designed to perform a subtraction in a single clock cycle.
Bottom line: Write readable code, and don't try to microoptimize what cannot be optimized.
I just found legacy code which tests a flag like this:
if( some_state & SOME_FLAG )
So far, so good!
But further in code, I see an improper negation
if( ! some_state & SOME_FLAG )
My understanding is that it is interpreted as (! some_state) & SOME_FLAG which is probably a bug, and gcc logically barks with -Wlogical-not-parentheses...
Though it could eventually have worked in the past if ever !some_state was implemented as ~some_state by some legacy compiler. Does anyone know if it was possibly the case?
EDIT
sme_state is declared as int (presumably 32 bits, 2 complement on target achitecture).
SOME_FLAG is a constant set to a single bit 0x00040000, so SOME_FLAG & 1 == 0
Logical negation and bitwise negation have never been equivalent. No conforming compiler could have implemented one as the other. For example, the bitwise negation of 1 is not 0, so ~1 != !1.
It is true that the expression ! some_state & SOME_FLAG is equivalent to (! some_state) & SOME_FLAG because logical negation has higher precedence than bitwise and. That is indeed suspicious, but the original code is not necessarily in error. In any case, it is more likely that the program is buggy in this regard than that any C implementation evaluated the original expression differently than the current standard requires, even prior to standardization.
Since the expressions (! some_state) & SOME_FLAG and !(some_state & SOME_FLAG) will sometimes evaluate to the same value -- especially if SOME_FLAG happens to expand to 1 -- it is also possible that even though they are inequivalent, their differences do not manifest during actual execution of the program.
While there was no standard before 1989, and thus compilers could do things as they wished, no compiler to my knowledge has ever done this; changing the meaning of operators wouldn't be a smart call if you want people to use your compiler.
There's very little reason to write an expression like (!foo & FLAG_BAR); the result is just !foo if FLAG_BAR is odd or always zero if it is even. What you've found is almost certainly just a bug.
It would not be possible for a legacy compiler to implement ! as bitwise negation, because such approach would produce incorrect results in situations when the value being negated is outside the {0, 0xFF...FF} set.
Standard requires the result of !x to produce zero for any non-zero value of x. Hence, applying ! to, say, 1 would yield 0xFF..FFFE, which is non-zero.
The only situation when the legacy code would have worked as intended is when SOME_FLAG is set to 1.
Let's start with the most interesting (and least obvious) part: gcc logically barks with -Wlogical-not-parentheses. What does this mean?
C has two different operators that have similar looking characters (but different behaviour and intended for very different purposes) - the & which is a bitwise AND, and && which is a boolean AND. Unfortunately this led to typos, in the same way that typing = when you meant == can cause problems, so some compilers (GCC) decided to warn people about "& without parenthesis used as a condition" (even though it's perfectly legal) to reduce the risk of typos.
Now...
You're showing code that uses & (and not showing code that uses &&). This implies that some_state is not a boolean and is number. More specifically it implies that each bit in some_state may be completely independent and unrelated.
For an example of this, let's pretend that we're implementing a Pacman game and need a nice compact way to store the map for each level. We decide that each tile in the map might be a wall or not, might be a collected dot or not, might be power pill or not, and might be a cherry or not. Someone suggests that this can be an array of bytes, like this (assuming the map is 30 tiles wide and 20 tiles high):
#define IS_WALL 0x01
#define HAS_DOT 0x02
#define HAS_POWER_PILL 0x04
#define HAS_CHERRY 0x08
uint8_t level1_map[20][30] = { ..... };
If we want to know if a tile happens to be safe to move into (no wall) we could do this:
if( level1_map[y][x] & IS_WALL == 0) {
For the opposite, if we want to know if a tile is a wall we could do any of these:
if( level1_map[y][x] & IS_WALL != 0) {
if( !level1_map[y][x] & IS_WALL == 0) {
if( level1_map[y][x] & IS_WALL == IS_WALL) {
..because it makes no difference which one it is.
Of course (to avoid the risk of typos) GCC might (or might not) warn about some of these.
Let's take a simple example of two lines supposedly doing the same thing:
if (value >= 128 || value < 0)
...
or
if (value & ~ 127)
...
Say 'If's are costly in a loop of thousands of iterations, is it better to keep with the traditional C syntax or better to find a binary optimized one if possible?
I would use first statement with traditional syntax as it is more readable.
It's possible to break the eyes with the second statement.
Care about programmers who will use the code after you.
In 99 cases out of 100, do the one which is more readable and expresses your intent better.
In theory, compilers will do this sort of optimization for you. In practice, they might not. This particular example is a little bit subtle, because the two are not equivalent, unless you make some assumptions about value and on whether or not signed arithmetic is 2's complement on your target platform.
Use whichever you find more readable. If and when you have evidence that the performance of this particular test is critical, use whatever gives you the best performance. Personally, I would probably write:
if ((unsigned int)value >= 96U)
because that's more intuitive to me, and more likely to get handled well by most compilers I've worked with.
It depends on how/where check is. If the check is being done once during program start up to check the command line parameter, then the performance issue is completely moot and you should use whatever is more natural.
On the other hand, if the check was inside some inner loop that is happening millions of times a second, then it may matter. But don't assume one will be better; you should create both versions and time them to see if there is any measurable difference between the two.