Why is it that '0 is false, but 'False is true? - hy

I was playing around with symbols and was surprised to see that:
hy 0.18.0 using CPython(default) 3.7.3 on Linux
=> (bool '0)
False
=> (bool 'False)
True
=>
Is that a design decision? What is the best way to represent boolean values on Hy?

'0 isn't a symbol; it's a HyInteger, which inherits from int and behaves like an int in many ways. In particular, it uses int's __bool__ method.
'False is indeed a symbol (HySymbol), but most operations on a symbol, including bool, don't try to evaluate the symbol. Instead, they treat it like a string. At least for the time being, HySymbol inherits from str. So, bool on any nonempty symbol returns True. For the same reason, (+ 'x 'y) returns the string "xy" even if you've set the variables x and y to numbers. If you want to Booleanize the value of a variable represented by a symbol, rather than the symbol itself, say (bool (hy.eval 'False)).
What is the best way to represent boolean values on Hy?
With a plain old bool, as in Python.

Related

How do I validate an enum value read from a file?

If I am reading binary values from a file in C, then an integer that is supposed to be a member of an enum can be checked manually by looping through the enum itself and verifying that the integer is one of those values, but this seems like a somewhat tedious process. If I just cast the read value to the enum, then I assume some kind of runtime error will occur if the value is invalid.
Is there a better method of validating the enum than doing a manual check loop?
Note that in my case, the enum(s) in question do not necessarily have consecutive values, so min/max checking is not a solution.
In C, all enums are actually integral types.
So any value of that integral type is a valid value for your enum.
If you are careful and set up the enum labels so they are consecutive (the default is consecutive from 0), it's a simple case of checking if the value from the file is in that range. Otherwise, yes, it's tedious.
enum in C works like an integer, and so it can be forced to any value by any kind of read function taking a pointer, or directly casting it from integer types.
If the enum has only sequential values, some programs have a max enum value for their enum. These can either have explicit values, or have the implicit values which will always start from 0 and go up sequentially. This way they can just check the value is in the allowed range (0 to max - 1), rather than checking it for every allowed value.
typedef enum foo {
foo_a,
foo_b,
foo_c,
foo_max //last
} foo;
int main(void)
{
foo x = (foo)88; // from somewhere
if (x >= 0 && x < foo_max)
printf("valid\n");
else printf("invalid\n");
}

Why the linux kernel uses double logical negations instead of casts to bools?

Given that x is a variable of type int with the number 5 as its value, consider the following statement:
int y = !!x;
This is what I think it happens: x is implicitly casted to a bool and the first negation is executed, after that the last negation is made, so a cast and two negations.
My question is, isn't just casting to bool (executing int y = (bool)x; instead of int y = !!x) faster than using double negation, as you are saving two negations from executing.
I might be wrong because I see the double negation a lot in the Linux kernel, but I don't understand where my intuition goes wrong, maybe you can help me out.
There was no bool type when Linux was first written. The C language treated everything that was not zero as true in Boolean expressions. So 7, -2 and 0xFF are all "true". No bool type to cast to. The double negation trick ensures the result is either zero or whatever bit pattern the compiler writers chose to represent true in Boolean expressions. When you're debugging code and looking at memory and register values, it's easier to recognize true values when they all have the same bit patterns.
Addendum: According the C89 draft standard, section 3.3.3.3:
The result of the logical negation operator ! is 0 if the value of its operand compares unequal to 0, 1 if the value of its operand compares equal to 0. The result has type int . The expression !E is equivalent to (0==E).
So while there was no Boolean type in the early days of the Linux OS, the double negation would have yielded either a 0 or a 1 (thanks to Gox for pointing this out), depending on the truthiness of the expression. In other words any bit pattern in the range of INT_MIN..-1 and 1..INT_MAX would have yielded a 1 and the zero bit pattern is self-explanatory.
C language unlike other languages does not have bool type. bool in C is actually defined in stdbool.h which is not included in many C projects. Linux kernel is one such projects, it would be a pain to go through Linux code and update everything to use bool now as well. That is reason why Linux kernel does not use bool in C.
why !!x? This is done to ensure that value of y is either 1 or 0. As an example if you have this cocd
x=5;
int y = !!x;
We know that everything that non-zero values in C mean true. So above code would brake down to y= !!(5) followed by y = !(0) and than y = 1.
EDIT:
One more thing, I just saw OP mentioned casting to bool. In C there is no bool as base type, bool is defined type, thus compilers do not cast integers to Boolean type.
EDIT 2:
To further explain, in C++, Java and other languages when you type bool a = false you do not have to use headers or compile some other libraries or define bool type for bool type to work, it is already incorporated into compilers, where as in c you have to.
EDIT 3:
bool is not the same as _Bool.
The only reason I can imagine is because this saves some typing (7 chars vs 2 chars).
As #jwdonahue and #Gox have already mentioned, this is not the correct reason. C did not have bool when the linux kernel was written therefore casting to bool was not an option.
As far as efficiency goes, both are equivalent because compilers can easily figure this out. See https://godbolt.org/g/ySo6K1
bool cast_to_bool_1(int x) {
return !!x;
}
bool cast_to_bool_2(int x) {
return (bool) x;
}
Both the functions compile to the same assembly which uses the test instruction to check if the argument is zero or not.
test edi, edi // checks if the passed argument is 0 or not
setne al // set al to 0 or 1 based on the previous comparison
ret // returns the result

How to insert booleans into a bitfield in C89

As far as I understand, in C89 all boolean expressions are of type integer. This also means that function parameters that represent bool usually get represented by an int parameter.
Now my question is how I can most ideally take such an int and put it into a bitfield so that it only occupies one bit (let's ignore padding for now).
The first thing here is which type to use. Using int or any other unsigned type doesn't work, because when there is only one bit, only -1 and 0 can be represented (at least with two's complement).
While -1 technically evaluates as true, this is not ideal because actually assigning it without undefined behavior can be quite tricky from what I understand.
So an unsigned type should be chosen for the bitfield:
typedef struct bitfield_with_boolean {
unsigned int boolean : 1;
} bitfield_with_boolean;
The next question is then how to assign that bitfield. Just taking an int or similar won't work because the downcast truncates the value so if the lowest bit wasn't set, a value that would previously evaluate to true would now suddenly evaluate to false.
As far as I understand, the boolean operators are guaranteed to always return either 0 or 1. So my idea to solve this problem would be something like this:
#define to_boolean(expression) (!!(expression))
So in order to assign the value I would do:
bitfield_with_boolean to_bitfield(int boolean) {
bitfield_with_boolean bitfield = {to_boolean(boolean)};
return bitfield;
}
Is that correct, and or is there a better way?
NOTE:
I know the problem is completely solved starting with C99 because casting to _Bool is guaranteed to always result in either a 0 or a 1. Where 0 is only the result if the input had a value of 0.
Yes, your solution is correct. However, I wouldn't hide it behind a macro, and I wouldn't name a macro using all_lowercase letters.
!!var is sufficiently idiomatic that I'd say it's fine in code.
Alternatives include var != 0 and, of course, var ? 1 : 0.

Why #define TRUE (1==1) in a C boolean macro instead of simply as 1?

I've seen definitions in C
#define TRUE (1==1)
#define FALSE (!TRUE)
Is this necessary? What's the benefit over simply defining TRUE as 1, and FALSE as 0?
This approach will use the actual boolean type (and resolve to true and false) if the compiler supports it. (specifically, C++)
However, it would be better to check whether C++ is in use (via the __cplusplus macro) and actually use true and false.
In a C compiler, this is equivalent to 0 and 1.
(note that removing the parentheses will break that due to order of operations)
The answer is portability. The numeric values of TRUE and FALSE aren't important. What is important is that a statement like if (1 < 2) evaluates to if (TRUE) and a statement like if (1 > 2) evaluates to if (FALSE).
Granted, in C, (1 < 2) evaluates to 1 and (1 > 2) evaluates to 0, so as others have said, there's no practical difference as far as the compiler is concerned. But by letting the compiler define TRUE and FALSE according to its own rules, you're making their meanings explicit to programmers, and you're guaranteeing consistency within your program and any other library (assuming the other library follows C standards ... you'd be amazed).
Some History
Some BASICs defined FALSE as 0 and TRUE as -1. Like many modern languages, they interpreted any non-zero value as TRUE, but they evaluated boolean expressions that were true as -1. Their NOT operation was implemented by adding 1 and flipping the sign, because it was efficient to do it that way. So 'NOT x' became -(x+1). A side effect of this is that a value like 5 evaluates to TRUE, but NOT 5 evaluates to -6, which is also TRUE! Finding this sort of bug is not fun.
Best Practices
Given the de facto rules that zero is interpreted as FALSE and any non-zero value is interpreted as TRUE, you should never compare boolean-looking expressions to TRUE or FALSE. Examples:
if (thisValue == FALSE) // Don't do this!
if (thatValue == TRUE) // Or this!
if (otherValue != TRUE) // Whatever you do, don't do this!
Why? Because many programmers use the shortcut of treating ints as bools. They aren't the same, but compilers generally allow it. So, for example, it's perfectly legal to write
if (strcmp(yourString, myString) == TRUE) // Wrong!!!
That looks legitimate, and the compiler will happily accept it, but it probably doesn't do what you'd want. That's because the return value of strcmp() is
0 if yourString == myString
<0 if yourString < myString
>0 if yourString > myString
So the line above returns TRUE only when yourString > myString.
The right way to do this is either
// Valid, but still treats int as bool.
if (strcmp(yourString, myString))
or
// Better: lingustically clear, compiler will optimize.
if (strcmp(yourString, myString) != 0)
Similarly:
if (someBoolValue == FALSE) // Redundant.
if (!someBoolValue) // Better.
return (x > 0) ? TRUE : FALSE; // You're fired.
return (x > 0); // Simpler, clearer, correct.
if (ptr == NULL) // Perfect: compares pointers.
if (!ptr) // Sleazy, but short and valid.
if (ptr == FALSE) // Whatisthisidonteven.
You'll often find some of these "bad examples" in production code, and many experienced programmers swear by them: they work, some are shorter than their (pedantically?) correct alternatives, and the idioms are almost universally recognized. But consider: the "right" versions are no less efficient, they're guaranteed to be portable, they'll pass even the strictest linters, and even new programmers will understand them.
Isn't that worth it?
The (1 == 1) trick is useful for defining TRUE in a way that is transparent to C, yet provides better typing in C++. The same code can be interpreted as C or C++ if you are writing in a dialect called "Clean C" (which compiles either as C or C++) or if you are writing API header files that can be used by C or C++ programmers.
In C translation units, 1 == 1 has exactly the same meaning as 1; and 1 == 0 has the same meaning as 0. However, in the C++ translation units, 1 == 1 has type bool. So the TRUE macro defined that way integrates better into C++.
An example of how it integrates better is that for instance if function foo has overloads for int and for bool, then foo(TRUE) will choose the bool overload. If TRUE is just defined as 1, then it won't work nicely in the C++. foo(TRUE) will want the int overload.
Of course, C99 introduced bool, true, and false and these can be used in header files that work with C99 and with C.
However:
this practice of defining TRUE and FALSE as (0==0) and (1==0) predates C99.
there are still good reasons to stay away from C99 and work with C90.
If you're working in a mixed C and C++ project, and don't want C99, define the lower-case true, false and bool instead.
#ifndef __cplusplus
typedef int bool;
#define true (0==0)
#define false (!true)
#endif
That being said, the 0==0 trick was (is?) used by some programmers even in code that was never intended to interoperate with C++ in any way. That doesn't buy anything and suggests that the programmer has a misunderstanding of how booleans work in C.
In case the C++ explanation wasn't clear, here is a test program:
#include <cstdio>
void foo(bool x)
{
std::puts("bool");
}
void foo(int x)
{
std::puts("int");
}
int main()
{
foo(1 == 1);
foo(1);
return 0;
}
The output:
bool
int
As to the question from the comments of how are overloaded C++ functions relevant to mixed C and C++ programming. These just illustrate a type difference. A valid reason for wanting a true constant to be bool when compiled as C++ is for clean diagnostics. At its highest warning levels, a C++ compiler might warn us about a conversion if we pass an integer as a bool parameter. One reason for writing in Clean C is not only that our code is more portable (since it is understood by C++ compilers, not only C compilers), but we can benefit from the diagnostic opinions of C++ compilers.
#define TRUE (1==1)
#define FALSE (!TRUE)
is equivalent to
#define TRUE 1
#define FALSE 0
in C.
The result of the relational operators is 0 or 1. 1==1 is guaranteed to be evaluated to 1 and !(1==1) is guaranteed to be evaluated to 0.
There is absolutely no reason to use the first form. Note that the first form is however not less efficient as on nearly all compilers a constant expression is evaluated at compile time rather than at run-time. This is allowed according to this rule:
(C99, 6.6p2) "A constant expression can be evaluated during translation rather than runtime, and accordingly may be used in any place that a constant may be."
PC-Lint will even issue a message (506, constant value boolean) if you don't use a literal for TRUE and FALSE macros:
For C, TRUE should be defined to be 1. However, other languages use quantities other than 1 so some programmers feel that !0 is playing it safe.
Also in C99, the stdbool.h definitions for boolean macros true and false directly use literals:
#define true 1
#define false 0
Aside from C++ (already mentioned), another benefit is for static analysis tools. The compiler will do away with any inefficiencies, but a static analyser can use its own abstract types to distinguish between comparison results and other integer types, so it knows implicitly that TRUE must be the result of a comparison and should not be assumed to be compatible with an integer.
Obviously C says that they are compatible, but you may choose to prohibit deliberate use of that feature to help highlight bugs -- for example, where somebody might have confuse & and &&, or they've bungled their operator precedence.
The pratical difference is none. 0 is evaluated to false and 1 is evaluated to true. The fact that you use a boolean expression (1 == 1) or 1, to define true, doesn't make any difference. They both gets evaluated to int.
Notice that the C standard library provides a specific header for defining booleans: stdbool.h.
We don't know the exact value that TRUE is equal to and the compilers can have their own definitions. So what you privode is to use the compiler's internal one for definition. This is not always necessary if you have good programming habits but can avoid problems for some bad coding style, for example:
if ( (a > b) == TRUE)
This could be a disaster if you mannually define TRUE as 1, while the internal value of TRUE is another one.
List item
Typically in the C Programming Language, 1 is defined as true and 0 is defined as false. Hence why you see the following quite often:
#define TRUE 1
#define FALSE 0
However, any number not equal to 0 would be evaluated to true as well in a conditional statement. Therefore by using the below:
#define TRUE (1==1)
#define FALSE (!TRUE)
You can just explicitly show that you trying to play it safe by making false equal to whatever isn't true.

C/GL: Using -1 as sentinel on array of unsigned integers

I am passing an array of vertex indices in some GL code... each element is a GLushort
I want to terminate with a sentinel so as to avoid having to laboriously pass the array length each time alongside the array itself.
#define SENTINEL ( (GLushort) -1 ) // edit thanks to answers below
:
GLushort verts = {0, 0, 2, 1, 0, 0, SENTINEL};
I cannot use 0 to terminate as some of the elements have value 0
Can I use -1?
To my understanding this would wrap to the maximum integer GLushort can represent, which would be ideal.
But is this behaviour guaranteed in C?
(I cannot find a MAX_INT equivalent constant for this type, otherwise I would be using that)
If GLushort is indeed an unsigned type, then (GLushort)-1 is the maximum value for GLushort. The C standard guarantees that. So, you can safely use -1.
For example, C89 didn't have SIZE_MAX macro for the maximum value for size_t. It could be portably defined by the user as #define SIZE_MAX ((size_t)-1).
Whether this works as a sentinel value in your code depends on whether (GLushort)-1 is a valid, non-sentinel value in your code.
GLushort is an UNSIGNED_SHORT type which is typedefed to unsigned short, and which, although C does not guarantee it, OpenGL assumes as a value with a 2^16-1 range (Chapter 4.3 of the specification). On practically every mainstream architecture, this somewhat dangerous assumption holds true, too (I'm not aware of one where unsigned short has a different size).
As such, you can use -1, but it is awkward because you will have a lot of casts and if you forget a cast for example in an if() statement, you can be lucky and get a compiler warning about "comparison can never be true", or you can be unlucky and the compiler will silently optimize the branch out, after which you spend days searching for the reason why your seemingly perfect code executes wrong. Or worse yet, it all works fine in debug builds and only bombs in release builds.
Therefore, using 0xffff as jv42 has advised is much preferrable, it avoids this pitfall alltogether.
I would create a global constant of value:
const GLushort GLushort_SENTINEL = (GLushort)(-1);
I think this is perfectly elegant as long as signed integers are represented using 2's complement.
I don't remember if thats guaranteed by the C standard, but it is virtually guaranteed for most CPU's (in my experience).
Edit: Appparently this is guaranteed by the C standard....
If you want a named constant, you shouldn't use a const qualified variable as proposed in another answer. They are really not the same. Use either a macro (as others have said) or an enumeration type constant:
enum { GLushort_SENTINEL = -1; };
The standard guarantees that this always is an int (really another name of the constant -1) and that it always will translate into the max value of your unsigned type.
Edit: or you could have it
enum { GLushort_SENTINEL = (GLushort)-1; };
if you fear that on some architectures GLushort could be narrower than unsigned int.

Resources