GCC compile-time floating point optimization

GCC compile-time floating point optimization - c

I'm developing for the AVR platform and I have a question. I don't want the floating point library to be linked with my code, but I like the concept of having analog values of the range 0.0 ... 1.0 instead of 0...255 and 0...1023, depending on even whether I'm using a port as an input or as an output.
So I decided to multiply the input/output functions' arguments by 1023.0 and 255.0, respecively. Now, my question is: if I implement the division like this:
#define analog_out(port, bit) _analog_out(port, ((uint8_t)((bit) * 255.0)))
will GCC (with the -O3 flag turned on) optimize the compile-time floating point multiplications, known at compile time and cast to an integral type, into integer operations? (I know that when using these macros with non-constant arguments, the optimization is not possible; I just want to know if it will be done in the other case.)

GCC should always do constant folding if you supply bit as a numeric literal.
If you want the compiler enforce the constness, you could get away with something like this:
#define force_const(x) (__builtin_choose_expr(__builtin_constant_p(x), (x), (void)0))
#define analog_out(port, bit) _analog_out(port, force_const((uint8_t)((bit) * 255.0)))

Generally, I think gcc -O2 will do all arithmetic on constants at compile time.
It won't convert it to integer arithmetic - just to a constant integer.
It may be dangerous to rely on, especially if other people maintain the code. A situation where passing a non-constant parameter to a macro results in an error isn't good.

Related

Why exactly using of a floating-point arithmetic in an integer constant expression is invalid?

In C11 (and later) integer constant expression shall only have operands that are, in particular:
floating constants that are the immediate operands of casts
The following code:
int a[ A > B ? 16 : 32 ];
when A and B are floating constants is invalid in C:
$ echo '#include "t576.h"' | clang -std=c11 -pedantic -Wall -Wextra -DA=1.0 -DB=2.0 -c -xc -
In file included from <stdin>:1:
./t576.h:1:5: warning: size of static array must be an integer constant expression [-Wpedantic]
but valid in C++:
$ echo '#include "t576.h"' | clang++ -std=c++11 -pedantic -Wall -Wextra -DA=1.0 -DB=2.0 -c -xc++ -
<nothing>
What is the origin / rationale of this requirement?
Extra question: In the future C standard revisions will it be useful to remove this requirement?

What is the origin / rationale of this requirement?
It means C compilers are not required to be able to execute floating-point arithmetic within the compiler. When compiling for a target platform different from the compiler host platform, replicating the exact behavior of the target floating-point operations can require a lot of work. (This was especially so prior to widespread adoption of the IEEE 754 standard for floating-point arithmetic.)
To implement floating-point semantics in a C program, the compiler only has to be able to convert the constants in source code to the target floating-point format and take the integer portion of them (in a cast). It does not have to be able to perform general arithmetic operations on them. Without this requirement, the compiler would have to reproduce the floating-point arithmetic operations of the target platform. So, if a program uses floating-point arithmetic, the compiler can implement that just by generating instructions to do the arithmetic; it does not have to do the arithmetic itself. This is also true for arithmetic constant expressions, which can be used as initializers: The compiler is not strictly required by the C standard to compute the value of the initializer. It can generate instructions that compute the value when the program starts running (for initialization of static objects) or when needed (for initialization of automatic objects).
In contrast, integer constant expressions can be used in places where the compiler needs the value, such as the width of a bit-field. So the compiler must be able to compute the value itself. If it were required to be able to do floating-point arithmetic to get the value, this would add considerable burden to writing some compilers.
Extra question: In the future C standard revisions will it be useful to remove this requirement?
Removing it will provide some opportunity for C program writers to use additional constant expressions and will require C compiler writers to do more work. The value of these things is subjective.

The rationale is common sense: they don't want to allow users to declare an array of some 3.1415 items - the array size needs to be an integer, obviously.
For many operators in C, the usual arithmetic conversions would turn the end result into floating point whenever a floating point operand is present. In case of ?: specifically that doesn't happen, since the result is the 2nd or 3rd operand. Also the > operator does always return int so it doesn't really apply there either.
If you don't immediately cast floating point operands to an integer type, as told in the definition of an integer constant expression that you quote, then it will become an arithmetic constant expression instead, which is a broader term.
So you can do this:
int a[ (int)1.0 > (int)2.0 ? 16 : 32 ]; // compliant
But you can't do this:
int a[ 1.0 > 2.0 ? 16 : 32 ]; // not compliant
Consider int a[ (int)1.0 > (int)2.0 ? 16.0 : 32 ]; (not compliant either). Here the condition always evaluates as false. We should get size 32, but because of the special implicit conversion rules of ?: the 2nd and 3rd operands are balanced per the usual arithmetic conversions, so we end up with 32.0 of type double. And if that in turn would lead to a floating point number that cannot be exactly represented, we would get a floating point array size.

Is there a gcc function to add two large numbers for C?

I am trying to add two numbers in C. My code is
#define type unsigned
type add(type a, type b) {
return a + b;
}
Code corresponding to above code in assembly makes use of a single add instruction (https://godbolt.org/ & ARM GCC 8.3.1). However when I changed the type to unsigned long long, code was bit obfuscated to understand; But, I believe it makes use of some ldm and then asks hardware to add complete vectors (or arrays). Similarly my next question was: is it possible to add two numbers where digits count in each number will be in the order of 1000s? It isn't hard to design a function and make it work, and I found many codes on internet which do this. But, I think compiler writes better code than us,
so are there any gcc built-in functions which can do this job?
In fact, does gcc provide such functions for all the 5 integer arithmetic operations?

No, there is no compiler support in GCC for arbitrary-precision arithmetic. You would need to use a library like GMP. If you can use C++ instead of C, you can get a more "natural" API (with arithmetic operators, etc.) by using a library like Boost Multiprecision.
GCC does support, as an extension, the types __int128 and unsigned __int128, which can be used like any other integral type, but these only provide capacity for 38 decimal digits.
Edit: Also, as an aside, don't use macros (like #define type unsigned) to rename types. Instead this should be written with the typedef keyword like so: typedef unsigned type;

Is SSE2 signed integer overflow undefined?

Signed integer overflow is undefined in C and C++. But what about signed integer overflow within the individual fields of an __m128i? In other words, is this behavior defined in the Intel standards?
#include <inttypes.h>
#include <stdio.h>
#include <stdint.h>
#include <emmintrin.h>
union SSE2
{
__m128i m_vector;
uint32_t m_dwords[sizeof(__m128i) / sizeof(uint32_t)];
};
int main()
{
union SSE2 reg = {_mm_set_epi32(INT32_MAX, INT32_MAX, INT32_MAX, INT32_MAX)};
reg.m_vector = _mm_add_epi32(reg.m_vector, _mm_set_epi32(1, 1, 1, 1));
printf("%08" PRIX32 "\n", (uint32_t) reg.m_dwords[0]);
return 0;
}
[myria#polaris tests]$ gcc -m64 -msse2 -std=c11 -O3 sse2defined.c -o sse2defined
[myria#polaris tests]$ ./sse2defined
80000000
Note that the 4-byte-sized fields of an SSE2 __m128i are considered signed.

You are asking about a specific implementation issue (using SSE2) and not about the standard. You've answered your own question "signed integer overflow is undefined in C".
When you are dealing with c intrinsics you aren't even programming in C! These are inserting assembly instructions in line. It is doing it in a some what portable way, but it is no longer true that your data is a signed integer. It is a vector type being passed to an SSE intrinsic. YOU are then casting that to an integer and telling C that you want to see the result of that operation. Whatever bytes happen to be there when you cast is what you will see and has nothing to do with signed arithmetic in the C standard.
Things are a bit different if the compiler inserts SSE instructions (say in a loop). Now the compiler is guaranteeing that the result is the same as a signed 32 bit operation ... UNLESS there is undefined behaviour (e.g. an overflow) in which case it can do whatever it likes.
Note also that undefined doesn't mean unexpected ... whatever behaviour your observe for auto-vectorization might be consistent and repeatable (maybe it does always wrap on your machine ... that might not be true with all cases for surrounding code, or all compilers. Or if the compiler selects different instructions depending on availability of SSSE3, SSE4, or AVX*, possibly not even all processors if it makes different code-gen choices for different instruction-sets that do or don't take advantage of signed overflow being UB).
EDIT:
Okay, well now that we are asking about "the Intel standards" (which don't exist, I think you mean the x86 standards), I can add something to my answer. Things are a little bit convoluted.
Firstly, the intrinsic _mm_add_epi32 is defined by Microsoft to match Intel's intrinsics API definition (https://software.intel.com/sites/landingpage/IntrinsicsGuide/ and the intrinsic notes in Intel's x86 assembly manuals). They cleverly define it as doing to a __m128i the same thing the x86 PADDD instruction does to an XMM register, with no more discussion (e.g. is it a compile error on ARM or should it be emulated?).
Secondly, PADDD isn't only a signed addition! It is a 32 bit binary add. x86 uses two's complement for signed integers, and adding them is the same binary operation as unsigned base 2. So yes, paddd is guaranteed to wrap. There is a good reference for all the x86 instructions here.
So what does that mean: again, the assumption in your question is flawed because there isn't even any overflow. So the output you see should be defined behaviour. Note that it is defined by Microsoft and x86 (not by the C Standard).
Other x86 compilers also implement Intel's intrinsics API the same way, so _mm_add_epi32 is portably guaranteed to just wrap.

This isn't "signed integer overflow within the fields of an __m128i". This is a function call. (Being a compiler intrinsic is just an optimization, much like inlining, and that doesn't interact with the C standard as long as the as-if rule is respected)
Its behavior must follow the contract (preconditions, postconditions) that the function developer documented. Usually intrinsics are documented by the compiler vendor, although they tend to coordinate the naming and contract of intrinsics to aid in porting code.

Any performance difference between sinf(), cosf() and sin(), cos()

I have code that works mainly with single-precision floating point numbers. Calls to transcendental functions occur fairly often. Right now, I'm using sin(), cos(), sqrt(), etc--functions that accept and return double. When it's compiled for x87 only, I know there's no difference between single and double precision. I read in Agner Fog's optimization guide however that software versions of these function utilizing SSE instructions are faster for single-precision floating point numbers.
My question is whether the compiler would automatically use the faster function when it encounters something like:
float x = 1.23;
float y = sin(x);
Or does rounding rule preclude such an optimization?
It'd be easy enough to just do a search-and-replace and see whether there's any performance gain. Trouble is that I also need pointers to these functions. In MSVC, sinf(), cosf(), and friends are inline functions. Using them would therefore requires a bit of gymnastics. Before making the effort, I would like to know whether it's worthwhile.
Besides MSVC, I'm also targeting gcc.

There is really no need to have the cast when calling sin. In fact it would be counterproductive if you'd use the <tgmath.h> header that comes with C99. That provides you type generic macros that would chose the right function according to your argument, not for the target type unfortunately. So if you'd use that header (not sure if this is available for MS)
float x = 1.23;
float y = sin(x);
would automatically use sinf under the hood.

Need some clarification regarding casting in C

I was just reading about the bad practice of casting the return value of malloc. If I understood correctly, it is absolutely legal to leave the cast as it is done implicitly (and should be left, because of other problems it could generate). Well my question is, when should I then cast my values ? Is there some general rule or something ? For example, this code compiles without any errors with gcc -W -Wall (except unused bar, but that's not the point):
float foo(void) {
double bar = 4.2;
return bar;
}
int main(void) {
double bar = foo();
return 0;
}
I'm confused now. What are the good practices and the rules about casting ?
Thanks.

There are several situations that require perfectly valid casting in C. Beware of sweeping assertions like "casting is always bad design", since they are obviously and patently bogus.
One huge group of situations that critically relies on casts is arithmetic operations. The casting is required in situations when you need to force the compiler to interpret arithmetic expression within a type different from the "default" one. As in
unsigned i = ...;
unsigned long s = (unsigned long) i * i;
to avoid overflow. Or in
double d = (double) i / 5;
in order to make the compiler to switch to floating-point division. Or in
s = (unsigned) d * 3 + i;
in order to take the whole part of the floating point value. And so on (the examples are endless).
Another group of valid uses is idioms, i.e. well-established coding practices. For example, the classic C idiom when a function takes a const pointer as an input and returns a non-const pointer to the same (potentially constant) data, like the standard strstr for example. Implementing this idiom usually requires a use of a cast in order to cast away the constness of the input. Someone might call it bad design, but in reality there's no better design alternative in C. Otherwise, it wouldn't be a well-established idiom :)
Also it is worth mentioning, as an example, that a pedantically correct use of standard printf function might require casts on the arguments in general case. (Like %p format specifier expecting a void * pointer as an argument, which means that an int * argument has to be transformed into a void * in one way or another. An explicit cast is the most logical way to perform the transformation.).
Of course, there are other numerous examples of perfectly valid situations when casts are required.
The problems with casts usually arise when people use them thoughtlessly, even where they are not required (like casting the return of malloc, which is bad for more reasons than one). Or when people use casts to force the compiler to accept their bad code. Needless to say, it takes certain level of expertise to tell a valid cast situation from a bad cast one.
In some cases casts are used to make the compiler to stop issuing some annoying and unnecessary warning messages. These casts belong to the gray area between the good and the bad casts. On the one hand, unnecessary casts are bad. On the other hand, the user might not have control over the compilation settings, thus making the casts the only way to deal with the warnings.

If and when you need to cast I always suggest you do this explicitly to show others, or perhaps yourself in the future, that you intended for this behavior.
By the way, the gcc warning for this is -Wconversion. Unfortunately -Wall and -Wextra still leave alot of good warnings off.
Here are the flags I use when I want gcc to be very lint-like
-pedantic -std=c99 -ggdb3 -O0 -Wall -Wextra -Wformat=2 -Wmissing-include-dirs -Winit-self -Wswitch-default -Wswitch-enum -Wunused-parameter -Wfloat-equal -Wundef -Wshadow -Wlarger-than-1000 -Wunsafe-loop-optimizations -Wbad-function-cast -Wcast-qual -Wcast-align -Wconversion -Wlogical-op -Waggregate-return -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes -Wmissing-declarations -Wpacked -Wpadded -Wredundant-decls -Wnested-externs -Wunreachable-code -Winline -Winvalid-pch -Wvolatile-register-var -Wstrict-aliasing=2 -Wstrict-overflow=2 -Wtraditional-conversion -Wwrite-strings
I also check my code first with cppcheck which is a free static code analyzer for C and C++. Highly recommended.

My simple guideline - if it needs a cast, it's probably wrong. If you don't need a cast, don't use them.

The reason you don't cast the return value of malloc is because you are always assigning the return value to a pointer type, and the C standard allows a void * to be implicitly cast to any other pointer type. Explicitly casting it is redundant, and therefore unnecessary.

You can't ask about casting in 'C' w/o understanding that casting covers more than one type of operation. There are essentially two, type conversion and type coercion. In C++, because it has more type info, it's creating 4 types of casts and codified this with an exclusive notation. reinterpret_cast<>, const_cast<>, dynamic_cast<> and static_cast<>. You don't have these in C since all casts have the syntax (ctype) but the reasons for them remain and it helps to understand why casting is required even though your question was about 'C' specifically.
The "need" for static cast is what you show in your example. The compiler will do them for you, even if you don't specify it - however, crank the warning level up high enough, and the compiler will warn you if there is a loss of precision as there is going from double to float (your return bar; statement). Adding a cast tells the compiler the loss of precision was intended.
The second least dangerous cast is a const cast<>. It's used to removed const or volatile from a type. This commonly occurs where structures have internal "caches". So a caller may have a const version of your structure, but an "internal function" needs to update the cache so will have to cast from a pointer to a const struct to a regular struct to update an internal field.
The most dangerous type is a reinterpret cast and why people will go on and on about how bad it is to cast. That's where you're not converting anything, but telling the compiler to reinterpret a value as a totally different type. What is below might have been added by a naive programmer trying to get rid of a compiler error.
char **ptostr = (char **p) "this is not a good idea";
Likely the correct fix was to use an '&' and this is how casting gets a bad reputation. Casts like this can be used for good or evil. I used it in the answer to another question about how to find the smallest power of 2 in order to leverage the power of the FPU in a CPU. A better example of being used for good, is when implementing linked lists. If the links live in the objects themselves, you have to cast from the link pointer back out to the enclosed object (a good use for the offsetof macro if the links can't be at the top of the structure).
A dynamic cast has no language support in C but the circumstance still occurs. If you have a heterogeneous list, then you might verify an object was of a given type using a field in the list's link header. Implemented manually, you would verify the type as being compatible and return NULL if it wasn't. This is special version of the reinterpret cast.
There are many sophisticated programing patterns that require casting so I wouldn't say casting needs to be avoided or indicates something is wrong. The problem with 'C' is how you write the unsafe ones in the same exact way as the safe ones. Keeping it contained and limited is a good practice so you can make sure you have it right (e.g., use library routines, strong typing and asserts if you can).

Casting a result should only done when strictly necessary; if you are using code developed from two different people (such as static libraries, or dynamic libraries), and two functions don't use compatible values, then casting is the only solution (as long as you don't try to cast a string to an integer).
Before to use the casting, it would be better to verify if the datatypes used are correct. In the example code (which has the purpose of providing an example), it doesn't make sense to declare the returned value to be a float value when the function returns a double.

double bar = foo();
What happens here is called promotional conversion, where the value of the casted variable is reserved after the conversion. The reverse is not true, i.e. float -> double. The only answer is to cast only when you really need to. Casting a lot is a sign of bad design.

In your example there's a loss of precision, but the cast is implicit. There are times when casting is absolutely necessary, such as when you're reading data from a byte stream or when all you have is data coming in through a void* pointer, but you know what data it represents. But for most part, casting should be avoided and reserved for these extreme cases.

What you're looking at is implicit type conversion. This is considered safe if you're starting with a type having a more restricted range than the one you're ending up with, i.e. short to int is OK, as is float to double.
I'm quite surprised that gcc isn't generating a warning when converting a double to a float; I believe Microsoft's compiler does.

You might find these two SO posts informative:
Specifically, what’s dangerous about casting the result of malloc?
Do Implict Function Declarations in C Actually Generate Object Code?

The admonition against casting the result of malloc() is a special case, due to C implicitly typing the result of previously undeclared functions to int (which IINM is disallowed as of C99).
Generally, you want to limit the use of explicit casts as much as possible; the only time you need to use one is if you're trying to assign a value of one type to a variable of an incompatible type (e.g., assign a pointer value to an int variable or vice versa). Since void * is compatible with every other pointer type, no explicit cast is needed. However, if you're trying to assign a value of type int * to a variable of type struct foo *, an explicit cast is required. If you find yourself assigning values of incompatible types a lot, then you may want to revisit your design.

Basically you need to cast arguments to functions that expect a different parameter than their prototype claims.
For example, isalpha() has a prototype with an int argument, but really expects an unsigned char.
char *p;
if ((*p != EOF) && isalpha((unsigned char)*p) /* cast needed */
{
/* ... */
}
And, you need to be extra careful with functions that accept a variable number of arguments, eg:
long big;
printf("%d\n", (int)big);
Edit
The compiler cannot convert the arguments of variadic functions to the proper type, because there is no type information available in the prototype itself. Consider a printf()-like function
int my_printf(const char *fmt, ...);
As far as the compiler is aware, you can pass values of all kinds of types in the "..." argument, and you must make sure the arguments match what the function expects. For example, let's say the my_printf() function accepts a value of type time_t with a corresponding "%t" specifier in the format string.
my_printf("UNIX Epoch: %t.\n", 0); /* here, 0 is an int value */
my_printf("UNIX Epoch: %t.\n", (time_t)0); /* and here it is a time_t */
My compiler does not want to make this fail! Apparently it (and the one at codepad too) passes 8 bytes for each argument in "..."
Using a prototype without "..." ( int my_printf(const char *fmt, time_t data); ), the compiler would automagically convert the first 0 to the right type.
Note: some compilers (gcc included) will validate the arguments against the format string for printf() if the format string is a literal string

There are at least two kinds of casts:
Casts of numeric values that cause a change of representation. (This is a term of art you will find in Harbison and Steele's very helpful C Reference Manual.) These casts are mostly innocuous; the only way you can do harm is by, e.g., casting a wider type to a narrower type, in which case you are deliberately throwing away bits. Only you, the programmer, know whether it's safe to throw away those bits.
C also has an insidious feature that it may perform a change of representation on your behalf when you assign, return, or pass a numeric expression whose type doesn't exactly match the type of the associate lvalue, result, or argument. I think this feature is pernicious, but it is firmly embedded in the C way of doing things. The gcc option -Wconversion will let you know where the compiler is doing a change of representation on your behalf, without being asked.
Casts that don't involve a change of representation but simply ask the compiler to view bits a certain way. These include casts between signed and unsigned types of the same size, as well as casts between pointer types that point to different sorts of data. The type void * enjoys a special status in C as the compiler will convert freely between void * and other data pointer types without a cast. Note that a cast between pointers of two different types of data never involves a change of representation. This is one reason the void * convention can work.
One reason C programmers try to avoid gratuitous casts is that when you write a cast, the compiler trusts you completely. If you make a mistake, the compiler is not going to catch it for you. For this reason I advise my student never to cast pointer types unless they know exactly what they are doing.
P.S. I thought it would be possible to write a short, helpful answer to this question. Wrong!

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight