I'm trying to print an integer into a string with snprintf for display on an OLED display from an ARM micro. However, when I use %d or %u the micro locks up and stops executing.
Using %x or %c works fine, but the output isn't much use.
What could cause this behaviour? Unfortunately I don't have access to a JTAG device to debug. I'm using arm-none-eabi-gcc to compile and it's all running on a maple mini.
UPDATE
Passing values < 10 seems to make it work.
This actually turned out to be a stack size issue with the RTOS that I was using. I guess the added complexity of the snprintf call was pushing it over the limit and crashing.
Thanks to all who took a crack at answering this!
Passing values < 10 seems to make it work.
This sounds to me as if you have a missing/non-working divide routine. printf/sprintf usually prints decimal numbers by successively dividing them by 10. For numbers less than 10 the division is not necessary and that's probably why it doesn't work.
To check, make a function which divides two variables (dividing by a constant is usually optimized into multiplication by the compiler). E.g.:
int t()
{
volatile int a, b; // use volatile to prevent compiler optimizations
a = 123;
b = 10;
return a/b;
};
Also, check your build log for link warnings.
It can't be a type error since %x and %u both specify the same types. So it has to be a problem in snprintf itself. The only major difference between the two is that %u has to divide integers and compute the remainder, whereas %x can get by with shifts and masks.
It is possible that your C library was compiled for a different variety of ARM processor than you are using, and perhaps it is using an illegal instruction to compute a quotient or remainder.
Make sure you are compiling your library for Cortex M3. E.g.,
gcc -mcpu=cortex-m3 ...
Do you have a prototype in scope ? snprintf() is a varargs function, and calling a varargs may involve some trickery to get the arguments at the place where the function expects them.
Also: always use the proper types when calling a varargs function. (the one after the '%' is the type that snprintf() expects to find somewhere, 'somewhere' may even depend on the type. Anything goes...) in your case : "%X" expects an unsigned int. Give it to her, either by casting the parameter in the function call, or by using "unsigned int sweeplow;" when defining it. Negative frequencies or counts make no sense anyway.
Related
I have a C program which involves some floating-point literal that's defined by a macro. And - it just so happens that in a performance-critical part of the code, I need the ceiling of that floating-point number, as an integer (say, an int). Now, if it were an actual literal, then I would just manually take the ceiling and be done with it. But since it's a macro, I can't do that.
Naturally, ceilf(MY_DOUBLE_LITERAL) works, but - it's expensive. (int)(MY_DOUBLE_LITERAL) + 1 will also work... except if the literal is itself an integer, in which case it will be wrong.
So, is there some way - preprocessor macros are fine - to obtain the ceiling of MY_DOUBLE_LITERAL, or alternatively to determine whether it is an integer or not?
Notes:
I cannot perform the ceil() before the performance-critical part of the code, due to constraints I won't go into. Naturally if one can move computation outside of the performance-critical part or loop, that's always best, and thanks goes to #SpyrosK for mentioning that.
C99 preferred.
You may assume the literal is a double, i.e. a decimal number like 123.4 or an exponential-notation like 1.23e4.
If your literal fits nicely within the int representation range, you will often be able to rely on compiler optimizations. clang and gcc, for example, will go as far as simply optimizing ceilf() away when it's provided a literal, so ceifl(MY_DOUBLE_LITERAL) will result in a literal being used and no time wasted at run-time.
If somehow that doesn't work with your compiler, you could use:
int poor_mans_ceil(double x)
{
return (int) x + ( ((double)(int) x < x) ? 1 : 0);
}
which should be "easier" for a compiler to optimize. You can see this happening on GodBolt.
Having said that - Some compilers in more exotic setting might fail to optimize both ceil() and the function above. Example: NVIDIA's NVCC compiler for CUDA.
A solution could be to statically store the rounded float macro literal in a const variable outside the critical part of the code and use the precomputed int const value in the critical part.
This will work as long as the float returned by the macro is always the same value.
Nowadays compilers optimize crazy things. Especially gcc and clang sometime do really crazy transformations.
So, I'm wondering why the following piece of code is not optimized:
#include <stdio.h>
#include <string.h>
int main() {
printf("%d\n", 0);
}
I would expect a heavy-optimizing compiler to generate code equivalent to this:
#include <stdio.h>
#include <string.h>
int main() {
printf("0\n");
}
But gcc and clang don't apply the format at compile-time (see https://godbolt.org/z/Taa44c1n7). Is there a reason why such an optimization is not applied?
I see regularly that some format-arguments are known at compile-time, so I guess it could make sense (especially for floating-point values, because there the formatting is probably relatively expensive).
Suppose that at various spots within a function one has the following four lines:
printf("%d\n", 100);
printf("%d\n", 120);
printf("%d\n", 157);
printf("%d\n", 192);
when compiling for a typical ARM microcontroller, each of those printf call would take eight bytes, and all four function calls would share four bytes that are used to hold a copy of the address of the string, and four bytes for the string itself. The total for code and data would thus be 40 bytes.
If one were to replace that with:
printf("100\n");
printf("120\n");
printf("157\n");
printf("192\n");
then each printf call would only need six bytes of code, but would need to have its own separate five-byte string in memory, along with four bytes to hold the address. The total code plus data requirement would thus increase from 40 bytes to 60. Far from being an optimization, thus would actually increase code space requirement by 50%.
Even if there were only one printf call, the savings would be minuscule. Eight bytes for the original version of the call plus eight bytes of data overhead would be 16 bytes. The second version of the code would require eight bytes of code plus seven bytes of data, for a total of 17 bytes. An optimization which would save a little storage in best case, and cost a lot in scenarios that are hardly implausible isn't really an optimization.
Why are compile-time known format-strings not optimized?
Some potential optimizations are easy for compilers/compiler developers to support and have large impact on the quality/performance of the code the compiler generates; and some potential optimizations are extremely difficult for compilers/compiler developers to support and have a small impact on the quality/performance of the code the compiler generates. Obviously compiler developers are going to spend most of their time on the "easier with higher impact" optimizations, and some of the "harder with less impact" optimizations are going to be postponed (and possibly never implemented).
Something like optimizing printf("%d\n", 0); into a puts() (or better, an fputs()) looks like it'd be relatively easy to implement (but would have a very small performance impact, partly because it'd be rare for that to occur in source code anyway).
The real problem is that it's a "slippery slope".
If compiler developers do the work needed for the compiler to optimize printf("%d\n", 0);, then what about optimizing printf("%d\n", x); too? If you optimize those then why not also optimize printf("%04d\n", 0); and printf("%04d\n", x);, and floating point, and more complex format strings? Surely printf("Hello %04d\n Foo is %0.6f!\n", x, y); could be broken down into a series of smaller functions (puts(), atoi(), ..)?
This "slippery slope" means that the complexity increases rapidly (as the compiler supports more permutations of format strings) while the performance impact barely improves at all.
If compiler developers are going to spend most of their time on the "easier with higher impact" optimizations, they're not going to be enthusiastic about clawing further up that particular slippery slope.
How could the compiler know what kind of printf() function you will link?
There is a standard meaning of this function, but nobody forces you to use the standard library.
So it's beyond the compiler's competence to do in advance what the function is supposed to do.
My guess is that how printf() does the conversion from the internal C format to the user representation is a matter of operating system, with its locale conventions (often/always configurable?) so it would be wrong to assume that to print an "integer 0" one should use the ascii char 48 (extreme example, I know). Particularly, and here I may be wrong, a printf("\n") outputs a newline on unix, and a CR+LF on other systems. I am not sure about where this transformation is done and, to be sincere, I think this is a glitch: I've seen, more than once, programs which outputted the wrong line termination, and I think this is because the C compiler should not code a line termination character into the program. This is the same as assuming that a particular number has to be written in a particular way.
Little addition: below this question there's a comment saying that it is possible to change at runtime the behavior of format specifiers. If this is true, then we have the final reply to the question... can someone confirm?
How can I print (that is, to stdout) a float in C without having it be promoted to a double when passed to printf?
The issue here is that variadic functions in C promote all float parameter to double, which incurs two unnecessary conversions. For example, if you turn on -Wdouble-promotion in GCC and compile
float f = 0.f;
printf("%f", f);
you will get
warning: implicit conversion from 'float' to 'double' when passing argument to function
I have relatively little processing power to play with (a 72MHz ARM Cortex-M3), and I am definitely bottlenecking on ASCII output of floating point data. As the architecture lacks a hardware FPU to begin with, having to convert between single and double precision does not help matters.
Is there a way to print a float more efficiently in straight C?
Avoiding the promotion will not save you anything, since the internal double (or more likely long double) arithmetic printf will perform is going to consume at least 1000x as much time. Accurately printing floating point values is not easy.
If you don't care about accuracy though, and just need to print approximate values quickly, you can roll your own loop to do the printing. As long as your values aren't too large to fit in an integer type, first convert and print the non-fractional part as an integer, then subtract that off and loop multiplying by 10 and taking off the integer part to print the fractional part one digit at a time (buffer it in a string for better performance).
Or you could just do something like:
printf("%d.%.6d", (int)x, (int)((x-(int)x)*1000000));
Unfortunately, printf does not have support for handing plain float:s.
This mean that you would have to write your own print function. If you don't need the full expressive power of printf, you could easily convert your floating-point value to an integral part and a part representing a number of decimals, and print out both using integers.
If, on the other hand, you simply would like to get rid of the warning, you could explicitly cast the float to a double.
I think that doesnt matter - printf is already such a timeconsuming nasty thing, that those conversion should not matter. The time converting float to double should be far less than converting any number to ascii (you should/could profile your code to get there a definitve answer). The only remaining solution would be to write an own custom output routine which converts float->ascii and then uses puts (or similar).
First approach: Use ftoa instead of printf. Profile.
For increased output flexibility, I would go into the source code of your compiler's stdlib, perhaps some derivative of gcc anyway, locate the printf implementation and copy over the relevant code for double -> ascii conversion. Rewrite it to float -> ascii.
Next, manually change one or two porminent call-sites to your new (non-variadic) version and profile it.
If it solves your problem, you could think of rewriting your own printf, based on the version from stdlib, whereby instead of float you pass float*. That should get rid of the automatic promotion.
I've implemented some sorting algorithms (to sort integers) in C, carefully using uint64_t to store anything which has got to do with the data size (thus also counters and stuff), since the algorithms should be tested also with data sets of several giga of integers.
The algorithms should be fine, and there should be no problems about the amount of data allocated: data is stored on files, and we only load little chunks per time, everything works fine even when we choke the in-memory buffers to any size.
Tests with datasets up to 4 giga ints (thus 16GB of data) work fine (sorting 4Gint took 2228 seconds, ~37 minutes), but when we go above that (ie: 8 Gints) the algorithm doesn't seem to halt (it's been running for about 16 hours now).
I'm afraid the problem could be due to integer overflow, maybe a counter in a loop is stored on a 32 bits variable, or maybe we're calling some functions that works with 32 bits integers.
What else could it be?
Is there any easy way to check whether an integer overflow occurs at runtime?
This is compiler-specific, but if you're using gcc then you can compile with -ftrapv to issue SIGABRT when signed integral overflow occurs.
For example:
/* compile with gcc -ftrapv <filename> */
#include <signal.h>
#include <stdio.h>
#include <limits.h>
void signalHandler(int sig) {
printf("Overflow detected\n");
}
int main() {
signal(SIGABRT, &signalHandler);
int largeInt = INT_MAX;
int normalInt = 42;
int overflowInt = largeInt + normalInt; /* should cause overflow */
/* if compiling with -ftrapv, we shouldn't get here */
return 0;
}
When I run this code locally, the output is
Overflow detected
Aborted
Take a look at -ftrapv and -fwrapv:
-ftrapv
This option generates traps for signed overflow on addition, subtraction, multiplication operations.
-fwrapv
This option instructs the compiler to assume that signed arithmetic overflow of addition, subtraction and multiplication wraps around using twos-complement representation. This flag enables some optimizations and disables other. This option is enabled by default for the Java front-end, as required by the Java language specification.
See also Integer overflow in C: standards and compilers, and Useful GCC flags for C.
clang now support dynamic overflow checks for both signed and unsigned integers. See -fsanitize=integer switch. For now it is only one C++ compiler with fully supported dynamic overflow checking for debug purpose.
If you are using Microsoft's compiler, there are options to generate code that triggers a SEH exception when an integer conversion cuts off non-zero bits. In places where this is actually desired, use a bitwise AND to remove the upper bits before doing the conversion.
The only sure fire way is to wrap operations on those integers into functions that perform bounds violation checking. This will of course slow down integer ops, but if your code asserts or halts on a boundary violation with a meaningful error message, that will go a long way towards helping you identify where the problem is.
As for your particular issue, keep in mind that general case sorting is O(nlogn), so the reason the algorithm is taking much longer could be due to the fact that the increase in time is not linear with respect to the data set size. Since also didn't mention how much physical memory is in the box and how much of it is used for your algorithm, there could possibly be page faulting to disk with the larger data set, thus potentially slowing things to a crawl.
1) I've got many constants in my C algo.
2) my code works both in floating-point and fixed-point.
Right now, these constants are initialized by a function, float2fixed, whereby in floating-point it does nothing, while in fixed-point, it finds their fixed-point representation. For instance, 0.5f stays 0.5f if working in floating-point, whereas it uses the pow() routine and becomes 32768 if working in fixed-point and the fixed-point representation is Qx.16.
That's easy to maintain, but it takes a lot of time actually to compute these constants in fixed-point (pow is a floatin-point function). In C++, I'd use some meta-programming, so the compiler computes these values at compile-time, so there's no hit at run-time. But in C, thats not possible. Or is it? Anybody knows of such a trick? Is any compiler clever enough to do that?
Looking forward to any answers.
A
Rather than using (unsigned)(x*pow(2,16)) to do your fixed point conversion, write it as (unsigned)(0.5f * (1 << 16))
This should be an acceptable as a compile-time constant expression since it involves only builtin operators.
When using fixed-point, can you write a program that takes your floating point values and converts them into correct, constant initializers for the fixed point type, so you effectively add a step to the compilation that generates the fixed point values.
One advantage of this will be that you can then define and declare your constants with const so that they won't change at run-time - whereas with the initialization functions, of course, the values have to be modifiable because they are calculated once.
I mean write a simple program that can scan for formulaic lines that might read:
const double somename = 3.14159;
it would read that and generate:
const fixedpoint_t somename = { ...whatever is needed... };
You design the operation to make it easy to manage for both notations - so maybe your converter always reads the file and sometimes rewrites it.
datafile.c: datafile.constants converter
converter datafile.constants > datafile.c
In plain C, there's not much you can do. You need to do the conversion at some point, and the compiler doesn't give you any access to call interesting user-provided functions at compile time. Theoretically, you could try to coax the preprocessor to do it for you, but that's the quick road to total insanity (i.e. you'd have to implement pow() in macros, which is pretty hideous).
Some options I can think of:
Maintain a persistent cache on disk. At least then it'd only be slow once, though you still have to load it, make sure it's not corrupt, etc.
As mentioned in another comment, use template metaprogramming anyway and compile with a C++ compiler. Most C works just fine (arguably better) with a C++ compiler.
Hmm, I guess that's about all I can think of. Good luck.
Recent versions of GCC ( around 4.3 ) added the ability to use GMP and MPFR to do some compile-time optimisations by evaluating more complex functions that are constant. That approach leaves your code simple and portable, and trust the compiler to do the heavy lifting.
Of course, there are limits to what it can do, and it would be hard to know if it's optimizing a given instance without going and looking at the assembly. But it might be worth checking out. Here's a link to the description in the changelog