Test printf implementation - c

I would like to have a portable implemenation of my application. However,
I have heard that there are some issues with printf from the stdlib on certain machines
where it does not behave as intended. For instance, when using the conversion specifier
%f then it can happen that on certain architectures the printf implementation
includes a decimal point in the output!
Now I am wondering, if there are maybe some testing routines out there which I could
use to test the semantic correctness of stdlib c implementation, in particular the printf
routine. Maybe there are some good resources that point out some issues when porting programs?
Many thanks,
Heinz

I think Postel's law ("be conservative in what you do, be liberal in what you accept from others") applies here, too. Don't write your tests to require a character-by-character match in order to consider the printf() implementation to be working.
Instead, do it a a higher level; parse the text output by printf() into the expected datatype, and compare against a value of that type.
I.e., if printing "2.25", parse the text (using strtod() or equivalent) and compare against the actual number 2.25, not the literal text string "2.25".

The wine test suite for the msvcrt dll looks interesting : http://github.com/mirrors/wine/blob/master/dlls/msvcrt/tests/printf.c

You should write your own test suite that covers the issues that concern you. It's very simple to call printf 100 times with varying inputs, and the output is simple text, so it's simple to check against the output you expect.

I would recommend to test it in the following way: use sprintf() to produce some testing templates and compare them to your "correct" one.
I did something like this using fprintf (just to avoid the caching in our embedded system).
I think that results would not differ for the printf and sprintf: the formatting algorithm is the same.

Related

Atoi() vulnerability against fault injection

I’m using atoi to convert an string to integer in a embedded c application. However, I could exploit the vulnerability in atoi() using clock glitching fault injection attack. I mean when I have a single or multiple glitch, the processor missed some characters and returns faulty integer. Is there any alternative for atoi function which is more robust against fault injection? Can I use its complementary (itoa function) to regenerate the string and compare two strings?
I saw the strtol function as an alternative for validation instead of atoi(). could that be a case for my problem or it just returns the software errors?
This is a typical case of a CPU controlled by a Schrödinger cat. With her quantic paws, she can decide which instructions to execute or skip...
It is difficult to imagine code that would be resilient in such an environment.
As a matter of fact, any attempt at testing output consistency could be defeated by skipping the corresponding instructions.
As commented by Barmar, you could just call atoi() twice and compare the values, hoping for a moment of distraction of the clock glitcher.

Unsafe C functions in HP-UX Environment

We are developing one scheduler application in the C Programming language. We are using the HP-UX environment to compile and deploy the code. During the yearly external audit of application, we received one report that contains following number of observations.
Dangerous functions: strcpy, strlen, strcat etc.
Buffer overflow: memcpy
Buffer overflow format string: sprintf, snprintf etc.
Format string: printf, sprintf etc.
They also give the general recommendation — Contains some safe functions that is:
strncpy_s
strnlen_s
strncat_s
memcpy_s etc..
Now, the problem is there no such library available for HP-UX environment. Above given functions are supported only in the Windows environment.
Is there any alternative available for dangerous functions in Linux environment?
How we can mitigate buffer-overflow format string and format string category?
See Do you use the TR 24731 'safe' functions? for a discussion of the demerits of the _s functions.
Functions such as strcpy() are safe if (and only if) you know how big the source string and the target strings are. If you don't know, you're playing with fire.
Buffer overflows with memcpy() are outright bugs in your program; you can't use it reliably if you don't know the sizes, or that the buffers do not overlap (memmove() is safer; it handles overlaps). There's an argument to say "you don't need strcpy() or strcat() etc because if you have enough data to use them safely, you can use memmove() or memcpy() instead". On the whole, strlen() is pretty safe — as long as you pass it a string. If you don't know whether you're dealing with strings, then you've got lots of problems; you must know that you're dealing with strings to call the string manipulation functions.
Note that the strncpy() and strncat() functions are not safe. The problem with strncpy() is that it does not null terminate the string if the source is too long. The problem with strncat() is that passing sizeof(dst) as the size of the destination is wrong, even if the string is empty; it has one of the weirdest, most bug-prone interfaces of any extant C function — gets() is no longer counted as extant. If you know the sizes of everything, you don't need them. If you don't know the sizes, using them won't make you safe.
Using sprintf() is unnecessarily dangerous; using snprintf() should be safe as long as you get the size correct and pay attention to data truncation by testing the return value. Check to see whether asprintf() and vasprintf() are available — and consider using them if they are.
Format string vulnerabilities arise where you have:
printf(fmtstr, value1, value2);
where the fmtstr argument can be controlled or influenced by the user. If you can determine where the format string comes from and know it is safe, then there isn't a problem, and it can help with the internationalization of your code. If you can't determine that the format string is safe, you are running risks. How serious those risks are depends on the context in which it is used. If the user root will be running the code, which seems likely for a scheduler, then you must be meticulous. You may be able to be a little more blasé if the users running the code will not be root, but it is difficult to ensure that no-one ever runs the code as root.
You're right that the _s functions are not available except on Windows. The external auditors have been downright unhelpful — suggesting the use of functions that are not available on the target platform is counter-productive. There is room to debate whether using the _s functions is sufficient, Microsoft notwithstanding. They can be misused, just as any function can. See the N1967 paper referenced in my answer to the TR 24731 question. (There are later papers available from the C standard committee's web site at http://www.open-std.org/jtc1/sc22/wg14/ which don't entirely agree with N1967 — N2336 from the Pre-London 2019 mailing, for example. I'm not sure I entirely agree with N2336.)
Consider whether strlcpy() and strlcat() are available and could/should be used for strcpy(), strncpy(), strcat(), strncat().

Input/ Output alternatives for printf/scanf

It may sound strange that knowing a lot about iOS and having some experience in .net, I am a newcomer to C. Somewhere I got this target to find average of n numbers without using printf and scanf. I don't want the code for the program but I am seeking alternatives to the mentioned functions.
Is code with printf/scanf required here? Also do let me know if my query stands invalid.
No, neither printf nor scanf is really needed for this.
The obvious alternatives would be to read the input with something like getc or fgets and convert from characters to numbers with something like strtol.
On the output side, you'd more or less reverse that, converting from numbers to characters (e.g., with itoa which is quite common, though not actually standard), then printing out the resulting string (e.g., with fputs).

Test Cases of atof() function in structure format

Two days ago i attended an interview.I had been asked a question and i am still finding answer.The Question Was tell me the test cases of atof(const char *str) function in c.I told him various test cases like
I have to check the given string should contain only numeric.
Given string contain one decimal point.
it should not overflow after conversion.
string should not be null.
but interviewer was not satisfied and asking for give me the answer in structured format.now my question is how to represent this answer in structure format so that in future i could not make same mistake.
I'm not sure what the interviewer means by "structured format", but I would do this by writing down the BNF syntax for floating point numbers (the C language specifies them), and then presenting test cases that test for each path through the syntax. Your cases notably do not cover the sign or exponent, and the number need not contain a decimal point.
A structural approach breaks the problem down into subproblems. Syntax is one subproblem, and the syntax chart or BNF provides a natural way to break that down into subproblems. An additional subproblem is boundary conditions ... there should be test cases for the minimum (> 0) and maximum valid values. There should also be test cases for handling of invalid inputs, but as lundin noted in a comment, that's impossible for atof as the behavior for invalid inputs is undefined.
Maybe you can structure your answer by what you are testing, like giving bad formated string (null, empty, etc ...) and by giving bad arguments like bad "numbers" (0 prefix/suffix 2.0, 0.4 etc...) you can also tests negative float numbers, put more than one dot in the string or whatever. I hope i have answer your question, if not, i think i haven't understood the question well.
I understand the term "test cases" differently than you.
I think what he wants are various inputs to atof and their expected results. For example:
1. atof("1.5") should return 1.5.
2. atof("-7") should retutn -7.0.
3. atof("Hello, world") should fail. But following Lundin's comment, there's no defined failure behavior for atof, so you can't really test this.
The test cases should cover all the different things the function needs to test. But you don't need to write down these things - just the example inputs and expected outputs.
Writing this in a structured format is easy.
We used use atof in our code most of the time we need to handle Internationalization/Localization in many languages 10.0 get converted to 10,0.
before calling atof you need to set locale and after completing the functionality you have to reset the locale.

file and formatting alternative libs for c

I've done some searching and have not found anything that would boost the file and formatting functions in Visual Studio VS2010 C (not C++).
I've been able to address the raw i/o issues to some extent by using large buffers and a SSD drive, so the more pressing issue is a replacement for the family of printf functions.
Has anyone found something worthwhile?
As I understand it, part of the glacial speed issue with the printf functions is that they have to handle myriad types of arguments. Does anyone have experience with writing a datatype-specific version of printf; eg, one that only prints ints, or only prints doubles, etc?
First off, you should profile the code first before assuming it's printf.
But if you're sure it's printf and similar then you can do a few things to fix the issue.
1) print less. IE, don't call expensive operations as much if you can avoid it. Do you need all the output, for example?
2) manually replace the string concatenation with manually built routines that do all the pieces without having to parse the format specifier.
EG: printf("--%s--", "really cool");
Can become:
write(1, "--", 2);
write(1, "really cool", 11);
write(1, "--", 2);
That may be faster. But again, you won't know until you profile it. Don't spend energy on a solution till you can confirm it's the solution you need and be able to measure the success of your proposed solution.
#Wes is right, never assume you know what you need to fix until you have proof.
Where I differ is on the method of finding out.
I and others use random pausing which works for these reasons, and here's a short slide show demo & C++ code so you can see how it works, if you want.
The thing about printf (or any output) function is it spends A) a certain number of CPU cycles creating a buffer to be output, and then it spends B) a certain amount of time waiting while the system and/or auxiliary hardware actually moves the data out.
That's maybe a bit over-simplified, but if you randomly pause and examine the state, that's what you see.
What you've done by using large buffers and an SSD drive is reduce B, and that's good.
That means of the time remaining, A is a larger fraction.
You know that.
Now of the samples you find in A, you might get a hint of what's happening if you see what subordinate routines inside printf are showing up.
Usually printf calls something like vprintf to get rid of the variable argument list, which then cycles over the format string to figure out what to do, including things like parsing precision specifiers.
If it looks like that's what it's doing, then you know about how much time goes into parsing the format.
On the other hand, if you see it inside a routine that is copying a string, or formatting an integer (along with dealing with leading/trailing characters, etc.) then you know to concentrate on that.
On yet another hand, if you see it inside a routine that looks like it's formatting a floating point number (which is actually quite complicated), you know to concentrate on that.
Given all that, you want to know what I do?
First, I ask who is going to read this anyway?
If nobody really needs to read all this text, why not pump it out in binary? Or failing that, in hex?
If you simply write binary, A shrinks to nothing, and when you read it back in with another program, guess what?
No Lost Bits!

Resources