How do I prevent buffer overflow converting a double to char?

How do I prevent buffer overflow converting a double to char? - c

I'm converting a double to a char string:
char txt[10];
double num;
num = 45.344322345
sprintf(txt, "%.1f", num);
and using ".1f" to truncate the decimal places, to the tenths digit.
i.e. - txt contains 45.3
I usually use precision in sprintf to ensure the char buffer is not overflowed.
How can I do that here also truncating the decimal, without using snprintf?
(i.e. if num = 345694876345.3 for some reason)
Thanks
EDIT If num is > buffer the result no longer matters, just do not want to crash. Not sure what would make the most sense in that case.
EDIT2 I should have made it more clear than in just the tag, that this is a C program.
I am having issues using snprintf in a C program. I don't want to add any 3rd party libraries.

Use snprintf() , which will tell you how many bytes were not printed. In general, you should size your array to be large enough to handle the longest string representation of the target integer type. If not known in advance, use malloc() (or asprintf(), which is non-standard, but present on many platforms).
Edit
snprintf() will fail gracefully if the format exceeds the given buffer, it won't overflow. If you don't need to handle that, then simply using it will solve your problem. I can't think of an instance where you would not want to handle that, but then again, I'm not working on whatever you are working on :)

Why not just make your buffer big enough to hold the largest possible string representation of a double?
Assuming a 64-bit double using the IEEE standard for floating point arithmetic, which uses 52 bits for a mantissa: 2^52 = 4,503,599,627,370,500. So we need 16 characters to hold all the digits before and after the decimal point. 19 considering the decimal point, sign character and null terminator.
I would just use a buffer size of at least 20 characters and move on.
If you need to print a double using scientific notation, you will need to add enough space for the exponent. Assuming a 11 bit signed exponent, that's another 4 characters for the exponent plus a sign for the exponent and the letter 'E'. I would just go with 30 characters in that case.

If you absolutely must do it on your own, count the digits in the number before trying to convert:
int whole = num;
int wholeDigits = 0;
do {
++wholeDigits;
}
while (whole /= 10);
double fraction = num - (int) num;
int decimallDigits = 0;
while (fraction > 0) {
++decimalDigits;
fraction *= 10;
fraction = fraction - (int) fraction;
}
int totalLength = decimalDigits ? wholeDigits + decimalDigits + 1 : wholeDigits;
You should probably verify that this ad-hoc code works as advertised before relying on it to guard against crashes. I recommend that you use snprintf or something similar instead of my code, as others have said.

Why do you want to do it without snprintf? You should be using snprintf regardless of whether your format string contains a double, another string or anything else, really. As far as I can see, there's no reason not to.

Related

Convert Long To Double, Unexpected Results

I am using very basic code to convert a string into a long and into a double. The CAN library I am using requires a double as an input. I am attempting to send the device ID as a double to another device on the CAN network.
If I use an input string of that is 6 bytes long the long and double values are the same. If I add a 7th byte to the string the values are slightly different.
I do not think I am hitting a max value limit. This code is run with ceedling for an automated test. The same behaviour is seen when sending this data across my CAN communications. In main.c the issue is not observed.
The test is:
void test_can_hal_get_spn_id(void){
struct dbc_id_info ret;
memset(&ret, NULL_TERMINATOR, sizeof(struct dbc_id_info));
char expected_str[8] = "smg123";
char out_str[8];
memset(&out_str, 0, 8);
uint64_t long_val = 0;
double phys = 0.0;
memcpy(&long_val, expected_str, 8);
phys = long_val;
printf("long %ld \n", long_val);
printf("phys %f \n", phys);
uint64_t temp = (uint64_t)phys;
memcpy(&out_str, &temp, 8);
printf("%s\n", expected_str);
printf("%s\n", out_str);
}
With the input = "smg123"
[test_can_hal.c]
- "long 56290670243187 "
- "phys 56290670243187.000000 "
- "smg123"
- "smg123"
With the input "smg1234"
[test_can_hal.c]
- "long 14692989459197299 "
- "phys 14692989459197300.000000 "
- "smg1234"
- "tmg1234"
Is this error just due to how floats are handled and rounded? Is there a way to test for that? Am I doing something fundamentally wrong?
Representing the char array as a double without the intermediate long solved the issue. For clarity I am using DBCPPP. I am using it in C. I should clarify my CAN library comes from NXP, DBCPPP allows my application to read a DBC file and apply the data scales and factors to my raw CAN data. DBCPPP accepts doubles for all data being encoded and returns doubles for all data being decoded.

The CAN library I am using requires a double as an input.
That sounds surprising, but if so, then why are you involving a long as an intermediary between your string and double?
If I use an input string of that is 6 bytes long the long and double values are the same. If I add a 7th byte to the string the values are slightly different.
double is a floating point data type. To be able to represent values with a wide range of magnitudes, some of its bits are used to represent scale, and the rest to represent significant digits. A typical C implementation uses doubles with 53 bits of significand. It cannot exactly represent numbers with more than 53 significant binary digits. That's enough for 6 bytes, but not enough for 7.
I do not think I am hitting a max value limit.
Not a maximum value limit. A precision limit. A 64-bit long has smaller numeric range but more significant digits than an IEEE-754 double.
So again, what role is the long supposed to be playing in your code? If the objective is to get eight bytes of arbitrary data into a double, then why not go directly there? Example:
char expected_str[8] = "smg1234";
char out_str[8] = {0};
double phys = 0.0;
memcpy(&phys, expected_str, 8);
printf("phys %.14e\n", phys);
memcpy(&out_str, &phys, 8);
printf("%s\n", expected_str);
printf("%s\n", out_str);
Do note, however, that there is some risk when (mis)using a double this way. It is possible for the data you put in to constitute a trap representation (a signaling NaN might be such a representation, for example). Handling such a value might cause a trap, or cause the data to be corrupted, or possibly produce other misbehavior. It is also possible to run into numeric issues similar to the one in your original code.
Possibly your library provides some relevant guarantees in that area. I would certainly hope so if doubles are really its sole data type for communication. Otherwise, you could consider using multiple doubles to covey data payloads larger than 53 bits, each of which you could consider loading via your original technique.

If you have a look at the IEEE-754 Wikipedia page, you'll see that the double precision values have a precision of "[a]pproximately 16 decimal digits". And that's roughly where your problem seems to appear.
Specifically, though it's a 64-bit type, it does not have the necessary encoding to provide 264 distinct floating point values. There are many bit patterns that map to the same value.
For example, NaN is encoded as the exponent field of binary 1111 1111 with non-zero fraction (23 bits) regardless of the sign (one bit). That's 2 * (223 - 1) (over 16 million) distinct values representing NaN.
So, yes, your "due to how floats are handled and rounded" comment is correct.
In terms of fixing it, you'll either have to limit your strings to values that can be represented by doubles exactly, or find a way to send the strings across the CAN bus.
For example (if you can't send strings), two 32-bit integers could represent an 8-character string value with zero chance of information loss.

Reading from binary file and converting to double?

I am trying to write a C program that reads a binary file and converts it to a data type. I am generating a binary file with a head command head -c 40000 /dev/urandom > data40.bin. The program works for data types int and char but fails for double. Here is the code for the program.
void double_funct(int readFrom, int writeTo){
double buffer[150];
int a = read(readFrom,buffer,sizeof(double));
while(a!=0){
int size = 1;
int c=0;
for(c=0;c<size;c++){
char temp[100];
int x = snprintf(temp,100,"%f ", buffer[c]);
write(writeTo, temp, x);
}
a = read(readFrom,buffer,sizeof(double));
}
}
and this is the char function that works
void char_funct(int readFrom, int writeTo){
char buffer[150];
int a = read(readFrom,buffer,sizeof(char));
while(a!=0){
int size = 1;
int c=0;
for(c=0;c<size;c++){
char temp[100]=" ";
snprintf(temp,100,"%d ", buffer[c]);
write(writeTo, temp, strlen(temp));
}
a = read(readFrom,buffer,sizeof(char));
}
}
The problem is that with char I need to get 40000 words with wc -w file and I get them. Now with double I get random amount of words but theoretically I should get 5000 from 40000 bytes of data but I get a random amount between 4000 and 15000 and for char I get 40000 like it should 1 byte for one character.
I don't know what is wrong the same code works for int where I get 10000 words from 40000 bytes of data.

The main problem seems to be that your temp array is not large enough for your printf format and data. IEEE-754 doubles have a decimal exponent range from from -308 to +308. You're printing your doubles with format "%f", which produces a plain decimal representation. Since no precision is specified, the default precision of 6 applies. This may require as many as 1 (sign) + 309 (digits) + 1 (decimal point) + 6 (trailing decimal places) + 1 (terminator) chars (a total of 318), but you only have space for 100.
You print to your buffer using snprintf(), and therefore do not overrun the array bounds there, but snprintf() returns the number of bytes that would have been required, less the one required for the terminator. That's the number of bytes you write(), and in many cases that does overrun your buffer. You see the result in your output.
Secondarily, you'll also see a large number of 0.00000 in your output, arising from rounding small numbers to 6-decimal-digit precision.
You would probably have better success if you change the format with which you're printing the numbers. For example, "%.16e " will give you output in exponential format with a total of 17 significant digits (one preceding the decimal point). That will not require excessive space in memory or on disk, and it will accurately convey all numbers, regardless of scale, supposing again that your doubles are represented per IEEE 754. If you wish, you can furthermore eliminate the (pretty safe) assumption of IEEE 754 format by employing the variation suggested by #chux in comments. That would be the safest approach.
One more thing: IEEE floating point supports infinities and multiple not-a-number values. These are very few in number relative to ordinary FP numbers, but it is still possible that you'll occasionally hit on one of these. They'll probably be converted to output just fine, but you may want to consider whether you need to deal specially with them.

Difference in the values of atoi

I have the following code:
char* input = (char*)malloc(sizeof(char) * BUFFER) // buffer is defined to 100
int digit = atoi(input); // convert char into a digit
int digit_check = 0;
digit_check += digit % 10; // get last value of digit
When I run the input 1234567896 and similarly digit = 1234567896 and digit_check = 6.
However when I run the input 9999999998, digit = 1410065406 and therefore digit_check = 6 when it should be 8.
For the second example, why is there a difference between input and digit when it should be the same value?

Probably because 9999999998 is bigger then the maximum (signed) integer representation, so you get an overflow.
In fact this is the binary representation of 9999999998 and 1410065406:
10 01010100 00001011 11100011 11111110
01010100 00001011 11100011 11111110
As you can see if you see 1410065406 is the 32ed bit value of 9999999998

atoi is limited to an int size (32 bits on most recent plateform).
If you want to handle large numbers, you can use atol or scanf("%ld").
Don't forget to type your variable to long int (or long).
You could also just getting the very last character of your input (gathered as a string rather than as an int) and use atoi on it, so it would never overflow.

On many platforms size of int is limited by 4 bytes, that limits digit in [-2 ** 31, 2**31 - 1].
Use long (or long long) with strtol (or strtoll) depending on platform you build for. For example, GCC on x86 will have 64-bit long long, and for amd64 it will have 64-bit long and long long types.
So:
long long digit = strtoll(input, NULL, 10);
NOTE: strtoll() is popular in Unix-like systems and became standard in C++11, but not all VC++ implementations have it. Use _strtoi64() instead:
__int64 digit = _strtoi64(input, NULL, 10);

You probably want to use the atoll function, which returns a long long int, that is twice as big as int (most likely 64 bits in your case).
It is declared in stdlib.h
http://linux.die.net/man/3/atoll

You should avoid to call atoi on uninitialized string, if there is no \0 on the string, you will invalid read and have a segmentation fault.
You should use strtoimax instead, it's more safe.

9999999998 is bigger then the maximum value that an integer can represent. Either use atol() OR atoll()

You should stop using atoi function or any other functions from ato... group. These functions are not officially deprecated, but they are effectively abandoned since 1995 and exist only for legacy code compatibility purposes. Forget about these functions as if they do not exist. These function provide no usable feedback in case of error or overflow. And overflow is what apparently happens in your example.
In order to convert strings to numbers, C standard library provides strtol function and other functions from strto... group. These are the functions you should use to perform the conversion. And don't forget to check the result for overflow: strto... functions provide this feedback through the return value and errno variable.

How to print the char, integer, float and double values without using format specifiers in c

I want to print the char, int, float and double values without using format specifiers in my c program.
I can able to print the string using the below code:
char s[] = "Hello\n";
fprintf(stdout, s);
how can I print the other data type values?

To print a char, use:
fputc(c, stream)
(If the stream is stdout, you can use putchar(c).)
To print an int:
If the int is negative, print “-”.
Calculate the individual digits of the integer. This can be done either by calculating the digits from least significant to most significant and saving them in a buffer to be printed in reverse order or by figuring out where the most significant digit is and then calculating the digits from most significant to least significant. You can use a remainder operation, such as x % 10, to calculate the least significant digit of a number, and you can use division, such as x / 10, to remove that digit.
One caveat is that, if the original number is negative, you have to be careful about calculating its digits. The % operator will return negative values. Some people attempt to deal with this by negating the integer if it is negative. However, if the number is the least possible int, this may overflow. E.g., in many C implementations, the least int value is -2,147,483,648, but it cannot be negated because the greatest int is 2,147,483,647.
Any digit in numeric form (0 to 10) can be converted to a character (“0” to “9”) by adding '0', such as int d = x % 10; char c = d + '0';. The C standard guarantees that this produces the appropriate character in c.
After you obtain the characters from the digits, print them.
To print a float or double:
Doing this completely correctly is hard, although it is a solved problem. The classic reference for it is Correctly Rounded Binary-Decimal and Decimal-Binary Conversions by David M. Gay.
If you just want a simple implementation suitable for a learning exercise, then you can format a floating-point value much as you would an integer: Calculate the digits individually. You also need to decide whether to print a fixed-point notation or a scientific notation (or other).
To print a fixed-point notation, print the integer part of the value as above, for integer types. Then print a “.” and some digits for the fractional part of the value.
To print a scientific notation, calculate the value of the exponent part (e.g., to express 12345789 as “1.23456789e7”, the exponent is 7, for 107. Divide the value by 10 raised to the power of that exponent and print the resulting value as a fixed-point number (so, in this example, you print “1.23456789”), then print “e”, the print the exponent part.
Floating-point rounding errors will occur in the above, making it suitable only for a learning exercise, not for use in a quality product.
The above should suffice to get you started. It is not complete code, obviously.

just one thought, not very optimal:
int myvalue = 12345;
char buffer[100];
size_t index = 0;
while (myvalue) {
buffer[index] = '0' + myvalue % 10;
myvalue = myvalue / 10;
index++;
}
buffer[index] = '\0';
reverse(buffer);
fprintf(stdout, buffer);
you have to consider the negative sign. And the sizeof buffer (100 is a very bad guess).

To print a char as a character use fputc().
To print an integer (including char) in its decimal form, call either print_unsigned() or print_signed(), depending on if it is a signed integer or an unsigned integer.
The below uses recursion to print the most significant digits first.
For signed integers, it flips positive numbers to negative avoiding the undefined behavior of -INT_MIN.
int print_unsigned(uintmax_t x) {
if (x >= 10) {
if (print_unsigned(x / 10) == EOF) return EOF;
}
return fputc('0' + x % 10, stdout);
}
static int print_signed_helper(intmax_t x) {
if (x <= -10) {
if (print_signed_helper(x / 10) == EOF) return EOF;
}
return fputc('0' - x % 10, stdout);
}
int print_signed(intmax_t x) {
if (x < 0) {
if (fputc('-', stdout) == EOF) return EOF;
} else {
x = -x; // overflow not possible
}
return print_signed_helper(x);
}
The above stops early if the output causes an output error. EOF is something rarely returned from fputc().
To printf a float of double: TBD code.

The short answer is that while it may be possible to hack something together that will do something RESEMBLING what you want (along the lines of Peter Miehle's solution posted in another answer), fundamentally C is not designed for this kind of functionality, and there is no support for it in the language. What you want is function overloading, which C++ and many other higher-level languages provide.
Even Peter Miehle's solution cannot be implemented as a function (in C), because what kind of argument would the function take? Either it is passed a type, in which case we KNOW the type and may as well use printf, or it is passed e.g. a void pointer, in which case, how can it implement the arithmetic operators without knowing the underlying data type the pointer points to?

Dynamic allocation in C

I'm writing a program and I have the following problem:
char *tmp;
sprintf (tmp,"%ld",(long)time_stamp_for_file_name);
Could someone explain how much memory allocate for the string tmp.
How many chars are a long variable?
Thank you,
I would appreciate also a link to an exahustive resource on this kind of information.
Thank you
UPDATE:
Using your examples I got the following problem:
root#-[/tmp]$cat test.c
#include <stdio.h>
int
main()
{
int len;
long time=12345678;
char *tmp;
len=snprintf(NULL,0,"%ld",time);
printf ("Lunghezza:di %ld %d\n",time,len);
return 0;
}
root#-[/tmp]$gcc test.c
root#-[/tmp]$./a.out
Lunghezza:di 12345678 -1
root#-[/tmp]$
So the len result from snprintf is -1, I compiled on Solaris 9 with the standard compiler.
Please help me!

If your compiler conforms to C99, you should be able to do:
char *tmp;
int req_bytes = snprintf(NULL, 0, "%ld",(long)time_stamp_for_file_name);
tmp = malloc(req_bytes +1); //add +1 for NULL
if(!tmp) {
die_horrible_death();
}
if(snprintf(tmp, req_bytes+1, "%ld",(long)time_stamp_for_file_name) != req_bytes) {
die_horrible_death();
}
Relevant parts of the standard (from the draft document):
7.19.6.5.2: If n is zero, nothing is written, and s may be a null pointer.
7.19.6.5.3: The snprintf function returns the number of characters that would have been written
had n been sufficiently large, not counting the terminating null character, or a negative
value if an encoding error occurred. Thus, the null-terminated output has been
completely written if and only if the returned value is nonnegative and less than n.
If this is not working, I'm guessing your compiler/libc does not support this part of c99, or you might need to explicitly enable it. Wh I run your example (with gcc version 4.5.0 20100610 (prerelease), Linux 2.6.34-ARCH), I get
$./example
Lunghezza:di 12345678 8

The number of chars actually used obviously depends on the value: if time_stamp_for_file_name is 0 you only actually need 2 bytes. If there's any doubt, you can use snprintf, which tells you how much space you need:
int len = snprinf(0, 0, "%ld", (long)time_stamp_for_file_name) + 1;
char *tmp = malloc(len);
if (tmp == 0) { /* handle error */ }
snprintf(tmp, len, "%ld", (long)time_stamp_for_file_name);
Beware implementations where snprintf returns -1 for insufficient space, rather than the space required.
As Paul R says, though, you can figure out a fixed upper bound based on the size of long on your implementation. That way you avoid dynamic allocation entirely. For example:
#define LONG_LEN (((sizeof(long)*CHAR_BIT)/3)+2)
(based on the fact that the base-2 log of 10 is greater than 3). That +2 gives you 1 for the minus sign and 1 for the fact that integer division rounds down. You'd need another 1 for the nul terminator.
Or:
#define STRINGIFY(ARG) #ARG
#define EXPAND_AND_STRINGIFY(ARG) STRINGIFY(ARG)
#define VERBOSE_LONG EXPAND_AND_STRINGIFY(LONG_MIN)
#define LONG_LEN sizeof(VERBOSE_LONG)
char tmp[LONG_LEN];
sprintf(tmp, "%ld", (long)time_stamp_for_file_name);
VERBOSE_LONG might be a slightly bigger string than you actually need. On my compiler it's (-2147483647L-1). I'm not sure whether LONG_MIN can expand to something like a hex literal or a compiler intrinsic, but if so then it could be too short, and this trick won't work. It's easy enough to unit-test, though.
If you want a tight upper bound to cover all possibilities within the standard, up to a certain limit, you could try something like this:
#if LONG_MAX <= 2147483647L
#define LONG_LEN 11
#else
#if LONG_MAX <= 4294967295L
#define LONG_LEN 11
#else
#if LONG_MAX <= 8589934591L
... etc, add more clauses as new architectures are
invented with bigger longs
#endif
#endif
#endif
But I doubt it's worth it: better just to define it in some kind of portability header and configure it manually for new platforms.

It's hard to tell in advance, although I guess you could guesstimate that it'll be at the most 64 bits, and thus "18,446,744,073,709,551,615" should be the largest possible value. That's 2+6*3 = 20 digits, the commas are generally not included. It'd be 21 for a negative number. So, go for 32 bytes as a nice and round size.
Better would be to couple that with using snprintf(), so you don't get a buffer overflow if your estimate is off.

It depends on how big long is on your system. Assuming a worst case of 64 bits then you need 22 characters max - this allows for 20 digits, a preceding - and a terminating \0. Of course if you're feeling extravagant you could always allow a little extra and make it a nice round number like 32.

It takes log210 (~3.32) bits to represent a decimal digit; thus, you can compute the number of digits like so:
#include <limits.h>
#include <math.h>
long time;
double bitsPerDigit = log10(10.0) / log10(2.0); /* or log2(10.0) in C99 */
size_t digits = ceil((sizeof time * (double) CHAR_BIT) / bitsPerDigit);
char *tmp = malloc(digits+2); /* or simply "char tmp[digits+2];" in C99 */
The "+2" accounts for sign and the 0 terminator.

Octal requires one character per three bits. You print to base of ten which never gives more digits than octal for same number. Therefore, allocate one character for each three bits.
sizeof(long) gives you amount of bytes when compiling. Multiply that by 8 to get bits. Add two before dividing by three so you get ceiling instead of floor. Remember the C strings want a final zero byte to their end, so add one to the result. (Another one for negative, as described in comments).
char tmp[(sizeof(long)*8+2)/3+2];
sprintf (tmp,"%ld",(long)time_stamp_for_file_name);

3*sizeof(type)+2 is a safe general rule for the number of bytes needed to format an integer type type as a decimal string, the reason being that 3 is an upper bound on log10(256) and a n-byte integer is n digits in base-256 and thus ceil(log10(256^n))==ceil(n*log10(256)) digits in base 10. The +2 is to account for the terminating NUL byte and possible minus sign if type is very small.
If you want to be pedantic and support DSPs and such with CHAR_BIT!=8 then use 3*sizeof(type)*((CHAR_BIT+7)/8)+2. (Note that for POSIX systems this is irrelevant since POSIX requires UCHAR_MAX==255 and CHAR_BIT==8.)