Dynamic allocation in C - c

I'm writing a program and I have the following problem:
char *tmp;
sprintf (tmp,"%ld",(long)time_stamp_for_file_name);
Could someone explain how much memory allocate for the string tmp.
How many chars are a long variable?
Thank you,
I would appreciate also a link to an exahustive resource on this kind of information.
Thank you
UPDATE:
Using your examples I got the following problem:
root#-[/tmp]$cat test.c
#include <stdio.h>
int
main()
{
int len;
long time=12345678;
char *tmp;
len=snprintf(NULL,0,"%ld",time);
printf ("Lunghezza:di %ld %d\n",time,len);
return 0;
}
root#-[/tmp]$gcc test.c
root#-[/tmp]$./a.out
Lunghezza:di 12345678 -1
root#-[/tmp]$
So the len result from snprintf is -1, I compiled on Solaris 9 with the standard compiler.
Please help me!

If your compiler conforms to C99, you should be able to do:
char *tmp;
int req_bytes = snprintf(NULL, 0, "%ld",(long)time_stamp_for_file_name);
tmp = malloc(req_bytes +1); //add +1 for NULL
if(!tmp) {
die_horrible_death();
}
if(snprintf(tmp, req_bytes+1, "%ld",(long)time_stamp_for_file_name) != req_bytes) {
die_horrible_death();
}
Relevant parts of the standard (from the draft document):
7.19.6.5.2: If n is zero, nothing is written, and s may be a null pointer.
7.19.6.5.3: The snprintf function returns the number of characters that would have been written
had n been sufficiently large, not counting the terminating null character, or a negative
value if an encoding error occurred. Thus, the null-terminated output has been
completely written if and only if the returned value is nonnegative and less than n.
If this is not working, I'm guessing your compiler/libc does not support this part of c99, or you might need to explicitly enable it. Wh I run your example (with gcc version 4.5.0 20100610 (prerelease), Linux 2.6.34-ARCH), I get
$./example
Lunghezza:di 12345678 8

The number of chars actually used obviously depends on the value: if time_stamp_for_file_name is 0 you only actually need 2 bytes. If there's any doubt, you can use snprintf, which tells you how much space you need:
int len = snprinf(0, 0, "%ld", (long)time_stamp_for_file_name) + 1;
char *tmp = malloc(len);
if (tmp == 0) { /* handle error */ }
snprintf(tmp, len, "%ld", (long)time_stamp_for_file_name);
Beware implementations where snprintf returns -1 for insufficient space, rather than the space required.
As Paul R says, though, you can figure out a fixed upper bound based on the size of long on your implementation. That way you avoid dynamic allocation entirely. For example:
#define LONG_LEN (((sizeof(long)*CHAR_BIT)/3)+2)
(based on the fact that the base-2 log of 10 is greater than 3). That +2 gives you 1 for the minus sign and 1 for the fact that integer division rounds down. You'd need another 1 for the nul terminator.
Or:
#define STRINGIFY(ARG) #ARG
#define EXPAND_AND_STRINGIFY(ARG) STRINGIFY(ARG)
#define VERBOSE_LONG EXPAND_AND_STRINGIFY(LONG_MIN)
#define LONG_LEN sizeof(VERBOSE_LONG)
char tmp[LONG_LEN];
sprintf(tmp, "%ld", (long)time_stamp_for_file_name);
VERBOSE_LONG might be a slightly bigger string than you actually need. On my compiler it's (-2147483647L-1). I'm not sure whether LONG_MIN can expand to something like a hex literal or a compiler intrinsic, but if so then it could be too short, and this trick won't work. It's easy enough to unit-test, though.
If you want a tight upper bound to cover all possibilities within the standard, up to a certain limit, you could try something like this:
#if LONG_MAX <= 2147483647L
#define LONG_LEN 11
#else
#if LONG_MAX <= 4294967295L
#define LONG_LEN 11
#else
#if LONG_MAX <= 8589934591L
... etc, add more clauses as new architectures are
invented with bigger longs
#endif
#endif
#endif
But I doubt it's worth it: better just to define it in some kind of portability header and configure it manually for new platforms.

It's hard to tell in advance, although I guess you could guesstimate that it'll be at the most 64 bits, and thus "18,446,744,073,709,551,615" should be the largest possible value. That's 2+6*3 = 20 digits, the commas are generally not included. It'd be 21 for a negative number. So, go for 32 bytes as a nice and round size.
Better would be to couple that with using snprintf(), so you don't get a buffer overflow if your estimate is off.

It depends on how big long is on your system. Assuming a worst case of 64 bits then you need 22 characters max - this allows for 20 digits, a preceding - and a terminating \0. Of course if you're feeling extravagant you could always allow a little extra and make it a nice round number like 32.

It takes log210 (~3.32) bits to represent a decimal digit; thus, you can compute the number of digits like so:
#include <limits.h>
#include <math.h>
long time;
double bitsPerDigit = log10(10.0) / log10(2.0); /* or log2(10.0) in C99 */
size_t digits = ceil((sizeof time * (double) CHAR_BIT) / bitsPerDigit);
char *tmp = malloc(digits+2); /* or simply "char tmp[digits+2];" in C99 */
The "+2" accounts for sign and the 0 terminator.

Octal requires one character per three bits. You print to base of ten which never gives more digits than octal for same number. Therefore, allocate one character for each three bits.
sizeof(long) gives you amount of bytes when compiling. Multiply that by 8 to get bits. Add two before dividing by three so you get ceiling instead of floor. Remember the C strings want a final zero byte to their end, so add one to the result. (Another one for negative, as described in comments).
char tmp[(sizeof(long)*8+2)/3+2];
sprintf (tmp,"%ld",(long)time_stamp_for_file_name);

3*sizeof(type)+2 is a safe general rule for the number of bytes needed to format an integer type type as a decimal string, the reason being that 3 is an upper bound on log10(256) and a n-byte integer is n digits in base-256 and thus ceil(log10(256^n))==ceil(n*log10(256)) digits in base 10. The +2 is to account for the terminating NUL byte and possible minus sign if type is very small.
If you want to be pedantic and support DSPs and such with CHAR_BIT!=8 then use 3*sizeof(type)*((CHAR_BIT+7)/8)+2. (Note that for POSIX systems this is irrelevant since POSIX requires UCHAR_MAX==255 and CHAR_BIT==8.)

Related

Difference in the values of atoi

I have the following code:
char* input = (char*)malloc(sizeof(char) * BUFFER) // buffer is defined to 100
int digit = atoi(input); // convert char into a digit
int digit_check = 0;
digit_check += digit % 10; // get last value of digit
When I run the input 1234567896 and similarly digit = 1234567896 and digit_check = 6.
However when I run the input 9999999998, digit = 1410065406 and therefore digit_check = 6 when it should be 8.
For the second example, why is there a difference between input and digit when it should be the same value?
Probably because 9999999998 is bigger then the maximum (signed) integer representation, so you get an overflow.
In fact this is the binary representation of 9999999998 and 1410065406:
10 01010100 00001011 11100011 11111110
01010100 00001011 11100011 11111110
As you can see if you see 1410065406 is the 32ed bit value of 9999999998
atoi is limited to an int size (32 bits on most recent plateform).
If you want to handle large numbers, you can use atol or scanf("%ld").
Don't forget to type your variable to long int (or long).
You could also just getting the very last character of your input (gathered as a string rather than as an int) and use atoi on it, so it would never overflow.
On many platforms size of int is limited by 4 bytes, that limits digit in [-2 ** 31, 2**31 - 1].
Use long (or long long) with strtol (or strtoll) depending on platform you build for. For example, GCC on x86 will have 64-bit long long, and for amd64 it will have 64-bit long and long long types.
So:
long long digit = strtoll(input, NULL, 10);
NOTE: strtoll() is popular in Unix-like systems and became standard in C++11, but not all VC++ implementations have it. Use _strtoi64() instead:
__int64 digit = _strtoi64(input, NULL, 10);
You probably want to use the atoll function, which returns a long long int, that is twice as big as int (most likely 64 bits in your case).
It is declared in stdlib.h
http://linux.die.net/man/3/atoll
You should avoid to call atoi on uninitialized string, if there is no \0 on the string, you will invalid read and have a segmentation fault.
You should use strtoimax instead, it's more safe.
9999999998 is bigger then the maximum value that an integer can represent. Either use atol() OR atoll()
You should stop using atoi function or any other functions from ato... group. These functions are not officially deprecated, but they are effectively abandoned since 1995 and exist only for legacy code compatibility purposes. Forget about these functions as if they do not exist. These function provide no usable feedback in case of error or overflow. And overflow is what apparently happens in your example.
In order to convert strings to numbers, C standard library provides strtol function and other functions from strto... group. These are the functions you should use to perform the conversion. And don't forget to check the result for overflow: strto... functions provide this feedback through the return value and errno variable.

Buffer for a 3 digit number

I want to create a buffer for a 3 digit number, status, and the null characters. Currently I have
char buf[4];
sprintf(buf, "%d",status);
My question is whether this is sufficient size for a 3 digit number, which is returned as the status from waitpid()
Yes, but there's nothing preventing the user (or however status is filled in) from punching in a lot more than three digits and overflowing that little buffer.
The snprintf function should be used instead, it will cut off input that is too long:
snprintf(buf, 4, "%d", status); // the 4 is the length of buf
Yes: 3 characters/digits and the trailing null.
Better be safe than sorry.
If you want it be safe, make sure it really 3 digits number.
For example you can use mod operator:
char buf[4];
sprintf(buf, "%d",status % 1000);
Note that if you choose the snprintf solution, and you give 1001, you'll get result of 100, if you use this solution you'll get 1.
You can choose what is more suitable for you.
You can also combine the both:
snprintf(buf, sizeof(buf), "%d",status % 1000);
It is safe for a strictly three digit decimal, but waitpid() makes no such guarantees; the type pid_t is a 32 bit signed integer, and three digits seems a somewhat optimistic assumption. Y
You should range check the waitpid() return value to guarantee that it will fit in your buffer, but it is generally unnecessary to be quite so sparing with memory; a buffer of 12 characters will allow any 32 bit signed integer to be represented as a decimal string (worst case 10 digits, a sign character and a nul terminator). Similarly for 16 bit, use 7 and 64 bit use 23. In general:
decimal_string_buffer_len = ceil( log( 2 sizeof(type) * CHAR_BIT)) + 3

In C, when would type casting a char to an integer not return a value between 0 and 127?

A common interview question asks to write an algorithm that detects duplicates in a string.
Using a character array of length 128 to keep track of the characters already seen is a good way to solve this problem in linear time.
In C we would type something like
char seen_chars[128];
unsigned char c;
/* set seen_chars to all zeros, assign c */
seen_chars[ c ] = 1;
To mark character c as seen. Of course this relies on
(int) c
returning a value between 0 and 127.
I'm wondering when would this fail? What are the assumptions that make this code work correctly?
The code will fail (and cause undefined behavior) every time when the integer value of the given char c is not between 0 and 127 (inclusive).
C does in no way limit the maximum range of char - you are only guaranteed that it can hold at least 256 distinct values - so in any given C implementation a valid char value can be out of that boundary. On most desktop systems a char can hold values from -128 to 127, or from 0 to 255. However, as an example:
char aFunction(void);
char c = aFunction();
if ((int)c > 1000000000)
printf("This could be true on some systems\n").
The following would be valid (although it may exhaust your stack on systems with large chars):
#include <limits.h>
_Bool seen[1<<CHAR_BIT] = {0};
seen[(unsigned char)c] = 1;
/* etc. */
On most implementations, an unsigned char has a value going from 0 to 255. Now, ASCII defines values from 0 to 127, but if your string contains characters from an "extended ASCII" character set (Latin1, for instance), then you might get character values above 127.
So, if your text is american, you're safe. Otherwise, you will overflow your buffer.
It works with all 7-bit ascii characters. Other characters, like german umlauts, would be translated to a negative number, with the appropriate consequences for your algorithm.
You're on the safe side if you use unsigned char and make the array 256 entries wide.
I'm wondering when would this fail?
Whenever you have any characters above 127.
Whenever you have a multi-byte encoding like UTF-8.

How do I prevent buffer overflow converting a double to char?

I'm converting a double to a char string:
char txt[10];
double num;
num = 45.344322345
sprintf(txt, "%.1f", num);
and using ".1f" to truncate the decimal places, to the tenths digit.
i.e. - txt contains 45.3
I usually use precision in sprintf to ensure the char buffer is not overflowed.
How can I do that here also truncating the decimal, without using snprintf?
(i.e. if num = 345694876345.3 for some reason)
Thanks
EDIT If num is > buffer the result no longer matters, just do not want to crash. Not sure what would make the most sense in that case.
EDIT2 I should have made it more clear than in just the tag, that this is a C program.
I am having issues using snprintf in a C program. I don't want to add any 3rd party libraries.
Use snprintf() , which will tell you how many bytes were not printed. In general, you should size your array to be large enough to handle the longest string representation of the target integer type. If not known in advance, use malloc() (or asprintf(), which is non-standard, but present on many platforms).
Edit
snprintf() will fail gracefully if the format exceeds the given buffer, it won't overflow. If you don't need to handle that, then simply using it will solve your problem. I can't think of an instance where you would not want to handle that, but then again, I'm not working on whatever you are working on :)
Why not just make your buffer big enough to hold the largest possible string representation of a double?
Assuming a 64-bit double using the IEEE standard for floating point arithmetic, which uses 52 bits for a mantissa: 2^52 = 4,503,599,627,370,500. So we need 16 characters to hold all the digits before and after the decimal point. 19 considering the decimal point, sign character and null terminator.
I would just use a buffer size of at least 20 characters and move on.
If you need to print a double using scientific notation, you will need to add enough space for the exponent. Assuming a 11 bit signed exponent, that's another 4 characters for the exponent plus a sign for the exponent and the letter 'E'. I would just go with 30 characters in that case.
If you absolutely must do it on your own, count the digits in the number before trying to convert:
int whole = num;
int wholeDigits = 0;
do {
++wholeDigits;
}
while (whole /= 10);
double fraction = num - (int) num;
int decimallDigits = 0;
while (fraction > 0) {
++decimalDigits;
fraction *= 10;
fraction = fraction - (int) fraction;
}
int totalLength = decimalDigits ? wholeDigits + decimalDigits + 1 : wholeDigits;
You should probably verify that this ad-hoc code works as advertised before relying on it to guard against crashes. I recommend that you use snprintf or something similar instead of my code, as others have said.
Why do you want to do it without snprintf? You should be using snprintf regardless of whether your format string contains a double, another string or anything else, really. As far as I can see, there's no reason not to.

Allocating memory for a char array to concatenate a known piece of text and an integer

I want to concatenate a piece of text, for example "The answer is " with a signed integer, to give the output "The number is 42".
I know how long the piece of text is (14 characters) but I don't know how many characters the string representation of the number will be.
I assume the worst case scenario, the largest signed 16-bit integer has 5 digits, plus one extra in case it is negative, so is the following code the correct way to do it?
#include <stdio.h>
#include <stdlib.h>
int main()
{
char *message;
message = malloc(14*sizeof(char)+(sizeof(int)*5)+1);
sprintf(message, "The answer is %d", 42);
puts(message);
free(message);
}
Use:
malloc(14*sizeof(char) /*for the 14 char text*/
+(sizeof(char)*5) /*for the magnitude of the max number*/
+1 /* for the sign of the number*/
+1 /* for NULL char*/
);
Since the digits will be represented
as char you have to use sizeof(char)
instead of sizeof(int).
Not quite, you only need a number of characters so sizeof(int) is not required.
However, for easily maintainable and portable code, you should have something like:
#define TEXT "The answer is "
#undef CHARS_PER_INT
#if INT_MAX == 32767
#define CHARS_PER_INT 6
#endif
#if INT_MAX == 2147483647
#define CHARS_PER_INT 11
#endif
#ifndef CHARS_PER_INT
#error Suspect system, I have no idea how many chars to allocate for an int.
#endif
int main (void) {
char *message;
message = malloc(sizeof(TEXT)+CHARS_PER_INT+1);
sprintf(message, TEXT "%d", 42);
puts(message);
free(message);
return 0;
}
This has a number of advantages:
If you change the string, you change one thing and one thing only. The argument to malloc adjusts automatically.
The expression sizeof(TEXT)+CHARS_PER_INT+1 is calculated at compile time. A solution involving strlen would have a runtime cost.
If you try to compile your code on a system where integers may cause overflow, you'll be told about it (go fix the code).
You should actually allocate an extra character for the number since the biggest 16-bit number (in terms of character count) is -32768 (six characters long). You'll notice I still have a +1 on the end - that's because you need space for the string null terminator.
One way of doing it (not necessarily recommended) that gives you the exact size of the number in characters is using the stdio functions themselves.
For example, if you print the number (somewhere, for whatever reason) before you allocate your memory, you can use the %n format identifier with printf. %n doesn't print anything; rather, you supply it with a pointer to int, and printf fills that with how many characters have been written so far.
Another example is snprintf, if you have it available. You pass it the maximum number of characters you want it to write to your string, and it returns the number of characters it should have written, not counting the final nul. (Or -1 on error.) So, using a 1-byte dummy string, snprintf can tell you exactly how many characters your number is.
A big advantage to using these functions is that if you decide to change the format of your number (leading 0's, padding spaces, octal output, long longs, whatever) you will not overrun your memory.
If you have GNU extensions to stdio, you may want to consider using asprintf. This is exactly like sprintf, except it does the memory allocation for you! No assembly required. (Although you do need to free it yourself.) But you shouldn't rely on it to be portable.
malloc((14 + 6 + 1) * sizeof(char));
14 char for the string
6 for de digits + sign
1 for the '\0'
Note : Sizeof(int) gives you the size of the type in byes. Sizeof(int) == 4 if the int is 32bits, 8 if it's a 64bits.
I think that the correct formula to get the maximum lenght of the decimal representation of an integer would be (floor(log10(INT_MAX))+1); you could also abuse of the preprocessor in this way:
#include <limits.h>
#define TOSTRING_(x) #x
#define TOSTRING(x) TOSTRING_(x)
/* ... */
#define YOUR_MESSAGE "The answer is "
char message[]=YOUR_MESSAGE "+" TOSTRING(INT_MAX);
sprintf(message+sizeof(YOUR_MESSAGE),"%d", 42);
, which also avoids the heap allocation. You may want to use snprintf for better security, although with this method it shouldn't be necessary.
Another trick like that would be to create a function like this:
size_t GetIntMaxLenght()
{
const char dummy[]=TOSTRING(INT_MAX);
return sizeof(dummy)+1;
}
if the compiler is smart enough it could completely sweep away the dummy var from the compiled code, otherwise it may be wise to declare that var as static to avoid reinitializing it every time the function is called.
A safe approximation for signed int is (number of digits including the potential - sign):
(CHAR_BIT * sizeof(int) + 1) / 3 + 1
The equivalent for unsigned is:
(CHAR_BIT * sizeof(unsigned) + 2) / 3
This calculates the number of digits - add one to both of them to account for the terminator, if allocating space for a null-terminated string.
This will slightly overestimate the space required for very long types (and will also overestimate in the unusual case where int has padding bits), but is a good approximation and has the advantage that it is a compile-time constant. CHAR_BIT is provided by <limits.h>.

Resources