Buffer size for converting unsigned long to string - c

In reference to question and the answer
here: Can I use this method so that the solution will be platform independent.
char *buff = (char*) malloc(sizeof(unsigned long)*8);
sprintf(buff, "%lu", unsigned_long_variable);
Here I am getting the value of buffer length as it will similar to unsigned long variable. Is this approach correct?

Don't even try to calculate the buffer size.
Start with snprintf, which will tell you safely how many characters are needed. Then you know how many bytes to allocate to print safely.
Since this is a few lines of code that you don't want to repeat again and again, write a function malloc_printf that does exactly what you want: In that function, call snprintf with a NULL destination, then malloc the buffer, sprintf into the malloc buffer, and return it. To make it faster and to often avoid two snprintf and sprintf calls, write into a buffer of 256 chars first which is often enough.
So your final code would be
char* buff = malloc_printf ("%lu", unsigned_long_variable);
Also does quick, safe and easy string concatenation using the format %s%s, for example.

You want to know how many characters are needed to represent the largest possible unsigned long. Correct?
To that end, you are trying to calculate the largest possible unsigned long:
sizeof(unsigned long)*8
That is faulty in several ways. For one, sizeof returns multiples of char, which need not be 8 bit. You should multiply with CHAR_BIT (from <limits.h>) instead. But even that is not necessary, because that very same header already does provide the largest possible value -- UCHAR_MAX.
Then you're making a mistake: Your calculation gives the size of the integer representation of unsigned long in bits. What you want is the size of the string representation in characters. This can be achieved with the log10() function (from <math.h>):
log10( UCHAR_MAX )
This will give you a double value that indicates the number of (decimal) digits in UCHAR_MAX. It will be a fraction, which you need to round up (1) (ceil() does this for you).
Thus:
#include <math.h>
#include <stdlib.h>
#include <limits.h>
int main()
{
char * buff = malloc( ceil( log10( UCHAR_MAX ) ) + 1 );
//...
}
All in all, this is quite dodgy (I made two mistakes while writing this out, shame on me -- if you make mistakes when using this, shame on you). And it requires the use of the math library for something that snprintf( NULL, ... ) can do for you more easily, as indicated by the Q&A you linked to.
(1): log10( 9999 ) gives 3.9999565... for the four-digit number.

The C standard doesn't put an upper limit to the number of bits per char.
If someone constructs a C compiler that uses for example 2000 bits per char the output can overflow the buffer.
Instead of 8 you should use CHAR_BIT from limits.h.
Also, note that you need (slighly less than) 1 char per 3 bits and you need 1 byte for the string terminator.
So, something like this:
#include <limit.h>
char *buff = malloc(1 + (sizeof(unsigned long) * CHAR_BIT + 2) / 3);
sprintf(buff, "%lu", unsigned_long_variable);

No, this is not the right way to calculate the buffer size.
E.g. for 4 byte unsigned longs you have values up to 2^32-1
which means 10 decimal digits. So your buffer needs 11 chars.
You are allocating 4 * 8 = 32.
The correct formula is
ceil(log10(2^(sizeof(unsigned long) * CHAR_BIT) - 1)) + 1
(log10 denotes the decimal logarithm here)
A good (safe) estimation is:
(sizeof(unsigned long) * CHAR_BIT + 2) / 3 + 1
because log10(2) is less than 0.33.

Short answer:
#define INTEGER_STRING_SIZE(t) (sizeof (t) * CHAR_BIT / 3 + 3)
unsigned long x;
char buf[INTEGER_STRING_SIZE(x)];
int len = snprintf(buf, sizeof buf, "%lu", x);
if (len < 0 || len >= sizeof buf) Handle_UnexpectedOutput();
OP's use of sizeof(unsigned long)*8 is weak. On systems where CHAR_BIT (the # of bits per char) is large (it must be at least 8), sizeof(unsigned long) could be 1. 1*8 char is certainly too small for 4294967295 (the minimum value for ULONG_MAX).
Concerning: sprintf()/snprintf() Given locale issues, in theory, code may print additional characters like 4,294,967,295 and so exceed the anticipated buffer. Unless very tight memory constraints occur, recommend a 2x anticipated sized buffer.
char buf[ULONG_STRING_SIZE * 2]; // 2x
int len = snprintf(buf, sizeof buf, "%lu", x);
The expected maximum string width of printing some unsigned integer is ceil(log10(unsigned_MAX)) + 1. In the case of of unsigned long, the value of ULONG_MAX certainly does not exceed pow(2,sizeof (unsigned long) * CHAR_BIT) - 1 so code could use:
#define LOG10_2 0.30102999566398119521373889472449
#define ULONG_STRING_SIZE (sizeof (unsigned long) * CHAR_BIT * LOG10_2 + 2)
// For greater portability, should use integer math.
#define ULONG_STRING_SIZE (sizeof (unsigned long) * CHAR_BIT / 3 + 2)
// or more precisely
#define ULONG_STRING_SIZE (sizeof (unsigned long) * CHAR_BIT * 28/93 + 2)
The short answer used +3 in case a signed` integer was specified.

Related

Get size of char buffer for sprintf handling longs C

I am quite new to C and occurred a question, when dealing with long ints/char* in C. I want to store a long in a char*, but I am not sure, how I should manage the size of my buffer to fit any long given.
Thats what I want:
char buffer[LONG_SIZE]; // what should LONG_SIZE be to fit any long, not depending on the OS?
sprintf(buffer, "%ld", some_long);
I need to use C not C++. Is there any solution to this, if I don't want to use magic-numbers?
if I don't want to use magic-numbers
Using snprintf() with a 0-length buffer will return the number of chars needed to hold the result (Minus the trailing 0). You can then allocate enough space to hold the string on demand:
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
int main(void) {
long some_long = LONG_MAX - 5;
// Real code should include error checking and handling.
int len = snprintf(NULL, 0, "%ld", some_long);
char *buffer = malloc(len + 1);
snprintf(buffer, len + 1, "%ld", some_long);
printf("%s takes %d chars\n", buffer, len);
free(buffer);
}
There's also asprintf(), available in Linux glibc and some BSDs, that allocates the result string for you, with a more convenient (But less portable) interface than the above.
Allocating the needed space on demand instead of using a fixed size has some benefits; it'll continue to work without further adjustment if you change the format string at some point in the future, for example.
Even if you stick with a fixed length buffer, I recommend using snprintf() over sprintf() to ensure you won't somehow overwrite the buffer.
It is probably more correct to use snprintf to compute the necessary size, but it seems like this should work:
char buf[ sizeof(long) * CHAR_BIT ];
The number of bits in a long is sizeof long * CHAR_BIT. (CHAR_BIT is defined in <limits.h>.) This can represent at most a signed number of magnitude 2sizeof long * CHAR_BIT - 1.
Such a number can have at most floor(log102sizeof long * CHAR_BIT - 1)+1 decimal digits. This is floor((sizeof long * CHAR_BIT - 1) * log102) + 1. log102 is less than .302, so (sizeof long * CHAR_BIT - 1) * 302 / 1000 + 1 bytes is enough for the digits.
Add one for a sign and one for a terminating null character, and char[(sizeof long * CHAR_BIT - 1) * 302 / 1000 + 3] suffices for the buffer.

C Programming - Size of 2U and 1024U

I know that the U literal means in c, that the value is a unsigned integer. An unsigned intagers size is 4 bytes.
But how big are 2U or 1024U? Does this simply mean 2 * 4 bytes = 8 bytes for example or does this notation means that 2 (or 1024) are unsigned integers?
My goal would be to figured out how much memory will be allocated if i call malloc like this
int *allocated_mem = malloc(2U * 1024U);
and prove in a short program my answer what i tried like this
printf("Size of 2U: %ld\n", sizeof(2U));
printf("Size of 1024U: %ld\n", sizeof(1024U));
I would have expeted for the first line a size of 2 * 4 Bytes = 8 and for the second like 1024 * 4 Bytes = 4096 but the output is always "4".
Would realy appreciate what 2U and 1024U means exactly and how can i check their size in C?
My goal would be to figured out how much memory will be allocated if i call malloc like this int *allocated_mem = malloc(2U * 1024U);
What is difficult about 2 * 1024 == 2048? The fact that they are unsigned literals does not change their value.
An unsigned intagers size is 4 bytes. (sic)
You are correct. So 2U takes up 4-bytes, and 1024U takes up 4-bytes, because they are both unsigned integers.
I would have expeted for the first line a size of 2 * 4 Bytes = 8 and for the second like 1024 * 4 Bytes = 4096 but the output is always "4".
Why would the value change the size? The size depends only on the type. 2U is of type unsigned int, so it takes up 4-bytes; same as 50U, same as 1024U. They all take 4-bytes.
You are trying to multiply the value (2) times the size of the type. That makes no sense.
How big?
2U and 1024U are the same size, the size of an unsigned, commonly 32-bits or 4 "bytes". The size of a type is the same throughout a given platform - it does not change because of value.
"I know that the U literal means in c, that the value is a unsigned integer." --> OK, close enough so far.
"An unsigned integers size is 4 bytes.". Reasonable guess yet C requires that unsigned are at least 16-bits. Further, the U makes the constant unsigned, yet that could be unsigned, unsigned long, unsigned long long, depending on the value and platform.
Detail: in C, 2U is not a literal, but a constant. C has string literals and compound literals. The literals can have their address taken, but &2U is not valid C. Other languages call 2U a literal, and have their rules on how it can be used.
My goal would be to figured out how much memory will be allocated if i call malloc like this int *allocated_mem = malloc(2U * 1024U);
Instead, better to use size_t for sizing than unsigned and check the allocation.
size_t sz = 2U * 1024U;
int *allocated_mem = malloc(sz);
if (allocated_mem == NULL) allocated_mem = 0;
printf("Allocation size %zu\n", allocated_mem);
(Aside) Be careful with computed sizes. Do your size math using size_t types. 4U * 1024U * 1024U * 1024U could overflow unsigned math, yet may compute as desired with size_t.
size_t sz = (size_t)4 * 1024 * 1024 * 1024;
The following attempts to print the size of the constants which is likely 32-bits or 4 "bytes" and not their values.
printf("Size of 1024U: %ld\n", sizeof(1024U));
printf("Size of 1024U: %ld\n", sizeof(2U));

How many chars do I need to print a size_t with sprintf? [duplicate]

This question already has answers here:
Determining sprintf buffer size - what's the standard?
(7 answers)
Closed 7 years ago.
I want to do something like this:
char sLength[SIZE_T_LEN];
sprintf(sLength, "%zu", strlen(sSomeString));
That is, I want to print in a buffer a value of type size_t. The question is: what should be the value of SIZE_T_LEN? The point is to know the minimum size required in order to be sure that an overflow will never occur.
sizeof(size_t) * CHAR_BIT (from limits.h) gives the number of bits. For decimal output, use a (rough) lower estimate of 3 (4 for hex) bits per digit to be on the safe side. Don't forget to add 1 for the nul terminator of a string.
So:
#include <limits.h>
#define SIZE_T_LEN ( (sizeof(size_t) * CHAR_BIT + 2) / 3 + 1 )
This yields a value of type size_t itself. On typical 8/16/32/64 bit platforms + 2 is not required (the minimum possible size of size_t is 16 bits). Here the error is already large enough to yield a correctly truncated result.
Note that this gives an upper bound and is fully portable due to the use of CHAR_BIT. To get an exact value, you have to use log10(SIZE_MAX) (see JohnBollinger's answer for this). But that yields a float and might be calculated at run-time, while the version above is compile-time evaluated (and likely costs more stack than the rough estimate already). Unless you have a very RAM constrained system, that is ok. And on such a system, you should refrain from using stdio anyway.
To be absolutely on the safe side, you might want to use snprintf (but that is not necessary).
The exact answer, accounting for space for a string terminator, would be
log10(SIZE_MAX) + 2
where the SIZE_MAX macro is declared in stdint.h. Unfortunately, that's not a compile-time constant (on account of the use of log10()). If you need a compile-time constant that is computed at compile time then you could use this:
sizeof(size_t) * CHAR_BIT * 3 / 10 + 2
That gives the correct answer for size_t up to at least 256 bits. It's based on the fact that 210 is pretty close to 1000. At some large number of bits it will be too small by one, and for much larger numbers of bits it will fall further behind. If you're worried about such large size_t then you could add one or even two to the result.
If you can use snprintf, use:
int len = snprintf (NULL, 0, "%zu", strlen (sSomeString));
On systems where a char is represented using 8 bits,
If sizeof(size_t) is 2, then, the maximum value is: 65535
If sizeof(size_t) is 4, then, the maximum value is: 4294967295
If sizeof(size_t) is 8, then, the maximum value is: 18446744073709551615
If sizeof(size_t) is 16, then, the maximum value is: 3.4028237e+38
You can use that information to extract maximum size of the string using the pre-processor.
Sample program:
#include <stdio.h>
#ifdef MAX_STRING_SIZE
#undef MAX_STRING_SIZE
#endif
// MAX_STRING_SIZE is 6 when sizeof(size_t) is 2
// MAX_STRING_SIZE is 11 when sizeof(size_t) is 4
// MAX_STRING_SIZE is 21 when sizeof(size_t) is 8
// MAX_STRING_SIZE is 40 when sizeof(size_t) is 16
// MAX_STRING_SIZE is -1 for all else. It will be an error to use it
// as the size of an array.
#define MAX_STRING_SIZE (sizeof(size_t) == 2 ? 6 : sizeof(size_t) == 4 ? 11 : sizeof(size_t) == 8 ? 21 : sizeof(size_t) == 16 ? 40 : -1)
int main()
{
char str[MAX_STRING_SIZE];
size_t a = 0xFFFFFFFF;
sprintf(str, "%zu", a);
printf("%s\n", str);
a = 0xFFFFFFFFFFFFFFFF;
sprintf(str, "%zu", a);
printf("%s\n", str);
}
Output:
4294967295
18446744073709551615
It will be easy to adapt it to systems where a char is represented using 16 bits.

Whats the easiest way to convert a long in C to a char*?

What is the clean way to do that in C?
wchar_t* ltostr(long value) {
int size = string_size_of_long(value);
wchar_t *wchar_copy = malloc(value * sizeof(wchar_t));
swprintf(wchar_copy, size, L"%li", self);
return wchar_copy;
}
The solutions I came up so far are all rather ugly, especially allocate_properly_size_whar_t uses double float base math.
A long won't have more than 64 digits on any platform (actually less than that, but I'm too lazy to figure out what the actual minimum is now). So just print to a fixed-size buffer, then use wcsdup rather than trying to calculate the length ahead of time.
wchar_t* ltostr(long value) {
wchar_t buffer[ 64 ] = { 0 };
swprintf(buffer, sizeof(buffer), L"%li", value);
return wcsdup(buffer);
}
If you want a char*, it's trivial to translate the above:
char* ltostr(long value) {
char buffer[ 64 ] = { 0 };
snprintf(buffer, sizeof(buffer), "%li", value);
return strdup(buffer);
}
This will be faster and less error-prone than calling snprintf twice, at the cost of a trivial amount of stack space.
int charsRequired = snprintf(NULL, 0, "%ld", value) + 1;
char *long_str_buffer = malloc(charsRequired);
snprintf(long_str_buffer, charsRequired, "%ld", value);
The maximum number of digits is given by ceil(log10(LONG_MAX)). You can precompute this value for the most common ranges of long using the preprocessor:
#include <limits.h>
#if LONG_MAX < 1u << 31
#define LONG_MAX_DIGITS 10
#elif LONG_MAX < 1u << 63
#define LONG_MAX_DIGITS 19
#elif LONG_MAX < 1u << 127
#define LONG_MAX_DIGITS 39
#else
#error "unsupported LONG_MAX"
#endif
Now, you can use
wchar_t buffer[LONG_MAX_DIGITS + 2];
int len = swprintf(buffer, sizeof buffer / sizeof *buffer, L"%li", -42l);
to get a stack-allocated wide-character string. For a heap-allocated string, use wcsdup() if available or a combination of malloc() and memcpy() otherwise.
Many people would recommend you avoid this approach, because it's not apparent that the user of your function will have to call free at some point. Usual approach is to write into a supplied buffer.
Since you receive a long, you know it's range will be in –2,147,483,648 to 2,147,483,647 and since swprintf() uses locale ("C") by default (you control that part), you only need 11 characters. This saves you from string_size_of_long().
You could either (for locale C):
wchar_t* ltostr(long value) {
wchar_t *wchar_copy = malloc(12 * sizeof(wchar_t));
swprintf(wchar_copy, 12, L"%li", value);
return wchar_copy;
}
Or more general but less portable, you could use _scwprintf to get the length of the string required (but then it's similar to your original solution).
PS: I'd simplify the memory allocation and freeing more than this "tool-box" function.
You can use the preprocessor to calculate an upper bound on the number of chars required to hold the text form of an integer type. The following works for signed and unsigned types (eg MAX_SIZE(int)) and leaves room for the terminating \0 and possible minus sign.
#define MAX_SIZE(type) ((CHAR_BIT * sizeof(type)) / 3 + 2)

Convert integer into an array

I am working on a C program, and I am coming across a small problem. I don't know how to convert an integer (say 2007) into a char array. Is there a function in the C libraries to do that for me?
To clarify, I'd like to take 2007 and store it in some char array[4] ={ '2', '0', '0', '7', '\0' };
I was thinking something like sprintf, but I'm not sure. Anyways, any help/hints would be appreciated.
Thanks,
Michael
You can do that with sprintf, or more safely snprintf.
Here's a contrived example:
#include <stdio.h>
#define MAX_LEN 5
char str[MAX_LEN];
snprintf(str, MAX_LEN, "%d", 2007);
Use snprintf() and be sure allocate the proper amount of space to hold numbers up to 2^(sizeof(int)*CHAR_BIT). On a 64-bit machine, that will be 20 digit characters, plus 1 for the NULL terminator.
#include <stdio.h>
#define MAX_DIGITS 20
int n = 2007;
char str[MAX_DIGITS+1];
snprintf(str, MAX_DIGITS+1, "%d", n);
As others have said, you should look at sprintf() or snprintf(). Assuming you are trying to convert an integral type T to such an array, the interesting bit is to figure out the size of the buffer for the array.
First, the number of digits in a number's decimal representation is ⌊log10n+1⌋. The maximum possible value of an unsigned integral type T can be represented in nbits = CHAR_BIT*sizeof(T) binary bits, which will need ⌊log102nbits⌋+1 decimal digits.
log102nbits = nbits×log102 = nbits×log(2)/log(10).
28/93 is a very good1 rational approximation of log(2)/log(10) (0.30107526881720431 vs 0.30102999566398114).
So, using the above, we get our expression for the number of digits:
CHAR_BIT * sizeof(T) * 28 / 93 + 1
For signed numbers, we need to add 1 more for the - sign, and we need to add 1 for the terminating 0. So we get:
#include <limits.h>
/* Figure out the maximum number of characters including the
terminating 0 to represent the numbers in integral type T */
#define SZ(T) (CHAR_BIT * sizeof(T) * 28 / 93 + 3)
So we can do:
char array[SZ(int)];
sprintf(array, "%d", n);
And we are sure that array has enough space. We don't have to worry about snprintf() or malloc()/realloc() and free() combination either.
Here is a complete program using the above:
#include <stdio.h>
#include <limits.h>
/* Figure out the maximum number of characters including the
terminating 0 to represent the numbers in integral type T */
#define SZ(T) (CHAR_BIT * sizeof(T) * 28 / 93 + 3)
#define PRINT(x) do \
{ \
printf("%s: %lu\n", #x, (unsigned long)SZ(x)); \
} while (0)
int main(void)
{
PRINT(int);
PRINT(long);
PRINT(size_t);
return 0;
}
1or good enough for this purpose.
#include <stdio.h>
char array[5];
sprintf(array, "%d", 2007);
...done.
Note that sprintf is not overflow safe, so if you have a number of 5 or more digits you'll have a problem. Note also that the converted number will be followed by a terminating \0.
The classic way is itoa. Or you can use snprintf to get more control.
There is a nonstandard, but well-supported as I undertand it, function itoa - the opposite of atoi.
Example:
char *a = malloc(10 * sizeof(char));
itoa(2007, a, 2007);
sprintf also works:
char *a = malloc(10 * sizeof(char));
sprintf(a, "%d", 2007);
int n = 2007;
char a[100];
sprintf( a, "%d", n );
Use snprintf:
// value is int;
// buf is char *;
// length is space available in buf
snprintf(buf, length, "%d", value)
snprintf has the advantage of being standard and giving your more flexibility over the formatting as well as being safe.
You could also use itoa but be warned that it is not part of the standard. Most implementations have it though.
Usage:
// value is int;
// buf is char *;
itoa(value, buf, 10);
An interesting question is: how much space do you allocate for buf? We note the following. With sizeof(int) bytes per int and eight bits per byte, the maximum value is approximately 2^(CHAR_BIT * sizeof(int) - 1) (-1 is for sign bit). Therefore we need space to hold
floor(log_10(2^(CHAR_BIT * sizeof(int) - 1)) + 1
digits. But don't forget the sign and null terminator! So the maximum length of an integer representable here is
floor(log_10(2^(CHAR_BIT * sizeof(int) - 1)) + 3.
Of course, this could be wasting space if our values are small. To find out how much space a specific value needs:
floor(log_10(abs(value))) + 1 + (value < 0 ? 1 : 0) + 1

Resources