Determining character array size effeciently to use snprintf() - c

We have a small assignment from college that requires us to perform some job X in C.
Part of that problem is to convert an unsigned long number, that is generated in the course of the program and hence without any way to predetrmine it, to a string. Naturally, i made use of snprintf. I initialized an array (str[50]) that was generously sized to avoid any sort of buffer errors.
On submission, however, my professor said that my method of avoiding buffer errors was ineffecient.
My question now is, when i create an char array to hold the unsigned long value, what size do i make it as? Is there some C macro to help determind the max number of characters that an unsigned long can hold?
something maybe like,
char str[MAX_NUMBER_OF_DIGITS_OF_UNSIGNED_LONG_ON_MACHINE];
I've skimmed throught limits.h and a few blogs and this forum but with no accord. Any help would be appreciated!

Go with #BLUEPIXY for brevity.
A deeper answer.
C allows various "locales" such that, in theory, snprintf(..., "%lu",...) could print a longer string than expected. Instead of "1234567", the output could be "1,234,567".
Recommend:
1. Determine the size in bits, n, of the maximum integer.
2. n * log2(10) rounded-up + 1 to get then char count.
3. Set-up a buffer that is 2x max need.
4. Check snprintf result.
5. Niche concern: Using the double call with snprintf() needs to insure the "locale" and number do not change between calls - not use here as snprintf() is a functionally expensive call.
char *ulong_to_buf(char *buf, size_t size, unsigned long x) {
int n = snprintf(buf, size, "%lu", x);
if (n < 0 || n >= size) return NULL;
return buf;
}
// Usage example
void foo(unsigned long x)
// 1/3 --> ~log2(10)
#define ULONG_PRT_LEN (sizeof(unsigned long)*CHAR_BIT/3 + 2)
char buf[ULONG_PRT_LEN*2 + 1]; // 2x for unexpected locales
if (ulong_to_buf(, sizeof buf, x)) {
puts(buf);
}
If code is really concerned, simple write your own
#include <stdlib.h>
#include <limits.h>
#include <string.h>
#define PRT_ULONG_SIZE (sizeof(unsigned long) * CHAR_BIT * 10 / 33 + 3)
char *ulong_strnull(int x, char *dest, size_t dest_size) {
char buf[PRT_ULONG_SIZE];
char *p = &buf[sizeof buf - 1];
// Form string
*p = '\0';
do {
*--p = x % 10 + '0';
x /= 10;
} while (x);
size_t src_size = &buf[sizeof buf] - p;
if (src_size > dest_size) {
// Not enough room
return NULL;
}
return memcpy(dest, p, src_size); // Copy string
}

#if ULONG_MAX == 4294967295UL
# define SIZE (10 + 1)
#elif ULONG_MAX <= 18446744073709551615ULL
# define SIZE (20 + 1)
#endif

From the documetation for snprintf:
Concerning the return value of snprintf(), SUSv2 and C99
contradict
each other: when snprintf() is called with size=0 then SUSv2 stipulates
an unspecified return value less than 1, while C99 allows str to be
NULL in this case, and gives the return value (as always) as the number
of characters that would have been written in case the output string
has been large enough.
If you are using C99 you can determine the size using snprintf (as BLUEPIXY commented):
int size = snprintf(NULL, 0, "%lu", ULONG_MAX);
However if you can't use C99 then you can determine the string size by determining how many digits you require and adding an additional character for the terminating \0 character:
int size = (int) log10((double) ULONG_MAX) + 1;
In order to allocate your array with size bytes you can simply use
char str[size];
However this only works if your compiler/version supports VLAs, if you compiler doesn't support this you can dynamically allocate the array with
char *str = malloc(size); //< Allocate the memory dynamically
// TODO: Use the str as you would the character array
free(str); //< Free the array when you are finished

Related

Function is returning a different value every time?

I'm trying to convert a hexadecimal INT to a char so I could convert it into a binary to count the number of ones in it. Here's my function to convert it into char:
#include <stdio.h>
#include <stdlib.h>
#define shift(a) a=a<<5
#define parity_even(a) a = a+0x11
#define add_msb(a) a = a + 8000
void count_ones(int hex){
char *s = malloc(2);
sprintf(s, "0x%x", hex);
free(s);
printf("%x", s);
};
int main() {
int a = 0x01B9;
shift(a);
parity_even(a);
count_ones(a);
return 0;
}
Every time I run this, i always get different outputs but the first three hex number are always the same. Example of outputs:
8c0ba2a0
fc3b92a0
4500a2a0
d27e82a0
c15d62a0
What exactly is happening here? I allocated 2 bytes for the char since my hex int is 2 bytes.
It's too long to write a comment so here goes:
I'm trying to convert a hexadecimal INT
int are stored as a group of value, padding (possible empty) and sign bits, so is there no such thing as a hexadecimal INT but you can represent (print) a given number in the hexadecimal format.
convert a ... INT to a char
That would be lossy conversion as an int might have 4 bytes of data that you are trying to cram into a 1 byte. char specifically may be signed or unsigned. You probably mean string (generic term) or char [] (standard way to represent a string in C).
binary to count the number of ones
That's the real issue you are trying to solve and this is a duplicate of:
How to count the number of set bits in a 32-bit integer?
count number of ones in a given integer using only << >> + | & ^ ~ ! =
To address the question you ask:
Need to allocate more than 2 bytes. Specifically ceil(log16(hex)) + 2 (for 0x) + 1 (for trailing '\0').
One way to get the size is to just ask snprintf(s, 0, ...)
then allocate a suitable array via malloc (see first implementation below) or use stack allocated variable length array (VLA).
You can use INT_MAX instead of hex to get an upper
bound. log16(INT_MAX) <= CHAR_BIT * sizeof(int) / 4 and the
latter is a compile time constant. This means you can allocate your string on stack (see 2nd implementation below).
It's undefined behavior to use a variable after it's deallocated. Move free() to after the last use.
Here is one of the dynamic versions mentioned above:
void count_ones(unsigned hex) {
char *s = NULL;
size_t n = snprintf(s, 0, "0x%x", hex) + 1;
s = malloc(n);
if(!s) return; // memory could not be allocated
snprintf(s, n, "0x%x", hex);
printf("%s (size = %zu)", s, n);
free(s);
};
Note, I initialized s to NULL which would cause the first call to snprintf() to return an undefined value on SUSv2 (legacy). It's well defined on c99 and later. The output is:
0x3731 (size = 7)
And the compile-time version using a fixed upper bound:
#include <limits.h>
// compile-time
void count_ones(unsigned hex) {
char s[BIT_CHAR * sizeof(int) / 4 + 3];
sprintf(s, "0x%x", hex);
printf("%s (size = %zu)", s, n);
};
and the output is:
0x3731 (size = 11)
Your biggest problem is that malloc isn't allocating enough. As Barmar said, you need at least 7 bytes to store it or you could calculate the amount needed. Another problem is that you are freeing it and then using it. It is only one line after the free that you use it again, which shouldn't have anything bad happen like 99.9% of the time, but you should always free after you know you are done using it.

How much space to allocate for printing long int value in string?

I want to store a long value (LONG_MAX in my test program) in a dynamically allocated string, but I'm confused how much memory I need to allocate for the number to be displayed in the string.
My fist attempt:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
int main(void)
{
char *format = "Room %lu somedata\n";
char *description = malloc(sizeof(char) * strlen(format) + 1);
sprintf(description, format, LONG_MAX);
puts(description);
return 0;
}
Compiled with
gcc test.c
And then running it (and piping it into hexdump):
./a.out | hd
Returns
00000000 52 6f 6f 6d 20 39 32 32 33 33 37 32 30 33 36 38 |Room 92233720368|
00000010 35 34 37 37 35 38 30 37 20 62 6c 61 62 6c 61 0a |54775807 blabla.|
00000020 0a |.|
00000021
Looking at the output, it seems my memory allocation of sizeof(char) * strlen(format) + 1 is wrong (too less memory allocated) and it works more accidentally?
What is the correct amount to allocate then?
My next idea was (pseudo-code):
sizeof(char) * strlen(format) + strlen(LONG_MAX) + 1
This seems too complicated and pretty non-idomatic. Or am I doing something totally wrong?
You are doing it totally wrong. LONG_MAX is an integer, so you can't call strlen (). And it's not the number that gives the longest result, LONG_MIN is. Because it prints a minus character as well.
A nice method is to write a function
char* mallocprintf (...)
which has the same arguments as printf and returns a string allocated using malloc with the exactly right length. How you do this: First figure out what a va_list is and how to use it. Then figure out how to use vsnprintf to find out how long the result of printf would be without actually printing. Then you call malloc, and call vsnprintf again to produce the string.
This has the big advantage that it works when you print strings using %s, or strings using %s with some large field length. Guess how many characters %999999d prints.
You can use snprintf() to figure out the length without worrying about the size of LONG_MAX.
When you call snprintf with NULL string, it'll return a number of bytes that would have been required if it was write into the buffer and then you know exactly how many bytes are required.
char *format = "Room %lu somedata\n";
int len = snprintf(0, 0, format, LONG_MAX); // Returns the number of
//bytes that would have been required for writing.
char *description = malloc( len+1 );
if(!description)
{
/* error handling */
}
snprintf(description, len+1, format, LON_MAX);
Convert the predefined constant numeric value to a string, using macro expansion as explaned in convert digital to string in macro:
#define STRINGIZER_(exp) #exp
#define STRINGIZER(exp) STRINGIZER_(exp)
(code courtesy of Whozcraig). Then you can use
int max_digit = strlen(STRINGIZER(LONG_MAX))+1;
or
int max_digit = strlen(STRINGIZER(LONG_MIN));
for signed values, and
int max_digit = strlen(STRINGIZER(ULONG_MAX));
for unsigned values.
Since the value of LONG_MAX is a compile-time, not a run-time value, you are ensured this writes the correct constant for your compiler into the executable.
To allocate enough room, consider worst case
// Over approximate log10(pow(2,bit_width))
#define MAX_STR_INT(type) (sizeof(type)*CHAR_BIT/3 + 3)
char *format = "Room %lu somedata\n";
size_t n = strlen(format) + MAX_STR_INT(unsigned long) + 1;
char *description = malloc(n);
sprintf(description, format, LONG_MAX);
Pedantic code would consider potential other locales
snprintf(description, n, format, LONG_MAX);
Yet in the end, recommend a 2x buffer
char *description = malloc(n*2);
sprintf(description, format, LONG_MAX);
Note: printing with specifier "%lu" ,meant for unsigned long and passing a long LONG_MAX in undefined behavior. Suggest ULONG_MAX
sprintf(description, format, ULONG_MAX);
With credit to the answer by #Jongware, I believe the ultimate way to do this is the following:
#define STRINGIZER_(exp) #exp
#define STRINGIZER(exp) STRINGIZER_(exp)
const size_t LENGTH = sizeof(STRINGIZER(LONG_MAX)) - 1;
The string conversion turns it into a string literal and therefore appends a null termination, therefore -1.
And not that since everything is compile-time constants, you could as well simply declare the string as
const char *format = "Room " STRINGIZER(LONG_MAX) " somedata\n";
You cannot use the format. You need to observer
LONG_MAX = 2147483647 = 10 characters
"Room somedata\n" = 15 characters
Add the Null = 26 characters
so use
malloc(26)
should suffice.
You have to allocate a number of char equal to the digits of the number LONG_MAX that is 2147483647. The you have to allocate 10 digit more.
in your format string you fave
Room = 4 chars
somedata\n = 9
spaces = 2
null termination = 1
The you have to malloc 26 chars
If you want to determinate runtime how man digit your number has you have to write a function that test the number digit by digit:
while(n!=0)
{
n/=10; /* n=n/10 */
++count;
}
Another way is to store temporary the sprintf result in a local buffer and the mallocate strlen(tempStr)+1 chars.
Usually this is done by formatting into a "known" large enough buffer on the stack and then dynamically allocated whatever is needed to fit the formatted string. i.e.:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
int main(void)
{
char buffer[1024];
sprintf(buffer, "Room %lu somedata\n", LONG_MAX);
char *description = malloc( strlen( buffer ) + 1 );
strcpy( description, buffer );
puts(description);
return 0;
}
Here, your using strlen(format) for allocation of memory is bit problematic. it will allocate memory considering the %lu lexicographically, not based on the lexicographical width of the value that can be printed with %lu.
You should consider the max possible value for unsigned long,
ULONG_MAX 4294967295
lexicographically 10 chars.
So, you've to allocate space for
The actual string (containing chars), plus
10 chars (at max), for the lexicographical value for %lu , plus
1 char to represnt - sign, in case the value is negative, plus
1 null terminator.
Well, if long is a 32-bit on your machine, then LONG_MAX should be 2147483647, which is 10 characters long. You need to account for that, the rest of your string, and the null character.
Keep in mind that long is a signed value with maximum value of LONG_MAX, and you are using %lu (which is supposed to print an unsigned long value). If you can pass a signed value to this function, then add an additional character for the minus sign, otherwise you might use ULONG_MAX to make it clearer what your limits are.
If you are unsure which architecture you are running on, you might use something like:
// this should work for signed 32 or 64 bit values
#define NUM_CHARS ((sizeof(long) == 8) ? 21 : 11)
Or, play safe and simply use 21. :)
Use the following code to calculate the number of characters necessary to hold the decimal representation of any positve integer:
#include <math.h>
...
size_t characters_needed_decimal(long long unsigned int llu)
{
size_t s = 1;
if (0LL != llu)
{
s += log10(llu);
}
return s;
}
Mind to add 1 when using a C-"string" to store the number, as C-"string"s are 0-terminated.
Use it like this:
#include <limits.h>
#include <stdlib.h>
#include <stdio.h>
size_t characters_needed_decimal(long long unsigned int);
int main(void)
{
size_t s = characters_needed_decimal(LONG_MAX);
++s; /* 1+ as we want a sign */
char * p = malloc(s + 1); /* add one for the 0-termination */
if (NULL == p)
{
perror("malloc() failed");
exit(EXIT_FAILURE);
}
sprintf(p, "%ld", LONG_MAX);
printf("LONG_MAX = %s\n", p);
sprintf(p, "%ld", LONG_MIN);
printf("LONG_MIN = %s\n", p);
free(p);
return EXIT_SUCCESS;
}
Safest:
Rather than predict the allocation needed, uses asprintf(). This function allocates memory as needed.
char *description = NULL;
asprintf(&description, "Room %lu somedata\n", LONG_MAX);
asprintf() is not standard C, but is common in *nix and its source code is available to accommodate other systems.
Why Use Asprintf?
apple
android

Write a recursive function in C that converts a number into a string

I'm studying software engineering, and came across this exercise: it asks to write a recursive function in C language that receives a positive integer and an empty string, and "translates" the number into a string. Meaning that after calling the function, the string we sent would contain the number but as a string of its digits.
I wrote this function, but when I tried printing the string, it did print the number I sent, but in reverse.
This is the function:
void strnum(int n, char *str)
{
if(n)
{
strnum(n/10, str+1);
*str = n%10 + '0';
}
}
For example, I sent the number 123 on function call, and the output was 321 instead of 123.
I also tried exchanging the two lines within the if statement, and it still does the same. I can't figure out what I did wrong. Can someone help please?
NOTE: Use of while and for loop statements is not allowed for the exercise.
Note: your current implementation design is somewhat dangerous since you have no way of knowing if you are really writing in valid memory; consider implementing the function with a passed in len to know when you shouldn't try to write anymore; ie. to prevent buffer overruns.
Introduction
The problem is that you are shaving off the least significant digit, but assigning it to the most significant position in the buffer pointed to by str.
You will need to have the "off shaving" and the "assigning" synchronized, so that the least significant digit is stored at the end - and not the beginning.
Hints
Easiest solution would be to do what you currently are doing, and then later reverse the buffer, but this will require far more assignments than what is actually required.
The recommended way is to calculate the number of digits in your string, by doing this you'll know at what offset the end will be, and start assigning the least significant digit at that position.
How do I determine the number of digits of an integer in C?
The hack
Another alternative is having the recursive call modify the current value of our pointer, effectively making it assign the right value - at the right offset.
This example is mostly included because it's "fun", there are (as mentioned) other paths to walk.
#include <stdio.h>
void strnum_impl (int n, char ** str) {
if (n) {
strnum_impl (n/10, str);
**str = n%10 + '0';
(*str)++;
}
}
void strnum (int n, char * str) {
if (n == 0) { *str++ = '0'; }
else { strnum_impl (n, &str); }
*str = '\0'; /* null-terminate */
}
int main () {
char buf[256];
strnum (10240123, buf);
printf (">%s<\n", buf);
return 0;
}
>10240123<
As #Robert Harvey commented, as well as others, code is determining the least rather than the most significant digit and placing it in str[0].
It did look like fun to implement, so the below well copes with the entire range of int including INT_MIN and arbitrary sized int.
static char *strnum_helper(int n, char *str) {
str[0] = '0' - n%10;
if (n < -9) {
return strnum_helper(n/10, str - 1);
}
return str;
}
void strnum(int n, char *str) {
char buf[(sizeof n * CHAR_BIT)/3 + 3]; // Sized to any size int
if (n < 0) {
*str++ = '-';
}
else {
n = -n; // By using negative numbers, do not attempt -INT_MIN
}
buf[sizeof buf - 1] = '\0';
strcpy(str, strnum_helper(n, &buf[sizeof buf - 2]));
}
#Filip Roséen - refp pointed out the value of passing in a size. The above strnum() could be adjusted per a size limitation.

String conversion to int

I have a pointer lpBegin pointing to a string "1234". Now i want this string compare to an uint how can i make this string to unsigned integer without using scanf? I know the string number is 4 characters long.
You will have to use the atoi function. This takes a pointer to a char and returns an int.
const char *str = "1234";
int res = atoi(str); //do something with res
As said by others and something I didn't know, is that atoi is not recommended because it is undefined what happens when a formatting error occurs. So better use strtoul as others have suggested.
Definitely atoi() which is easy to use.
Don't forget to include stdlib.h.
You can use the strtoul() function. strtoul stands for "String to unsigned long":
#include <stdio.h>
#include <stdlib.h>
int main()
{
char lpBegin[] = "1234";
unsigned long val = strtoul(lpBegin, NULL, 10);
printf("The integer is %ul.", val);
return 0;
}
You can find more information here: http://www.cplusplus.com/reference/clibrary/cstdlib/strtoul/
You could use strtoul(lpBegin), but this only works with zero-terminated strings.
If you don't want to use stdlib for whatever reason and you're absolutely sure about the target system(s), you could do the number conversion manually.
This one should work on most systems as long as they are using single byte encoding (e.g. Latin, ISO-8859-1, EBCDIC). To make it work with UTF-16, just replace the 'char' with 'wchar_t' (or whatever you need).
unsigned long sum = (lpbegin[0] - '0') * 1000 +
(lpbegin[1] - '0') * 100 +
(lpbegin[2] - '0') * 10 +
(lpbegin[3] - '0');
or for numbers with unknown length:
char* c = lpBegin;
unsigned long sum = 0;
while (*c >= '0' && *c <= '9') {
sum *= 10;
sum += *c - '0';
++c;
}
I think you look for atoi()
http://www.elook.org/programming/c/atoi.html
strtol is better than atoi with better error handling.
You should use the strtoul function, "string to unsigned long". It is found in stdlib.h and has the following prototype:
unsigned long int strtoul (const char * restrict nptr,
char ** restrict endptr,
int base);
nptr is the character string.
endptr is an optional parameter giving the location of where the function stopped reading valid numbers. If you aren't interested of this, pass NULL in this parameter.
base is the number format you expect the string to be in. In other words, 10 for decimal, 16 for hex, 2 for binary and so on.
Example:
#include <stdlib.h>
#include <stdio.h>
int main()
{
const char str[] = "1234random rubbish";
unsigned int i;
const char* endptr;
i = strtoul(str,
(char**)&endptr,
10);
printf("Integer: %u\n", i);
printf("Function stopped when it found: %s\n", endptr);
getchar();
return 0;
}
Regarding atoi().
atoi() internally just calls strtoul with base 10. atoi() is however not recommended, since the C standard does not define what happens when atoi() encounters a format error: atoi() can then possibly crash. It is therefore better practice to always use strtoul() (and the other similar strto... functions).
If you're really certain the string is 4 digits long, and don't want to use any library function (for whatever reason), you can hardcode it:
const char *lpBegin = "1234";
const unsigned int lpInt = 1000 * (lpBegin[0] - '0') +
100 * (lpBegin[1] - '0') +
10 * (lpBegin[2] - '0') +
1 (lpBegin[3] - '0');
Of course, using e.g. strtoul() is vastly superior so if you have the library available, use it.

Minimum buffer length to read a float

I'm writing a small command-line program that reads two floats, an int, and a small string (4 chars max) from stdin. I'm trying to figure out the buffer size I should create and pass to fgets. I figured I could calculate this based on how many digits should be included in the maximum values of float and int respectively, like so:
#include <float.h>
#include <limits.h>
...
int fmax = log10(FLOAT_MAX) + 2; // Digits plus - and .
int imax = log10(INT_MAX) + 1; // Digits plus -
int buflen = 4 + 2*fmax + imax + 4; // 4 chars, 2 floats, 1 int, 3 spaces and \n
...
fgets(inbuf, buflen + 1, stdin);
But it's occurred to me that this might not actually be correct. imax ends up being 10 on my system, which seems a bit low, while fmax if 40. (Which I'm thinking is a bit high, given that longer values may be represented with e notation.)
So my question is: is this the best way to work this out? Is this even necessary? It just feels more elegant than assigning a buffer of 256 and assuming it'll be enough. Call it a matter of pride ;P.
This type of thing is a place where I would actually use fscanf rather than reading into a fixed-size buffer first. If you need to make sure you don't skip a newline or other meaningful whitespace, you can use fgetc to process character-by-character until you get the the beginning of the number, then ungetc before calling fscanf.
If you want to be lazy though, just pick a big number like 1000...
This is defined for base 10 floating point numbers (#include <float.h> or the equivalent member of std::numeric_limits<float_type>):
FLT_MAX_10_EXP // for float
DBL_MAX_10_EXP // for double
LDBL_MAX_10_EXP // for long double
As is the maximum precision for decimals in base 10:
FLT_DIG // for float
DBL_DIG // for double
LDBL_DIG // for long double
Although it really depends on what you define to be a valid floating point number. You could imagine someone expecting:
00000000000000000000000000000000000000000000000000.00000000000000000000
to be read in as zero.
I'm sure there's a good way to determine the maximum length of a float string algorithmically, but what fun is that? Let's figure it out by brute force!
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int, char **)
{
float f;
unsigned int i = -1;
if (sizeof(f) != sizeof(i))
{
printf("Oops, wrong size! Change i to a long or whatnot so the sizes match.\n");
return 0;
}
printf("sizeof(float)=%li\n", sizeof(float));
char maxBuf[256] = "";
int maxChars = 0;
while(i != 0)
{
char buf[256];
memcpy(&f, &i, sizeof(f));
sprintf(buf, "%f", f);
if ((i%1000000)==0) printf("Calclating # %u: buf=[%s] maxChars=%i (maxBuf=[%s])\n", i, buf, maxChars, maxBuf);
int numChars = strlen(buf);
if (numChars > maxChars)
{
maxChars = numChars;
strcpy(maxBuf, buf);
}
i--;
}
printf("Max string length was [%s] at %i chars!\n", maxBuf, maxChars);
}
Looks like the answer might be 47 characters per float (at least on my machine), but I'm not going to let it run to completion so it's possibly more.
Following the answer from #MSN, you can't really know your buffer is large enough.
Consider:
const int size = 4096;
char buf[size] = "1.";
buf[size -1 ] = '\0';
for(int i = 2; i != size - 1; ++i)
buf[i] = '0';
double val = atof(buf);
std::cout << buf << std::endl;
std::cout << val << std::endl;
Here atof() handles (as it is supposed to), a thousand character representation of 1.
So really, you can do one or more of:
Handle the case of not having a large enough buffer
Have better control over the input file
Use fscanf directly, to make the buffer size someone else's problem

Resources