Converting string to a number - c

I came across this C program:
int main() {
printf("Enter your address, (e.g. 51 Anzac Road) ");
gets(address);
number = 0;
i = 0;
while (address[i] != ' ') {
number = number * 10 + (address[i] - 48);
i++;
}
}
I understand number = number * 10 + (address[i] - 48); is to get the number from input, but can anybody explain to me how this works? How does that produce the number from the input?

C requires the digits 0 through 9 to be stored contiguously, in that order, in the execution character set. 48 is the ASCII value of '0', so, for instance:
'3' - 48 == 3
for any digit.
ASCII is not required for C, so better is:
'3' - '0'
because while 48 is right for ASCII, '0' is by definition right for any character set.
If address contains "456 ", then:
when i == 0 and number == 0, number * 10 + (address[0] - 48) equals 0 * 10 + 4, or 4.
when i == 1, number * 10 + (address[1] - 48) is 4 * 10 + 5, or 45.
when i == 2, number * 10 + (address[2] - 48) is 45 * 10 + 6, or 456
and you're done.
Never use gets(), it's dangerous, and isn't even part of C anymore.

In ASCII, the digit characters '0' through '9' occupy code points 48 through 57 (i hex, 0x30 through 0x39) so, to turn a digit character into a value, you just subtract 48.
As an aside, you should really subtract '0' since the standard doesn't guarantee ASCII, though it does guarantee that the digit characters are contiguous and ordered. C under z/OS, for example, uses EBCDIC which places the digits at code points 0xf0 through 0xf9.
The loop itself is a simple shift-and-add type, to create a number from multiple digit characters. Say you have the string "123", and number is initially zero.
You multiply number (zero) by ten to get zero then add digit character '1' (49) and subtract 48. This gives you one.
You then multiply number (one) by ten to get ten and add digit character '2' (50), again subtracting 48. This gives you twelve.
Finally, you multiply number (twelve) by ten to get a hundred and twenty then add digit character '3' (51) and subtract 48. This gives you a hundred and twenty three.
There are better ways to do this in the C standard library, atoi or the more robust strtol-type functions, all found in stdlib.h. The latter allow you to better detect if there was "rubbish" at the end of the number, for assistance with validation (atoi cannot tell the difference between 123 and 123xyzzy).
And, as yet another aside, you should avoid gets() like the plague. It, like the "naked" scanf("%s"), is not suitable for user input, and opens your code to buffer overflow problems. In fact, unlike scanf(), there is no safe way to use gets(), which is undoubtedly why it has been removed from C11, the latest standard. A more robust user input function can be found here.
There's also a large class of addresses for which that code will fail miserably, such as:
3/28 Tivoli Rd
57a Smith Street
Flat 2, 12 Xyzzy Lane

Related

What does [c - '0'] mean in array? [duplicate]

This question already has answers here:
Why does subtracting '0' in C result in the number that the char is representing?
(8 answers)
Closed 1 year ago.
On page 24 of K and R - C programming book,
there is this code from a program to count digits and other input.
while((c = getchar()) != EOF)
if(c >= '0' && c <= '9')
++ndigit[c - '0'];
I am not understanding the last line of the code. What does [c - '0'] mean?
Entire program from the book: https://imgur.com/a/4WhIOsz
Every character has a numeric value; that means if you input the character "0" you're not actually inputting the number "0". You're inputting the character "0" which its value is 48 in the ASCII table (it may be different in other character sets...)
so when you input the character "0" you get the actual number zero in the following (According to the ASCII table):
c - '0' means c - 48
'0' - '0' means 48 - 48 = 0
'1' - '0' means 49 - 48 = 1
'2' - '0' means 50 - 48 = 2
and so on...
notice that the code is designed to work with different character sets. not just ASCII.
Search "ASCII table" in google to see the chart in order to understand it better.
'0' is the ASCII character 0, with a decimal value of 48. (See https://www.asciitable.com for a full listing.)
Since the numerals 0-9 are all guaranteed by the C standard to have consecutive values (in this case, decimal values from 48 to 57), by subtracting the character 0 from the input character, you arrive at the corresponding integer value, that you can then use in further processing.
'0' - '0' = 0
'1' - '0' = 1
'2' - '0' = 2
...
'9' - '0' = 9
In this case, the further processing is then used to index into the ndigit array.
C language mandates the code representation for numbers to be consecutive. That means that even if you do not use ASCII (EBCDIC used to be another common charset) you can be sure that the code for '5' will be the code for '0' + 5.
So the idiom c - '0' is guaranteed to be portable across any conformant system, while c - 48 (or c - 0x30) will only work on ASII (or ASCII derivatives like latin1, cp1252, utf8, etc.) systems.

can someone please tell why " -'0' "is being done in the fifth line of following function [duplicate]

This question already has answers here:
Why does subtracting '0' in C result in the number that the char is representing?
(8 answers)
Convert char to int in C and C++
(14 answers)
Closed 4 years ago.
m=1e9 + 7;
inline ll rem(char s[],ll m)
{
ll sum=0 , i;
for(i=0;S[i]!='\0';i++)
{
if(sum>=m)
sum %= m;
sum=(sum * 10 + S[i] - '0');
}
return sum%m;
}
here S[i] is a string of integer characters. My question is
what does -'0' does here, also can a character (here S[i]) be automatically converted to integer form is the above
sum=(sum * 10 + S[i] - '0');
equation.
First, you have to remember that characters in C are represented as tiny integers corresponding to the character's value in the machine's character set, which is typically ASCII.
For example, 'A' n ASCII is 65, and '0' is 48.
So if you're converting a string of digits to an integer, you want to do something like
int digit = c - 48;
That converts '0' to 0, '1' to 1, etc.
But that magic number 48 is mystifying, and it's theoretically also wrong on a machine using a character set other than ASCII. So the easier (because you don't have to remember that value 48), self-documenting (as long as your reader understands the idiom), and more portable way is to do
int digit = c - '0';
This works because, as I said, '0' is 48 in ASCII. But, more importantly, even on a non-ASCII machine, '0' is whatever value the character '0' has in that machine's character set, so it's always the right value to subtract, no matter what kind of machine you're using.

What's the longest string that can be printed with "%1.17g" format for any double precision float?

I'm maintaining a C json library and I need to know what's the maximum numbers of characters sprintf will output with "%1.17g" format string. Currently I'm allocating 1100 bytes (based on What is the maximum length in chars needed to represent any double value?) which seems quite wasteful. If I understand correctly it should never be longer than 22 characters (1 for integer part, 1 for dot, 16 for mantissa, 4 for "e-XX"). However problems with floating point numbers can be quite counterintuitive and I'm not sure if I'm not missing something. Is my reasoning correct?
Continuing from the comment,
The %1. (one before the '.') specifies the minimum field-width, it provides no limitation on the number of digits that can appear. If the number of digits exeeds the field-width, the field is expanded.
For g or G conversion specifiers the 17 specifies the "the maximum number of significant digits". Further "Style e is used if the exponent from its conversion is less than -4 or greater than or equal to the precision."
e, E The double argument is rounded and converted in the style [-]d.ddde±dd where there is one digit before the decimal-point
character and the number of digits after it is equal to the precision;
if the precision is missing, it is taken as 6; if the precision is
zero, no decimal-point character appears. An E conversion uses the
letter 'E' (rather than 'e') to introduce the exponent. The
exponent always contains at least two digits; if the value is zero,
the exponent is 00.
The maximum number of digits would then be:
'(+/-)' + 1 + '.' + 17 + 'e' + '(+/-)' + XXX + '\0' = 26-chars
(where XXX is a maximum of 308)
For good measure a buffer of 32-chars should suffice. There is nothing wrong with an 1100-char buffer. I'd rather be 10,000 bytes too long, than 1-byte too short.
What's the longest string that can be printed with “%1.17g” format for any double
Using "%1.17g" prints the double using various styles:
// Large/small values in exponential notation
printf("%1.17g\n", -1.0e200/7);
printf("%1.17g\n", -1.0e-200/7);
printf("%1.17g\n", -1.0e0/7);
-1.4285714285714286e+199
-1.4285714285714286e-201
-1.4285714285714286e-06
// middle values in fixed notation
printf("%1.17g\n", -1.0e-2/7);
printf("%1.17g\n", -1.0e-5/7);
-0.14285714285714285
-0.0014285714285714286
// non-finite values
printf("%1.17g\n", -NAN);
printf("%1.17g\n", -INFINITY);
-nan /* this may be longer */
-inf
The longest apparent string size is 25 char:
sign digit point fraction e sign exponent null
- 1 . 4285714285714286 e + 199 \0
1 1 1 17-1 1 1 3 1
What could this be longer?
C allows not-a-numbers to also include a payload with may include many characters. (I doubt more than the payload written in decimal. 16 with binary64)
The exponent range may be need more than 3 characters. (perhaps a 4 or 5 digit exponent)
double may require more the 17 digits to differentiate all double. (Detectable with DBL_DECIMAL_DIG)
The present locale may add extra characters for a double (not so likely)
The lead 1 in "%1.17g" is the minimum characters to print. It serves scant purpose here.
Solution: estimate the longest buffer using generous considerations - and then double it.
#define G_SIZE (1 + 1 + 1 + DBL_DECIMAL_DIG-1 + 1 + 1 + 5 + 1)
char buf[G_SIZE * 2];
int cnt = snprintf(buf, sizeof buf, "%.*g", DBL_DECIMAL_DIG, value);
if (cnt < 0 || cnt >= sizeof buf) {
unexpected_conversion_hanlder();
}
or use a variable length array and 2 calls to snprintf()
int cnt = snprintf(NULL, 0, "%.*g", DBL_DECIMAL_DIG, value);
char buf[cnt + 1];
snprintf(buf, sizeof buf , "%.*g", DBL_DECIMAL_DIG, value);

Octal to Decimal multidigit in C

I have a program I am writing that converts Octal to Decimal numbers. Most of it works.
(more code above this, assume all variables are properly declared).
for(i; i > 0; i--)
{
decimalNumber = (decimalNumber + (number['i'] * pow(8,power)));
power++;
}
The code correctly shifts over to the right to do other digits but it doesn't change the number it is working with. For example, entering 54 in octal results in an output of 36, 4*(8^0) + 4*(8^1) when it should be outputting 4*(8^0) + 5*(8^1), or 44.
'i' is a constant. You probably meant just i. Also, << 3.
As Ignacio pointed out, 'i' is a constant and will cause you to access the same out of bounds array element on each iteration of the loop. Since I assume you start with i equal to the number of digits in the array (you didn't show that code), you want to subtract 1 from it when you use it as an array index.
You're traversing the string in the wrong direction.
Or, better, change your logic:
5 -> 5*8^0
54 -> (5*8^0)*8 + 4
543 -> ((5*8^0)*8 + 4)*8 + 3
number[0] is 5
number[1] is 4
decimalNumber is 0
power is 0
i = 1 downto 0 do
decimalNumber = (decimalNumber + (number[i:1,0] * pow(8,power:0,1)));
power++;
do end

What does the condition "if (n/10)" with integral n specify?

I am looking at the following piece of code:
void printd(int n)
{
if (n < 0) {
putchar('-');
n = -n;
}
if (n / 10)
printd(n / 10);
putchar(n % 10 + '0');
}
I understand the first if statement fine, but the second one has me confused on a couple of points.
By itself, since "n" is an integer, I understand that n/10 will shift the decimal point to the left once - effectively removing the last digit of the number; however, I am having a little trouble understanding how this can be a condition by itself without the result being equal to something. Why isn't the condition if ((n/10) >= 0) or something?
Also, why is the '0' passed into the putchar() call?
Can someone tell me how it would read if you were to read it aloud in English?
Thanks!
The n / 10 will evaluate to false if the result is 0, true otherwise. Essentially it's checking if n > 10 && n < -10 (the -10 doesn't come into play here due to the n = -n code)
The + '0' is for character offset, as characters '0'-'9' are not represented by numbers 0-9, but rather at an offset (48-57 with ascii).
Can someone tell me how it would read if you were to read it aloud in English?
If you're talking about the conditional, then I would say "if integer n divided by 10 is not zero"
n/10 will not shift the decimal number since n is an integer. The division will produce the result like this: if n = 25, then n/10 would be 2 (without any decimal points), similarly if n = 9, then n/10 would be 0 in which case if condition would not be satisfied.
Regarding the +'0', since n%10 produces an integer result and in putchar you are printing a char , you need to convert the integer to a char. This is done by adding the ascii value of 0 to the integer.
In C, there is no separate boolean type; an expression like a > b evaluates to zero if false, non-zero if true. Sometimes you can take advantage of this when testing for zero or non-zero in an int.
As for the '0', that just performs character arithmetic so that the right character is printed. The zero character has an ASCII encoding value which isn't zero, so the n value is used as an offset from that encoding to get the right numeric digit printed out.

Resources