Decoding printf statements in C (Printf Primer) - c

I'm working on bringing some old code from 1998 up to the 21st century. One of the first steps in the process is converting the printf statements to QString variables. No matter how many times I look back at printf though, I always end up forgetting one thing or the other. So, for fun, let's decode it together, for ole' times sake and in the process create the first little 'printf primer' for Stackoverflow.
In the code, I came across this little gem,
printf("%4u\t%016.1f\t%04X\t%02X\t%1c\t%1c\t%4s", a, b, c, d, e, f, g);
How will the variables a, b, c, d, e, f, g be formatted?

Danny is mostly right.
a. unsigned decimal, minimum 4 characters, space padded
b. floating point, minimum 16 digits before the decimal (0 padded), 1 digit after the decimal
c. hex, minimum 4 characters, 0 padded, letters are printed in upper case
d. same as above, but minimum 2 characters
e. e is assumed to be an int, converted to an unsigned char and printed
f. same as e
g. This is likely a typo, the 4 has no effect. If it were "%.4s", then a maximum of 4 characters from the string would be printed. It is interesting to note that in this case, the string does not need to be null terminated.
Edit: jj33 points out 2 errors in b and g above here.

#Jason Day, I think the 4 in the last %4s is significant if there are fewer than 4 characters. If there are more than 4 you are right, %4s and %s would be the same, but with fewer than 4 chars in g %s would be left justified and %4s would be right-justified in a 4 char field.
b is actually minimum 16 chars for the whole field, including the decimal and the single digit after the decimal I think (16 total chars vs 18 total chars)

Here's my printf primer:
http://www.pixelbeat.org/programming/gcc/format_specs.html
I always compile with -Wall with gcc which
will warn about any mismatches between the supplied
printf formats and variables.

#jj33, you're absolutely right, on both counts.
#include <stdio.h>
int main(int argc, char *argv[]) {
char *s = "Hello, World";
char *s2 = "he";
printf("4s: '%4s'\n", s);
printf(".4s: '%.4s'\n", s);
printf("4s2: '%4s'\n", s2);
printf(".4s2: '%.4s'\n", s2);
return 0;
}
$ gcc -o foo foo.c
$ ./foo
4s: 'Hello, World'
.4s: 'Hell'
4s2: ' he'
.4s2: 'he'
Good catch!

a. decimal, four significant digits
b. Not sure
c. hex, minimum 4 characters
d. Also hex, minimum 2 characters
e. 1 character
f. String of characters, minimum 4

What you really need is a tool which takes the format strings in printf() statements and converts them into equivalent QString based function calls.
Does anyone want to spend his Free Software Donation Time on developing such a tool?
Placeholder for URL to a Free Software hosting service holding the source code of such a tool

Related

Why does my program seem to go over some numbers? [duplicate]

I'm trying to find a good way to print leading 0, such as 01001 for a ZIP Code. While the number would be stored as 1001, what is a good way to do it?
I thought of using either case statements or if to figure out how many digits the number is and then convert it to an char array with extra 0's for printing, but I can't help but think there may be a way to do this with the printf format syntax that is eluding me.
printf("%05d", zipCode);
The 0 indicates what you are padding with and the 5 shows the width of the integer number.
Example 1: If you use "%02d" (useful for dates) this would only pad zeros for numbers in the ones column. E.g., 06 instead of 6.
Example 2: "%03d" would pad 2 zeros for one number in the ones column and pad 1 zero for a number in the tens column. E.g., number 7 padded to 007 and number 17 padded to 017.
The correct solution is to store the ZIP Code in the database as a STRING. Despite the fact that it may look like a number, it isn't. It's a code, where each part has meaning.
A number is a thing you do arithmetic on. A ZIP Code is not that.
You place a zero before the minimum field width:
printf("%05d", zipcode);
sprintf(mystring, "%05d", myInt);
Here, "05" says "use 5 digits with leading zeros".
If you are on a *nix machine:
man 3 printf
This will show a manual page, similar to:
0 The value should be zero padded. For d, i, o, u, x, X, a, A, e,
E, f, F, g, and G conversions, the converted value is padded on
the left with zeros rather than blanks. If the 0 and - flags
both appear, the 0 flag is ignored. If a precision is given
with a numeric conversion (d, i, o, u, x, and X), the 0 flag is
ignored. For other conversions, the behavior is undefined.
Even though the question is for C, this page may be of aid.
ZIP Code is a highly localised field, and many countries have characters in their postcodes, e.g., UK, Canada. Therefore, in this example, you should use a string / varchar field to store it if at any point you would be shipping or getting users, customers, clients, etc. from other countries.
However, in the general case, you should use the recommended answer (printf("%05d", number);).
There are two ways to output your number with leading zeroes:
Using the 0 flag and the width specifier:
int zipcode = 123;
printf("%05d\n", zipcode); // Outputs 00123
Using the precision specifier:
int zipcode = 123;
printf("%.5d\n", zipcode); // Outputs 00123
The difference between these is the handling of negative numbers:
printf("%05d\n", -123); // Outputs -0123 (pad to 5 characters)
printf("%.5d\n", -123); // Outputs -00123 (pad to 5 digits)
ZIP Codes are unlikely to be negative, so it should not matter.
Note however that ZIP Codes may actually contain letters and dashes, so they should be stored as strings. Including the leading zeroes in the string is straightforward so it solves your problem in a much simpler way.
Note that in both examples above, the 5 width or precision values can be specified as an int argument:
int width = 5;
printf("%0*d\n", width, 123); // Outputs 00123
printf("%.*d\n", width, 123); // Outputs 00123
There is one more trick to know: a precision of 0 causes no output for the value 0:
printf("|%0d|%0d|\n", 0, 1); // Outputs |0|1|
printf("|%.0d|%.0d|\n", 0, 1); // Outputs ||1|
printf allows various formatting options.
Example:
printf("leading zeros %05d", 123);
You will save yourself a heap of trouble (long term) if you store a ZIP Code as a character string, which it is, rather than a number, which it is not.
More flexible..
Here's an example printing rows of right-justified numbers with fixed widths, and space-padding.
//---- Header
std::string getFmt ( int wid, long val )
{
char buf[64];
sprintf ( buf, "% *ld", wid, val );
return buf;
}
#define FMT (getFmt(8,x).c_str())
//---- Put to use
printf ( " COUNT USED FREE\n" );
printf ( "A: %s %s %s\n", FMT(C[0]), FMT(U[0]), FMT(F[0]) );
printf ( "B: %s %s %s\n", FMT(C[1]), FMT(U[1]), FMT(F[1]) );
printf ( "C: %s %s %s\n", FMT(C[2]), FMT(U[2]), FMT(F[2]) );
//-------- Output
COUNT USED FREE
A: 354 148523 3283
B: 54138259 12392759 200391
C: 91239 3281 61423
The function and macro are designed so the printfs are more readable.
If you need to store the ZIP Code in a character array, zipcode[], you can use this:
snprintf(zipcode, 6, "%05.5d", atoi(zipcode));

In C what does this line do:

printf("\n%12.6f%12.6f%12.6f", R[1], LS[1], LAMBDA);
The R is an array of floats, LS is also an array of floats, and LAMBDA is a single float variable.
I'm trying to convert a program over to Java but I cannot figure out what this line is trying to do (I am not experienced in C at all).
The printf() function prints to stdout, which is usually the console, and uses a particular variable-replacement syntax in strings - what you're looking at is thus a format string. Breaking it down:
\n : newline
%12.6f : next variable,
width 12 -- Padded with spaces to make the string exactly 12 characters if it's not already
precision 6 -- Six digits after the decimal point
in decimal floating point format
+2 more iterations of this
The array lookup syntax is the same as in Java, so it's looking up the second (because zero-based indexing) element in R and LS.
It just print these value to stdout, in float number format with specified format.
More details can be found at printf().
Here is a tiny test program, and its output:
#include <stdio.h>
int main(int argc, char **argv) {
float R[2] = {1./7. * 10000, 2./7. * 10000};
float LS[2] = {3./7. *10000, 4/7. * 10000};
float LAMBDA = 5/7. *10000;
printf("BEFORE");
printf("\n%12.6f%12.6f%12.6f", R[1], LS[1], LAMBDA);
printf("AFTER\n");
}
BEFORE
2857.142822 5714.285645 7142.856934AFTER
So, a newline, then all on one line, with each number taking up 12 spaces, R[1], LS[1], and LAMBDA, printed to 6 digits of accuracy. Then, no newline, so they must want something else to happen on the line line after.

keeping leading zeros in C [duplicate]

This question already has answers here:
Printing leading 0's in C
(11 answers)
Print numbers sequentially using printf with filling zeroes
(8 answers)
Closed 9 years ago.
I was trying to print a integer in c but those starting with zeroes causing me problem.
For example if no. is 01234 it is printing like 1234 instead of 01234.please tell how to do it in C
My problem is that there are 2 integers and I want to know whether first integer is in the starting of second or not.
for ex-
123 and
12345 "yes" because 123(first integer) is in the beginning of second integer(12345)
but in case
123 and
012345
it should print "no" because 123 in not in the beginnig of 0123345 but in c trailing zeroes get deleted and my program is printing "yes"
please tell what to do (note-no.of digits can vary in range of integer and 2nd integer is either equal or greater then 1st integer)
int i = 1234;
printf("%08d", i); // zero-pad to 8 places.
printf documentation
Working example
my suggestion would be , If you are taking the value from STDIN from the user.
Then if you want to use that value for printing purpose, then you need to store that integer with leading zeros into a string rather than an integer. Because leading zero has no meaning if you are storing that string value in an integer.
so %s in printf with retain the number of zeroes that user has enetered in that way.
#include <stdio.h>
#include <string.h>
#define N_DIGITS 64
main()
{
char a[N_DIGITS], b[N_DIGITS];
scanf("%s%s", a, b);
if(strncmp(a, b, strlen(a))==0)
puts("yes");
else
puts("no");
return 0;
}
Note that the length of digits is limited by value N_DIGITS.
The result is as below;
$ ./a.out <<<"123 12345"
yes
$ ./a.out <<<"123 012345"
no

figure out 2 strings similar or not

Rules:
2 strings, a and b, both of them consist of ASCII chars and non-ASCII chars (say, Chinese Characters gbk-encoded).
If the non-ASCII chars contained in b also show up in a and no less than the times they appear in b, then we say b is similar with a.
For example:
a = "ab中ef日jkl中本" //non-ASCII chars:'中'(twice), '日'(once), '本'(once)
b = "bej中中日" //non-ASCII chars:'中'(twice), '日'(once)
c = 'lk日日日' //non-ASCII chars:'日'(3 times, more than twice in a)
according to the rule, b is similar with a, but c is not.
Here is my question:
We don't know how many non-ASCII chars are there in a and b, probably many.
So to find out how many times a non-ASCII char appears in a and b, am I supposed to use a Hash-Table to store their appearing-times?
Take string a as an example:
[non-ASCII's hash-value]:[times]
中's hash-val : 2
日's hash-val : 1
本's hash-val : 1
Check string b, if we encounter a non-ASCII char in b, then hash it and check a's hash-table, if the char is present in a's hash-table, then its appearing-times decrements by 1.
If the appearing-times is less than 0 (-1), then we say b is not similar with a.
Or is there any better way?
PS:
I read string a byte by byte, if the byte is less than 128, then I take is as an ASCII char, otherwise I take it as part of a non-ASCII char (multi-bytes).
This is what I am doing to find out the non-ASCII chars.
Is it right?
You have asked two questions:
Can we count the non-ASCII characters using a hashtable? Answer: sure. As you read the characters (not the bytes), examine the codepoints. For any codepoint greater than 127, put it into a counting hashtable. That is for a character c, add (c,1) if c is not in the table, and update (c,x) to (c, x+1) if c is in the table already.
Is there a better way to solve this problem than your approach of incrementing counts in a and decrementing as you run through b? If your hashtable implementation gives nearly O(1) access, then I suspect not. You are looking at each character in the string exactly once, and for each character your are doing either an hashtable insert or lookup and an addition or subtraction, and a check against 0. With unsorted strings, you have to look at all the characters in both strings anyway, so you've given, I think, the best solution.
The interviewer might be looking for you to say things like, "Hmmmmm, if these strings were actually massive files that could not fit in memory, what would I do?" Or for you to ask "Well are the string sorted? Because if they are, I can do it faster...".
But now let's say the strings are massive. The only thing you are storing in memory is the hashtable. Unicode has only around 1 million codepoints and you are storing an integer count for each, so even if you are getting data from gigabyte sized files you only need around 4MB or so for your hash table (or a small multiple of this, as there will be overhead).
In the absence of any other conditions, your algorithm is nice. Sorting the strings beforehand isn't good; it takes up more memory and isn't a linear-time operation.
ADDENDUM
Since your original comments mentioned the type char as opposed to wchar_t, I thought I'd show an example of using wide strings. See http://codepad.org/B3MXOgqc
Hope that helps.
ADDENDUM 2
Okay here is a C program that shows exactly how to go through a widestring and work at the character level:
http://codepad.org/QVX3QPat
It is a very short program so I will also paste it here:
#include <stdio.h>
#include <string.h>
#include <wchar.h>
char *s1 = "abd中日";
wchar_t *s2 = L"abd中日";
int main() {
int i, n;
printf("length of s1 is %d\n", strlen(s1));
printf("length of s2 using wcslen is %d\n", wcslen(s2));
printf("The codepoints of the characters of s2 are\n");
for (i = 0, n = wcslen(s2); i < n; i++) {
printf("%02x\n", s2[i]);
}
return 0;
}
Output:
length of s1 is 9
length of s2 using wcslen is 5
The codepoints of the characters of s2 are
61
62
64
4e2d
65e5
What can we learn from this? A couple things:
If you use plain old char for CJK characters then the string length will be wrong.
To use Unicode characters in C, use wchar_t
String literals have a leading L for wide strings
In this example I defined a string with CJK characters and used wchar_t and a for-loop with wcslen. Please note here that I am working with real characters, NOT BYTES, so I get the correct count of characters, which is 5. Now I print out each codepoint. In your interview question, you will be looking to see if the codepoint is >= 128. I showed them in Hex, as is the culture, so you can look for > 0x7F. :-)
ADDENDUM 3
A few notes in http://tldp.org/HOWTO/Unicode-HOWTO-6.html are worth reading. There is a lot more to character handling than the simple example above shows. In the comments below J.F. Sebastian gives a number of other important links.
Of the few things that need to be addressed is normalization. For example, does your interviewer care that when given two strings, one containing just a Ç and the other a C followed by a COMBINING MARK CEDILLA BELOW, would they be the same? They represent the same character, but one uses one codepoint and the other uses two.

Printing leading 0's in C

I'm trying to find a good way to print leading 0, such as 01001 for a ZIP Code. While the number would be stored as 1001, what is a good way to do it?
I thought of using either case statements or if to figure out how many digits the number is and then convert it to an char array with extra 0's for printing, but I can't help but think there may be a way to do this with the printf format syntax that is eluding me.
printf("%05d", zipCode);
The 0 indicates what you are padding with and the 5 shows the width of the integer number.
Example 1: If you use "%02d" (useful for dates) this would only pad zeros for numbers in the ones column. E.g., 06 instead of 6.
Example 2: "%03d" would pad 2 zeros for one number in the ones column and pad 1 zero for a number in the tens column. E.g., number 7 padded to 007 and number 17 padded to 017.
The correct solution is to store the ZIP Code in the database as a STRING. Despite the fact that it may look like a number, it isn't. It's a code, where each part has meaning.
A number is a thing you do arithmetic on. A ZIP Code is not that.
You place a zero before the minimum field width:
printf("%05d", zipcode);
sprintf(mystring, "%05d", myInt);
Here, "05" says "use 5 digits with leading zeros".
If you are on a *nix machine:
man 3 printf
This will show a manual page, similar to:
0 The value should be zero padded. For d, i, o, u, x, X, a, A, e,
E, f, F, g, and G conversions, the converted value is padded on
the left with zeros rather than blanks. If the 0 and - flags
both appear, the 0 flag is ignored. If a precision is given
with a numeric conversion (d, i, o, u, x, and X), the 0 flag is
ignored. For other conversions, the behavior is undefined.
Even though the question is for C, this page may be of aid.
ZIP Code is a highly localised field, and many countries have characters in their postcodes, e.g., UK, Canada. Therefore, in this example, you should use a string / varchar field to store it if at any point you would be shipping or getting users, customers, clients, etc. from other countries.
However, in the general case, you should use the recommended answer (printf("%05d", number);).
There are two ways to output your number with leading zeroes:
Using the 0 flag and the width specifier:
int zipcode = 123;
printf("%05d\n", zipcode); // Outputs 00123
Using the precision specifier:
int zipcode = 123;
printf("%.5d\n", zipcode); // Outputs 00123
The difference between these is the handling of negative numbers:
printf("%05d\n", -123); // Outputs -0123 (pad to 5 characters)
printf("%.5d\n", -123); // Outputs -00123 (pad to 5 digits)
ZIP Codes are unlikely to be negative, so it should not matter.
Note however that ZIP Codes may actually contain letters and dashes, so they should be stored as strings. Including the leading zeroes in the string is straightforward so it solves your problem in a much simpler way.
Note that in both examples above, the 5 width or precision values can be specified as an int argument:
int width = 5;
printf("%0*d\n", width, 123); // Outputs 00123
printf("%.*d\n", width, 123); // Outputs 00123
There is one more trick to know: a precision of 0 causes no output for the value 0:
printf("|%0d|%0d|\n", 0, 1); // Outputs |0|1|
printf("|%.0d|%.0d|\n", 0, 1); // Outputs ||1|
printf allows various formatting options.
Example:
printf("leading zeros %05d", 123);
You will save yourself a heap of trouble (long term) if you store a ZIP Code as a character string, which it is, rather than a number, which it is not.
More flexible..
Here's an example printing rows of right-justified numbers with fixed widths, and space-padding.
//---- Header
std::string getFmt ( int wid, long val )
{
char buf[64];
sprintf ( buf, "% *ld", wid, val );
return buf;
}
#define FMT (getFmt(8,x).c_str())
//---- Put to use
printf ( " COUNT USED FREE\n" );
printf ( "A: %s %s %s\n", FMT(C[0]), FMT(U[0]), FMT(F[0]) );
printf ( "B: %s %s %s\n", FMT(C[1]), FMT(U[1]), FMT(F[1]) );
printf ( "C: %s %s %s\n", FMT(C[2]), FMT(U[2]), FMT(F[2]) );
//-------- Output
COUNT USED FREE
A: 354 148523 3283
B: 54138259 12392759 200391
C: 91239 3281 61423
The function and macro are designed so the printfs are more readable.
If you need to store the ZIP Code in a character array, zipcode[], you can use this:
snprintf(zipcode, 6, "%05.5d", atoi(zipcode));

Resources