printf's missing binary format specifiers: `%b` and `%B` - c

I'm using an Ubuntu 22.04 for some software development, and working with binary data. I recently found this announcement for glibc ver 2.35; it states that ver 2.35 incorporates the binary conversion format specifiers %b and %B:
printf-family functions now support the %b format for output of
integers in binary, as specified in draft ISO C2X, and the %B variant
of that format recommended by draft ISO C2X.
I have checked my glibc version as follows, but I'm not sure this is the authoritative way to get this information:
$ ldd --version
ldd (Ubuntu GLIBC 2.35-0ubuntu3.1) 2.35
It appears I can use the binary format specifiers %b and %B, but...
I've installed the Raspberry Pi C SDK about a month ago on this Ubuntu for a micro-controller project. I've used the binary format specifiers %b and %B in printf as follows:
printf("\n binary values stored in rcvdata[0]: %#b \n", rcvdata[0]);
printf(" binary values stored in rcvdata[1]: %#b \n", rcvdata[1]);
The code compiles on the Pico without error or warning, and the output to my serial terminal (minicom 2.8) is as follows:
binary values stored in rcvdata[0]: 0b11111111
binary values stored in rcvdata[1]: 0b11011000
So yeah - I can use the binary format specifiers in my RPi Pico C SDK development environment.
But then I tried to use similar printf statements for a small program I compiled with gcc to run on Ubuntu. It runs, but gives me 2 warnings for each usage:
The code:
#include <stdio.h>
int main() {
char data_char = 0x41;
unsigned char u_data_char = 8;
int int_val = -39;
printf("\nSome experiments:");
printf("\nvalue of data_char: %#b", data_char);
printf("\nvalue of u_data_char: %#b", u_data_char);
printf("\nvalue of int_val: %#10b", int_val);
printf("\n\n");
return 0;
}
Compilation output (partial):
$ gcc -o printf_binary_3, printf_binary_3.c
printf_binary_3.c: In function ‘main’:
printf_binary_3.c:10:37: warning: unknown conversion type character ‘b’ in format [-Wformat=]
10 | printf("\nvalue of data_char: %#b", data_char);
| ^
printf_binary_3.c:10:12: warning: too many arguments for format [-Wformat-extra-args]
10 | printf("\nvalue of data_char: %#b", data_char);
The execution result:
$ ./printf_binary_3
Some experiments:
value of data_char: 0b1000001
value of u_data_char: 0b1000
value of int_val: 0b11111111111111111111111111011001
I don't know what to make of this:
Why does gcc yield warnings & state unknown conversion type character ‘b’ ??
I'd guess the warning [-Wformat-extra-args] is the result of failure to recognize %b as a legitimate format specifier??
Finally - can anyone explain why I get what appears to be valid results when the compiler says the format specifier I used doesn't exist, and how to fix this?
And this may be useful information: I learned during this Q&A that the documentation for the binary specifiers is not complete as of this writing.

Related

I have difficulties with putwchar() in c

Certainly, my problem is not new...., so I apologize if my error is simply too stupid.
I just wanted to become familiar with putwchar and simply wrote the following little piece of code:
#include <stdio.h>
#include <wchar.h>
#include <locale.h>
int main(void)
{
char *locale = setlocale(LC_ALL, "");
printf ("Locale: %s\n", locale);
//setlocale(LC_CTYPE, "de_DE.utf8");
wchar_t hello[]=L"Registered Trademark: ®®\nEuro sign: €€\nBritisch Pound: ££\nYen: ¥¥\nGerman Umlauts: äöüßÄÖÜ\n";
int index = 0;
while (hello[index]!=L'\0'){
//printf("put liefert: %d\n", putwchar(hello[index++]));
putwchar(hello[index++]);
};
}
Now. the output is simply:
Locale: de_DE.UTF-8
Registered Trademark: ��
Euro sign: ��
Britisch Pound: ��
Yen: ��
German Umlauts: �������
\[1\]+ Fertig gedit versuch.c
None of the non-ASCII chars appeared on the screen.
As you see in the comment (and I well noticed that I must not mix putwchar and print in the same program, hence the line is in comment, putwchar returned the proper Unicode codepoint for the character I wanted to print. Thus, the call is supposed to work. (At least to my understanding.)
The c source is coded in utf-8
$ file versuch.c
versuch.c: C source, UTF-8 Unicode text
my system is Ubuntu Linux 20.04.05
compiler: gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
I would greatly appreciate any advice on this one.
As stated above: I simply expected the trademark sign, yen, € and the umlauts äöüßÄÖÜ to appear.
You shouldn't mix normal and wide output on the same stream.
I get the expected output if I change this early print:
printf ("Locale: %s\n", locale);
into a wide print:
wprintf(L"Locale: %s\n", locale);
Then the subsequent putwchar() calls write the expected characters.
You cannot mix narrow and wide I/O in the same stream (7.21.2). If you want putwchar, you cannot use printf. Start with wprintf instead (with the wide format string):
wprintf (L"Locale: %s\n", locale);
You can simply print those wide characters as shown below:
wprintf(L"Registered Trade Mark: %ls\n", L"®®");
wprintf(L"Euro Sign: %ls\n", L"€€");
wprintf(L"British Pound: %ls\n", L"££");
wprintf(L"Yen: %ls\n", L"¥¥");
wprintf(L"German Umlauts: %ls\n", L"äöüßÄÖÜ");
Please refer:
https://stackoverflow.com/a/37587933/2805824
https://stackoverflow.com/a/7696033/2805824

[msys2][gcc] Wrong output of wprintf() compiling with MSys2's gcc 10.2

I need to get this stripped-down C program working on Windows 7 using MSys2's gcc toolchain:
#include <stdio.h>
void wmain(int argc, wchar_t *argv[])
{
for (int i = 1; i < argc; i++)
wprintf(L"%s\n", argv[i]);
}
The code compiles with
gcc -Wall -municode -O2 -march=x86-64 -m64 test.c
but gives me the following output
>> ./a.exe kk лл
k (!)
:?:?
I have the following questions:
What am I doing wrong?
How would I downgrade the compiler to
version, say, 9.x, or 10.1? (I'm under the impression that the very
same program compiled about one year ago used to work correctly)
Edit [1]: Meanwhile I managed to set up a new MSys2 environment using gcc 9.3. The "error" persists, so it's not the compiler.
Edit [2]: "Some programmer dude" (cmp. below) described the "immediate" solution (THX!).
Even for the wide-character wprintf the format %s is for narrow character strings.
You need to use %ls to print wide-character strings:
wprintf(L"%ls\n", argv[i]);
However this might still not be enough, as the actual encoding of the input (including arguments) might not be what's expected. You need to take into account the encoding used by the terminal the program is running in.

Printf not printing to terminal in Apple Clang

We have started C programming in my Uni, and I appear to have fallen at the first hurdle. My very simple program will not print to the terminal. The code:
#include "stdio.h"
int main(){
printf("Memory size for type %s = %lu \n", "double", sizeof(double));
return 0;
}
I have used all my google-fu, and have only found that I apparently should use vprint, but it won't take three arguments, only two. Also, bizarrely, printing to a file works! See screenshot:
Terminal screenshot
The format specifier for size_t(the return type of sizeof) is %zu.
printf("Memory size for type %s = %zu \n", "double", sizeof(double));
Alright, apparently, when I simply run the program, Clang prints the output to a.out in the same directory as the code, and there's nothing I can do about it. Whatever, as long as the code works - means I can turn it in, and I'll be checking my work with ./a.out.
According to your screenshot, it appears that there is a misconception here :
gcc 1.c
gcc is used to compile the program (creating the executable, here using the source file 1.c), not to run it. The program might have been compiled under the name 'a.out'.
When you compile your program using :
gcc 1.c -o 1.txt
You are actually compiling the program with gcc and, using the option -o, creating the executable under the name 1.txt (a program can be named whatever you want, or almost)
Then when you type
./1.txt
you are actually running the program (1.txt) and you have the expected output.

How are library functions are linked in this case?

I just come across this code and the blog says this works fine on 32 bit architecture. I didn't test it; however, I have a doubt about the linkage of libraries in this case. How will the compiler link the string library to main since its not aware which library to link?
So basically if I include <string.h> then it should work fine; however, if I don't include <string.h> then, as per the blog, it runs in 32 bit architecture and fails to run on 64 bit architecture.
#include <errno.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *fp;
fp = fopen(argv[1], "r");
if (fp == NULL) {
fprintf(stderr, "%s\n", strerror(errno));
return errno;
}
printf("file exist\n");
fclose(fp);
return 0;
}
The code shown will only compile if you allow the compiler to infer that functions that are not declared always return an int. This was valid in C89/C90 but marked obsolescent; C99 and C11 require functions to be declared before they are used. GCC prior to version 5.1.0 assumes C90 mode by default; you had to turn the 'reject this code' warnings on. GCC 5.1.0 and onwards assumes C11 by default. You will at least get warnings from the code even without any compilation options to turn them on.
The code will link fine because the function name is strerror() regardless of whether it was declared or not, and the linker can find the function in the standard C library. In general, all the functions that are in the Standard C library are automatically made available for linking — and, indeed, there are usually a lot of not so standard functions also available. C does not have type-safe linkage as C++ does (but C++ also insists on having every function declared before it is used, so the code would not compile as C++ without the header.)
For historical reasons, the maths library was separate and you needed to specify -lm in order to link it. This was in large part because hardware floating point was not universal, so some machines needed a library using the hardware, and other machines needed software emulation of the floating point arithmetic. Some platforms (Linux, for example) still require a separate -lm option if you use functions declared in <math.h> (and probably <tgmath.h>); other platforms (Mac OS X, for example) do not — there is a -lm to satisfy build systems that link it, but the maths functions are in the main C library.
If the code is compiled on a fairly standard 32-bit platform with ILP32 (int, long, pointer all 32-bit), then for many architectures, assuming that strerror() returns an int assumes that it returns the same amount of data as if it returns a char * (which is what strerror() actually returns). So, when the code pushes the return value from strerror() onto the stack for fprintf(), the correct amount of data is pushed.
Note that some architectures (notably the Motorola M680x0 series) would return addresses in an address register (A0) and numbers in a general register (D0), so there would be problems even on those machines with a 32-bit compilation: the compiler would try to get the returned value from the data register instead of the address register, and that was not set by strerror() — leading to chaos.
With a 64-bit architecture (LP64), assuming strerror() returns a 32-bit int means that the compiler will only collect 32-bits of the 64-bit address returned by strerror() and push that on the stack for fprintf() to work with. When it tried to treat the truncated address as valid, things would go awry, often leading to a crash.
When the missing <string.h> header is added, the compiler knows that the strerror() function returns a char * and all is happiness and delight once more, even when the file the program is told to look for doesn't exist.
If you are wise, you will ensure your compiler is always compiling in fussy mode, rejecting anything which is plausibly erroneous. When I use my default compilation on your code, I get:
$ gcc -std=c11 -O3 -g -Wall -Wextra -Werror -Wmissing-prototypes \
> -Wstrict-prototypes -Wold-style-definition bogus.c -o bogus
bogus.c: In function ‘main’:
bogus.c:10:33: error: implicit declaration of function ‘strerror’ [-Werror=implicit-function-declaration]
fprintf(stderr, "%s\n", strerror(errno));
^
bogus.c:10:25: error: format ‘%s’ expects argument of type ‘char *’, but argument 3 has type ‘int’ [-Werror=format=]
fprintf(stderr, "%s\n", strerror(errno));
^
bogus.c:10:25: error: format ‘%s’ expects argument of type ‘char *’, but argument 3 has type ‘int’ [-Werror=format=]
bogus.c:4:14: error: unused parameter ‘argc’ [-Werror=unused-parameter]
int main(int argc, char *argv[])
^
cc1: all warnings being treated as errors
$
The 'unused argument' error reminds you that you should be checking that there is an argument to pass to fopen() before you try to open the file.
Fixed code:
#include <string.h>
#include <errno.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *fp;
if (argc != 2)
{
fprintf(stderr, "Usage: %s file\n", argv[0]);
return 1;
}
fp = fopen(argv[1], "r");
if (fp == NULL)
{
fprintf(stderr, "%s: file %s could not be opened for reading: %s\n",
argv[0], argv[1], strerror(errno));
return errno;
}
printf("file %s exists\n", argv[1]);
fclose(fp);
return 0;
}
Build:
$ gcc -std=c11 -O3 -g -Wall -Wextra -Werror -Wmissing-prototypes \
> -Wstrict-prototypes -Wold-style-definition bogus.c -o bogus
$
Run:
$ ./bogus bogus
file bogus exists
$ ./bogus bogus2
./bogus: file bogus2 could not be opened for reading: No such file or directory
$ ./bogus
Usage: ./bogus file
$
Note that the error messages include the program name and report to standard error. When the file is known, the error message includes the file name; it is much easier to debug that error if the program is in a shell script than if the message is just:
No such file or directory
with no indication of which program or which file encountered the problem.
When I remove the #include <string.h> line from the fixed code shown, then I can compile it and run it like this:
$ gcc -o bogus90 bogus.c
bogus.c: In function ‘main’:
bogus.c:18:35: warning: implicit declaration of function ‘strerror’ [-Wimplicit-function-declaration]
argv[0], argv[1], strerror(errno));
^
$ gcc -std=c90 -o bogus90 bogus.c
$ ./bogus90 bogus11
Segmentation fault: 11
$
This was tested with GCC 5.1.0 on Mac OS X 10.10.5 — which is, of course, a 64-bit platform.
I solved with including strings.h header
#include <string.h>
I don't think the functionality of this code would be affected by whether its 32-bit or 64-bit architecture: it doesn't matter if pointers are 32- or 64-bit, and if long int is 32 or 64 bit. Inclusion of headers, in this case string.h, should not affect linking to libraries, either. Header inclusion matters to the compiler, not linker. The compiler might warn about the function being implicitly declared, but as long as the linker can find the function in one of the libraries being searched by it, it will successfully link the binary, and it should run just fine.
I just built and ran this code successfully on a 64-bit CentOS box, using clang 3.6.2. I did get this compiler warning:
junk.c:10:33: warning: implicitly declaring library function 'strerror' with type 'char *(int)'
fprintf(stderr, "%s\n", strerror(errno));
^
junk.c:10:33: note: include the header <string.h> or explicitly provide a declaration for 'strerror'
1 warning generated.
The program was given a non-existent file name, and the error message, "No such file or directory," was meaningful. However, this is because the strerror() function is a well-known standard library function, and its declaration was correctly guessed by the compiler. If it is a user-defined function, the compiler may not be so "lucky" at guessing, and then the architecture can matter, as suggested by other answers.
So, the lesson learned: make sure function declarations are available to the compiler and heed the warnings!

C: Expalanation for %No as a format specifier for printf

The output of the following piece of code (considering %No as a string),
#include <stdio.h>
int main(void) {
// your code goes here
printf("%No");
return 0;
}
on a linux machine is: %No
while on a windows machine is: 13
For the following code (considering %No as a format specifier), output
#include <stdio.h>
int main(void) {
// your code goes here
printf(" %c %No %d",65,65,23);
return 0;
}
on a linux machine is: A %No 23
while on a windows machine is: A 101 23
The output on the windows machine keeps on varying with different arguments for %No specifier. Any explanation about this specifier would be very helpful.
Thanks in Advance.
output of gcc -v on my windows machine
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=c:/program files (x86)/codeblocks/mingw/bin/../libexec/gcc/mingw32/4.7.1/lto-wrapper.exe
Target: mingw32
Configured with: ../../src/gcc-4.7.1/configure --build=mingw32 --enable-languages=c,c++,ada,fortran,objc,obj-c++ --enable-threads=win32 --enable-libgo
mp --enable-lto --enable-fully-dynamic-string --enable-libstdcxx-debug --enable-version-specific-runtime-libs --with-gnu-ld --disable-nls --disable-wi
n32-registry --disable-symvers --disable-build-poststage1-with-cxx --disable-werror --prefix=/mingw32tdm --with-local-prefix=/mingw32tdm --enable-cxx-
flags='-fno-function-sections -fno-data-sections' --with-pkgversion=tdm-1 --enable-sjlj-exceptions --with-bugurl=http://tdm-gcc.tdragon.net/bugs
Thread model: win32
gcc version 4.7.1 (tdm-1)
EDIT:
Got my answer, as cremno pointed out, %N does nothing, and %o prints in octal, so in %No , N is ignored and %o is used to print the passed argument as octal. The remaining question is why passing no argument is taken as 11(in decimal). FYI in ASCII 11 represents a vertical tab.
You have mismatched printf format string and arguments; that results in undefined behaviour. Anything could happen. If you want to print a literal %, use %% in the format string.

Resources