In the C language, when printing something on the screen, we usually use printf, puts and so on. Which are all defined in the or other header documents.
Is there any way to print something on screen without using such functions? That is to say, how is printf realised?
Eventually the C function printf will result in a sys_write system call, directly or by going through write (see man 2 write). The actual implementation depends on the compiler and the standard libraries.
Printing to screen requires access to framebuffer (hardware) and userspace programs are not allowed to have a direct access to it. So what they do is make a system call and kernel performs the required function for them. printf -> write system call -> kernel writes the data into framebuffer and then control is given back to user program.
Even if you don't want to use printf or puts (they are implemented in hosted libc) still you have to use write system call to tell the kernel on which device you want to write the buffer.
The standard headers are not, necessarily, a library containing functions written in C code.
They are functions with C "interfase", however it's very probably that they contain explicit machine code, adapted, in each case, to the target system.
The standard headers provide, in this way, ways of doing special process that it would not be possible to achieve in strict C code.
In the specific case of printf(), the situation is even more clear, because if none header is #include-d, then there is not any mechanism through the use of the C syntax only that performs an Input/Output operation.
library ncurses can help you, but if you want to use a low level function use write() and if you want to do kernel programming you have to use printk()
Related
I am looking to understand the printf() statement at the assembly level. However most of the assembly programs do something like call an external print function whose dependency is met by some other object file that the linker adds on. I would like to know what is inside that print function in terms of system calls and very basic assembly code. I want a piece of assembly code where the only external calls are the system calls, for printf. I'm thinking of something like a de assembled object file. Where can I get something like that??
I would suggest instead to stay first at the C level, and study the source code of some existing C standard library free software implementation on Linux. Look into the source code of musl-libc or of GNU libc (a.k.a. glibc). You'll understand that several intermediate (usually internal) functions are useful between printf and the basic system calls (listed in syscalls(2) ...). Use also strace(1) on a sample C program doing printf (e.g. the usual hello-world example).
In particular, musl-libc has a very readable stdio/printf.c implementation, but you'll need to follow several other C functions there before reaching the write(2) syscall. Notice that some buffering is involved. See also setvbuf(3) & fflush(3). Several answers (e.g. this and that one) explain the chain between functions like printf and system calls (up to kernel code).
I want a piece of assembly code where the only external calls are the system calls, for printf
If you want exactly that, you might start from musl-libc's stdio/printf.c, add any additional source file from musl-libc till you have no more external undefined symbols, and compile all of them with gcc -flto -O2 and perhaps also -S, you probably will finish with a significant part of musl-libc in object (or assembly) form (because printf may call malloc and many other functions!)... I'm not sure it is worth the pain.
You could also statically link your libc (e.g. libc.a). Then the linker will link only the static library members needed by printf (and any other function you are calling).
To be picky, system calls are not actually external calls (your libc write function is actually a tiny wrapper around the raw system call). You could make them using SYSENTER machine instructions (but using vdso(7) is preferable: more portable, and perhaps quicker), and you don't even need a valid stack pointer (on x86_64) to make a system call.
You can write Linux user-level programs without even using the libc; the bones implementation of Scheme is such a program (and you'll find others).
The function printf() is in the standard C library, so it is linked into your program and not copied into it. Dynamically linked libraries save memory because you don't have the exact same code copied in resident memory for every program that uses it.
Think about what printf() does. Interpreting the formatted string and generating the correct output is fairly complex. The series of functions that printf() belongs to also buffers the output. You probably don't really want to re-implement all of this in assembly. The standard C library is omnipresent, and probably available for you.
Maybe you're looking for write(2), which is the system call for unbuffered writes of just bytes to a file descriptor. You'd have to generate the string to print beforehand and format it yourself. (See also open(2) for opening files.)
To disassemble a binary, you can use objdump:
objdump -d binary
where binary is some compiled binary. This gives opcodes and human readable instructions. You probably want to redirect to a file and read elsewhere.
You can disassemble the standard C binary on your system and try to interpret it if you want (strongly not recommended). The problem is that it will be far too complex to understand. Things like printf() were written in C, then compiled and assembled. You can't (within a reasonable number of decades) restore the high level structure from the assembly of a compiled (non-trivial) program. If you really want to try this, good luck.
An easier thing to do is to look at the C source code for printf() itself. The real work is actually done in vfprintf() which is in stdio-common/vfprintf.c of the GNU C library source code.
I have read that C language does not include instructions for input and for output and that printf, scanf, getchar, putchar are actually functions.
Which are the primitive C language instructions to obtain the function printf , then?
Thank you.
If you want to use printf, you have to #include <stdio.h>. That file declares the function.
If you where thinking about how printf is implemented: printf might internally call any other functions and probably goes down to putc (also part of the C runtime) to write out the characters one-by-one. Eventually one of the functions needs to really write the character to the console. How this is done depends on the operating system. On Linux for example printf might internally call the Linux write function. On Windows printf might internally call WriteConsole.
The function printf is documented here; in fact, it is not part of the C language itself. The language itself does not provide a means for input and output. The function printf is defined in a library, which can be accessed using the compiler directive #include <stdio.h>.
No programming language provides true "primitives" for I/O. Any I/O "primitives" rely on lower abstraction levels, in this language or another.
I/O, at the lowest level, needs to access hardware. You might be looking at BIOS interrupts, hardware I/O ports, memory-mapped device controlers, or something else entirely, depending on the actual hardware your program is running on.
Because it would be a real burden to cater for all these possibilities in the implementation of the programming language, a hardware abstraction layer is employed. Individual I/O controllers are accessed by hardware drivers, which in turn are controlled by the operating system, which is providing I/O services to the application developer through a defined, abstract API. These may be accessed directly (e.g. by user-space assembly), or wrapped further (e.g. by the implementation of a programming language's interpreter, or standard library functions).
Whether you are looking at "commands" like (bash) echo or (Python) print, or library functions like (Java) System.out.println() or (C) printf() or (C++) std::cout, is just a syntactic detail: Any I/O is going through several layers of abstraction, because it is easier, and because it protects you from all kinds of erroneous or malicious program behaviour.
Those are the "primitives" of the respective language. If you're digging down further, you are leaving the realm of the language, and enter the realm of its implementation.
I once worked on a C library implementation myself. Ownership of the project has passed on since, but basically it worked like this:
printf() was implemented by means of vfprintf() (as was, eventually, every function of the *printf() family).
vfprintf() used a couple of internal helpers to do the fancy formatting, writing into a buffer.
If the buffer needed to be flushed, it was passed to an internal writef() function.
this writef() function needed to be implemented differently for each target system. On POSIX, it would call write(). On Win32, it would call WriteFile(). And so on.
I am quite new to the FILE family of functions that the standard C library provides.
I recently stumbled across fopen() and the similar functions after researching how stdout, stdin and stderr work alongside functions like printf().
I was wondering, what is needed to use fopen() on an embedded system (which doesn't necessarily have operating system support). After reading more about it, is seems like a cool thing to do on more powerful embedded systems to hook into say, a UART/SPI interface, so that calling printf() would print data out of the UART. Simarly, you could read data from a UART buffer by calling scanf().
This would also increase portability! (code written for say, Linux, would be easier to port if printf() was supported). You could also print debug data to a file if it was running in a production environment, and read from it later.
Can you just use fopen() on a bare-bones embedded system? If so who/where/when is the "FILE" then created (as far as I now, fopen() does not malloc() space for the file, nor do you specify how much)? Or do you need a operating system with FAT file support. If so, would something like http://ultra-embedded.com/?fat_filelib work? Would using FreeRTOS help at all?
Check the documentation for your toolchain's C library - it should have something to say about re-targeting the library.
For example if you are using Newlib you must re-implement some or all of the [syscalls stubs][3] to suit your target. The low level open() syscall in this case will allow fopen() to work as necessary. At its simplest, you might implement open() to support higher-level stdio access to serial ports, but if you are expecting standard file-system access, then you will still need an underlying file-system to map it too.
Another example of re-targeting the Keil/ARM standard library can be found here.
Yes, it's often possible to use fopen() and similar routines in code for embedded systems. The way it often works is that the vendor supplies a C compiler and associated libraries
targeted for their system, which implement some supported subset of the language in a way that's appropriate for that system (e.g. an implementation of printf() that outputs via a UART, or fopen() that uses RAM to simulate some sort of filesystem).
On the Keil compiler, the stdio library is designed to allow the user to define the __FILE structure in any desired fashion. A function like fprintf will perform a sequence of calls to fputc, which will receive a copy of the pointer passed to fprintf. One may define something like fopen to "create" a __FILE and populate its members via any desired means (if there will never be more than one file open at a time, one could simply fill in the fields of a static instance and return that). Variables __stdin, __stdout, and __stderror may likewise be defined as desired (stdin is defined to point to __stdin, and likewise with stdout and stderror).
"Can you just use fopen() on a bare-bones embedded system?"
It depends. Depends on the configuration of your embedded system, the types of memories interfaced, on what memory do you want to implement the file system, the file system library code size (ROM & RAM requirements).
FILE manipulation functions can be used independent of any OS. But a proper file system must be used and FAT is not the only file system (JFFS2, YAFS,...some other proprietary file system)
The file system is generally (but not always) implemented on Flash memories (Nand Flash, Nor Flash). USB device is also a flash (Nand flash). The Nand Flash & Nor Flash may have Parallel interface, I2C interface or SPI interface.
I need to build a OS, a very small and basic one, with actually least functionality, coded in C.
Probably a CUI OS which does some memory management and has at least a text editor and a calculator, its just going to be a experimentation about how to make a code that has full and direct control over your hardware.
Still I'll be requiring an interface, that will need input/output functions like printf(&args), scanf(&args). Now my basic question is should I use existing headers or go for coding actually from scratch, and why so ?
I'd be more than very thankful to you guys for and help.
First, you can't link against anything from libc ... you're going to have to code everything from scratch.
Now having worked on a micro-kernel myself, I would not use the actual stdio headers that come with libc since they are going to be cluttered with a lot of extra information that will be either irrelevant for your OS, or will create compiler errors due to missing definitions, etc. What I would do though is keep the function signatures for these standard functions the same ... so in the end you would have a file called stdio.h for your OS, but it would be a very stripped down header file with the basic minimum requirements for your needs, and only having the standard I/O functions you need, with the correct standard signatures.
Keep in mind on the back-end, i.e., in your stdio.c file, you're going to have to point these functions to a custom console-driver or some other type of character drive for your display. Either that, or you could just use them as wrappers for some other kernel-level display printing routine. You are also going to want to make sure that even though you may use a #include <stdio.h> directive in your other OS code modules to access these printing functions, you do not link against libc. This can be done using gcc -ffreestanding.
Just retarget newlib.
printf, scanf, etc relies on implementation specific funcions to get a single char or print a single char. You can then make your stdin and stdout the UART 1 for example.
Kernel itself would not require the printf and scanf functions, if you do not want to keep the kernel in kernel mode and work the apps you have planned for. But for basic printf and scanf features, you can write your own printf and scanf functions, which would provide basic support for printing ans taking input. I do not have much experience on this, but you can try make a console buffer, where the keyboard driver puts the read in ASCII characters (after conversion from scan codes), and then make the printf and scanf work on it. I have one basic implementation were i have wrote a gets instead of scanf and kept things simple. To get integer output you can write an atoi function to convert the string to a number.
To port in other libraries, you need to make the components which the libraries depend on. You need to make the decision if you can code in those support in the kernel so that the libraries could be ported in. If it is more difficult then coding some basic input output functions i think won't be bad at this stage,
I am trying to fully understand the process pro writing code in some language to execution by OS. In my case, the language would be C and the OS would be Windows. So far, I read many different articles, but I am not sure, whether I understand the process right, and I would like to ask you if you know some good articles on some subjects I couldnĀ“t find.
So, what I think I know about C (and basically other languages):
C compiler itself handles only data types, basic math operations, pointers operations, and work with functions. By work with functions I mean how to pass argument to it, and how to get output from function. During compilation, function call is replaced by passing arguments to stack, and than if function is not inline, its call is replaced by some symbol for linker. Linker than find the function definition, and replace the symbol to jump adress to that function (and of course than jump back to program).
If the above is generally true and I get it right, where to final .exe file actually linker saves the functions? After the main() function? And what creates the .exe header? Compiler or Linker?
Now, additional capabilities of C, today known as C standart library is set of functions and the declarations of them, that other programmers wrote to extend and simplify use of C language. But these functions like printf() were (or could be?) written in different language, or assembler. And there comes my next question, can be, for example printf() function be written in pure C without use of assembler?
I know this is quite big question, but I just mostly want to know, wheather I am right or not. And trust me, I read a lots of articles on the web, and I would not ask you, If I could find these infromation together on one place, in one article. Insted I must piece by piece gather informations, so I am not sure if I am right. Thanks.
I think that you're exposed to some information that is less relevant as a beginning C programmer and that might be confusing you - part of the goal of using a higher level language like this is to not have to initially think about how this process works. Over time, however, it is important to understand the process. I think you generally have the right understanding of it.
The C compiler merely takes C code and generates object files that contain machine language. Most of the object file is taken by the content of the functions. A simple function call in C, for example, would be represented in the compiled form as low level operators to push things into the stack, change the instruction pointer, etc.
The C library and any other libraries you would use are already available in this compiled form.
The linker is the thing that combines all the relevant object files, resolves all the dependencies (e.g., one object file calling a function in the standard library), and then creates the executable.
As for the language libraries are written in: Think of every function as a black box. As long as the black box has a standard interface (the C calling convention; that is, it takes arguments in a certain way, returns values in a certain way, etc.), how it is written internally doesn't matter. Most typically, the functions would be written in C or directly in assembly. By the time they make it into an object file (or as a compiled library), it doesn't really matter how they were initially created, what matters is that they are now in the compiled machine form.
The format of an executable depends on the operating system, but much of the body of the executable in windows is very similar to that of the object files. Imagine as if someone merged together all the object files and then added some glue. The glue does loading related stuff and then invokes the main(). When I was a kid, for example, people got a kick out of "changing the glue" to add another function before the main() that would display a splash screen with their name.
One thing to note, though is that regardless of the language you use, eventually you have to make use of operating system services. For example, to display stuff on the screen, to manage processes, etc. Most operating systems have an API that is also callable in a similar way, but its contents are not included in your EXE. For example, when you run your browser, it is an executable, but at some point there is a call to the Windows API to create a window or to load a font. If this was part of your EXE, your EXE would be huge. So even in your executable, there are "missing references". Usually, these are addressed at load time or run time, depending on the operating system.
I am a new user and this system does not allow me to post more than one link. To get around that restriction, I have posted some idea at my blog http://zhinkaas.blogspot.com/2010/04/how-does-c-program-work.html. It took me some time to get all links, but in totality, those should get you started.
The compiler is responsible for translating all your functions written in C into assembly, which it saves in the object file (DLL or EXE, for example). So, if you write a .c file that has a main function and a few other function, the compiler will translate all of those into assembly and save them together in the EXE file. Then, when you run the file, the loader (which is part of the OS) knows to start running the main function first. Otherwise, the main function is just like any other function for the compiler.
The linker is responsible for resolving any references between functions and variables in one object file with the references in other files. For example, if you call printf(), since you do not define the function printf() yourself, the linker is responsible for making sure that the call to printf() goes to the right system library where printf() is defined. This is done at compile-time.
printf() is indeed be written in pure C. What it does is call a system call in the OS which knows how to actually send characters to the standard output (like a window terminal). When you call printf() in your program, at compile time, the linker is responsible for linking your call to the printf() function in the standard C libraries. When the function is passed at run-time, printf() formats the arguments properly and then calls the appropriate OS system call to actually display the characters.