Can stdio be used while coding for a Kernel...? - c

I need to build a OS, a very small and basic one, with actually least functionality, coded in C.
Probably a CUI OS which does some memory management and has at least a text editor and a calculator, its just going to be a experimentation about how to make a code that has full and direct control over your hardware.
Still I'll be requiring an interface, that will need input/output functions like printf(&args), scanf(&args). Now my basic question is should I use existing headers or go for coding actually from scratch, and why so ?
I'd be more than very thankful to you guys for and help.

First, you can't link against anything from libc ... you're going to have to code everything from scratch.
Now having worked on a micro-kernel myself, I would not use the actual stdio headers that come with libc since they are going to be cluttered with a lot of extra information that will be either irrelevant for your OS, or will create compiler errors due to missing definitions, etc. What I would do though is keep the function signatures for these standard functions the same ... so in the end you would have a file called stdio.h for your OS, but it would be a very stripped down header file with the basic minimum requirements for your needs, and only having the standard I/O functions you need, with the correct standard signatures.
Keep in mind on the back-end, i.e., in your stdio.c file, you're going to have to point these functions to a custom console-driver or some other type of character drive for your display. Either that, or you could just use them as wrappers for some other kernel-level display printing routine. You are also going to want to make sure that even though you may use a #include <stdio.h> directive in your other OS code modules to access these printing functions, you do not link against libc. This can be done using gcc -ffreestanding.

Just retarget newlib.
printf, scanf, etc relies on implementation specific funcions to get a single char or print a single char. You can then make your stdin and stdout the UART 1 for example.

Kernel itself would not require the printf and scanf functions, if you do not want to keep the kernel in kernel mode and work the apps you have planned for. But for basic printf and scanf features, you can write your own printf and scanf functions, which would provide basic support for printing ans taking input. I do not have much experience on this, but you can try make a console buffer, where the keyboard driver puts the read in ASCII characters (after conversion from scan codes), and then make the printf and scanf work on it. I have one basic implementation were i have wrote a gets instead of scanf and kept things simple. To get integer output you can write an atoi function to convert the string to a number.
To port in other libraries, you need to make the components which the libraries depend on. You need to make the decision if you can code in those support in the kernel so that the libraries could be ported in. If it is more difficult then coding some basic input output functions i think won't be bad at this stage,

Related

Are functions such as printf() implemented differently for Linux and Windows

Something I still don't fully understand. For example, standard C functions such as printf() and scanf() which deal with sending data to the standard output or getting data from the standard input. Will the source code which implements these functions be different depending on if we are using them for Windows or Linux?
I'm guessing the quick answer would be "yes", but do they really have to be different?
I'm probably wrong , but my guess is that the actual function code be the same, but the lower layer functions of the OS that eventually get called by these functions are different. So could any compiler compile these same C functions, but it is what gets linked after (what these functions depend on to work on lower layers) is what gives us the required behavior?
Will the source code which implements these functions be different
depending on if we are using them for Windows or Linux?
Probably. It may even be different on different Linuxes, and for different Windows programs. There are several distinct implementations of the C standard library available for Linux, and maybe even more than one for Windows. Distinct implementations will have different implementation code, otherwise lawyers get involved.
my guess is that the actual function code be the same, but the lower
layer functions of the OS that eventually get called by these
functions are different. So could any compiler compile these same C
functions, but it is what gets linked after (what these functions
depend on to work on lower layers) is what gives us the required
behavior?
It is conceivable that standard library functions would be written in a way that abstracts the environment dependencies to some lower layer, so that the same source for each of those functions themselves can be used in multiple environments, with some kind of environment-specific compatibility layer underneath. Inasmuch as the GNU C library supports a wide variety of environments, it serves as an example of the general principle, though Windows is not among the environments it supports. Even then, however, the environment distinction would be effective even before the link stage. Different environments have a variety of binary formats.
In practice, however, you are very unlikely to see the situation you describe for Windows and Linux.
Yes, they have different implementations.
Moreover you might be using multiple different implementations on the same OS. For example:
MinGW is shipped with its own implementation of standard library which is different from the one used by MSVC.
There are many different implementations of C library even for Linux: glibc, musl, dietlibc and others.
Obviously, this means there is some code duplication in the community, but there are many good reasons for that:
People have different views on how things should be implemented and tested. This alone is enough to "fork" the project.
License: implementations put some restrictions on how they can be used and might require some actions from the end user (GPL requires you to share your code in some cases). Not everyone can follow those requirements.
People have very different needs. Some environments are multithreaded, some are not. printf might need or might not need to use some thread synchronization mechanisms. Some people need locale support, some don't. All this can bloat the code in the end, not everyone is willing to pay for things they do not use. Even strerror is vastly different on different OSes.
Aforementioned synchronization mechanisms are usually OS-specific and work in specific ways. Same can be said about locale handling, signal handling and other things, including the actual data writing and reading.
Some implementations add non-standard extensions that can make your life easier. Not all of those make sense on other OSes. For example glibc adds 'e' mode specifier to open file with O_CLOEXEC flag. This doesn't make sense for Windows.
Many complex things cannot be implemented in pure C and require some compiler-specific extensions. This can tie implementation to a limited number of compilers.
In the end, it is much simpler to have many C libraries, than trying to create a one-size-fits-all implementation.
As you say the higher level parts of the implementation of something like printf, like the code used to format the string using the arguments, can be written in a cross-platform way and be shared between Linux and Windows. I'm not sure if there's a C library that actually does it though.
But to interact with the hardware or use other operating system facilities (such as when printf writes to the console), the libc implementation has to use the OS's interface: the system calls. And these are very different between Windows and Unix-likes, and different even among Unix-likes (POSIX specifies a lot of them but there are OS specific extensions). For example here you can find system call tables for Linux and Windows.
There are two parts to functions like printf(). The first part parses the format string, and assembles an array of characters ready for output. If this part is written in C, there's no reason preventing it being common across all C libraries, and no reason preventing it being different, so long the standard definition of what printf() does is implemented. As it happens, different library developers have read the standard's definition of printf(), and have come up with different ways of parsing and acting on the format string. Most of them have done so correctly.
The second part, the bit that outputs those characters to stdout, is where the differences come in. It depends on using the kernel system call interface; it's the kernel / OS that looks after input/output, and that is done in a specific way. The source code required to get the Linux kernel to output characters is very different to that required to get Windows to output characters.
On Linux, it's usual to use glibc; this does some elaborate things with printf(), buffering the output characters in a pipe until a newline is output, and only then calling the Linux system call for displaying characters on the screen. This means that printf() calls from separate threads are neatly separated, each being on their own line. But the same program source code, compiled against another C library for Linux, won't necessarily do the same thing, resulting in printf() output from different threads being all jumbled up and unreadable.
There's also no reason why the library that contains printf() should be written in C. So long as the same function calling convention as used by the C compiler is honoured, you could write it in assembler (though that'd be slightly mad!). Or Ada (calling convention might be a bit tricky...).
Will the source code which implements these functions be different
Let us try another point-of-view: competition.
No. Competitors in industry are not required by the C spec to share source code to issue a compliant compiler - nor would various standard C library developers always want to.
C does not require "open source".

How to print something without using std lib functions?

In the C language, when printing something on the screen, we usually use printf, puts and so on. Which are all defined in the or other header documents.
Is there any way to print something on screen without using such functions? That is to say, how is printf realised?
Eventually the C function printf will result in a sys_write system call, directly or by going through write (see man 2 write). The actual implementation depends on the compiler and the standard libraries.
Printing to screen requires access to framebuffer (hardware) and userspace programs are not allowed to have a direct access to it. So what they do is make a system call and kernel performs the required function for them. printf -> write system call -> kernel writes the data into framebuffer and then control is given back to user program.
Even if you don't want to use printf or puts (they are implemented in hosted libc) still you have to use write system call to tell the kernel on which device you want to write the buffer.
The standard headers are not, necessarily, a library containing functions written in C code.
They are functions with C "interfase", however it's very probably that they contain explicit machine code, adapted, in each case, to the target system.
The standard headers provide, in this way, ways of doing special process that it would not be possible to achieve in strict C code.
In the specific case of printf(), the situation is even more clear, because if none header is #include-d, then there is not any mechanism through the use of the C syntax only that performs an Input/Output operation.
library ncurses can help you, but if you want to use a low level function use write() and if you want to do kernel programming you have to use printk()

Stand-alone portable snprintf(), independent of the standard library?

I am writing code for a target platform with NO C-runtime. No stdlib, no stdio. I need a string formatting function like snprintf but that should be able to run without any dependencies, not even the C library.
At most it can depend on memory alloc functions provided by me.
I checked out Trio but it needs stdio.h header. I can't use this.
Edit
Target platform : PowerPC64 home made OS(not by me). However the library shouldn't rely on OS specific stuff.
Edit2
I have tried out some 3rd-party open source libs, such as Trio(http://daniel.haxx.se/projects/trio/), snprintf and miniformat(https://bitbucket.org/jj1/miniformat/src) but all of them rely on headers like string.h, stdio.h, or(even worse) stdlib.h. I don't want to write my own implementation if one already exists, as that would be time-wasting and bug-prone.
Try using the snprintf implementation from uclibc. This is likely to have the fewest dependencies. A bit of digging shows that snprintf is implemented in terms of vsnprintf which is implemented in terms of vfprintf (oddly enough), it uses a fake "stream" to write to string.
This is a pointer to the code: http://git.uclibc.org/uClibc/tree/libc/stdio/_vfprintf.c
Also, a quick google search also turned up this:
http://www.ijs.si/software/snprintf/
http://yallara.cs.rmit.edu.au/~aholkner/psnprintf/psnprintf.html
http://www.jhweiss.de/software/snprintf.html
Hopefully one is suitable for your purposes. This is likely to not be a complete list.
There is a different list here:
http://trac.eggheads.org/browser/trunk/src/compat/README.snprintf?rev=197
You will probably at least need stdarg.h or low level knowledge of the specific compiler/architecture calling convention in order to be able to process the variadic arguments.
I have been using code based on Kustaa Nyholm's implementation It provides printf() (with user supplied character output stub) and sprintf(), but adding snprintf() would be simple enough. I added vprintf() and vsprintf() for example in my implementation.
No dynamic memory application is required, but it does have a dependency on stdarg.h, but as I said, you are unlikely to be able to get away without that for any variadic function - though you could potentially implement your own.
I am guessing you are in a norming enivronment where you need to explicitly document and verify COTS code.
However, I think in the case of stdarg.h this is worthwhile. You could pull in the source for just this and treat it like handwritten code (review, lint, unit-test, etc.). Any self-written replacement will be a lot of work, probably less stable and absolutely not portable.
That said, the actual snprintf implementation should not be too hard, and you could do this yourself, probably. Especially if you might be able to strip a few features away.
Keep in mind that vararg code has no typechecking and is prone to errors. For library snprintf you may find gcc's warnings helpful.

hidden routines linked in c program

Hullo,
When one disasembly some win32 exe prog compiled by c compiler it
shows that some compilers links some 'hidden' routines in it -
i think even if c program is an empty one and has a 5 bytes or so.
I understand that such 5 bytes is enveloped in PE .exe format but
why to put some routines - it seem not necessary for me and even
somewhat annoys me. What is that? Can it be omitted? As i understand
c program (not speaking about c++ right now which i know has some
initial routines) should not need such complementary hidden functions..
Much tnx for answer, maybe even some extended info link, cause this
topic interests me much
//edit
ok here it is some disasembly Ive done way back then
(digital mars and old borland commandline (i have tested also)
both make much more code, (and Im specialli interested in bcc32)
but they do not include readable names/symbols in such dissassembly
so i will not post them here
thesse are somewhat readable - but i am not experienced in understending
what it is ;-)
https://dl.dropbox.com/u/42887985/prog_devcpp.htm
https://dl.dropbox.com/u/42887985/prog_lcc.htm
https://dl.dropbox.com/u/42887985/prog_mingw.htm
https://dl.dropbox.com/u/42887985/prog_pelles.htm
some explanatory comments whats that heere?
(I am afraid maybe there is some c++ sh*t here, I am
interested in pure c addons not c++ though,
but too tired now to assure that it was compiled in c
mode, extension of compiled empty-main prog was c
so I was thinking it will be output in c not c++)
tnx for longer explanations what it is
Since your win32 exe file is a dynamically linked object file, it will contain the necessary data needed by the dynamic linker to do its job, such as names of libraries to link to, and symbols that need resolving.
Even a program with an empty main() will link with the c-runtime and kernel32.dll libraries (and probably others? - a while since I last did Win32 dev).
You should also be aware that main() is only the entry point of your program - quite a bit has already gone on before this point such as retrieving and tokening the command-line, setting up the locale, creating stderr, stdin, and stdout and setting up the other mechanism required by the c-runtime library such a at_exit(). Similarly, when your main() returns, the runtime does some clean-up - and at the very least needs to call the kernel to tell it that you're done.
As to whether it's necessary? Yes, unless you fancy writing your own program prologue and epilogue each time. There are probably are ways of writing minimal, statically linked applications if you're sufficiently masochistic.
As for storage overhead, why are you getting so worked up? It's not enough to worry about.
There are several initialization functions that load whenever you run a program on Windows. These functions, among other things, call the main() function that you write - which is why you need either a main() or WinMain() function for your program to run. I'm not aware of other included functions though. Do you have some disassembly to show?
You don't have much detail to go on but I think most of what you're seeing is probably the routines of the specific C runtime library that your compiler works with.
For instance there will be code enabling it to run from the entry point 'main' which portable executable format understands to call the main(char ** args) that you wrote in your C program.

Ncurses programs in pseudo-terminals

In my continuing attempt to understand how pseudo-terminals work, I have written a small program to try to run bash.
The problem is, my line-breaking seems to be off. (The shell prompt only appears AFTER I press enter.)
Furthermore, I still cannot properly use ncurses programs, like vi. Can anyone tell me how to setup the pseudo-terminal for this?
My badly written program can be found here, I encourage you to compile it. The operating system is GNU/Linux, thanks.
EDIT: Compile like this: gcc program.c -lutil -o program
EDIT AGAIN: It looks like the issue with weird spacing was due to using printf(), still doesn't fix the issue with ncurses programs though.
There are several issues in your program. Some are relatively easy to fix - others not so much:
forkpty() and its friends come from BSD and are not POSIX compatible. They should be avoided for new programs. From the pty(7) manual page:
Historically, two pseudoterminal APIs have evolved: BSD and System V. SUSv1 standardized a pseudoterminal API based on the System V API, and this API should be employed in all new programs that use pseudoterminals.
You should be using posix_openpt() instead. This issue is probably not critical, but you should be aware of it.
You are mixing calls to raw system calls (read(), write()) and file-stream (printf(), fgets()) functions. This is a very good way to confuse yourself. In general you should choose one approach and stick with it. In this case, it's probably best to use the low-level system calls (read(), write()) to avoid any issues that would arise from the presence of the I/O buffers that the C library functions use.
You are assuming a line-based paradigm for your terminals, by using printf() and fgets(). This is not always true, especially when dealing with interactive programs like vim.
You are assuming a C-style single-byte null-terminated string paradigm. Terminals normally deal with characters and bytes - not strings. And while most character set encodings avoid using a zero byte, not all do so.
As a result of (2), (3) and (4) above, you are not using read() and write() correctly. You should be using their return values to determine how many bytes they processed, not string-based functions like strlen().
This is the issue that, in my opinion, will be most difficult to solve: You are implicitly assuming that:
The terminal (or its driver) is stateless: It is not. Period. There are at least two stateful controls that I suspect are the cause of ncurses-based programs not working correctly: The line mode and the local echo control of the terminal. At least these have to match between the parent/master and the slave terminal in order to avoid various strange artifacts.
The control-interface of a terminal can be passed-through just by passing the bytes back and forth: It is not always so. Modern virtual terminals allow for a degree of out-of-band control via ioctl() calls, as described for Linux here.
The simplest way to deal with this issue is probably to set the parent terminal to raw mode and let the slave pseudo-terminal driver deal with the awkward details.
You may want to have a look at this program which seems to work fine. It comes from the book The Linux Programming Interface and the full source code is here. Disclaimer: I have not read the book, nor am I promoting it - I just found the program using Google.

Resources