I am planning to study about operating systems. I met with 2 doubts. Why we should not use library functions while creating an operating system?
What is the drawback in it?
Why we should not use library functions while creating an operating system? What is the drawback in it?
It depends on what you mean by "library functions".
You absolutely should try to use someone else's version of the functions from <string.h>, for example. If you're writing an OS, you've got plenty to do, why re-invent the wheel with something simple like strcpy?
You should use whatever open-source code you can that has no dependencies. I mean simple "leaf" functions like strcpy that have no dependencies. If you look at the Linux kernel source code, you will certainly see standard library functions like memcpy, and strlen, etc. But you'll also see things like strncpy_from_user which are adapted to particular uses in the kernel (in this case copying a string from user-space to kernel-space).
What you shouldn't try to use (if it isn't obvious already) are things like fopen. fopen is a wrapper around some code that makes a system call to the kernel to handle the actual opening of a file. Well clearly, if you are the kernel, you can't use this in your kernel.
Related
I am interested in programming my own OS from scratch(in C).
However, every tutorial I encounter has made a message print on the screen by writing directly to the VDU. Why can't I use standard library functions while writing my OS? I don't have much problem in writing directly to the VDU. However, it sometimes makes my mind utterly confused(especially in large programs).
Are the library functions not converted into the same low-level code as the functions created by us?
This is a kind of chicken-and-egg problem: Standard library functions use OS functions to print to the screen (deep down there, someone actually needs to write directly to the hardware).
Without an OS (because you just start writing one) this will not work. The standard library you want to use will need to be written specifically for and together with your OS.
Sometimes, I want to know the implementation of a c function. My editor is vim. I have try ctags and cscope, and man.
man 2|3 only tell me how to use a function.
Both ctags and cscope can just find some of the implementation of functions.
They all can't find some functions. especially some system function(calls).
If a function can be use by include some header file, is there any way easily find the implementation of a function,
select(2) is a system call (but I suggest using poll(2) instead - google for C10K problem to understand why I prefer poll over select). So it is really implemented inside the linux kernel. The libc contains a small stub function (translating the C argument convention to the syscall convention, then doing the real syscall with e.g. some SYSENTER machine instruction). You could look into the source code of MUSL Libc (I recommend MUSL libc because its source is much easier to read) or the real Gnu libc to see that wrapper function.
FD_SET is just a macro, defined in /usr/include/x86_64-linux-gnu/sys/select.h and really in /usr/include/bits/select.h
But you are very right to try to find out how software functions of Linux are implemented: take advantage that it is free software.
Actually, the syscall layer is well defined and quite stable (see the syscalls(2) man page, and read Advanced Linux Programming for more. Look also for the Posix standards). It is much more interesting to study the source code of higher-level libraries using them (e.g. Qt, Gtk, ...).
From an application's point of view, syscalls are elementary "atomic" operations. strace is a handy utility to find which syscalls are done by some process (or running program).
You won't get around pulling in the sources of the module providing the function's implementation.
For Linux most of the modules in use are open source, so access to the sources shall be possible.
Where to get the sources from depends on library and/or the distribution in use. This includes the kernel.
There are distributions which may include all sources. Gentoo is one of those.
For Debian based distros it is easy to pull a package's sources using the apt-get tool:
$ apt-get source <package-name>
Other distros may use other ways to provide sources. Perhaps fellow SO experts might like to comment/answer regarding those.
So I have a minimal OS that doesn't do much. There's a bootloader, that loads a basic C kernel in 32-bit protected mode. How do I port in a C library so I can use things like printf? I'm looking to use the GNU C Library. Are there any tutorials anywhere?
Ok, porting in a C library isn't that hard, i'm using Newlib in my kernel. Here is a tutorial to start: http://wiki.osdev.org/Porting_Newlib.
You basically need to:
Compile the library (for example Newlib) using your cross compiler
Provide stub-implementations for a list of system functions (like fork, fstat, etc.) in your kernel
Link the library and your kernel together
If you want to use functions like malloc or printf (which uses malloc internally), you need some kind of memory management and simplest working implementation of sbrk.
I strongly recommend against glibc. It is a beast.
Try newlib instead. Porting it to a new kernel is easy. You just need to write a few support functions, as explained here.
Another new kid on the block is musl which specifically aims to improve the situation in embedded space.
It's probably not the best choice for a beginner, though, since it's still pretty much work in progress.
Better look for a small libc, like uClibc. The GNU C library is huge. And as the comments tell, the first step is to get a C compiler going.
What are you trying to do? Building a full operating system is a job for a group of people lasting a few years... better start with something that already works, and hack on the parts that most interest you.
For a college assignment we have to add a system call to the Linux kernel. I have "Hello, World" done no problem. In terms of adding a more complicated call, I know (or at least think) I can't use C functions like malloc, but I'm wondering can I use syscall() to use other system calls?
The kernel has its own specific calls for pretty much everything. You don't have access to system calls or <sys/xxxx.h> header files.
For your exmaple, yes, you can't use malloc() but you can use kmalloc()
In older versions of the kernel (2.4) you could use syscall() via: syscallN() macros. I'm pretty sure that's been removed.
In general syscalls() from the kernel is not a good idea. Really system calls are just a way of user space going into the kernel to do something, so if you're already in the kernel there should be a better way to do what you're trying to do.
There are multiple sections in the manpages. Two of them are:
2 Unix and C system calls
3 C Library routines for C programs
For example there is getmntinfo(3) and getfsstat(2), both look like they do the same thing. When should one use which and what is the difference?
System calls are operating system functions, like on UNIX, the malloc() function is built on top of the sbrk() system call (for resizing process memory space).
Libraries are just application code that's not part of the operating system and will often be available on more than one OS. They're basically the same as function calls within your own program.
The line can be a little blurry but just view system calls as kernel-level functionality.
Libraries of common functions are built on top of the system call interface, but applications are free to use both.
System calls are like authentication keys which have the access to use kernel resources.
Above image is from Advanced Linux programming and helps to understand how the user apps interact with kernel.
System calls are the interface between user-level code and the kernel. C Library routines are library calls like any other, they just happen to be really commonly provided (pretty much universally). A lot of standard library routines are wrappers (thin or otherwise) around system calls, which does tend to blur the line a bit.
As to which one to use, as a general rule, use the one that best suits your needs.
The calls described in section 2 of the manual are all relatively thin wrappers around actual calls to system services that trap to the kernel. The C standard library routines described in section 3 of the manual are client-side library functions that may or may not actually use system calls.
This posting has a description of system calls and trapping to the kernel (in a slightly different context) and explains the underlying mechanism behind system calls with some references.
As a general rule, you should always use the C library version. They often have wrappers that handle esoteric things like restarts on a signal (if you have requested that). This is especially true if you have already linked with the library. All rules have reasons to be broken. Reasons to use the direct calls,
You want to be libc agnostic; Maybe with an installer. Such code could run on Android (bionic), uClibc, and more traditional glibc/eglibc systems, regardless of the library used. Also, dynamic loading with wrappers to make a run-time glibc/bionic layer allowing a dual Android/Linux binary.
You need extreme performance. Although this is probably rare and most likely misguided. Probably rethinking the problem will give better performance benefits and not calling the system is often a performance win, which the libc can occasionally do.
You are writing some initramfs or init code without a library; to create a smaller image or boot faster.
You are testing a new kernel/platform and don't want to complicate life with a full blown file system; very similar to the initramfs.
You wish to do something very quickly on program startup, but eventually want to use the libc routines.
To avoid a known bug in the libc.
The functionality is not available through libc.
Sorry, most of the examples are Linux specific, but the rationals should apply to other Unix variants. The last item is quite common when new features are introduced into a kernel. For example when kqueue or epoll where first introduced, there was no libc to support them. This may also happen if the system has an older library, but a newer kernel and you wish to use this functionality.
If your process hasn't used the libc, then most likely something in the system will have. By coding your own variants, you can negate the cache by providing two paths to the same end goal. Also, Unix
will share the code pages between processes. Generally there is no reason not to use the libc version.
Other answers have already done a stellar job on the difference between libc and system calls.