Stand-alone portable snprintf(), independent of the standard library? - c

I am writing code for a target platform with NO C-runtime. No stdlib, no stdio. I need a string formatting function like snprintf but that should be able to run without any dependencies, not even the C library.
At most it can depend on memory alloc functions provided by me.
I checked out Trio but it needs stdio.h header. I can't use this.
Edit
Target platform : PowerPC64 home made OS(not by me). However the library shouldn't rely on OS specific stuff.
Edit2
I have tried out some 3rd-party open source libs, such as Trio(http://daniel.haxx.se/projects/trio/), snprintf and miniformat(https://bitbucket.org/jj1/miniformat/src) but all of them rely on headers like string.h, stdio.h, or(even worse) stdlib.h. I don't want to write my own implementation if one already exists, as that would be time-wasting and bug-prone.

Try using the snprintf implementation from uclibc. This is likely to have the fewest dependencies. A bit of digging shows that snprintf is implemented in terms of vsnprintf which is implemented in terms of vfprintf (oddly enough), it uses a fake "stream" to write to string.
This is a pointer to the code: http://git.uclibc.org/uClibc/tree/libc/stdio/_vfprintf.c
Also, a quick google search also turned up this:
http://www.ijs.si/software/snprintf/
http://yallara.cs.rmit.edu.au/~aholkner/psnprintf/psnprintf.html
http://www.jhweiss.de/software/snprintf.html
Hopefully one is suitable for your purposes. This is likely to not be a complete list.
There is a different list here:
http://trac.eggheads.org/browser/trunk/src/compat/README.snprintf?rev=197

You will probably at least need stdarg.h or low level knowledge of the specific compiler/architecture calling convention in order to be able to process the variadic arguments.
I have been using code based on Kustaa Nyholm's implementation It provides printf() (with user supplied character output stub) and sprintf(), but adding snprintf() would be simple enough. I added vprintf() and vsprintf() for example in my implementation.
No dynamic memory application is required, but it does have a dependency on stdarg.h, but as I said, you are unlikely to be able to get away without that for any variadic function - though you could potentially implement your own.

I am guessing you are in a norming enivronment where you need to explicitly document and verify COTS code.
However, I think in the case of stdarg.h this is worthwhile. You could pull in the source for just this and treat it like handwritten code (review, lint, unit-test, etc.). Any self-written replacement will be a lot of work, probably less stable and absolutely not portable.
That said, the actual snprintf implementation should not be too hard, and you could do this yourself, probably. Especially if you might be able to strip a few features away.
Keep in mind that vararg code has no typechecking and is prone to errors. For library snprintf you may find gcc's warnings helpful.

Related

Is glib usable in an unobtrusive way?

I was looking for a good general-purpose library for C on top of the standard C library, and have seen several suggestions to use glib. How 'obtrusive' is it in your code? To explain what I mean by obtrusiveness, the first thing I noticed in the reference manual is the basic types section, thinking to myself, "what, am I going to start using gint, gchar, and gprefixing geverything gin gmy gcode gnow?"
More generally, can you use it only locally without other functions or files in your code having to be aware of its use? Does it force certain assumptions on your code, or constraints on your compilation/linking process? Does it take up a lot of memory in runtime for global data structures? etc.
The most obtrustive thing about glib is that any program or library using it is non-robust against resource exhaustion. It unconditionally calls abort when malloc fails and there's nothing you can do to fix this, as the entire library is designed around the concept that their internal allocation function g_malloc "can't fail"
As for the ugly "g" types, you definitely don't need any casts. The types are 100% equivalent to the standard types, and are basically just cruft from the early (mis)design of glib. Unfortunately the glib developers lack much understanding of C, as evidenced by this FAQ:
Why use g_print, g_malloc, g_strdup and fellow glib functions?
"Regarding g_malloc(), g_free() and siblings, these functions are much safer than their libc equivalents. For example, g_free() just returns if called with NULL.
(Source: https://developer.gnome.org/gtk-faq/stable/x908.html)
FYI, free(NULL) is perfectly valid C, and does the exact same thing: it just returns.
I have used GLib professionally for over 6 years, and have nothing but praise for it. It is very light-weight, with lots of great utilities like lists, hashtables, rand-functions, io-libraries, threads/mutexes/conditionals and even GObject. All done in a portable way. In fact, we have compiled the same GLib-code on Windows, OSX, Linux, Solaris, iOS, Android and Arm-Linux without any hiccups on the GLib side.
In terms of obtrusiveness, I have definitely "bought into the g", and there is no doubt in my mind that this has been extremely beneficial in producing stable, portable code at great speed. Maybe specially when it comes to writing advanced tests.
And if g_malloc don't suit your purpose, simply use malloc instead, which of course goes for all of it.
Of course you can "forget about it elsewhere", unless of course those other places somehow interact with glib code, then there's a connection (and, arguable, you're not really "elsewhere").
You don't have to use the types that are just regular types with a prepended g (gchar, gint and so on); they're guaranteed to be the same as char, int and so on. You never need to cast to/from gint for instance.
I think the intention is that application code should never use gint, it's just included so that the glib code can be more consistent.

Can stdio be used while coding for a Kernel...?

I need to build a OS, a very small and basic one, with actually least functionality, coded in C.
Probably a CUI OS which does some memory management and has at least a text editor and a calculator, its just going to be a experimentation about how to make a code that has full and direct control over your hardware.
Still I'll be requiring an interface, that will need input/output functions like printf(&args), scanf(&args). Now my basic question is should I use existing headers or go for coding actually from scratch, and why so ?
I'd be more than very thankful to you guys for and help.
First, you can't link against anything from libc ... you're going to have to code everything from scratch.
Now having worked on a micro-kernel myself, I would not use the actual stdio headers that come with libc since they are going to be cluttered with a lot of extra information that will be either irrelevant for your OS, or will create compiler errors due to missing definitions, etc. What I would do though is keep the function signatures for these standard functions the same ... so in the end you would have a file called stdio.h for your OS, but it would be a very stripped down header file with the basic minimum requirements for your needs, and only having the standard I/O functions you need, with the correct standard signatures.
Keep in mind on the back-end, i.e., in your stdio.c file, you're going to have to point these functions to a custom console-driver or some other type of character drive for your display. Either that, or you could just use them as wrappers for some other kernel-level display printing routine. You are also going to want to make sure that even though you may use a #include <stdio.h> directive in your other OS code modules to access these printing functions, you do not link against libc. This can be done using gcc -ffreestanding.
Just retarget newlib.
printf, scanf, etc relies on implementation specific funcions to get a single char or print a single char. You can then make your stdin and stdout the UART 1 for example.
Kernel itself would not require the printf and scanf functions, if you do not want to keep the kernel in kernel mode and work the apps you have planned for. But for basic printf and scanf features, you can write your own printf and scanf functions, which would provide basic support for printing ans taking input. I do not have much experience on this, but you can try make a console buffer, where the keyboard driver puts the read in ASCII characters (after conversion from scan codes), and then make the printf and scanf work on it. I have one basic implementation were i have wrote a gets instead of scanf and kept things simple. To get integer output you can write an atoi function to convert the string to a number.
To port in other libraries, you need to make the components which the libraries depend on. You need to make the decision if you can code in those support in the kernel so that the libraries could be ported in. If it is more difficult then coding some basic input output functions i think won't be bad at this stage,

Should I use secure versions of POSIX functions on MSVC - C

I am writing some C code which is expected to compile on multiple compilers (at least on MSVC and GCC). Since I am beginner in C, I have all warnings turned on and warnings are treated as errors (-Werror in GCC & /WX in MSVC) to prevent me from making silly mistakes.
When I compiled some code that uses strcpy on MSVC, I get warning like,
warning C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
I am bit confused. Lot of common functions are deprecated on MSVC. Should I use this secured version when on Windows? If yes, should I wrap strcpy something like,
my_strcpy()
{
#ifdef WIN32
// use strcpy_s
#ELSE
// use strcpy
}
Any thoughts?
Whenever you move data between non-constant-size buffers, you have to (gasp! omg!) actually think about whether it fits. Using functions (like the MS-specific strcpy_s or the BSD strlcpy) that purport to be "safe" will protect you from some obvious buffer overflow conditions, but won't protect you from the bugs that result from string truncation. It also won't protect you from integer overflows in computing the necessary sizes of buffers.
Unless you're an expert dealing with C strings, I would recommend forgetting about special functions and commenting every line of your code that will perform variable-length/position writes with a justification for how you know, at this point in the program, that the length/offset you're about to use is within the bounds of the size of the buffer. Do this for lines where you perform arithmetic on sizes/offsets too - document how you know that the arithmetic will not overflow, and add tests for overflow if you find you don't know.
Another approach is to completely wrap all your string handling in a string object that stores the length of the buffer along with the string and automatically reallocates when a string needs to be enlarged, and then only use const char * for read-only access to strings when you need to pass them to system functions or other libraries. This will sacrifice a good bit of the performance you'd expect from C, but it will help you ensure that you don't make mistakes. Just don't take it to the extreme. There's no need to duplicate stuff like strchr, strstr, etc. in your string wrapper. Just provide methods to duplicate string objects, concatenate them, and truncate them, and then with the existing library functions that operate on const char * you can do just about anything you'd want to.
There are lots and lots of discussions about this topic here on SO. The usual suspects like strncpy, strlcpy and whatever will pop up here again, I'm sure. Just type "strcpy" in the search box and read some of the longer threads to get an overview.
My advice is: Whatever your final choice will be, it is a good idea to follow the DRY principle and continue to do it as in your example of my_strcpy(). Don't throw the raw calls all over your code, use wrappers and centralize them in your own string handling library. This will reduce overall code (boilerplate), and you have one central location to make modifications, if you change your mind later.
Of course this opens up some other cans of worms, especially for a beginner: Memory handling responsibility and interface design. Both a topic on its own, and 5 people will give you 10 suggestions of how to do it. A central library usually has the nice effect that it enforces a decision, which you will follow throughout your whole codebase, instead of using method a in module A and method b in module B, causing you trouble when you try to connect A with B...
I would tend to use the safer function snprintf † which is available on both platforms rather than having different paths depending on platform. You will need to use the define to prevent the warnings on MSVC.
† though possibly slightly less safer - it will return a string which is not nul-terminated on error, so you must check the return, but it won't cause a buffer overflow.

A pure bytes version of strstr?

Is there a version of strstr that works over a fixed length of memory that may include null characters?
I could phrase my question like this:
strncpy is to memcpy as strstr is to ?
memmem, unfortunately it's GNU-specific rather than standard C. However, it's open-source so you can copy the code (if the license is amenable to you).
Not in the standard library (which is not that large, so take a look). However writing your own is trivial, either directly byte by byte or using memchr() followed by memcmp() iteratively.
In the standard library, no. However, a quick google search for "safe c string library" turns up several potentially useful results. Without knowing more about the task you are trying to perform, I cannot recommend any particular third-party implementation.
If this is the only "safe" function that you need beyond the standard functions, then it may be best to roll your own rather than expend the effort of integrating a third-party library, provided you are confident that you can do so without introducing additional bugs.

Safer Alternatives to the C Standard Library

The C standard library is notoriously poor when it comes to I/O safety. Many functions have buffer overflows (gets, scanf), or can clobber memory if not given proper arguments (scanf), and so on. Every once in a while, I come across an enterprising hacker who has written his own library that lacks these flaws.
What are the best of these libraries you have seen? Have you used them in production code, and if so, which held up as more than hobby projects?
I use GLib library, it has many good standard and non standard functions.
See https://developer.gnome.org/glib/stable/
and maybe you fall in love... :)
For example:
https://developer.gnome.org/glib/stable/glib-String-Utility-Functions.html#g-strdup-printf
explains that g_strdup_printf is:
Similar to the standard C sprintf() function but safer, since it calculates the maximum space required and allocates memory to hold the result.
This isn't really answering your question about the safest libraries to use, but most functions that are vulnerable to buffer overflows that you mentioned have safer versions which take the buffer length as an argument to prevent the security holes that are opened up when the standard methods are used.
Unless you have relaxed the level of warnings, you will usually get compiler warnings when you use the deprecated methods, suggesting you use the safer methods instead.
I believe the Apache Portable Runtime (apr) library is safer than the standard C library. I use it, well, as part of an apache module, but also for independent processes.
For Windows there is a 'safe' C/C++ library.
You're always at liberty to implement any library you like and to use it - the hard part is making sure it is available on the platforms you need your software to work on. You can also use wrappers around the standard functions where appropriate.
Whether it is really a good idea is somewhat debatable, but there is TR24731 published by the C standard committee - for a safer set of C functions. There's definitely some good stuff in there. See this question: Do you use the TR 24731 Safe Functions in your C code?, which includes links to the technical report.
Maybe the first question to ask is if your really need plain C? (maybe a language like .net or java is an option - then e.g. buffer overflows are not really a problem anymore)
Another option is maybe to write parts of your project in C++ if other higher level languages are not an option. You can then have a C interface which encapsulates the C++ code if you really need C.
Because if you add all the advanced functions the C++ standard library has build in - your C code would only be marginally faster most times (and contain a lot more bugs than an existing and tested framework).

Resources