Bearing in mind the answers given to a question about a safer formatting library for C, I'm wondering whether there is a safe C formatting library?
What I mean is:
there's no possibility to mismatch the format string from the arguments
there's no possibility to crash by passing the wrong type
there're no platform-dependent aspects
Please don't answer about the Microsoft Safe String Library, or libraries that are less unsafe but still not totally safe, as I'm aware of these, and they don't satisfy the requirements for total safety.
Thanks in advance
You're writing in C. C is not type-safe. You cannot avoid undefined behaviour if you pass an int* instead of a char*. There's no such thing as "there's no possibility" if your variables are not statically type checked / tagged for runtime checking.
If you have something that produces warnings, that's quite good already...
If you really need or want safety, you may want to have a look at cyclone (C dialect), or some completely different languages.
there's no possibility to mismatch the format string from the arguments
If you want a format string, without special compiler support you basically can't do it. That said, you could have a safe formatting library in C if you forgo the format string. I'm not aware of any, but I would not be surprised if they existed.
One could have an interface like:
typedef ... FORMATTER;
FORMATTER create_formatter();
int fmt_add_string_default(FORMATTER f, const char *s);
int fmt_add_string(FORMATTER f, const char *s, int maxlength, const char fill, enum fmt_alignment align);
...
int fmt_add_decimal_default(FORMATTER f, int d);
... // you get the idea
int fmt_write_result(FORMATTER f, char *out, int out_length);
void destroy_formatter(FORMATTER f);
Something like this would be perfectly safe, if a bit verbose.
No, because whatever "safety" you introduce can be suborned by the language. It's like building your castle on sand - it doesn't matter how good the castle is, it can still be made to fall if you dig out the sand from underneath it.
There is no mechanism in C to enforce specific parameter types, nor should there be.
If people don't use your tools as they're meant to, that's their own problem, in my opinion. You're not supposed to be providing software to three-year-olds - they're expected to have some modicum of intelligence.
Related
I am new to coding using MISRA C guidelines.
The following are two rules in MISRA C 2004:
Rule 16.1 (required): Functions shall not be defined with a variable number of arguments.
Rule 20.9 (required): The input/output library <stdio.h> shall not be used in production code.
This clearly means that I can't use printf in production code for it to be MISRA C compliant, because printf is a part of <stdio.h> and allows a variable number of arguments. So I set out on a quest to find out how I can write my own printf statement. So far I am unable to find any solution for this predicament. Any help from fellow developers would be appreciated.
so far I am unable to find any solution for this predicament
You have to use functions that print one (countable) things at a time. An example interface you might want to implement might look like the following:
print_string("Hello");
print_int(5);
print_char('\n');
so I set out on a quest to find out how I can write my own printf statement
Most MISRA-C systems are embedded systems where printf is just some bloated wrapper around an UART library. The usual solution is to develop your own logging/messaging tool instead. Not necessarily UART-based, might as well some other serial bus, or just 8 parallel data or some LCD/7-seg... all depending on what you need to display and if you intend for this to be part of the production code or not.
So how to do this is highly project-specific and it's typically more of a system design and electronics problem than a programming one.
EDIT
Since you seem to be making some sort of general-purpose library, one solution is to simply provide an API that returns strings to the caller, then let the caller worry about how to present them. That makes your lib MISRA-C compliant, while allowing the caller to print strings in whatever application-specific way they have available. For example:
void lib_getmsg (char* msg, size_t bufsize);
Where "lib" is some prefix for your library. Leave string allocation to the caller. Alternatively, the old-fashioned way:
lib_result_t lib_dosomething (void);
// Returns LIB_OK if went OK, returns LIB_ERR in case of errors.
// To get more information, call lib_get_lastmsg.
const char* lib_get_lastmsg (void);
This returns a pointer to an internal static string allocated by your library. The downside of this is that it won't work well in multi-process environments.
You need to understand the rationale for the MISRA C guidelines, understand the context they are used in, and the circumstances of your own code.
You also need to understand that the MISRA Guidelines are not to be blindly followed with a tick-box mentality... you then need to appreciate that those nice folk at MISRA provide several chapters of useful material before the actual guidelines. Part of that is the Deviation procedure.
If you can justify why you feel you need to violate a guidelines, then use the deviation procedure that is specified. This requires you understand the nature of the violation, and what you are going to do to ensure the integrity of your application.
If you genuinely need to use printf() and you can justify that, use it with a deviation
On Linux, running on a modern x86_64 processor:
int main()
{
char *s = "Hello, World!\n";
long l = 14;
long fd = 1;
long syscall = 1;
long ret = 0;
__asm__("syscall"
: "=a"(ret)
: "a"(syscall),
"D"(fd),
"S"(s),
"d"(l));
return 0;
}
Output:
Hello, World!
I want to create a custom data type String. With that data type I want to use code like below:
#include <stdio.h>
void main(void)
{
String name = "MD A Barik";
printf("%s", name);
}
How may I implement String datatype in C programming languages!
A bunch of ways!
Study C for many years well enough to write your own elaborate set of functions and structures (but not classes), such that you create something more or less useful for you, but that nobody else will ever be able to use.
Switch to C++.
Wish upon a star.
Really, as comments above have noted, C does not support this and there's simply no way to pretend that it does. Efforts such as my #1 above would be a huge amount of work, and entirely counterproductive.
It's a totally fair question to wish for, but you really need to take "no" for an answer here.
typedef char* String;
or just use char* directly.
char* name = "MD A Barik";
You should be aware that this String behaves very differently than the std::string of c++. e.g. in c++ you can add two strings to concatenate, in C, you need to use strcat
Why this distinction? I've landed up with terrible problems, assuming itoa to be in stdlib.h and finally ending up with linking a custom version of itoa with a different prototype and thus producing some crazy errors.
So, why isn't itoa not a standard function? What's wrong with it? And why is the standard partial towards its twin brother atoi?
No itoa has ever been standardised so to add it to the standard you would need a compelling reason and a good interface to add it.
Most itoa interfaces that I have seen either use a static buffer which has re-entrancy and lifetime issues, allocate a dynamic buffer that the caller needs to free or require the user to supply a buffer which makes the interface no better than sprintf.
An "itoa" function would have to return a string. Since strings aren't first-class objects, the caller would have to pass a buffer + length and the function would have to have some way to indicate whether it ran out of room or not. By the time you get that far, you've created something similar enough to sprintf that it's not worth duplicating the code/functionality. The "atoi" function exists because it's less complicated (and arguably safer) than a full "scanf" call. An "itoa" function wouldn't be different enough to be worth it.
The itoa function isn't standard probably for the reason is that there is no consistent definition of it. Different compiler and library vendors have introduced subtly different versions of it, possibly as an invention to serve as a complement to atoi.
If some non-standard function is widely provided by vendors, the standard's job is to codify it: basically add a description of the existing function to the standard. This is possible if the function has more or less consistent argument conventions and behavior.
Because multiple flavors of itoa are already out there, such a function cannot be added into ISO C. Whatever behavior is described would be at odds with some implementations.
itoa has existed in forms such as:
void itoa(int n, char *s); /* Given in _The C Programming Language_, 1st ed. (K&R1) */
void itoa(int input, void (*subr)(char)); /* Ancient Unix library */
void itoa(int n, char *buf, int radix);
char *itoa(int in, char *buf, int radix);
Microsoft provides it in their Visual C Run Time Library under the altered name: _itoa.
Not only have C implementations historically provided it under differing definitions, C programs also provide a function named itoa function for themselves, which is another source for possible clashes.
Basically, the itoa identifier is "radioactive" with regard to standardization as an external name or macro. If such a function is standardized, it will have to be under a different name.
I am working on a C project (still pretty new to C), and I am trying to remove all of the warnings when it's compiled.
The original coders of this project have made a type called dyn_char (dynamic char arr) and it's an unsigned char * type. Here's a copy of one of the warnings:
warning: argument #1 is incompatible with prototype: prototype:
pointer to char : ".../stdio_iso.h", line 210 argument : pointer to
unsigned char
They also use lots of standard string functions like strlen(); so the way that I have been removing these warnings is like this:
strlen((char *)myDynChar);
I can do this but some of the files have hundreds of these warnings. I could do a Find and Replace to search for strlen( and replace with strlen((char*), but is there a better way?
Is it possible to use a Macro to do something similar? Maybe something like this:
#define strlen(s) strlen((char *)s)
Firstly, would this work? Secondly, if so, is it a bad idea to do this?
Thanks!
This is an annoying problem, but here's my two cents on it.
First, if you can confidently change the type of dyn_char to just be char *, I would do that. Perhaps if you have a robust test program or something you can try it out and see if it still works?
If not, you have two choices as far as I can see: fix what's going into strlen(), or have your compiler ignore those warnings (or ignore them yourself)! I'm not one for ignoring warnings unless I have to, but as far as fixing what goes into strlen...
If your underlying type is unsigned char *, then casting what goes into strlen() is basically telling the compiler to assume that the argument, for the purposes of being passed to strlen(), is a char *. If strlen() is the only place this is causing an issue and you can't safely change the type, then I'd consider a search-and-replace to add in casts to be the preferable option. You could redefine strlen with a #define like you suggested (I just tried it out and it worked for me), but I would strongly recommend not doing this. If anything, I'd search-replace strlen() with USTRLEN() or something (a fake function name), and then use that as your casting macro. Overriding C library functions with your own names transparently is a maintainability nightmare!
Two points on that: first, you're using a different name. Second, you're using all-caps, as is the convention for defining such a macro.
This may or may not work
strlen may be a macro in the system header, in which case you will get a warning about redefining a macro, and you won't get the functionality of the existing macro.
(The Visual Studio stdlib does many interesting things with macros in <string.h>. strcpy is defined this way:
__DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1(char *, __RETURN_POLICY_DST, __EMPTY_DECLSPEC, strcpy, _Pre_cap_for_(_Source) _Post_z_, char, _Dest, _In_z_ const char *, _Source))
I wouldn't be surprised at all if #defining strcpy broke this)`
Search and replace may be your best option. Don't hide subtle differences like this behind macros - you will just pass your pain on to the next maintainer.
Instead of adding a cast to all the calls, you may want to change all the calls to dyn_strlen, which is a function you create that calls strlen with the appropriate cast.
You can define a function:
char *uctoc(unsigned char*p){ return (char*)(p); }
and do the search replace strstr(x with strstr(uctoc(x). At least you can have some type checking. Later you can convert uctoc to a macro for performance.
I'm writing some C code and use the Windows API. I was wondering if it was in any way good practice to cast the types that are obviously the same, but have a different name? For example, when passing a TCHAR * to strcmp(), which expects a const char *. Should I do, assuming I want to write strict and in every way correct C, strcmp((const char *)my_tchar_string, "foo")?
Don't. But also don't use strcmp() but rather _tcscmp() (or even the safe alternatives).
_tcs* denotes a whole set of C runtime (string) functions that will behave correctly depending on how TCHAR gets translated by the preprocessor.
Concerning safe alternatives, look up functions with a trailing _s and otherwise named as the classic string functions from the C runtime. There is another set of functions that returns HRESULT, but it is not as compatible with the C runtime.
No, casting that away is not safe because TCHAR is not always equal to char. Instead of casting, you should pick a function that works with a TCHAR. See http://msdn.microsoft.com/en-us/library/e0z9k731(v=vs.71).aspx
Casting is generally a bad idea. Casting when you don't need to is terrible practice.
Think what happens if you change the type of the variable you are casting? Suppose that at some future date you change my_tchar_string to be wchar_t* rather than char*. Your code will still compile but will behave incorrectly.
One of your primary goals when writing C code is to minimise the number of casts in your code.
My advice would be to just avoid TCHAR (and associated functions) completely. Their real intent was to allow a single code base to compile natively for either 16-bit or 32-bit versions of Windows -- but the 16-bit versions of Windows are long gone, and with them the real reason to write code like this.
If you want/need to support wide characters, do it. If you're fine with only narrow/multibyte characters, do that. At least IME, trying to sit on the fence and do some of both generally means you end up not doing either one well. It also means roughly doubling the amount of testing necessary without even coming close to doubling the functionality you provide to the user.