Is passing empty strings ("") in C bad practice? - c

I'm trying to fix a bug in very old C code that just popped up recently on Windows after we updated Visual Studio to version 2017. The code still runs on several linux platforms. That tells me we're probably relying on some undefined behavior, and we previously got lucky.
We have a bunch of function calls like this:
get_some_data(parent, "ge", "", type);
Running in debug, I noticed that upon entry to this function, that empty string is immediately filled with garbage, before the function has done anything. The function is declared like this:
static void get_some_data(
KEY Parent,
char *Prefix,
char *Suffix,
ENT EntType)
So is it unwise to pass the strings directly ("ge", "")? I know it's trivial to fix this case by declaring char *suffix="" and passing suffix instead of "", but I'm now questioning whether I need to go through this entire suite of code looking for this type of function call.

So is it unwise to pass the strings directly ("ge", "")?
There is nothing inherently wrong with passing string literals to functions in general.
However, it is unwise to pass pointers (in)to string literals specifically to parameters declared as pointers to non-const char, because C specifies that undefined behavior results from attempting to modify a string literal, and in practice, that UB often manifests as abrupt program termination. If a function declares a parameter as const char * then you can reasonably take that as a promise that it will not attempt to modify the target of that pointer -- which is what you need to ensure -- but if it declares a parameter as just char * then no such promise is made, and the function doesn't even have a way to check at runtime whether the argument is writable.
Possibly you can rely on documentation in place of const-qualification, for you're ok in this regard as long as no attempt is made in practice to modify a string literal, but that still leaves you more open to bugs than you otherwise would be.
I know it's trivial to fix this case by declaring char *suffix="" and passing suffix instead of ""
Such a change may disguise what you're doing from the compiler, so that it does not warn about the function call, but it does not fix anything. The same pointer value is passed to the function either way, and the same semantics and constraints apply. Also, if the compiler warned about the function call then it should also warn about the assignment.
This is not an issue in C++, by the way, or at least not the same issue, because in C++, string literals represent arrays of const char in the first place.
, but I'm now questioning whether I need to go through this entire suite of code looking for this type of function call.
Better might be to modify the signatures of the called functions. Where you intend for it to be ok to pass a string literal, ensure that the parameter has type const char *, like so:
static void get_some_data(
KEY Parent,
const char *Prefix,
const char *Suffix,
ENT EntType)
But do note that is highly likely to cause new warnings about violations of const-correctness. To ensure safety, you need to fix these, too, without casting away constness. This could well cascade broadly, but the exercise will definitely help you identify and fix places where your code was mishandling string literals.
On the other hand, a genuine fix that might be less pervasive would be to pass pointers to modifiable arrays instead of (unmodifiable) string literals. Perhaps that's what you had in mind with your proposed fix, but the correct way to do that is this:
char prefix[] = "ge";
char suffix[] = "";
get_some_data(parent, prefix, suffix, type);
Here, prefix and suffix are separate (modifiable) local arrays, initialized with copies of the string literals' contents.
With all that said, I'm inclined to suspect that if you're getting bona fide runtime errors related to these arguments with VS-compiled executables but not GCC-compiled ones, then the source of those is probably something else. My first guess would be that array bounds are being overrun. My second guess would be that you are compiling C code as C++, and running afoul of one or more of the (other) differences between them.
That's not to say that you shouldn't take a good look at the constness / writability concerns involved here, but it would suck to go through the whole exercise just to find out that you were ok to begin with. You could still end up with better code, but that's a little tricky to sell to the boss when they ask why the bug hasn't been fixed yet.

No, there is absolutely nothing wrong with passing a string literal, empty or not. Quite the opposite — if you try to "fix" your code by doing your trivial change, you will hide the bug and make life harder for whoever is going to fix it in future.

I encountered this same problem and the cause was a similar situation in a recently executed function where, crucially, the contents of the string were being changed.
Prior to the call where this strange behaviour was being observed, there was a call to older code with a parameter of PSTR type. Not realizing that the contents of that parameter were going to be changed, a programmer had supplied an empty string. The code was only ever updating the first character so declaring a char type parameter and supplying the address of that was sufficient to solve the problem that was exhibiting in the later call.

Related

Loading graphic files

#include <graphics.h> //importing graphics
#include <conio.h>
int main()
{
int gd=0,gm;
initgraph(&gd,&gm," ");
circle(100,80,20);
getch();
closegraph();
}
I typed the above code in Code::Blocks but it does not execute and rather it says
warning: deprecated conversion from string constant to 'char*' [-Wwrite-strings]
I have been trying for three days. Someone please help me out.
I have installed winbgim.h and other necessary files for graphics but it's not helping. I have searched all possible websites...
Preamble: I deleted and then undeleted this answer several times. There are other questions that address this subject, but most of them seem to be focused on C++, which has some differences from C in this area, or to have been closed as a dupe of such a question. Some have answers I don't care for, or that are incomplete. In a possiuble act of hubris, therefore, I am providing this answer, and I choose to provide it here.
Since the only string constant I see in the code is the " " argument to initgraph(), the warning must arise from there. In that case, the third parameter to that function must be declared as type char *.
In that case, there is nothing inherently wrong with the code you presented. According to the standard, string constants correspond to arrays of char, and such an array is automatically converted to a pointer to its first element in most contexts where it is evaluated, including when it appears as a function argument. That lines up perfectly with the function.
The problem is that the standard also says that if a program attempts to modify the contents of a string literal then it produces undefined behavior. It is not uncommon for the actual manifestation of that to be a program crash. Therefore, although it is technically legal, you take a significant risk when you convert a string constant to a char *, because whoever handles the pointer may not know that it is unsafe to attempt to use it to modify what it points to. This problem does not arise if you convert to const char * instead.
GCC wants to help you detect such problems. Therefore its -Wwrite-strings] option, when enabled, causes string constants to be treated as arrays of const char instead of arrays of char, and therefore to trigger warnings such as you encountered.
You have at least three options:
You could disable the -Wwrite-strings compilation option. This is probably the best approach for legacy code with well-tested behavior, but I do not recommend it otherwise.
Provided that the function in fact does not attempt to modify the array that its parameter points to, directly or indirectly, you can change the function parameter to type const char *. This may require a cascade of other, similar changes to avoid additional, relevant warnings.
Avoid passing a string literal to the function. Instead create and pass a mutable array of char. For example:
int gd = 0, gm;
char s[] = " ";
initgraph(&gd, &gm, s);
// ...
Note in particular that there, although there is a string literal, it is used only to initialize the distinct, non-const array s.

C function parameter optimization: (MyStruct const * const myStruct) vs. (MyStruct const myStruct)

Example available at ideone.com:
int passByConstPointerConst(MyStruct const * const myStruct)
int passByValueConst (MyStruct const myStruct)
Would you expect a compiler to optimize the two functions above such that neither one would actually copy the contents of the passed MyStruct?
I do understand that many optimization questions are specific to individual compilers and optimization settings, but I can't be designing for a single compiler. Instead, I would like to have a general expectation as to whether or not I need to be passing pointers to avoid copying. It just seems like using const and allowing the compiler to handle the optimization (after I configure it) should be a better choice and would result in more legible and less error prone code.
In the case of the example at ideone.com, the compiler clearly is still copying the data to a new location.
In the first case (passing a const pointer to const) no copying occurs.
In the second case, copying does occur and I would not expect that to be optimized out if for no other reason because the address of the object is taken and then passed through an ellipsis into a function and from the point of view of the compiler, who knows what the function does with that pointer?
More generally speaking, I don't think changing call-by-value into call-by-reference is something compilers do. If you want copy by reference, implement it yourself.
Is it theoretically possible that a compiler could detect that it could just convert the function to be pass-by-reference? Yes; nothing in the C standard says it cannot..
Why are you worrying about this? If you are concerned about performance, has profiling shown copy-by-value to be a significant bottleneck in your software?
This topic is addressed by the comp.lang.c FAQ:
http://c-faq.com/struct/passret.html
When large structures are passed by value, this is commonly optimized by actually passing the address of the object rather than a copy. The callee then determines whether a copy needs to be made, or whether it can simply work with the original object.
The const qualifier on the parameter makes no difference. It is not part of the type; it is simply ignored. That is to say, these two function declarations are equivalent:
int foo(int);
int foo(const int);
It's possible for the declaration to omit the const, but for the definition to have it and vice versa. The optimization of the call cannot hinge on this const in the declaration. That const is not what creates the semantics that the object is passed by value and hence the original cannot be modified.
The optimization has to preserve the semantics; it has to look as if a copy of the object was really passed.
There are two ways you can tell that a copy was not passed: one is that a modification to the apparent copy affects the original. The other way is to compare addresses. For instance:
int compare(struct foo *ptr, struct foo copy);
Inside compare we can take the address of copy and see whether it is equal to ptr. If the optimization takes place even though we have done this, then it reveals itself to us.
The second declaration is actually a direct request by the user to receive a copy of the passed struct.
const modifier eliminates the possibility of any modifications made to the local copy, however, it is does not eliminate all the reasons for copying.
Firstly, the copy has to maintain its address identity, meaning that inside the second function the &myStruct expression should produce a value different from the address of any other MyStruct object. A smart compiler can, of course, detect the situations that depend on the address identity of the object.
Secondly, aliasing presents another problem. Imagine that the program has a global pointer MyStruct *global_struct and inside the second function someone modifies the *global_struct. There's a possibility that the *global_struct is the same struct object that was passed to the function as an argument. If no copy was made, the modifications made to *global_struct will be visible through the local parameter, which is a disaster. Aliasing issues are much more difficult (and in general case impossible) to resolve at compilation time, which is why compilers usually won't be able to optimize out the copying.
So, I would expect any compiler to perform the copying, as requested.

Type-casting in C for standard function

I am working on a C project (still pretty new to C), and I am trying to remove all of the warnings when it's compiled.
The original coders of this project have made a type called dyn_char (dynamic char arr) and it's an unsigned char * type. Here's a copy of one of the warnings:
warning: argument #1 is incompatible with prototype: prototype:
pointer to char : ".../stdio_iso.h", line 210 argument : pointer to
unsigned char
They also use lots of standard string functions like strlen(); so the way that I have been removing these warnings is like this:
strlen((char *)myDynChar);
I can do this but some of the files have hundreds of these warnings. I could do a Find and Replace to search for strlen( and replace with strlen((char*), but is there a better way?
Is it possible to use a Macro to do something similar? Maybe something like this:
#define strlen(s) strlen((char *)s)
Firstly, would this work? Secondly, if so, is it a bad idea to do this?
Thanks!
This is an annoying problem, but here's my two cents on it.
First, if you can confidently change the type of dyn_char to just be char *, I would do that. Perhaps if you have a robust test program or something you can try it out and see if it still works?
If not, you have two choices as far as I can see: fix what's going into strlen(), or have your compiler ignore those warnings (or ignore them yourself)! I'm not one for ignoring warnings unless I have to, but as far as fixing what goes into strlen...
If your underlying type is unsigned char *, then casting what goes into strlen() is basically telling the compiler to assume that the argument, for the purposes of being passed to strlen(), is a char *. If strlen() is the only place this is causing an issue and you can't safely change the type, then I'd consider a search-and-replace to add in casts to be the preferable option. You could redefine strlen with a #define like you suggested (I just tried it out and it worked for me), but I would strongly recommend not doing this. If anything, I'd search-replace strlen() with USTRLEN() or something (a fake function name), and then use that as your casting macro. Overriding C library functions with your own names transparently is a maintainability nightmare!
Two points on that: first, you're using a different name. Second, you're using all-caps, as is the convention for defining such a macro.
This may or may not work
strlen may be a macro in the system header, in which case you will get a warning about redefining a macro, and you won't get the functionality of the existing macro.
(The Visual Studio stdlib does many interesting things with macros in <string.h>. strcpy is defined this way:
__DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1(char *, __RETURN_POLICY_DST, __EMPTY_DECLSPEC, strcpy, _Pre_cap_for_(_Source) _Post_z_, char, _Dest, _In_z_ const char *, _Source))
I wouldn't be surprised at all if #defining strcpy broke this)`
Search and replace may be your best option. Don't hide subtle differences like this behind macros - you will just pass your pain on to the next maintainer.
Instead of adding a cast to all the calls, you may want to change all the calls to dyn_strlen, which is a function you create that calls strlen with the appropriate cast.
You can define a function:
char *uctoc(unsigned char*p){ return (char*)(p); }
and do the search replace strstr(x with strstr(uctoc(x). At least you can have some type checking. Later you can convert uctoc to a macro for performance.

Is it good practice to ALWAYS cast variables in C?

I'm writing some C code and use the Windows API. I was wondering if it was in any way good practice to cast the types that are obviously the same, but have a different name? For example, when passing a TCHAR * to strcmp(), which expects a const char *. Should I do, assuming I want to write strict and in every way correct C, strcmp((const char *)my_tchar_string, "foo")?
Don't. But also don't use strcmp() but rather _tcscmp() (or even the safe alternatives).
_tcs* denotes a whole set of C runtime (string) functions that will behave correctly depending on how TCHAR gets translated by the preprocessor.
Concerning safe alternatives, look up functions with a trailing _s and otherwise named as the classic string functions from the C runtime. There is another set of functions that returns HRESULT, but it is not as compatible with the C runtime.
No, casting that away is not safe because TCHAR is not always equal to char. Instead of casting, you should pick a function that works with a TCHAR. See http://msdn.microsoft.com/en-us/library/e0z9k731(v=vs.71).aspx
Casting is generally a bad idea. Casting when you don't need to is terrible practice.
Think what happens if you change the type of the variable you are casting? Suppose that at some future date you change my_tchar_string to be wchar_t* rather than char*. Your code will still compile but will behave incorrectly.
One of your primary goals when writing C code is to minimise the number of casts in your code.
My advice would be to just avoid TCHAR (and associated functions) completely. Their real intent was to allow a single code base to compile natively for either 16-bit or 32-bit versions of Windows -- but the 16-bit versions of Windows are long gone, and with them the real reason to write code like this.
If you want/need to support wide characters, do it. If you're fine with only narrow/multibyte characters, do that. At least IME, trying to sit on the fence and do some of both generally means you end up not doing either one well. It also means roughly doubling the amount of testing necessary without even coming close to doubling the functionality you provide to the user.

Function pointer cast to different signature

I use a structure of function pointers to implement an interface for different backends. The signatures are very different, but the return values are almost all void, void * or int.
struct my_interface {
void (*func_a)(int i);
void *(*func_b)(const char *bla);
...
int (*func_z)(char foo);
};
But it is not required that a backends supports functions for every interface function. So I have two possibilities, first option is to check before every call if the pointer is unequal NULL. I don't like that very much, because of the readability and because I fear the performance impacts (I haven't measured it, however). The other option is to have a dummy function, for the rare cases an interface function doesn't exist.
Therefore I'd need a dummy function for every signature, I wonder if it is possible to have only one for the different return values. And cast it to the given signature.
#include <stdio.h>
int nothing(void) {return 0;}
typedef int (*cb_t)(int);
int main(void)
{
cb_t func;
int i;
func = (cb_t) nothing;
i = func(1);
printf("%d\n", i);
return 0;
}
I tested this code with gcc and it works. But is it sane? Or can it corrupt the stack or can it cause other problems?
EDIT: Thanks to all the answers, I learned now much about calling conventions, after a bit of further reading. And have now a much better understanding of what happens under the hood.
By the C specification, casting a function pointer results in undefined behavior. In fact, for a while, GCC 4.3 prereleases would return NULL whenever you casted a function pointer, perfectly valid by the spec, but they backed out that change before release because it broke lots of programs.
Assuming GCC continues doing what it does now, it will work fine with the default x86 calling convention (and most calling conventions on most architectures), but I wouldn't depend on it. Testing the function pointer against NULL at every callsite isn't much more expensive than a function call. If you really want, you may write a macro:
#define CALL_MAYBE(func, args...) do {if (func) (func)(## args);} while (0)
Or you could have a different dummy function for every signature, but I can understand that you'd like to avoid that.
Edit
Charles Bailey called me out on this, so I went and looked up the details (instead of relying on my holey memory). The C specification says
766 A pointer to a function of one type may be converted to a pointer to a function of another type and back again;
767 the result shall compare equal to the original pointer.
768 If a converted pointer is used to call a function whose type is not compatible with the pointed-to type, the behavior is undefined.
and GCC 4.2 prereleases (this was settled way before 4.3) was following these rules: the cast of a function pointer did not result in NULL, as I wrote, but attempting to call a function through a incompatible type, i.e.
func = (cb_t)nothing;
func(1);
from your example, would result in an abort. They changed back to the 4.1 behavior (allow but warn), partly because this change broke OpenSSL, but OpenSSL has been fixed in the meantime, and this is undefined behavior which the compiler is free to change at any time.
OpenSSL was only casting functions pointers to other function types taking and returning the same number of values of the same exact sizes, and this (assuming you're not dealing with floating-point) happens to be safe across all the platforms and calling conventions I know of. However, anything else is potentially unsafe.
I suspect you will get an undefined behaviour.
You can assign (with the proper cast) a pointer to function to another pointer to function with a different signature, but when you call it weird things may happen.
Your nothing() function takes no arguments, to the compiler this may mean that he can optimize the usage of the stack as there will be no arguments there. But here you call it with an argument, that is an unexpected situation and it may crash.
I can't find the proper point in the standard but I remember it says that you can cast function pointers but when you call the resulting function you have to do with the right prototype otherwise the behaviour is undefined.
As a side note, you should not compare a function pointer with a data pointer (like NULL) as thee pointers may belong to separate address spaces. There's an appendix in the C99 standard that allows this specific case but I don't think it's widely implemented. That said, on architecture where there is only one address space casting a function pointer to a data pointer or comparing it with NULL, will usually work.
You do run the risk of causing stack corruption. Having said that, if you declare the functions with extern "C" linkage (and/or __cdecl depending on your compiler), you may be able to get away with this. It would be similar then to the way a function such as printf() can take a variable number of arguments at the caller's discretion.
Whether this works or not in your current situation may also depend on the exact compiler options you are using. If you're using MSVC, then debug vs. release compile options may make a big difference.
It should be fine. Since the caller is responsible for cleaning up the stack after a call, it shouldn't leave anything extra on the stack. The callee (nothing() in this case) is ok since it wont try to use any parameters on the stack.
EDIT: this does assume cdecl calling conventions, which is usually the default for C.
As long as you can guarantee that you're making a call using a method that has the caller balance the stack rather than the callee (__cdecl). If you don't have a calling convention specified the global convention could be set to something else. (__stdcall or __fastcall) Both of which could lead to stack corruption.
This won't work unless you use implementation-specific/platform-specific stuff to force the correct calling convention. For some calling conventions the called function is responsible for cleaning up the stack, so they must know what's been pushed on.
I'd go for the check for NULL then call - I can't imagine it would have any impact on performance.
Computers can check for NULL about as fast as anything they do.
Casting a function pointer to NULL is explicitly not supported by the C standard. You're at the mercy of the compiler writer. It works OK on a lot of compilers.
It is one of the great annoyances of C that there is no equivalent of NULL or void* for function pointers.
If you really want your code to be bulletproof, you can declare your own nulls, but you need one for each function type. For example,
void void_int_NULL(int n) { (void)n; abort(); }
and then you can test
if (my_thing->func_a != void_int_NULL) my_thing->func_a(99);
Ugly, innit?

Resources