#include <graphics.h> //importing graphics
#include <conio.h>
int main()
{
int gd=0,gm;
initgraph(&gd,&gm," ");
circle(100,80,20);
getch();
closegraph();
}
I typed the above code in Code::Blocks but it does not execute and rather it says
warning: deprecated conversion from string constant to 'char*' [-Wwrite-strings]
I have been trying for three days. Someone please help me out.
I have installed winbgim.h and other necessary files for graphics but it's not helping. I have searched all possible websites...
Preamble: I deleted and then undeleted this answer several times. There are other questions that address this subject, but most of them seem to be focused on C++, which has some differences from C in this area, or to have been closed as a dupe of such a question. Some have answers I don't care for, or that are incomplete. In a possiuble act of hubris, therefore, I am providing this answer, and I choose to provide it here.
Since the only string constant I see in the code is the " " argument to initgraph(), the warning must arise from there. In that case, the third parameter to that function must be declared as type char *.
In that case, there is nothing inherently wrong with the code you presented. According to the standard, string constants correspond to arrays of char, and such an array is automatically converted to a pointer to its first element in most contexts where it is evaluated, including when it appears as a function argument. That lines up perfectly with the function.
The problem is that the standard also says that if a program attempts to modify the contents of a string literal then it produces undefined behavior. It is not uncommon for the actual manifestation of that to be a program crash. Therefore, although it is technically legal, you take a significant risk when you convert a string constant to a char *, because whoever handles the pointer may not know that it is unsafe to attempt to use it to modify what it points to. This problem does not arise if you convert to const char * instead.
GCC wants to help you detect such problems. Therefore its -Wwrite-strings] option, when enabled, causes string constants to be treated as arrays of const char instead of arrays of char, and therefore to trigger warnings such as you encountered.
You have at least three options:
You could disable the -Wwrite-strings compilation option. This is probably the best approach for legacy code with well-tested behavior, but I do not recommend it otherwise.
Provided that the function in fact does not attempt to modify the array that its parameter points to, directly or indirectly, you can change the function parameter to type const char *. This may require a cascade of other, similar changes to avoid additional, relevant warnings.
Avoid passing a string literal to the function. Instead create and pass a mutable array of char. For example:
int gd = 0, gm;
char s[] = " ";
initgraph(&gd, &gm, s);
// ...
Note in particular that there, although there is a string literal, it is used only to initialize the distinct, non-const array s.
Related
I'm trying to fix a bug in very old C code that just popped up recently on Windows after we updated Visual Studio to version 2017. The code still runs on several linux platforms. That tells me we're probably relying on some undefined behavior, and we previously got lucky.
We have a bunch of function calls like this:
get_some_data(parent, "ge", "", type);
Running in debug, I noticed that upon entry to this function, that empty string is immediately filled with garbage, before the function has done anything. The function is declared like this:
static void get_some_data(
KEY Parent,
char *Prefix,
char *Suffix,
ENT EntType)
So is it unwise to pass the strings directly ("ge", "")? I know it's trivial to fix this case by declaring char *suffix="" and passing suffix instead of "", but I'm now questioning whether I need to go through this entire suite of code looking for this type of function call.
So is it unwise to pass the strings directly ("ge", "")?
There is nothing inherently wrong with passing string literals to functions in general.
However, it is unwise to pass pointers (in)to string literals specifically to parameters declared as pointers to non-const char, because C specifies that undefined behavior results from attempting to modify a string literal, and in practice, that UB often manifests as abrupt program termination. If a function declares a parameter as const char * then you can reasonably take that as a promise that it will not attempt to modify the target of that pointer -- which is what you need to ensure -- but if it declares a parameter as just char * then no such promise is made, and the function doesn't even have a way to check at runtime whether the argument is writable.
Possibly you can rely on documentation in place of const-qualification, for you're ok in this regard as long as no attempt is made in practice to modify a string literal, but that still leaves you more open to bugs than you otherwise would be.
I know it's trivial to fix this case by declaring char *suffix="" and passing suffix instead of ""
Such a change may disguise what you're doing from the compiler, so that it does not warn about the function call, but it does not fix anything. The same pointer value is passed to the function either way, and the same semantics and constraints apply. Also, if the compiler warned about the function call then it should also warn about the assignment.
This is not an issue in C++, by the way, or at least not the same issue, because in C++, string literals represent arrays of const char in the first place.
, but I'm now questioning whether I need to go through this entire suite of code looking for this type of function call.
Better might be to modify the signatures of the called functions. Where you intend for it to be ok to pass a string literal, ensure that the parameter has type const char *, like so:
static void get_some_data(
KEY Parent,
const char *Prefix,
const char *Suffix,
ENT EntType)
But do note that is highly likely to cause new warnings about violations of const-correctness. To ensure safety, you need to fix these, too, without casting away constness. This could well cascade broadly, but the exercise will definitely help you identify and fix places where your code was mishandling string literals.
On the other hand, a genuine fix that might be less pervasive would be to pass pointers to modifiable arrays instead of (unmodifiable) string literals. Perhaps that's what you had in mind with your proposed fix, but the correct way to do that is this:
char prefix[] = "ge";
char suffix[] = "";
get_some_data(parent, prefix, suffix, type);
Here, prefix and suffix are separate (modifiable) local arrays, initialized with copies of the string literals' contents.
With all that said, I'm inclined to suspect that if you're getting bona fide runtime errors related to these arguments with VS-compiled executables but not GCC-compiled ones, then the source of those is probably something else. My first guess would be that array bounds are being overrun. My second guess would be that you are compiling C code as C++, and running afoul of one or more of the (other) differences between them.
That's not to say that you shouldn't take a good look at the constness / writability concerns involved here, but it would suck to go through the whole exercise just to find out that you were ok to begin with. You could still end up with better code, but that's a little tricky to sell to the boss when they ask why the bug hasn't been fixed yet.
No, there is absolutely nothing wrong with passing a string literal, empty or not. Quite the opposite — if you try to "fix" your code by doing your trivial change, you will hide the bug and make life harder for whoever is going to fix it in future.
I encountered this same problem and the cause was a similar situation in a recently executed function where, crucially, the contents of the string were being changed.
Prior to the call where this strange behaviour was being observed, there was a call to older code with a parameter of PSTR type. Not realizing that the contents of that parameter were going to be changed, a programmer had supplied an empty string. The code was only ever updating the first character so declaring a char type parameter and supplying the address of that was sufficient to solve the problem that was exhibiting in the later call.
I recently had a question, I know that a pointer to a constant array initialized as it is in the code below, is in the .rodata region and that this region is only readable.
However, I saw in pattern C11, that writing in this memory address behavior will be undefined.
I was aware that the Borland's Turbo-C compiler can write where the pointer points, this would be because the processor operated in real mode on some systems of the time, such as MS-DOS? Or is it independent of the operating mode of the processor? Is there any other compiler that writes to the pointer and does not take any memory breach failure using the processor in protected mode?
#include <stdio.h>
int main(void) {
char *st = "aaa";
*st = 'b';
return 0;
}
In this code compiling with Turbo-C in MS-DOS, you will be able to write to memory
As has been pointed out, trying to modify a constant string in C results in undefined behavior. There are several reasons for this.
One reason is that the string may be placed in read-only memory. This allows it to be shared across multiple instances of the same program, and doesn't require the memory to be saved to disk if the page it's on is paged out (since the page is read-only and thus can be reloaded later from the executable). It also helps detect run-time errors by giving an error (e.g. a segmentation fault) if an attempt is made to modify it.
Another reason is that the string may be shared. Many compilers (e.g., gcc) will notice when the same literal string appears more than once in a compilation unit, and will share the same storage for it. So if a program modifies one instance, it could affect others as well.
There is also never a need to do this, since the same intended effect can easily be achieved by using a static character array. For instance:
#include <stdio.h>
int main(void) {
static char st_arr[] = "aaa";
char *st = st_arr;
*st = 'b';
return 0;
}
This does exactly what the posted code attempted to do, but without any undefined behavior. It also takes the same amount of memory. In this example, the string "aaa" is used as an array initializer, and does not have any storage of its own. The array st_arr takes the place of the constant string from the original example, but (1) it will not be placed in read-only memory, and (2) it will not be shared with any other references to the string. So it's safe to modify it, if in fact that's what you want.
Is there any other compiler that writes to the pointer and does not take any memory breach failure using the processor in protected mode?
GCC 3 and earlier used to support gcc -fwriteable-strings to let you compile old K&R C where this was apparently legal, according to https://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/Incompatibilities.html. (It's undefined behaviour in ISO C and thus a bug in an ISO C program). That option will define the behaviour of the assignment which ISO C leaves undefined.
GCC 3.3.6 manual - C Dialect options
-fwritable-strings
Store string constants in the writable data segment and don't uniquize them. This is for compatibility with old programs which assume they can write into string constants.
Writing into string constants is a very bad idea; “constants” should be constant.
GCC 4.0 removed that option (release notes); the last GCC3 series was gcc3.4.6 in March 2006. Although apparently it had become buggy in that version.
gcc -fwritable-strings would treat string literals like non-const anonymous character arrays (see #gnasher's answer), so they go in the .data section instead of .rodata, and thus get linked into a segment of the executable that's mapped to read+write pages, not read-only. (Executable segments have basically nothing to do with x86 segmentation, it's just a start+range memory-mapping from the executable file to memory.)
And it would disable duplicate-string merging, so char *foo() { return "hello"; } and char *bar() { return "hello"; } would return different pointer values, instead of merging identical string literals.
Related:
How can some GCC compilers modify a constant char pointer?
https://softwareengineering.stackexchange.com/questions/294748/why-are-c-string-literals-read-only
Linker option: still Undefined Behaviour so probably not viable
On GNU/Linux, linking with ld -N (--omagic) will make the text (as well as data) section read+write. This may apply to .rodata even though modern GNU Binutils ld puts .rodata in its own section (normally with read but not exec permission) instead of making it part of .text. Having .text writeable could easily be a security problem: you never want a page with write+exec at the same time, otherwise some bugs like buffer overflows can turn into code-injection attacks.
To do this from gcc, use gcc -Wl,-N to pass on that option to ld when linking.
This doesn't do anything about it being Undefined Behaviour to write const objects. e.g. the compiler will still merge duplicate strings, so writing into one char *foo = "hello"; will affect all other uses of "hello" in the whole program, even across files.
What to use instead:
If you want something writeable, use static char foo[] = "hello"; where the quoted string is just an array initializer for a non-const array. As a bonus, this is more efficient than static char *foo = "hello"; at global scope, because there's one fewer level of indirection to get to the data: it's just an array instead a pointer stored in memory.
You are asking whether or not the platform may cause undefined behavior to be defined. The answer to that question is yes.
But you are also asking whether or not the platform defines this behavior. In fact it does not.
Under some optimization hints, the compiler will merge string constants, so that writing to one constant will write to the other uses of that constant. I used this compiler once, it was quite capable of merging strings.
Don't write this code. It's not good. You will regret writing code in this style when you move onto a more modern platform.
Your literal "aaa" produces a static array of four const char 'a', 'a', 'a', '\0' in an anonymous location and returns a pointer to the first 'a', cast to char*.
Trying to modify any of the four characters is undefined behaviour. Undefined behaviour can do anything, from modifying the char as intended, pretending to modify the char, doing nothing, or crashing.
It's basically the same as static const char anonymous[4] = { 'a', 'a', 'a', '\0' }; char* st = (char*) &anonymous [0];
To add to the correct answers above, DOS runs in real mode, so there is no read only memory. All memory is flat and writable. Hence, writing to the literal was well defined (as it was in any sort of const variable) at the time.
The question is plain and simple, s is a string, I suddenly got the idea to try to use printf(s) to see if it would work and I got a warning in one case and none in the other.
char* s = "abcdefghij\n";
printf(s);
// Warning raised with gcc -std=c11:
// format not a string literal and no format arguments [-Wformat-security]
// On the other hand, if I use
char* s = "abc %d efg\n";
printf(s, 99);
// I get no warning whatsoever, why is that?
// Update, I've tested this:
char* s = "random %d string\n";
printf(s, 99, 50);
// Results: no warning, output "random 99 string".
So what's the underlying difference between printf(s) and printf("%s", s) and why do I get a warning in just one case?
In the first case, the non-literal format string could perhaps come from user code or user-supplied (run-time) data, in which case it might contain %s or other conversion specifications, for which you've not passed the data. This can lead to all sorts of reading problems (and writing problems if the string includes %n — see printf() or your C library's manual pages).
In the second case, the format string controls the output and it doesn't matter whether any string to be printed contains conversion specifications or not (though the code shown prints an integer, not a string). The compiler (GCC or Clang is used in the question) assumes that because there are arguments after the (non-literal) format string, the programmer knows what they're up to.
The first is a 'format string' vulnerability. You can search for more information on the topic.
GCC knows that most times the single argument printf() with a non-literal format string is an invitation to trouble. You could use puts() or fputs() instead. It is sufficiently dangerous that GCC generates the warnings with the minimum of provocation.
The more general problem of a non-literal format string can also be problematic if you are not careful — but extremely useful assuming you are careful. You have to work harder to get GCC to complain: it requires both -Wformat and -Wformat-nonliteral to get the complaint.
From the comments:
So ignoring the warning, as if I really know what I am doing and there will be no errors, is one or another more efficient to use or are they the same? Considering both space and time.
Of your three printf() statements, given the tight context that the variable s is as assigned immediately above the call, there is no actual problem. But you could use puts(s) if you omitted the newline from the string or fputs(s, stdout) as it is and get the same result, without the overhead of printf() parsing the entire string to find out that it is all simple characters to be printed.
The second printf() statement is also safe as written; the format string matches the data passed. There is no significant difference between that and simply passing the format string as a literal — except that the compiler can do more checking if the format string is a literal. The run-time result is the same.
The third printf() passes more data arguments than the format string needs, but that is benign. It isn't ideal, though. Again, the compiler can check better if the format string is a literal, but the run-time effect is practically the same.
From the printf() specification linked to at the top:
Each of these functions converts, formats, and prints its arguments under control of the format. The format is a character string, beginning and ending in its initial shift state, if any. The format is composed of zero or more directives: ordinary characters, which are simply copied to the output stream, and conversion specifications, each of which shall result in the fetching of zero or more arguments. The results are undefined if there are insufficient arguments for the format. If the format is exhausted while arguments remain, the excess arguments shall be evaluated but are otherwise ignored.
In all these cases, there is no strong indication of why the format string is not a literal. However, one reason for wanting a non-literal format string might be that sometimes you print the floating point numbers in %f notation and sometimes in %e notation, and you need to choose which at run-time. (If it is simply based on value, %g might be appropriate, but there are times when you want the explicit control — always %e or always %f.)
The warning says it all.
First, to discuss about the issue, as per the signature, the first parameter to printf() is a format string which can contain format specifiers (conversion specifier). In case, a string contains a format specifier and the corresponding argument is not supplied, it invokes undefined behavior.
So, a cleaner (or safer) approach (of printing a string which needs no format specification) would be puts(s); over printf(s); (the former does not process s for any conversion specifiers, removing the reason for the possible UB in the later case). You can choose fputs(), if you're worried about the ending newline that automatically gets added in puts().
That said, regarding the warning option, -Wformat-security from the online gcc manual
At present, this warns about calls to printf and scanf functions where the format string is not a string literal and there are no format arguments, as in printf (foo);. This may be a security hole if the format string came from untrusted input and contains %n.
In your first case, there's only one argument supplied to printf(), which is not a string literal, rather a variable, which can be very well generated/ populated at run time, and if that contains unexpected format specifiers, it may invoke UB. Compiler has no way to check for the presence of any format specifier in that. That is the security problem there.
In the second case, the accompanying argument is supplied, the format specifier is not the only argument passed to printf(), so the first argument need not to be verified. Hence the warning is not there.
Update:
Regarding the third one, with excess argument that required by the supplied format string
printf(s, 99, 50);
quoting from C11, chapter §7.21.6.1
[...] If the format is exhausted while arguments remain, the excess arguments are
evaluated (as always) but are otherwise ignored. [...]
So, passing excess argument is not a problem (from the compiler perspective) at all and it is well defined. NO scope for any warning there.
There are two things in play in your question.
The first is covered succinctly by Jonathan Leffler - the warning you're getting is because the string isn't literal and doesn't have any format specifiers in it.
The other is the mystery of why the compiler doesn't issue a warning that your number of arguments doesn't match the number of specifiers. The short answer is "because it doesn't," but more specifically, printf is a variadic function. It takes any number of arguments after the initial format specification - from 0 on up. The compiler can't check to see if you gave the right amount; that's up to the printf function itself, and leads to the undefined behavior that Joachim mentioned in comments.
EDIT:
I'm going to give further answer to your question, as a means of getting on a small soapbox.
What's the difference between printf(s) and printf("%s", s)? Simple - in the latter, you're using printf as it's declared. "%s" is a const char *, and it will subsequently not generate the warning message.
In your comments to other answers, you mentioned "Ignoring the warning...". Don't do this. Warnings exist for a reason, and should be resolved (otherwise they're just noise, and you'll miss warnings that actually matter among the cruft of all the ones that don't.)
Your issue can be resolved in several ways.
const char* s = "abcdefghij\n";
printf(s);
will resolve the warning, because you're now using a const pointer, and there are none of the dangers that Jonathan mentioned. (You could also declare it as const char* const s, but don't have to. The first const is important, because it then matches the declaration of printf, and because const char* s means that characters pointed to by s can't change, i.e. the string is a literal.)
Or, even simpler, just do:
printf("abcdefghij\n");
This is implicitly a const pointer, and also not a problem.
So what's the underlying difference between printf(s) and printf("%s", s)
"printf(s)" will treat s as a format string. If s contains format specifiers then printf will interpret them and go looking for varargs. Since no varargs actually exist this will likely trigger undefined behaviour.
If an attacker controls "s" then this is likely to be a security hole.
printf("%s",s) will just print what is in the string.
and why do I get a warning in just one case?
Warnings are a balance between catching dangerous stupidity and not creating too much noise.
C programmers are in the habbit of using printf and various printf like functions* as generic print functions even when they don't actually need formatting. In this environment it's easy for someone to make the mistake of writing printf(s) without thinking about where s came from. Since formatting is pretty useless without any data to format printf(s) has little legitimate use.
printf(s,format,arguments) on the other hand indicates that the programmer deliberately intended formatting to take place.
Afaict this warning is not turned on by default in upstream gcc, but some distros are turning it on as part of their efforts to reduce security holes.
* Both standard C functions like sprintf and fprintf and functions in third party libraries.
The underlying reason: printf is declared like:
int printf(const char *fmt, ...) __attribute__ ((format(printf, 1, 2)));
This tells gcc that printf is a function with a printf-style interface where the format string comes first. IMHO it must be literal; I don't think there's a way to tell the good compiler that s is actually a pointer to a literal string it had seen before.
Read more about __attribute__ here.
In the code below, I have two different local char* variables declared in two different functions.
Each variable is initialized to point to a constant string, and the contents of the two strings are identical.
Checking in runtime, the variables are initialized to point to the same address in memory.
So the compiler must have assigned the same (constant) value to each one of them.
How is that possible?
#include <stdio.h>
void PrintPointer()
{
char* p = "abc";
printf("%p\n",p);
}
int main()
{
char* p = "abc";
printf("%p\n",p);
PrintPointer();
return 0;
}
It has nothing to do with the preprocessor. But the compiler is explicitly allowed (not required) by the standard to share the memory for identical string literals. For details on when this happens, you must consult your compiler's documentation.
For example, here's the relevant documentation for VC2013:
In some cases, identical string literals may be pooled to save space in the executable file. In string-literal pooling, the compiler causes all references to a particular string literal to point to the same location in memory, instead of having each reference point to a separate instance of the string literal. To enable string pooling, use the /GF compiler option.
The C++ standard says in N3797 2.14.15/12:
Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation defined. The effect of attempting to modify a string literal is undefined.
The C standard now contains the same wording. Historically it was possible to modify string literals at run-time in C, but this is now Undefined Behaviour. Some compilers may allow it, some not.
Technically, the compiler does it by storing string literals in the symbol table. If an identical string is seen more than once, the same symbolic reference is used each time. The same technique might well be used for other literals, but would not be so easily detected.
The preprocessor, by the way, has nothing to do with it.
How is that possible?
It's possible because the compiler keeps track of values like that. But no, the preprocessor generally doesn't get involved in things like this; the preprocessor does things like macro substitutions that modify the code before the compiler starts working. In this case, though, we're talking about actual code:
char* p = "abc";
and that's the domain of the compiler, not the preprocessor.
So the compiler must have assigned the same (constant) value to each one of them. How is that possible?
If you have two identical string literals, as you do here, then the compiler is allowed to combine them into a single one; apparently, your compiler does that. It's also allowed to store them separately.
I am working on a C project (still pretty new to C), and I am trying to remove all of the warnings when it's compiled.
The original coders of this project have made a type called dyn_char (dynamic char arr) and it's an unsigned char * type. Here's a copy of one of the warnings:
warning: argument #1 is incompatible with prototype: prototype:
pointer to char : ".../stdio_iso.h", line 210 argument : pointer to
unsigned char
They also use lots of standard string functions like strlen(); so the way that I have been removing these warnings is like this:
strlen((char *)myDynChar);
I can do this but some of the files have hundreds of these warnings. I could do a Find and Replace to search for strlen( and replace with strlen((char*), but is there a better way?
Is it possible to use a Macro to do something similar? Maybe something like this:
#define strlen(s) strlen((char *)s)
Firstly, would this work? Secondly, if so, is it a bad idea to do this?
Thanks!
This is an annoying problem, but here's my two cents on it.
First, if you can confidently change the type of dyn_char to just be char *, I would do that. Perhaps if you have a robust test program or something you can try it out and see if it still works?
If not, you have two choices as far as I can see: fix what's going into strlen(), or have your compiler ignore those warnings (or ignore them yourself)! I'm not one for ignoring warnings unless I have to, but as far as fixing what goes into strlen...
If your underlying type is unsigned char *, then casting what goes into strlen() is basically telling the compiler to assume that the argument, for the purposes of being passed to strlen(), is a char *. If strlen() is the only place this is causing an issue and you can't safely change the type, then I'd consider a search-and-replace to add in casts to be the preferable option. You could redefine strlen with a #define like you suggested (I just tried it out and it worked for me), but I would strongly recommend not doing this. If anything, I'd search-replace strlen() with USTRLEN() or something (a fake function name), and then use that as your casting macro. Overriding C library functions with your own names transparently is a maintainability nightmare!
Two points on that: first, you're using a different name. Second, you're using all-caps, as is the convention for defining such a macro.
This may or may not work
strlen may be a macro in the system header, in which case you will get a warning about redefining a macro, and you won't get the functionality of the existing macro.
(The Visual Studio stdlib does many interesting things with macros in <string.h>. strcpy is defined this way:
__DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1(char *, __RETURN_POLICY_DST, __EMPTY_DECLSPEC, strcpy, _Pre_cap_for_(_Source) _Post_z_, char, _Dest, _In_z_ const char *, _Source))
I wouldn't be surprised at all if #defining strcpy broke this)`
Search and replace may be your best option. Don't hide subtle differences like this behind macros - you will just pass your pain on to the next maintainer.
Instead of adding a cast to all the calls, you may want to change all the calls to dyn_strlen, which is a function you create that calls strlen with the appropriate cast.
You can define a function:
char *uctoc(unsigned char*p){ return (char*)(p); }
and do the search replace strstr(x with strstr(uctoc(x). At least you can have some type checking. Later you can convert uctoc to a macro for performance.