In my SAX xml parsing callback (XCode 4, LLVM), I am doing a lot of calls to
this type of code:
static const char* kFoo = "Bar";
void SaxCallBack(char* sax_string,.....)
{
if ( strcmp(sax_string, kFoo, strlen(kFoo) ) == 0)
{
}
}
Is it safe to assume that strlen(kFoo) is optimized by the compiler?
(The Apple sample code
had pre-calculated strlen(kFoo), but I think this is error prone for large numbers of constant strings.)
Edit: Motivation for optimizing: parsing my SVG map on iPod touch 2G takes 5 seconds (!) using NSXMLParser. So, I want to switch to lib2xml, and optimize the string comparisons.
If by "LLVM" you mean clang, then yes, you can count on clang -O to optimize the strlen away. Here is what the code for your function looks like:
_SaxCallBack:
Leh_func_begin1:
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
leaq L_.str1(%rip), %rsi
movl $3, %edx
callq _strncmp
...
I changed the strcmp into strncmp, but the third argument has indeed been replaced by the immediate $3.
Note that gcc 4.2.1 -O3 does not optimize this strlen call, and that you can only expect it to work in the precise conditions of your question (especially, the string and the call to strlen must be in the same file).
Don't write things like:
static const char* kFoo = "Bar";
You've created a variable named kFoo that points to constant data. The compiler might be able to detect that this variable does not change and optimize it out, but if not, you've bloated your program's data segment.
Also don't write things like:
static const char *const kFoo = "Bar";
Now your variable kFoo is const-qualified and non-modifiable, but if it's used in position independent code (shared libraries etc.), the contents will still vary at runtime and thus it will add startup and memory cost to your program. Instead, use:
static const char kFoo[] = "Bar";
or even:
#define kFoo "Bar"
In general you can't count on it. However, you could use 'sizeof' and apply it to a string literal. Of course, this mean that you can't define 'kFoo' the way it originally was defined.
The following should work on all compilers and on all optimization levels.
#define kFoo "..."
... strcmp(... sizeof(kFoo))
Follow-up question:
Have you tested the following ?
static std::string const kFoo = "BAR";
void SaxCallBack(char* sax_string,.....)
{
if ( sax_string == kFoo)
{
}
}
It's a net win in readability, but I have no idea about the performance cost.
As an alternative, if you must dispatch by yourself, I have found that using a state-machine like approach (with stack) is much better readability-wise, and might also win performance-wise (instead of having a large number of tags to switch on you only have the tags that can be met right now).
Related
In a C99 program, under the (theoretical) assumption that I'm not using variable-length arrays, and each of my automatic variables can only exist once at a time in the whole stack (by forbidding circular function calls and explicit recursion), if I sum up all the space they are consuming, could I declare that this is the maximal stack size that can ever happen?
A bit of context here: I told a friend that I wrote a program not using dynamic memory allocation ("malloc") and allocate all memory static (by modeling all my state variables in a struct, which I then declared global). He then told me that if I'm using automatic variables, I still make use of dynamic memory. I argued that my automatic variables are not state variables but control variables, so my program is still to be considered static. We then discussed that there has to be a way to make a statement about the absolute worst-case behaviour about my program, so I came up with the above question.
Bonus question: If the assumptions above hold, I could simply declare all automatic variables static and would end up with a "truly" static program?
Even if array sizes are constant a C implementation could allocate arrays and even structures dynamically. I'm not aware of any that do (anyone) and it would appear quite unhelpful. But the C Standard doesn't make such guarantees.
There is also (almost certainly) some further overhead in the stack frame (the data added to the stack on call and released on return).
You would need to declare all your functions as taking no parameters and returning void to ensure no program variables in the stack. Finally the 'return address' of where execution of a function is to continue after return is pushed onto the stack (at least logically).
So having removed all parameters, automatic variables and return values to you 'state' struct there will still be something going on to the stack - probably.
I say probably because I'm aware of a (non-standard) embedded C compiler that forbids recursion that can determine the maximum size of the stack by examining the call tree of the whole program and identify the call chain that reaches the peek size of the stack.
You could achieve this a monstrous pile of goto statements (some conditional where a functon is logically called from two places or by duplicating code.
It's often important in embedded code on devices with tiny memory to avoid any dynamic memory allocation and know that any 'stack-space' will never overflow.
I'm happy this is a theoretical discussion. What you suggest is a mad way to write code and would throw away most of (ultimately limited) services C provides to infrastructure of procedural coding (pretty much the call stack)
Footnote: See the comment below about the 8-bit PIC architecture.
Bonus question: If the assumptions above hold, I could simply declare
all automatic variables static and would end up with a "truly" static
program?
No. This would change the function of the program. static variables are initialized only once.
Compare this 2 functions:
int canReturn0Or1(void)
{
static unsigned a=0;
a++;
if(a>1)
{
return 1;
}
return 0;
}
int willAlwaysReturn0(void)
{
unsigned a=0;
a++;
if(a>1)
{
return 1;
}
return 0;
}
In a C99 program, under the (theoretical) assumption that I'm not using variable-length arrays, and each of my automatic variables can only exist once at a time in the whole stack (by forbidding circular function calls and explicit recursion), if I sum up all the space they are consuming, could I declare that this is the maximal stack size that can ever happen?
No, because of function pointers..... Read n1570.
Consider the following code, where rand(3) is some pseudo random number generator (it could also be some input from a sensor) :
typedef int foosig(int);
int foo(int x) {
foosig* fptr = (x>rand())?&foo:NULL;
if (fptr)
return (*fptr)(x);
else
return x+rand();
}
An optimizing compiler (such as some recent GCC suitably invoked with enough optimizations) would make a tail-recursive call for (*fptr)(x). Some other compiler won't.
Depending on how you compile that code, it would use a bounded stack or could produce a stack overflow. With some ABI and calling conventions, both the argument and the result could go thru a processor register and won't consume any stack space.
Experiment with a recent GCC (e.g. on Linux/x86-64, some GCC 10 in 2020) invoked as gcc -O2 -fverbose-asm -S foo.c then look inside foo.s. Change the -O2 to a -O0.
Observe that the naive recursive factorial function could be compiled into some iterative machine code with a good enough C compiler and optimizer. In practice GCC 10 on Linux compiling the below code:
int fact(int n)
{
if (n<1) return 1;
else return n*fact(n-1);
}
as gcc -O3 -fverbose-asm tmp/fact.c -S -o tmp/fact.s produces the following assembler code:
.type fact, #function
fact:
.LFB0:
.cfi_startproc
endbr64
# tmp/fact.c:3: if (n<1) return 1;
movl $1, %eax #, <retval>
testl %edi, %edi # n
jle .L1 #,
.p2align 4,,10
.p2align 3
.L2:
imull %edi, %eax # n, <retval>
subl $1, %edi #, n
jne .L2 #,
.L1:
# tmp/fact.c:5: }
ret
.cfi_endproc
.LFE0:
.size fact, .-fact
.ident "GCC: (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0"
And you can observe that the call stack is not increasing above.
If you have serious and documented arguments against GCC, please submit a bug report.
BTW, you could write your own GCC plugin which would choose to randomly apply or not such an optimization. I believe it stays conforming to the C standard.
The above optimization is essential for many compilers generating C code, such as Chicken/Scheme or Bigloo.
A related theorem is Rice's theorem. See also this draft report funded by the CHARIOT project.
See also the Compcert project.
Consider:
#include <stdio.h>
char toUpper(char);
int main(void)
{
char ch, ch2;
printf("lowercase input: ");
ch = getchar();
ch2 = toUpper(ch);
printf("%c ==> %c\n", ch, ch2);
return 0;
}
char toUpper(char c)
{
if(c>='a' && c<='z')
c = c - 32;
}
In the toUpper function, the return type is char, but there isn't any "return" in toUpper(). And compile the source code with gcc (GCC) 4.5.1 20100924 (Red Hat 4.5.1-4), Fedora 14.
Of course, a warning is issued: "warning: control reaches end of non-void function", but, working well.
What has happened in that code during compile with gcc?
When the C program was compiled into assembly language, your toUpper function ended up like this, perhaps:
_toUpper:
LFB4:
pushq %rbp
LCFI3:
movq %rsp, %rbp
LCFI4:
movb %dil, -4(%rbp)
cmpb $96, -4(%rbp)
jle L8
cmpb $122, -4(%rbp)
jg L8
movzbl -4(%rbp), %eax
subl $32, %eax
movb %al, -4(%rbp)
L8:
leave
ret
The subtraction of 32 was carried out in the %eax register. And in the x86 calling convention, that is the register in which the return value is expected to be! So... you got lucky.
But please pay attention to the warnings. They are there for a reason!
It depends on the Application Binary Interface and which registers are used for the computation.
E.g. on x86, the first function parameter and the return value is stored in EAX and so gcc is most likely using this to store the result of the calculation as well.
Essentially, c is pushed into the spot that should later be filled with the return value; since it's not overwritten by use of return, it ends up as the value returned.
Note that relying on this (in C, or any other language where this isn't an explicit language feature, like Perl), is a Bad Idea™. In the extreme.
One missing thing that's important to understand is that it's rarely a diagnosable error to omit a return statement. Consider this function:
int f(int x)
{
if (x!=42) return x*x;
}
As long as you never call it with an argument of 42, a program containing this function is perfectly valid C and does not invoke any undefined behavior, despite the fact that it would invoke UB if you called f(42) and subsequently attempted to use the return value.
As such, while it's possible for a compiler to provide warning heuristics for missing return statements, it's impossible to do so without false positives or false negatives. This is a consequence of the impossibility of solving the halting problem.
I can't tell you the specifics of your platform as I don't know it, but there is a general answer to the behaviour you see.
When some function that has a return is compiled, the compiler will use a convention on how to return that data. It could be a machine register, or a defined memory location such as via a stack or whatever (though generally machine registers are used). The compiled code may also use that location (register or otherwise) while doing the work of the function.
If the function doesn't return anything, then the compiler will not generate code that explicitly fills that location with a return value. However, like I said above, it may use that location during the function. When you write code that reads the return value (ch2 = toUpper(ch);), the compiler will write code that uses its convention on how retrieve that return from the conventional location. As far as the caller code is concerned, it will just read that value from the location, even if nothing was written explicitly there. Hence you get a value.
Now look at Ray's example. The compiler used the EAX register to store the results of the upper casing operation. It just so happens, this is probably the location that return values are written to. On the calling side, ch2 is loaded with the value that's in EAX - hence a phantom return. This is only true of the x86 range of processors, as on other architectures the compiler may use a completely different scheme in deciding how the convention should be organised.
However, good compilers will try optimise according to a set of local conditions, knowledge of code, rules, and heuristics. So an important thing to note is that this is just luck that it works. The compiler could optimise and not do this or whatever - you should not reply on the behaviour.
You should keep in mind that such code may crash depending on the compiler. For example, Clang generates a ud2 instruction at the end of such function and your app will crash at run time.
There are no local variables, so the value on the top of the stack at the end of the function will be the parameter c. The value at the top of the stack upon exiting, is the return value. So whatever c holds, that's the return value.
I have tried a small program:
#include <stdio.h>
int f1() {
}
int main() {
printf("TEST: <%d>\n", f1());
printf("TEST: <%d>\n", f1());
printf("TEST: <%d>\n", f1());
printf("TEST: <%d>\n", f1());
printf("TEST: <%d>\n", f1());
}
Result:
TEST: <1>
TEST: <10>
TEST: <11>
TEST: <11>
TEST: <11>
I have used the MinGW-GCC compiler, so there might be differences.
You could just play around and try, e.g., a char function.
As long you don't use the result value, it will still work fine.
#include <stdio.h>
char f1() {
}
int main() {
f1();
}
But I still would recommend to set either void function or give some return value.
Your function seems to need a return:
char toUpper(char c)
{
if(c>='a'&&c<='z')
c = c - 32;
return c;
}
I am using C99 under GCC.
I have a function declared static inline in a header that I cannot modify.
The function never returns but is not marked __attribute__((noreturn)).
How can I call the function in a way that tells the compiler it will not return?
I am calling it from my own noreturn function, and partly want to suppress the "noreturn function returns" warning but also want to help the optimizer etc.
I have tried including a declaration with the attribute but get a warning about the repeated declaration.
I have tried creating a function pointer and applying the attribute to that, but it says the function attribute cannot apply to a pointed function.
From the function you defined, and which calls the external function, add a call to __builtin_unreachable which is built into at least GCC and Clang compilers and is marked noreturn. In fact, this function does nothing else and should not be called. It's only here so that the compiler can infer that program execution will stop at this point.
static inline external_function() // lacks the noreturn attribute
{ /* does not return */ }
__attribute__((noreturn)) void your_function() {
external_function(); // the compiler thinks execution may continue ...
__builtin_unreachable(); // ... and now it knows it won't go beyond here
}
Edit: Just to clarify a few points raised in the comments, and generally give a bit of context:
A function has has only two ways of not returning: loop forever, or short-circuit the usual control-flow (e.g. throw an exception, jump out of the function, terminate the process, etc.)
In some cases, the compiler may be able to infer and prove through static analysis that a function will not return. Even theoretically, this is not always possible, and since we want compilers to be fast only obvious/easy cases are detected.
__attribute__((noreturn)) is an annotation (like const) which is a way for the programmer to inform the compiler that he's absolutely sure a function will not return. Following the trust but verify principle, the compiler tries to prove that the function does indeed not return. If may then issue an error if it proves the function may return, or a warning if it was not able to prove whether the function returns or not.
__builtin_unreachable has undefined behaviour because it is not meant to be called. It's only meant to help the compiler's static analysis. Indeed the compiler knows that this function does not return, so any following code is provably unreachable (except through a jump).
Once the compiler has established (either by itself, or with the programmer's help) that some code is unreachable, it may use this information to do optimizations like these:
Remove the boilerplate code used to return from a function to its caller, if the function never returns
Propagate the unreachability information, i.e. if the only execution path to a code points is through unreachable code, then this point is also unreachable. Examples:
if a function does not return, any code following its call and not reachable through jumps is also unreachable. Example: code following __builtin_unreachable() is unreachable.
in particular, it the only path to a function's return is through unreachable code, the function can be marked noreturn. That's what happens for your_function.
any memory location / variable only used in unreachable code is not needed, therefore settings/computing the content of such data is not needed.
any computations which is probably (1) unnecessary (previous bullet) and (2) has no side effects (such as pure functions) may be removed.
Illustration:
The call to external_function cannot be removed because it might have side-effects. In fact, it probably has at least the side effect of terminating the process!
The return boiler plate of your_function may be removed
Here's another example showing how code before the unreachable point may be removed
int compute(int) __attribute((pure)) { return /* expensive compute */ }
if(condition) {
int x = compute(input); // (1) no side effect => keep if x is used
// (8) x is not used => remove
printf("hello "); // (2) reachable + side effect => keep
your_function(); // (3) reachable + side effect => keep
// (4) unreachable beyond this point
printf("word!\n"); // (5) unreachable => remove
printf("%d\n", x); // (6) unreachable => remove
// (7) mark 'x' as unused
} else {
// follows unreachable code, but can jump here
// from reachable code, so this is reachable
do_stuff(); // keep
}
Several solutions:
redeclaring your function with the __attribute__
You should try to modify that function in its header by adding __attribute__((noreturn)) to it.
You can redeclare some functions with new attribute, as this stupid test demonstrates (adding an attribute to fopen) :
#include <stdio.h>
extern FILE *fopen (const char *__restrict __filename,
const char *__restrict __modes)
__attribute__ ((warning ("fopen is used")));
void
show_map_without_care (void)
{
FILE *f = fopen ("/proc/self/maps", "r");
do
{
char lin[64];
fgets (lin, sizeof (lin), f);
fputs (lin, stdout);
}
while (!feof (f));
fclose (f);
}
overriding with a macro
At last, you could define a macro like
#define func(A) {func(A); __builtin_unreachable();}
(this uses the fact that inside a macro, the macro name is not macro-expanded).
If your never-returning func is declaring as returning e.g. int you'll use a statement expression like
#define func(A) ({func(A); __builtin_unreachable(); (int)0; })
Macro-based solutions like above won't always work, e.g. if func is passed as a function pointer, or simply if some guy codes (func)(1) which is legal but ugly.
redeclaring a static inline with the noreturn attribute
And the following example:
// file ex.c
// declare exit without any standard header
void exit (int);
// define myexit as a static inline
static inline void
myexit (int c)
{
exit (c);
}
// redeclare it as notreturn
static inline void myexit (int c) __attribute__ ((noreturn));
int
foo (int *p)
{
if (!p)
myexit (1);
if (p)
return *p + 2;
return 0;
}
when compiled with GCC 4.9 (from Debian/Sid/x86-64) as gcc -S -fverbose-asm -O2 ex.c) gives an assembly file containing the expected optimization:
.type foo, #function
foo:
.LFB1:
.cfi_startproc
testq %rdi, %rdi # p
je .L5 #,
movl (%rdi), %eax # *p_2(D), *p_2(D)
addl $2, %eax #, D.1768
ret
.L5:
pushq %rax #
.cfi_def_cfa_offset 16
movb $1, %dil #,
call exit #
.cfi_endproc
.LFE1:
.size foo, .-foo
You could play with #pragma GCC diagnostic to selectively disable a warning.
Customizing GCC with MELT
Finally, you could customize your recent gcc using the MELT plugin and coding your simple extension (in the MELT domain specific language) to add the attribute noreturn when encoutering the desired function. It is probably a dozen of MELT lines, using register_finish_decl_first and a match on the function name.
Since I am the main author of MELT (free software GPLv3+) I could perhaps even code that for you if you ask, e.g. here or preferably on gcc-melt#googlegroups.com; give the concrete name of your never-returning function.
Probably the MELT code is looking like:
;;file your_melt_mode.melt
(module_is_gpl_compatible "GPLv3+")
(defun my_finish_decl (decl)
(let ( (tdecl (unbox :tree decl))
)
(match tdecl
(?(tree_function_decl_named
?(tree_identifier ?(cstring_same "your_function_name")))
;;; code to add the noreturn attribute
;;; ....
))))
(register_finish_decl_first my_finish_decl)
The real MELT code is slightly more complex. You want to define your_adding_attr_mode there. Ask me for more.
Once you coded your MELT extension your_melt_mode.melt for your needs (and compiled that MELT extension into your_melt_mode.quicklybuilt.so as documented in the MELT tutorials) you'll compile your code with
gcc -fplugin=melt \
-fplugin-arg-melt-extra=your_melt_mode.quicklybuilt \
-fplugin-arg-melt-mode=your_adding_attr_mode \
-O2 -I/your/include -c yourfile.c
In other words, you just add a few -fplugin-* flags to your CFLAGS in your Makefile !
BTW, I'm just coding in the MELT monitor (on github: https://github.com/bstarynk/melt-monitor ..., file meltmom-process.melt something quite similar.
With a MELT extension, you won't get any additional warning, since the MELT extension would alter the internal GCC AST (a GCC Tree) of the declared function on the fly!
Customizing GCC with MELT is probably the most bullet-proof solution, since it is modifying the GCC internal AST. Of course, it is probably the most costly solution (and it is GCC specific and might need -small- changes when GCC is evolving, e.g. when using the next version of GCC), but as I am trying to show it is quite easy in your case.
PS. In 2019, GCC MELT is an abandoned project. If you want to customize GCC (for any recent version of GCC, e.g. GCC 7, 8, or 9), you need to write your own GCC plugin in C++.
It is said that we can write multiple declarations but only one definition. Now if I implement my own strcpy function with the same prototype :
char * strcpy ( char * destination, const char * source );
Then am I not redefining the existing library function? Shouldn't this display an error? Or is it somehow related to the fact that the library functions are provided in object code form?
EDIT: Running the following code on my machine says "Segmentation fault (core dumped)". I am working on linux and have compiled without using any flags.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *strcpy(char *destination, const char *source);
int main(){
char *s = strcpy("a", "b");
printf("\nThe function ran successfully\n");
return 0;
}
char *strcpy(char *destination, const char *source){
printf("in duplicate function strcpy");
return "a";
}
Please note that I am not trying to implement the function. I am just trying to redefine a function and asking for the consequences.
EDIT 2:
After applying the suggested changes by Mats, the program no longer gives a segmentation fault although I am still redefining the function.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *strcpy(char *destination, const char *source);
int main(){
char *s = strcpy("a", "b");
printf("\nThe function ran successfully\n");
return 0;
}
char *strcpy(char *destination, const char *source){
printf("in duplicate function strcpy");
return "a";
}
C11(ISO/IEC 9899:201x) §7.1.3 Reserved Identifiers
— Each macro name in any of the following subclauses (including the future library
directions) is reserved for use as specified if any of its associated headers is included;
unless explicitly stated otherwise.
— All identifiers with external linkage in any of the following subclauses (including the
future library directions) are always reserved for use as identifiers with external
linkage.
— Each identifier with file scope listed in any of the following subclauses (including the
future library directions) is reserved for use as a macro name and as an identifier with
file scope in the same name space if any of its associated headers is included.
If the program declares or defines an identifier in a context in which it is reserved, or defines a reserved identifier as a macro name, the behavior is undefined. Note that this doesn't mean you can't do that, as this post shows, it can be done within gcc and glibc.
glibc §1.3.3 Reserved Names proveds a clearer reason:
The names of all library types, macros, variables and functions that come from the ISO C standard are reserved unconditionally; your program may not redefine these names. All other library names are reserved if your program explicitly includes the header file that defines or declares them. There are several reasons for these restrictions:
Other people reading your code could get very confused if you were using a function named exit to do something completely different from what the standard exit function does, for example. Preventing this situation helps to make your programs easier to understand and contributes to modularity and maintainability.
It avoids the possibility of a user accidentally redefining a library function that is called by other library functions. If redefinition were allowed, those other functions would not work properly.
It allows the compiler to do whatever special optimizations it pleases on calls to these functions, without the possibility that they may have been redefined by the user. Some library facilities, such as those for dealing with variadic arguments (see Variadic Functions) and non-local exits (see Non-Local Exits), actually require a considerable amount of cooperation on the part of the C compiler, and with respect to the implementation, it might be easier for the compiler to treat these as built-in parts of the language.
That's almost certainly because you are passing in a destination that is a "string literal".
char *s = strcpy("a", "b");
Along with the compiler knowing "I can do strcpy inline", so your function never gets called.
You are trying to copy "b" over the string literal "a", and that won't work.
Make a char a[2]; and strcpy(a, "b"); and it will run - it probably won't call your strcpy function, because the compiler inlines small strcpy even if you don't have optimisation available.
Putting the matter of trying to modify non-modifiable memory aside, keep in mind that you are formally not allowed to redefine standard library functions.
However, in some implementations you might notice that providing another definition for standard library function does not trigger the usual "multiple definition" error. This happens because in such implementations standard library functions are defined as so called "weak symbols". Foe example, GCC standard library is known for that.
The direct consequence of that is that when you define your own "version" of standard library function with external linkage, your definition overrides the "weak" standard definition for the entire program. You will notice that not only your code now calls your version of the function, but also all class from all pre-compiled [third-party] libraries are also dispatched to your definition. It is intended as a feature, but you have to be aware of it to avoid "using" this feature inadvertently.
You can read about it here, for one example
How to replace C standard library function ?
This feature of the implementation doesn't violate the language specification, since it operates within uncharted area of undefined behavior not governed by any standard requirements.
Of course, the calls that use intrinsic/inline implementation of some standard library function will not be affected by the redefinition.
Your question is misleading.
The problem that you see has nothing to do with the re-implementation of a library function.
You are just trying to write non-writable memory, that is the memory where the string literal a exists.
To put it simple, the following program gives a segmentation fault on my machine (compiled with gcc 4.7.3, no flags):
#include <string.h>
int main(int argc, const char *argv[])
{
strcpy("a", "b");
return 0;
}
But then, why the segmentation fault if you are calling a version of strcpy (yours) that doesn't write the non-writable memory? Simply because your function is not being called.
If you compile your code with the -S flag and have a look at the assembly code that the compiler generates for it, there will be no call to strcpy (because the compiler has "inlined" that call, the only relevant call that you can see from main, is a call to puts).
.file "test.c"
.section .rodata
.LC0:
.string "a"
.align 8
.LC1:
.string "\nThe function ran successfully"
.text
.globl main
.type main, #function
main:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movw $98, .LC0(%rip)
movq $.LC0, -8(%rbp)
movl $.LC1, %edi
call puts
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE2:
.size main, .-main
.section .rodata
.LC2:
.string "in duplicate function strcpy"
.text
.globl strcpy
.type strcpy, #function
strcpy:
.LFB3:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movq %rdi, -8(%rbp)
movq %rsi, -16(%rbp)
movl $.LC2, %edi
movl $0, %eax
call printf
movl $.LC0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE3:
.size strcpy, .-strcpy
.ident "GCC: (Ubuntu/Linaro 4.7.3-1ubuntu1) 4.7.3"
.
I think Yu Hao answer has a great explanation for this, the quote from the standard:
The names of all library types, macros, variables and functions that
come from the ISO C standard are reserved unconditionally; your
program may not redefine these names. All other library names are
reserved if your program explicitly includes the header file that
defines or declares them. There are several reasons for these
restrictions:
[...]
It allows the compiler to do whatever special optimizations it pleases
on calls to these functions, without the possibility that they may
have been redefined by the user.
your example can operate in this way : ( with strdup )
char *strcpy(char *destination, const char *source);
int main(){
char *s = strcpy(strdup("a"), strdup("b"));
printf("\nThe function ran successfully\n");
return 0;
}
char *strcpy(char *destination, const char *source){
printf("in duplicate function strcpy");
return strdup("a");
}
output :
in duplicate function strcpy
The function ran successfully
The way to interpret this rule is that you cannot have multiple definitions of a function end up in the final linked object (the executable). So, if all the objects included in the link have only one definition of a function, then you are good. Keeping this in mind, consider the following scenarios.
Let's say you redefine a function somefunction() that is defined in some library. Your function is in main.c (main.o) and in the library the function is in an a object named someobject.o (in the libray). Remember that in the final link, the linker only looks for unresolved symbols in the libraries. Because somefunction() is resolved already from main.o, the linker does not even look for it in the libraries and does not pull in someobject.o. The final link has only one definition of the function, and things are fine.
Now imagine that there is another symbol anotherfunction() defined in someobject.o that you also happen to call. The linker will try to resolve anotherfunction() from someobject.o, and pull it in from the library, and it will become a part of the final link. Now you have two definitions of somefunction() in the final link - one from main.o and another from someobject.o, and the linker will throw an error.
I use this one frequently:
void my_strcpy(char *dest, char *src)
{
int i;
i = 0;
while (src[i])
{
dest[i] = src[i];
i++;
}
dest[i] = '\0';
}
and you can also do strncpy just by modify one line
void my_strncpy(char *dest, char *src, int n)
{
int i;
i = 0;
while (src[i] && i < n)
{
dest[i] = src[i];
i++;
}
dest[i] = '\0';
}
Religious arguments aside:
Option1:
if (pointer[i] == NULL) ...
Option2:
if (!pointer[i]) ...
In C is option1 functionally equivalent to option2?
Does the later resolve quicker due to absence of a comparison ?
I prefer the explicit style (first version). It makes it obvious that there is a pointer involved and not an integer or something else but it's just a matter of style.
From a performance point of view, it should make no difference.
Equivalent. It says so in the language standard. And people have the damndest religious preferences!
I like the second, other people like the first.
Actually, I prefer a third kind to the first:
if (NULL == ptr) {
...
}
Because then I:
won't be able to miss and just type one =
won't miss the == NULL and mistake it for the opposite if the condition is long (multiple lines)
Functionally they are equivalent.
Even if a NULL pointer is not "0" (all zero bits), if (!ptr) compares with the NULL pointer.
The following is incorrect. It's still here because there are many comments referring to it:
Do not compare a pointer with literal zero, however. It will work almost everywhere but is undefined behavior IIRC.
It is often useful to assume that compiler writers have at least a minimum of intelligence. Your compiler is not written by concussed ducklings. It is written by human beings, with years of programming experience, and years spent studying compiler theory. This doesn't mean that your compiler is perfect, and always knows best, but it does mean that it is perfectly capable of handling trivial automatic optimizations.
If the two forms are equivalent, then why wouldn't the compiler just translate one into the other to ensure both are equally efficient?
If if (pointer[i] == NULL) was slower than if (!pointer[i]), wouldn't the compiler just change it into the second, more efficient form?
So no, assuming they are equivalent, they are equally efficient.
As for the first part of the question, yes, they are equivalent. The language standard actually states this explicitly somewhere -- a pointer evaluates to true if it is non-NULL, and false if it is NULL, so the two are exactly identical.
Almost certainly no difference in performance. I prefer the implicit style of the second, though.
NULL should be declared in one of the standard header files as such:
#define NULL ((void*)0)
So either way, you are comparing against zero, and the compiler should optimize both the same way. Every processor has some "optimization" or opcode for comparing with zero.
Early optimization is bad. Micro optimization is also bad, unless you are trying to squeeze every last bit of Hz from your CPU, there is no point it doing it. As people have already shown, the compiler will optimize most of your code away anyways.
Its best to make your code as concise and readable as possible. If this is more readable
if (!ptr)
than this
if (NULL==ptr)
then use it. As long as everyone who will be reading your code agrees.
Personally I use the fully defined value (NULL==ptr) so it is clear what I am checking for. Might be longer to type, but I can easily read it. I'd think the !ptr would be easy to miss ! if reading to quickly.
It really depends on the compiler. I'd be surprised if most modern C compilers didn't generate virtually identical code for the specific scenario you describe.
Get your compiler to generate an assembly listing for each of those scenarios and you can answer your own question (for your particular compiler :)).
And even if they are different, the performance difference will probably be irrelevant in practical applications.
Turn on compiler optimization and they're basically the same
tested this on gcc 4.3.3
int main (int argc, char** argv) {
char c = getchar();
int x = (c == 'x');
if(x == NULL)
putchar('y');
return 0;
}
vs
int main (int argc, char** argv) {
char c = getchar();
int x = (c == 'x');
if(!x)
putchar('y');
return 0;
}
gcc -O -o test1 test1.c
gcc -O -o test2 test2.c
diff test1 test2
produced no output :)
I did a assembly dump, and found the difference between the two versions:
## -11,8 +11,7 ##
pushl %ecx
subl $20, %esp
movzbl -9(%ebp), %eax
- movsbl %al,%eax
- testl %eax, %eax
+ testb %al, %al
It looks like the latter actually generates one instruction and the first generates two, but this is pretty unscientific.
This is gcc, no optimizations:
test1.c:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char *pointer[5];
if(pointer[0] == NULL) {
exit(1);
}
exit(0);
}
test2.c: Change pointer[0] == NULL to !pointer[0]
gcc -s test1.c, gcc -s test2.c, diff -u test1.s test2.s
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char pointer[5];
/* This is insense you are comparing a pointer to a value */
if(pointer[0] == NULL) {
exit(1);
}
...
}
=> ...
movzbl 9(%ebp), %eax # your code compares a 1 byte value to a signed 4 bytes one
movsbl %al,%eax # Will result in sign extension...
testl %eax, %eax
...
Beware, gcc should have bumped out a warning, if not the case compile with -Wall flag on
Though, you should always compile to optimized gcc code.
BTW, precede your variable with volatile keyword in order to avoid gcc from ignoring it...
Always mention your compiler build version :)