Which is faster: Increment or equation with addition arithmetic - c

Example:
a : ++i;
b : i++;
c : i += 1;
d : i = i + 1;
Assuming each of them abcd are called completely simultaneous, which one of them will be performed first ?

Using gcc 5.2 to compile this program:
#include<stdio.h>
int main()
{
int i = 0;
++i;
i++;
i += 1;
i = i + 1;
return 0;
}
It gives this ASM:
main:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], 0
add DWORD PTR [rbp-4], 1 #++i
add DWORD PTR [rbp-4], 1 #i++
add DWORD PTR [rbp-4], 1 #i += 1
add DWORD PTR [rbp-4], 1 #i = i + 1
mov eax, 0
pop rbp
ret
Which means that with gcc 5.2 it's the exact same speed of execution.
It's seems to be the very same for version from 4.4.7 to 5.2.

In this particular example all four expressions have the exact same externally observable result so a competent compiler should generate the exact same code for them.
The compiler doesn't slavishly read the code and generate a few instructions for each statement, the compiler reasons about what the result of the code should be according to the standard and generates the code needed for a whole program to behave as required. Therefore asking performance questions about single statements is almost always meaningless. Let me show an example:
void foo(unsigned int a, unsigned int b) { unsigned int i = a * b; }
void bar(unsigned int a, unsigned int b) { unsigned int i = a + b; }
Which one is faster? Function foo or bar? Many would say "of course multiplication is slower", but most likely the answer is: both are equally fast because a very simple dead store optimization will see that nothing uses i, so there's no need to compute it, so the compiler can optimize the functions down to nothing. Let's try it:
$ cat > foo.c
void foo(unsigned int a, unsigned int b) { unsigned int i = a * b; }
void bar(unsigned int a, unsigned int b) { unsigned int i = a + b; }
$ cc -S -fomit-frame-pointer -O2 foo.c
$ cat foo.s
[... I edited out irrelevant spam to make this more readable ...]
_foo: ## #foo
retq
_bar: ## #bar
retq
The only instruction in both functions is retq which just returns from the function.

Modern compilers are smart enough to optimize all of the four cases to improve the performance.
You should note that in the last expression i = i+1, i will be evaluated twice.

In Programming, Unary operators are having higher priority than the other operators. Unary Operators are executed before the execution of the other operators. Pre and Post Increment operators are the examples of the Unary operators while c and d are binary operators, hence executed later.Also c is just the short-hand notation for d, hence both take the same time and from a and b, a is executed earlier than b as post increment is faster than pre increment.
I hope this answer helps.

Related

Do external variables always need to be volatile when compiled with gcc?

I tried to set an external variable and get its value afterwards, but the value I got was not correct. Do external variables always need to be volatile when compiled with gcc?
The code is as follows (updated the complete code, the previous access to the memory address 0x00100000 is changed to the another variable):
main.c
extern unsigned int ex_var;
extern unsigned int another;
int main ()
{
ex_var = 56326;
unsigned int *pos=&ex_var+16;
for (int i = 0; i < 6; i++ )
{
*pos++ = 1;
}
another = ex_var;
}
another.c
unsigned int ex_var; // the address of this variable is set to right before valid_address
unsigned int valid_address[1024]; // to indicate valid memory address
unsigned int another;
And the value set to another is not 56326.
Note: another.c seems to be strange to indicate that the memory region after ex_var is valid. For the actual running example on bear metal, please refer to this post.
Here is the disassembly of main.c. It is compiled with x86-64 gcc 12.2 with -O1 option:
main:
mov eax, OFFSET FLAT:ex_var+64
.L2:
add rax, 4
mov DWORD PTR [rax-4], 1
cmp rax, OFFSET FLAT:ex_var+88
jne .L2
mov eax, DWORD PTR ex_var[rip]
mov DWORD PTR another[rip], eax
mov eax, 0
ret
It can be found that the code for setting the external variable ex_var is optimized out.
I tried several versions of gcc, including x86-64 gcc, x86 gcc, arm64 gcc, and arm gcc, and it seems that all tested gcc versions above 8.x have such issue. Note that optimization option -O1 or above is needed to reproduce this issue.
The code can be found at this link at Compiler Explorer.
Update:
This bug in the above code is not related to external references.
Here is another example code that has the same bug. Note that it should be compiled with -O1 or above. You can try it at Compiler Explorer.
#include <stdio.h>
unsigned int var;
// In embedded environment, LD files can be used to make valid_address stores right after var
volatile unsigned int valid_address[1024];
int main ()
{
var = 56326;
unsigned int *ttb=&var;
ttb += 16;
for (int i = 0; i < 8; i++ )
{
*ttb++ = 1;
}
valid_address[0] = var;
printf("Value is: %d", valid_address[0]);
}
If you compile this code with gcc like
gcc -O1 main1.c
and execute this code, you might get the following output:
Value is: 0
Which is not correct.
The calculation &ex_var+16 is not defined by the C standard (because it only defines pointer arithmetic within an object, including to the address just beyond its end) and the assignment *pos++ = 1 is not defined by the C standard (because, for the purposes of the standard, pos does not point to an object). When there is behavior not defined by the C standard on a code path, the standard does not define any behavior on the code path.
You can make the behavior defined, to the extent the compiler can see, by declaring ex_var as an array of unknown size, so that the address calculation and the assignments would be defined if this translation unit were linked with another that defined ex_var to be an array of sufficient size:
extern unsigned int ex_var[];
int main ()
{
ex_var[0] = 56326;
unsigned int *pos = ex_var+16;
for (int i = 0; i < 6; i++ )
{
*pos++ = 1;
}
*(volatile unsigned int*)(0x00100000) = ex_var[0];
}
(Note that *(volatile unsigned int*)(0x00100000) = remains not defined by the C standard, but GCC is intended for some use in bare-metal environments and appears to work with this. Additional compilation switches might be necessary to ensure it is defined for GCC’s purposes.)
This yields assembly that sets ex_var[0] and uses it in the assignment to 0x00100000:
main:
mov DWORD PTR ex_var[rip], 56326
…
mov eax, DWORD PTR ex_var[rip]
mov DWORD PTR ds:1048576, eax
mov eax, 0
ret

Why does it return a random value other than the value I give to the function?

In a C program, there is a swap function and this function takes a parameter called x.I expect it to return it by changing the x value in the swap function inside the main function.
When I value the parameter as a variable, I want it, but when I set an integer value directly for the parameter, the program produces random outputs.
#include <stdio.h>
int swap (int x) {
x = 20;
}
int main(void){
int y = 100;
int a = swap(y);
printf ("Value: %d", a);
return 0;
}
Output of this code: 100 (As I wanted)
But this code:
#include <stdio.h>
int swap (int x) {
x = 20;
}
int main(void){
int a = swap(100);
printf ("Value: %d", a);
return 0;
}
Return randomly values such as Value: 779964766 or Value:1727975774.
Actually, in two codes, I give an integer type value into the function, even the same values, but why are the outputs different?
First of all, C functions are call-by-value: the int x arg in the function is a copy. Modifying it doesn't modify the caller's copy of whatever they passed, so your swap makes zero sense.
Second, you're using the return value of the function, but you don't have a return statement. In C (unlike C++), it's not undefined behaviour for execution to fall off the end of a non-void function (for historical reasons, before void existed, and function returns types defaulted to int). But it is still undefined behaviour for the caller to use a return value when the function didn't return one.
In this case, returning 100 was the effect of the undefined behaviour (of using the return value of a function where execution falls off the end without a return statement). This is a coincidence of how GCC compiles in debug mode (-O0):
GCC -O0 likes to evaluate non-constant expressions in the return-value register, e.g. EAX/RAX on x86-64. (This is actually true for GCC across architectures, not just x86-64). This actually gets abused on codegolf.SE answers; apparently some people would rather golf in gcc -O0 as a language than ANSI C. See this "C golfing tips" answer and the comments on it, and this SO Q&A about why i=j inside a function putting a value in RAX. Note that it only works when GCC has to load a value into registers, not just do a memory-destination increment like add dword ptr [rbp-4], 1 for x++ or whatever.
In your case (with your code compiled by GCC10.2 on the Godbolt compiler explorer)
int y=100; stores 100 directly to stack memory (the way GCC compiles your code).
int a = swap(y); loads y into EAX (for no apparent reason), then copies to EDI to pass as an arg to swap. Since GCC's asm for swap doesn't touch EAX, after the call, EAX=y, so effectively the function returns y.
But if you call it with swap(100), GCC doesn't end up putting 100 into EAX while setting up the args.
The way GCC compiles your swap, the asm doesn't touch EAX, so whatever main left there is treated as the return value.
main:
...
mov DWORD PTR [rbp-4], 100 # y=100
mov eax, DWORD PTR [rbp-4] # load y into EAX
mov edi, eax # copy it to EDI (first arg-passing reg)
call swap # swap(y)
mov DWORD PTR [rbp-8], eax # a = EAX as the retval = y
...
But with your other main:
main:
... # nothing that touches EAX
mov edi, 100
call swap
mov DWORD PTR [rbp-4], eax # a = whatever garbage was there on entry to main
...
(The later ... reloads a as an arg for printf, matching the ISO C semantics because GCC -O0 compiles each C statement to a separate block of asm; thus the later ones aren't affected by the earlier UB (unlike in the general case with optimization enabled), so do just print whatever's in a's memory location.)
The swap function compiles like this (again, GCC10.2 -O0):
swap:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-4], 20
nop
pop rbp
ret
Keep in mind none of this has anything to do with valid portable C. This (using garbage left in memory or registers) one of the kinds of things you see in practice from C that invokes undefined behaviour, but certainly not the only thing. See also What Every C Programmer Should Know About Undefined Behavior from the LLVM blog.
This answer is just answering the literal question of what exactly happened in asm. (I'm assuming un-optimized GCC because that easily explains the result, and x86-64 because that's a common ISA, especially when people forget to mention any ISA.)
Other compilers are different, and GCC will be different if you enable optimization.
You need to use return or use pointer.
Using return function.
#include <stdio.h>
int swap () {
return 20;
}
int main(void){
int a = swap(100);
printf ("Value: %d", a);
return 0;
}
Using pointer function.
#include <stdio.h>
int swap (int* x) {
(*x) = 20;
}
int main(void){
int a;
swap(&a);
printf ("Value: %d", a);
return 0;
}

Read flag register from C program

For the sake of curiosity I'm trying to read the flag register and print it out in a nice way.
I've tried reading it using gcc's asm keyword, but i can't get it to work. Any hints how to do it? I'm running a Intel Core 2 Duo and Mac OS X. The following code is what I have. I hoped it would tell me if an overflow happened:
#include <stdio.h>
int main (void){
int a=10, b=0, bold=0;
printf("%d\n",b);
while(1){
a++;
__asm__ ("pushf\n\t"
"movl 4(%%esp), %%eax\n\t"
"movl %%eax , %0\n\t"
:"=r"(b)
:
:"%eax"
);
if(b!=bold){
printf("register changed \n %d\t to\t %d",bold , b);
}
bold = b;
}
}
This gives a segmentation fault. When I run gdb on it I get this:
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x000000005fbfee5c
0x0000000100000eaf in main () at asm.c:9
9 asm ("pushf \n\t"
You can use the PUSHF/PUSHFD/PUSHFQ instruction (see http://siyobik.info/main/reference/instruction/PUSHF%2FPUSHFD for details) to push the flag register onto the stack. From there on you can interpret it in C. Otherwise you can test directly (against the carry flag for unsigned arithmetic or the overflow flag for signed arithmetic) and branch.
(to be specific, to test for the overflow bit you can use JO (jump if set) and JNO (jump if not set) to branch -- it's bit #11 (0-based) in the register)
About the EFLAGS bit layout: http://en.wikibooks.org/wiki/X86_Assembly/X86_Architecture#EFLAGS_Register
A very crude Visual C syntax test (just wham-bam / some jumps to debug flow), since I don't know about the GCC syntax:
int test2 = 2147483647; // max 32-bit signed int (0x7fffffff)
unsigned int flags_w_overflow, flags_wo_overflow;
__asm
{
mov ebx, test2 // ebx = test value
// test for no overflow
xor eax, eax // eax = 0
add eax, ebx // add ebx
jno no_overflow // jump if no overflow
testoverflow:
// test for overflow
xor ecx, ecx // ecx = 0
inc ecx // ecx = 1
add ecx, ebx // overflow!
pushfd // store flags (32 bits)
jo overflow // jump if overflow
jmp done // jump if not overflown :(
no_overflow:
pushfd // store flags (32 bits)
pop edx // edx = flags w/o overflow
jmp testoverflow // back to next test
overflow:
jmp done // yeah we're done here :)
done:
pop eax // eax = flags w/overflow
mov flags_w_overflow, eax // store
mov flags_wo_overflow, edx // store
}
if (flags_w_overflow & (1 << 11)) __asm int 0x3 // overflow bit set correctly
if (flags_wo_overflow & (1 << 11)) __asm int 0x3 // overflow bit set incorrectly
return 0;
This maybe the case of the XY problem. To check for overflow you do not need to get the hardware overflow flag as you think because the flag can be calculated easily from the sign bits
An illustrative example is what happens if we add 127 and 127 using 8-bit registers. 127+127 is 254, but using 8-bit arithmetic the result would be 1111 1110 binary, which is -2 in two's complement, and thus negative. A negative result out of positive operands (or vice versa) is an overflow. The overflow flag would then be set so the program can be aware of the problem and mitigate this or signal an error. The overflow flag is thus set when the most significant bit (here considered the sign bit) is changed by adding two numbers with the same sign (or subtracting two numbers with opposite signs). Overflow never occurs when the sign of two addition operands are different (or the sign of two subtraction operands are the same).
Internally, the overflow flag is usually generated by an exclusive or of the internal carry into and out of the sign bit. As the sign bit is the same as the most significant bit of a number considered unsigned, the overflow flag is "meaningless" and normally ignored when unsigned numbers are added or subtracted.
https://en.wikipedia.org/wiki/Overflow_flag
So the C implementation is
int add(int a, int b, int* overflowed)
{
// do an unsigned addition since to prevent UB due to signed overflow
unsigned int r = (unsigned int)a + (unsigned int)b;
// if a and b have the same sign and the result's sign is different from a and b
// then the addition was overflowed
*overflowed = !!((~(a ^ b) & (a ^ r)) & 0x80000000);
return (int)r;
}
This way it works portably on any architectures, unlike your solution which only works on x86. Smart compilers may recognize the pattern and change to using the overflow flag if possible. On most RISC architectures like MIPS or RISC-V there is no flag and all signed/unsigned overflow must be checked in software by analyzing the sign bits like that
Some compilers have intrinsics for checking overflow like __builtin_add_overflow in Clang and GCC. And with that intrinsic you can also easily see how the overflow is calculated on non-flag architectures. For example on ARM it's done like this
add w3, w0, w1 # r = a + b
eon w0, w0, w1 # a = a ^ ~b
eor w1, w3, w1 # b = b ^ r
str w3, [x2] # store sum ([x2] = r)
and w0, w1, w0 # a = a & b = (a ^ ~b) & (b ^ r)
lsr w0, w0, 31 # overflowed = a >> 31
ret
which is just a variation of what I've written above
See also
Checking overflow in C
Detecting signed overflow in C/C++
Is it possible to access the overflow flag register in a CPU with C++?
Very detailed explanation of Overflow and Carry flags evaluation techniques
For unsigned int it's much easier
unsigned int a, b, result = a + b;
int overflowed = (result < a);
The compiler can reorder instructions, so you cannot rely on your lahf being next to the increment. In fact, there may not be an increment at all. In your code, you don't use the value of a, so the compiler can completely optimize it out.
So, either write the increment + check in assembler, or write it in C.
Also, lahf loads only ah (8 bits) from eflags, and the Overflow flag is outside of that. Better use pushf; pop %eax.
Some tests:
#include <stdio.h>
int main (void){
int a=2147483640, b=0, bold=0;
printf("%d\n",b);
while(1){
a++;
__asm__ __volatile__ ("pushf \n\t"
"pop %%eax\n\t"
"movl %%eax, %0\n\t"
:"=r"(b)
:
:"%eax"
);
if((b & 0x800) != (bold & 0x800)){
printf("register changed \n %x\t to\t %x\n",bold , b);
}
bold = b;
}
}
$ gcc -Wall -o ex2 ex2.c
$ ./ex2 # Works by sheer luck
0
register changed
200206 to 200a96
register changed
200a96 to 200282
$ gcc -Wall -O -o ex2 ex2.c
$ ./ex2 # Doesn't work, the compiler hasn't even optimized yet!
0
You can't assume anything about how GCC implemented the a++ operation, or whether it even did the computation before your inline asm, or before a function call.
You could make a an (unused) input to your inline asm, but gcc could still have chosen to use lea to copy-and-add instead of inc or add, or constant-propagation after inlining could have turned it into a mov-immediate.
And of course gcc could have done some other computation that writes FLAGS right before your inline asm.
There is no way to make a++; asm(...) safe for this
Stop now, you're on the wrong track. If you insist on using asm, you need to do the add or inc inside the asm so you can read the flags output. If you only care about the overflow flag, use SETCC, specifically seto %0, to create an 8-bit output value. Or better, use GCC6 flag-output syntax to tell the compiler that a boolean output result is in the OF condition in FLAGS at the end of your inline asm.
Also, signed overflow in C is undefined behaviour, so actually causing overflow in a++ is already a bug. It usually won't manifest itself if you somehow detect it after the fact, but if you use a as an array index or something gcc may have widened it to 64-bit to avoid redoing sign-extension.
GCC has builtins for add with overflow detection, since gcc5
There are builtins for signed/unsigned add, sub, and mul, see the GCC manual, that avoid signed-overflow UB and tell you if there was overflow.
bool __builtin_add_overflow (type1 a, type2 b, type3 *res) is the generic version
bool __builtin_sadd_overflow (int a, int b, int *res) is the signed int version
bool __builtin_saddll_overflow (long long int a, long long int b, long long int *res) is the signed 64-bit long long version.
The compiler will attempt to use hardware instructions to implement these built-in functions where possible, like conditional jump on overflow after addition, conditional jump on carry etc.
There's a saddl version in case you want the operation for whatever size long is on the target platform. (For x86-64 gcc, int is always 32-bit, long long is always 64-bit, but long depends on Windows vs. non-Windows. For platforms like AVR, int would be 16-bit, and only long would be 32-bit.)
int checked_add_int(int a, int b, bool *of) {
int result;
*of = __builtin_sadd_overflow(a, b, &result);
return result;
}
compiles with gcc -O3 for x86-64 System V to this asm, on Godbolt
checked_add_int:
mov eax, edi
add eax, esi # can't use the normal lea eax, [rdi+rsi]
seto BYTE PTR [rdx]
and BYTE PTR [rdx], 1 # silly compiler, it's already 0/1
ret
ICC19 uses setcc into an integer register and then stores that, same difference as far as uops, but worse code-size.
After inlining to a caller that did if(of) {} it should just jo or jno instead of actually using setcc to create an integer 0/1; in general this should inline efficiently.
Also, since gcc7, there's a builtin to ask if an addition (after promotion to a given type) would overflow, without returning the value.
#include <stdbool.h>
int overflows(int a, int b) {
bool of = __builtin_add_overflow_p(a, b, (int)0);
return of;
}
compiles with gcc -O3 for x86-64 System V to this asm, also on Godbolt
overflows:
xor eax, eax
add edi, esi
seto al
ret
See also Detecting signed overflow in C/C++
Others have offered good alternate code and reasons why what you're trying to do probably doesn't give the result you want, but the actual bug in your code is that you corrupted the stack state by pushing without popping. I would rewrite the asm as:
pushf
pop %0
Or you could just add $4,%%esp at the end of your asm to fix the stack pointer if you prefer the inefficient way.
The following C program will read the FLAGS register when compiled with GCC and any x86 or x86_64 machine following a calling convention in which integers are returned to %eax. You may need to pass the -zexecstack argument to the compiler.
#include<stdio.h>
#include<stdlib.h>
int(*f)()=(void*)L"\xc3589c";
int main( int argc, char **argv ) {
if( argc < 3 ) {
printf( "Usage: %s <augend> <addend>\n", *argv );
return 0;
}
int a=atoi(argv[1])+atoi(argv[2]);
int b=f();
printf("%d CF %d PF %d AF %d ZF %d SF %d TF %d IF %d DF %d OF %d IOPL %d NT %d RF %d VM %d AC %d VIF %d VIP %d ID %d\n", a, b&1, b/4&1, b>>4&1, b>>6&1, b>>7&1, b>>8&1, b>>9&1, b>>10&1, b>>11&1, b>>12&3, b>>14&1, b>>16&1, b>>17&1, b>>18&1, b>>19&1, b>>20&1, b>>21&1 );
}
Try it online!
The funny looking string literal disassembles to
0x0000000000000000: 9C pushfq
0x0000000000000001: 58 pop rax
0x0000000000000002: C3 ret

Is a += b more efficient than a = a + b in C?

I know in some languages the following:
a += b
is more efficient than:
a = a + b
because it removes the need for creating a temporary variable. Is this the case in C? Is it more efficient to use += (and, therefore also -= *= etc)
So here's a definitive answer...
$ cat junk1.c
#include <stdio.h>
int main()
{
long a, s = 0;
for (a = 0; a < 1000000000; a++)
{
s = s + a * a;
}
printf("Final sum: %ld\n", s);
}
michael#isolde:~/junk$ cat junk2.c
#include <stdio.h>
int main()
{
long a, s = 0;
for (a = 0; a < 1000000000; a++)
{
s += a * a;
}
printf("Final sum: %ld\n", s);
}
michael#isolde:~/junk$ for a in *.c ; do gcc -O3 -o ${a%.c} $a ; done
michael#isolde:~/junk$ time ./junk1
Final sum: 3338615082255021824
real 0m2.188s
user 0m2.120s
sys 0m0.000s
michael#isolde:~/junk$ time ./junk2
Final sum: 3338615082255021824
real 0m2.179s
user 0m2.120s
sys 0m0.000s
...for my computer and my compiler running on my operating system. Your results may or may not vary. On my system, however, the time is identical: user time 2.120s.
Now just to show you how impressive modern compilers can be, you'll note that I used the expression a * a in the assignment. This is because of this little problem:
$ cat junk.c
#include <stdio.h>
int main()
{
long a, s = 0;
for (a = 0; a < 1000000000; a++)
{
s = s + a;
}
printf("Final sum: %ld\n", s);
}
michael#isolde:~/junk$ gcc -O3 -S junk.c
michael#isolde:~/junk$ cat junk.s
.file "junk.c"
.section .rodata.str1.1,"aMS",#progbits,1
.LC0:
.string "Final sum: %ld\n"
.text
.p2align 4,,15
.globl main
.type main, #function
main:
.LFB22:
.cfi_startproc
movabsq $499999999500000000, %rdx
movl $.LC0, %esi
movl $1, %edi
xorl %eax, %eax
jmp __printf_chk
.cfi_endproc
.LFE22:
.size main, .-main
.ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
.section .note.GNU-stack,"",#progbits
The compiler figured out my loop and unrolled it to the point of calculating the cumulative sum and just embedded that as a constant which it proceeded to print out, skipping any kind of looping construct entirely. In the face of optimizers that clever do you really think you're going to find any meaningful edge in distinguishing between s = s + a and s += a?!
This is a compiler specific question really, but I expect all modern compilers would give the same result. Using Visual Studio 2008:
int main() {
int a = 10;
int b = 30;
a = a + b;
int c = 10;
int d = 50;
c += d;
}
The line a = a + b has disassembly
0014139C mov eax,dword ptr [a]
0014139F add eax,dword ptr [b]
001413A2 mov dword ptr [a],eax
The line c += d has disassembly
001413B3 mov eax,dword ptr [c]
001413B6 add eax,dword ptr [d]
001413B9 mov dword ptr [c],eax
Which is the same. They are compiled into the same code.
It depends on what a is.
a += b in C is by definition equivalent to a = a + b, except that from the abstract point of view a is evaluated only once in the former variant. If a is a "pure" value, i.e. if evaluating it once vs. evaluating it many times makes no impact on programs behavior, then a += b is strictly equivalent to a = a + b in all regards, including efficiency.
In other words, in situations when you actually have a free choice between a += b and a = a + b (meaning that you know that they do the same thing) they will generally have exactly the same efficiency. Some compilers might have difficulties when a stands for a function call (for one example; probably not what you meant), but when a is a non-volatile variable the machine code generated for both expressions will be the same.
For another example, if a is a volatile variable, then a += b and a = a + b have different behavior and, therefore, different efficiency. However, since they are not equivalent, your question simply does not apply in such cases.
In the simple cases shown in the question, there is no significant difference. Where the assignment operator scores is when you have an expression such as:
s[i]->m[j1].k = s[i]->m[jl].k + 23; // Was that a typo?
vs:
s[i]->m[j1].k += 23;
Two benefits - and I'm not counting less typing. There's no question about whether there was a typo when the first and second expressions differ; and the compiler doesn't evaluate the complex expression twice. The chances are that won't make much difference these days (optimizing compilers are a lot better than they used to be), but you could have still more complex expressions (evaluating a function defined in another translation unit, for example, as part of the subscripting) where the compiler may not be able to avoid evaluating the expression twice:
s[i]->m[somefunc(j1)].k = s[i]->m[somefunc(j1)].k + 23;
s[i]->m[somefunc(j1)].k += 23;
Also, you can write (if you're brave):
s[i++]->m[j1++].k += 23;
But you cannot write:
s[i++]->m[j1++].k = s[i]->m[j1].k + 23;
s[i]->m[j1].k = s[i++]->m[j1++].k + 23;
(or any other permutation) because the order of evaluation is not defined.
a += b
is more efficient than
a = a + b
because the former takes you 6 keystrokes and the latter takes you 9 keystrokes.
With modern hardware, even if the compiler is stupid and uses slower code for one than the other, the total time saved over the lifetime of the program may possibly be less than the time it takes you to type the three extra key strokes.
However, as others have said, the compiler almost certainly produces exactly the same code so the former is more efficient.
Even if you factor in readability, most C programmers probably mentally parse the former more quickly than the latter because it is such a common pattern.
In virtually all cases, the two produce identical results.
Other than with a truly ancient or incompetently written compiler there should be no difference, as long as a and b are normal variables so the two produce equivalent results.
If you were dealing with C++ rather than C, operator overloading would allow there to be more substantial differences though.

How to deal with -Wconversion warnings from GCC?

I'm building my project with GCC's -Wconversion warning flag. (gcc (Debian 4.3.2-1.1) 4.3.2) on a 64bit GNU/Linux OS/Hardware. I'm finding it useful in identifying where I've mixed types or lost clarity as to which types should be used.
It's not so helpful in most of the other situations which activate it's warnings and I'm asking how am I meant to deal with these:
enum { A = 45, B, C }; /* fine */
char a = A; /* huh? seems to not warn about A being int. */
char b = a + 1; /* warning converting from int to char */
char c = B - 2; /* huh? ignores this *blatant* int too.*/
char d = (a > b ? b : c) /* warning converting from int to char */
Due to the unexpected results of the above tests (cases a and c) I'm also asking for these differences to be explained also.
Edit: And is it over-engineering to cast all these with (char) to prevent the warning?
Edit2: Some extra cases (following on from above cases):
a += A; /* warning converting from int to char */
a++; /* ok */
a += (char)1; /* warning converting from int to char */
Aside from that, what I'm asking is subjective and I'd like to hear how other people deal with the conversion warnings in cases like these when you consider that some developers advocate removing all warnings.
YAE:
One possible solution is to just use ints instead of chars right? Well actually, not only does it require more memory, it is slower too, as can been demonstrated by the following code. The maths expressions are just there to get the warnings when built with -Wconversion. I assumed the version using char variables would run slower than that using ints due to the conversions, but on my (64bit dual core II) system the int version is slower.
#include <stdio.h>
#ifdef USE_INT
typedef int var;
#else
typedef char var;
#endif
int main()
{
var start = 10;
var end = 100;
var n = 5;
int b = 100000000;
while (b > 0) {
n = (start - 5) + (n - (n % 3 ? 1 : 3));
if (n >= end) {
n -= (end + 7);
n += start + 2;
}
b--;
}
return 0;
}
Pass -DUSE_INT to gcc to build the int version of the above snippet.
When you say /* int */ do you mean it's giving you a warning about that? I'm not seeing any warnings at all in this code with gcc 4.0.1 or 4.2.1 with -Wconversion. The compiler is converting these enums into constants. Since everything is known at compile time, there is no reason to generate a warning. The compiler can optimize out all the uncertainty (the following is Intel with 4.2.1):
movb $45, -1(%rbp) # a = 45
movzbl -1(%rbp), %eax
incl %eax
movb %al, -2(%rbp) # b = 45 + 1
movb $44, -3(%rbp) # c = 44 (the math is done at compile time)
movzbl -1(%rbp), %eax
cmpb -2(%rbp), %al
jle L2
movzbl -2(%rbp), %eax
movb %al, -17(%rbp)
jmp L4
L2:
movzbl -3(%rbp), %eax
movb %al, -17(%rbp)
L4:
movzbl -17(%rbp), %eax
movb %al, -4(%rbp) # d = (a > b ? b : c)
This is without turning on optimizations. With optimizations, it will calculate b and d for you at compile time and hardcode their final values (if it actually needs them for anything). The point is that gcc has already worked out that there can't be a problem here because all the possible values fit in a char.
EDIT: Let me amend this somewhat. There is a possible error in the assignment of b, and the compiler will never catch it, even if it's certain. For example, if b=a+250;, then this will be certain to overflow b but gcc will not issue a warning. It's because the assignment to a is legal, a is a char, and it's your problem (not the compiler's) to make sure that math doesn't overflow at runtime.
Maybe the compiler can already see that all the values fit into a char so it doesn't bother warning. I'd expect the enum to be resolved right at the beginning of the compilation.

Resources