Related
I have an R package with C compiled code that's been relatively stable for quite a while and is frequently tested against a broad variety of platforms and compilers (windows/osx/debian/fedora gcc/clang).
More recently a new platform was added to test the package again:
Logs from checks with gcc trunk aka 10.0.1 compiled from source
on Fedora 30. (For some archived packages, 10.0.0.)
x86_64 Fedora 30 Linux
FFLAGS="-g -O2 -mtune=native -Wall -fallow-argument-mismatch"
CFLAGS="-g -O2 -Wall -pedantic -mtune=native -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong -fstack-clash-protection -fcf-protection"
CXXFLAGS="-g -O2 -Wall -pedantic -mtune=native -Wno-ignored-attributes -Wno-deprecated-declarations -Wno-parentheses -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong -fstack-clash-protection -fcf-protection"
At which point the compiled code promptly started segfaulting along these lines:
*** caught segfault ***
address 0x1d00000001, cause 'memory not mapped'
I've been able to reproduce the segfault consistently by using the rocker/r-base docker container with gcc-10.0.1 with optimization level -O2. Running a lower optimization gets rid of the problem. Running any other set-up, including under valgrind (both -O0 and -O2), UBSAN (gcc/clang), shows no problems at all. I'm also reasonably sure this ran under gcc-10.0.0, but don't have the data.
I ran the gcc-10.0.1 -O2 version with gdb and noticed something that seems odd to me:
While stepping through the highlighted section it appears the initialization of the second elements of the arrays is skipped (R_alloc is a wrapper around malloc that self garbage collects when returning control to R; the segfault happens before return to R). Later, the program crashes when the un-initialized element (in the gcc.10.0.1 -O2 version) is accessed.
I fixed this by explicitly initializing the element in question everywhere in the code that eventually led to the usage of the element, but it really should have been initialized to an empty string, or at least that's what I would have assumed.
Am I missing something obvious or doing something stupid? Both are reasonably likely as C is my second language by far. It's just strange that this just cropped up now, and I can't figure out what the compiler is trying to do.
UPDATE: Instructions to reproduce this, although this will only reproduce so long as debian:testing docker container has gcc-10 at gcc-10.0.1. Also, don't just run these commands if you don't trust me.
Sorry this is not a minimal reproducible example.
docker pull rocker/r-base
docker run --rm -ti --security-opt seccomp=unconfined \
rocker/r-base /bin/bash
apt-get update
apt-get install gcc-10 gdb
gcc-10 --version # confirm 10.0.1
# gcc-10 (Debian 10-20200222-1) 10.0.1 20200222 (experimental)
# [master revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779]
mkdir ~/.R
touch ~/.R/Makevars
echo "CC = gcc-10
CFLAGS = -g -O2 -Wall -pedantic -mtune=native -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong -fstack-clash-protection -fcf-protection
" >> ~/.R/Makevars
R -d gdb --vanilla
Then in the R console, after typing run to get gdb to run the program:
f.dl <- tempfile()
f.uz <- tempfile()
github.url <- 'https://github.com/brodieG/vetr/archive/v0.2.8.zip'
download.file(github.url, f.dl)
unzip(f.dl, exdir=f.uz)
install.packages(
file.path(f.uz, 'vetr-0.2.8'), repos=NULL,
INSTALL_opts="--install-tests", type='source'
)
# minimal set of commands to segfault
library(vetr)
alike(pairlist(a=1, b="character"), pairlist(a=1, b=letters))
alike(pairlist(1, "character"), pairlist(1, letters))
alike(NULL, 1:3) # not a wild card at top level
alike(list(NULL), list(1:3)) # but yes when nested
alike(list(NULL, NULL), list(list(list(1, 2, 3)), 1:25))
alike(list(NULL), list(1, 2))
alike(list(), list(1, 2))
alike(matrix(integer(), ncol=7), matrix(1:21, nrow=3))
alike(matrix(character(), nrow=3), matrix(1:21, nrow=3))
alike(
matrix(integer(), ncol=3, dimnames=list(NULL, c("R", "G", "B"))),
matrix(1:21, ncol=3, dimnames=list(NULL, c("R", "G", "B")))
)
# Adding tests from docs
mx.tpl <- matrix(
integer(), ncol=3, dimnames=list(row.id=NULL, c("R", "G", "B"))
)
mx.cur <- matrix(
sample(0:255, 12), ncol=3, dimnames=list(row.id=1:4, rgb=c("R", "G", "B"))
)
mx.cur2 <-
matrix(sample(0:255, 12), ncol=3, dimnames=list(1:4, c("R", "G", "B")))
alike(mx.tpl, mx.cur2)
Inspecting in gdb pretty quickly shows (if I understand correctly) that
CSR_strmlen_x is trying to access the string that was not initialized.
UPDATE 2: this is a highly recursive function, and on top of that the string initialization bit gets called many, many times. This is mostly b/c I was being lazy, we only need the strings initialized for the one time we actually encounter something we want to report in the recursion, but it was easier to initialize every time it is possible to encounter something. I mention this because what you'll see next shows multiple initializations, but only one of them (presumably the one with address <0x1400000001>) is being used.
I can't guarantee that the stuff I'm showing here is directly related to the element that caused the segfault (though it is the same illegal address acccess), but as #nate-eldredge asked it does show that the array element is not initialized either just before return or just after return in the calling function. Note the calling function is initializing 8 of these, and I show them all, with all them filled with either garbage or inaccessible memory.
UPDATE 3, disassembly of function in question:
Breakpoint 1, ALIKEC_res_strings_init () at alike.c:75
75 return res;
(gdb) p res.current[0]
$1 = 0x7ffff46a0aa5 "%s%s%s%s"
(gdb) p res.current[1]
$2 = 0x1400000001 <error: Cannot access memory at address 0x1400000001>
(gdb) disas /m ALIKEC_res_strings_init
Dump of assembler code for function ALIKEC_res_strings_init:
53 struct ALIKEC_res_strings ALIKEC_res_strings_init() {
0x00007ffff4687fc0 <+0>: endbr64
54 struct ALIKEC_res_strings res;
55
56 res.target = (const char **) R_alloc(5, sizeof(const char *));
0x00007ffff4687fc4 <+4>: push %r12
0x00007ffff4687fc6 <+6>: mov $0x8,%esi
0x00007ffff4687fcb <+11>: mov %rdi,%r12
0x00007ffff4687fce <+14>: push %rbx
0x00007ffff4687fcf <+15>: mov $0x5,%edi
0x00007ffff4687fd4 <+20>: sub $0x8,%rsp
0x00007ffff4687fd8 <+24>: callq 0x7ffff4687180 <R_alloc#plt>
0x00007ffff4687fdd <+29>: mov $0x8,%esi
0x00007ffff4687fe2 <+34>: mov $0x5,%edi
0x00007ffff4687fe7 <+39>: mov %rax,%rbx
57 res.current = (const char **) R_alloc(5, sizeof(const char *));
0x00007ffff4687fea <+42>: callq 0x7ffff4687180 <R_alloc#plt>
58
59 res.target[0] = "%s%s%s%s";
0x00007ffff4687fef <+47>: lea 0x1764a(%rip),%rdx # 0x7ffff469f640
0x00007ffff4687ff6 <+54>: lea 0x18aa8(%rip),%rcx # 0x7ffff46a0aa5
0x00007ffff4687ffd <+61>: mov %rcx,(%rbx)
60 res.target[1] = "";
61 res.target[2] = "";
0x00007ffff4688000 <+64>: mov %rdx,0x10(%rbx)
62 res.target[3] = "";
0x00007ffff4688004 <+68>: mov %rdx,0x18(%rbx)
63 res.target[4] = "";
0x00007ffff4688008 <+72>: mov %rdx,0x20(%rbx)
64
65 res.tar_pre = "be";
66
67 res.current[0] = "%s%s%s%s";
0x00007ffff468800c <+76>: mov %rax,0x8(%r12)
0x00007ffff4688011 <+81>: mov %rcx,(%rax)
68 res.current[1] = "";
69 res.current[2] = "";
0x00007ffff4688014 <+84>: mov %rdx,0x10(%rax)
70 res.current[3] = "";
0x00007ffff4688018 <+88>: mov %rdx,0x18(%rax)
71 res.current[4] = "";
0x00007ffff468801c <+92>: mov %rdx,0x20(%rax)
72
73 res.cur_pre = "is";
74
75 return res;
=> 0x00007ffff4688020 <+96>: lea 0x14fe0(%rip),%rax # 0x7ffff469d007
0x00007ffff4688027 <+103>: mov %rax,0x10(%r12)
0x00007ffff468802c <+108>: lea 0x14fcd(%rip),%rax # 0x7ffff469d000
0x00007ffff4688033 <+115>: mov %rbx,(%r12)
0x00007ffff4688037 <+119>: mov %rax,0x18(%r12)
0x00007ffff468803c <+124>: add $0x8,%rsp
0x00007ffff4688040 <+128>: pop %rbx
0x00007ffff4688041 <+129>: mov %r12,%rax
0x00007ffff4688044 <+132>: pop %r12
0x00007ffff4688046 <+134>: retq
0x00007ffff4688047: nopw 0x0(%rax,%rax,1)
End of assembler dump.
UPDATE 4:
So, trying to parse through the standard here are the parts of it that seem relevant (C11 draft):
6.3.2.3 Par7 Conversions > Other Operands > Pointers
A pointer to an object type may be converted to a pointer to a
different object type. If the resulting pointer is not correctly
aligned 68) for the referenced type, the behavior is undefined.
Otherwise, when converted back again, the result shall compare
equal to the original pointer. When a pointer to an object is
converted to a pointer to a character type,the result points to the
lowest addressed byte of the object. Successive increments of
the result, up to the size of the object, yield pointers to the
remaining bytes of the object.
6.5 Par6 Expressions
The effective type of an object for an access to its stored value is the
declared type of the object, if any. 87) If a value is stored into
an object having no declared type through an lvalue having a
type that is not a character type, then the type of the lvalue becomes
the effective type of the object for that access and for subsequent
accesses that do not modify the stored value. If a value is
copied into an object having no declared type
using memcpy or memmove, or is copied as an array of character type, then
the effective type of the modified object for that access and for
subsequent accesses that do not modify the value is the effective type
of the object from which the value is copied, if it has one. For all
other accesses to an object having no declared type, the effective
type of the object is simply the type of the lvalue used for the
access.
87) Allocated objects have no declared type.
IIUC R_alloc returns an offset into a malloced block that is guaranteed to be double aligned, and the size of the block after the offset is of the requested size (there is also allocation before the offset for R specific data). R_alloc casts that pointer to (char *) on return.
Section 6.2.5 Par 29
A pointer to void shall have the same representation and
alignment requirements as a pointer to a character
type. 48) Similarly, pointers to qualified or unqualified versions
of compatible types shall have the same representation and
alignment requirements. All pointers to structure types shall have
the same representation and alignment requirements as each other.
All pointers to union types shall have the same
representation and alignment requirements as each other.
Pointers to other types need not have the same representation
or alignment requirements.
48) The same representation and alignment requirements are meant to imply interchangeability asarguments to functions, return values from functions, and members of unions.
So the question is "are we allowed to recast the (char *) to (const char **) and write to it as (const char **)". My reading of the above is that so long as pointers on the systems the code run in have alignment compatible with double alignment, then its okay.
Are we violating "strict aliasing"? i.e.:
6.5 Par 7
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types: 88)
— a type compatible with the effective type of the object
...
88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.
So, what should the compiler think the effective type of the object pointed to by res.target (or res.current) is? Presumably the declared type (const char **), or is this actually ambiguous? It feels to me that it isn't in this case only because there is no other 'lvalue' in scope that accesses the same object.
I'll admit I'm struggling mightily to extract sense from these sections of the standard.
Summary: This appears to be a bug in gcc, related to string optimization. A self-contained testcase is below. There was initially some doubt as to whether the code is correct, but I think it is.
I have reported the bug as PR 93982. A proposed fix was committed but it does not fix it in all cases, leading to the followup PR 94015 (godbolt link).
You should be able to work around the bug by compiling with the flag -fno-optimize-strlen.
I was able to reduce your test case to the following minimal example (also on godbolt):
struct a {
const char ** target;
};
char* R_alloc(void);
struct a foo(void) {
struct a res;
res.target = (const char **) R_alloc();
res.target[0] = "12345678";
res.target[1] = "";
res.target[2] = "";
res.target[3] = "";
res.target[4] = "";
return res;
}
With gcc trunk (gcc version 10.0.1 20200225 (experimental)) and -O2 (all other options turned out to be unnecessary), the generated assembly on amd64 is as follows:
.LC0:
.string "12345678"
.LC1:
.string ""
foo:
subq $8, %rsp
call R_alloc
movq $.LC0, (%rax)
movq $.LC1, 16(%rax)
movq $.LC1, 24(%rax)
movq $.LC1, 32(%rax)
addq $8, %rsp
ret
So you are quite right that the compiler is failing to initialize res.target[1] (note the conspicuous absence of movq $.LC1, 8(%rax)).
It is interesting to play with the code and see what affects the "bug". Perhaps significantly, changing the return type of R_alloc to void * makes it go away, and gives you "correct" assembly output. Maybe less significantly but more amusingly, changing the string "12345678" to be either longer or shorter also makes it go away.
Previous discussion, now resolved - the code is apparently legal.
The question I have is whether your code is actually legal. The fact that you take the char * returned by R_alloc() and cast it to const char **, and then store a const char * seems like it might violate the strict aliasing rule, as char and const char * are not compatible types. There is an exception that allows you to access any object as char (to implement things like memcpy), but this is the other way around, and as best I understand it, that's not allowed. It makes your code produce undefined behavior and so the compiler can legally do whatever the heck it wants.
If this is so, the correct fix would be for R to change their code so that R_alloc() returns void * instead of char *. Then there would be no aliasing problem. Unfortunately, that code is outside your control, and it's not clear to me how you can use this function at all without violating strict aliasing. A workaround might be to interpose a temporary variable, e.g. void *tmp = R_alloc(); res.target = tmp; which solves the problem in the test case, but I'm still not sure if it's legal.
However, I am not sure of this "strict aliasing" hypothesis, because compiling with -fno-strict-aliasing, which AFAIK is supposed to make gcc allow such constructs, does not make the problem go away!
Update. Trying some different options, I found that either -fno-optimize-strlen or -fno-tree-forwprop will result in "correct" code being generated. Also, using -O1 -foptimize-strlen yields the incorrect code (but -O1 -ftree-forwprop does not).
After a little git bisect exercise, the error seems to have been introduced in commit 34fcf41e30ff56155e996f5e04.
Update 2. I tried digging into the gcc source a little bit, just to see what I could learn. (I don't claim to be any sort of compiler expert!)
It looks like the code in tree-ssa-strlen.c is meant to keep track of strings appearing in the program. As near as I can tell, the bug is that in looking at the statement res.target[0] = "12345678"; the compiler conflates the address of the string literal "12345678" with the string itself. (That seems to be related to this suspicious code which was added in the aforementioned commit, where if it tries to count the bytes of a "string" that is actually an address, it instead looks at what that address points to.)
So it thinks that the statement res.target[0] = "12345678", instead of storing the address of "12345678" at the address res.target, is storing the string itself at that address, as if the statement were strcpy(res.target, "12345678"). Note for what's ahead that this would result in the trailing nul being stored at address res.target+8 (at this stage in the compiler, all offsets are in bytes).
Now when the compiler looks at res.target[1] = "", it likewise treats this as if it were strcpy(res.target+8, ""), the 8 coming from the size of a char *. That is, as if it were simply storing a nul byte at address res.target+8. However, the compiler "knows" that the previous statement already stored a nul byte at that very address! As such, this statement is "redundant" and can be discarded (here).
This explains why the string has to be exactly 8 characters long to trigger the bug. (Though other multiples of 8 can also trigger the bug in other situations.)
If in C I write:
int num;
Before I assign anything to num, is the value of num indeterminate?
Static variables (file scope and function static) are initialized to zero:
int x; // zero
int y = 0; // also zero
void foo() {
static int x; // also zero
}
Non-static variables (local variables) are indeterminate. Reading them prior to assigning a value results in undefined behavior.
void foo() {
int x;
printf("%d", x); // the compiler is free to crash here
}
In practice, they tend to just have some nonsensical value in there initially - some compilers may even put in specific, fixed values to make it obvious when looking in a debugger - but strictly speaking, the compiler is free to do anything from crashing to summoning demons through your nasal passages.
As for why it's undefined behavior instead of simply "undefined/arbitrary value", there are a number of CPU architectures that have additional flag bits in their representation for various types. A modern example would be the Itanium, which has a "Not a Thing" bit in its registers; of course, the C standard drafters were considering some older architectures.
Attempting to work with a value with these flag bits set can result in a CPU exception in an operation that really shouldn't fail (eg, integer addition, or assigning to another variable). And if you go and leave a variable uninitialized, the compiler might pick up some random garbage with these flag bits set - meaning touching that uninitialized variable may be deadly.
0 if static or global, indeterminate if storage class is auto
C has always been very specific about the initial values of objects. If global or static, they will be zeroed. If auto, the value is indeterminate.
This was the case in pre-C89 compilers and was so specified by K&R and in DMR's original C report.
This was the case in C89, see section 6.5.7 Initialization.
If an object that has automatic
storage duration is not initialized
explicitely, its value is
indeterminate. If an object that has
static storage duration is not
initialized explicitely, it is
initialized implicitely as if every
member that has arithmetic type were
assigned 0 and every member that has
pointer type were assigned a null
pointer constant.
This was the case in C99, see section 6.7.8 Initialization.
If an object that has automatic
storage duration is not initialized
explicitly, its value is
indeterminate. If an object that has
static storage duration is not
initialized explicitly, then: — if it
has pointer type, it is initialized to
a null pointer; — if it has arithmetic
type, it is initialized to (positive
or unsigned) zero; — if it is an
aggregate, every member is initialized
(recursively) according to these
rules; — if it is a union, the first
named member is initialized
(recursively) according to these
rules.
As to what exactly indeterminate means, I'm not sure for C89, C99 says:
3.17.2 indeterminate valueeither an unspecified value or a trap
representation
But regardless of what standards say, in real life, each stack page actually does start off as zero, but when your program looks at any auto storage class values, it sees whatever was left behind by your own program when it last used those stack addresses. If you allocate a lot of auto arrays you will see them eventually start neatly with zeroes.
You might wonder, why is it this way? A different SO answer deals with that question, see: https://stackoverflow.com/a/2091505/140740
It depends on the storage duration of the variable. A variable with static storage duration is always implicitly initialized with zero.
As for automatic (local) variables, an uninitialized variable has indeterminate value. Indeterminate value, among other things, mean that whatever "value" you might "see" in that variable is not only unpredictable, it is not even guaranteed to be stable. For example, in practice (i.e. ignoring the UB for a second) this code
int num;
int a = num;
int b = num;
does not guarantee that variables a and b will receive identical values. Interestingly, this is not some pedantic theoretical concept, this readily happens in practice as consequence of optimization.
So in general, the popular answer that "it is initialized with whatever garbage was in memory" is not even remotely correct. Uninitialized variable's behavior is different from that of a variable initialized with garbage.
Ubuntu 15.10, Kernel 4.2.0, x86-64, GCC 5.2.1 example
Enough standards, let's look at an implementation :-)
Local variable
Standards: undefined behavior.
Implementation: the program allocates stack space, and never moves anything to that address, so whatever was there previously is used.
#include <stdio.h>
int main() {
int i;
printf("%d\n", i);
}
compile with:
gcc -O0 -std=c99 a.c
outputs:
0
and decompiles with:
objdump -dr a.out
to:
0000000000400536 <main>:
400536: 55 push %rbp
400537: 48 89 e5 mov %rsp,%rbp
40053a: 48 83 ec 10 sub $0x10,%rsp
40053e: 8b 45 fc mov -0x4(%rbp),%eax
400541: 89 c6 mov %eax,%esi
400543: bf e4 05 40 00 mov $0x4005e4,%edi
400548: b8 00 00 00 00 mov $0x0,%eax
40054d: e8 be fe ff ff callq 400410 <printf#plt>
400552: b8 00 00 00 00 mov $0x0,%eax
400557: c9 leaveq
400558: c3 retq
From our knowledge of x86-64 calling conventions:
%rdi is the first printf argument, thus the string "%d\n" at address 0x4005e4
%rsi is the second printf argument, thus i.
It comes from -0x4(%rbp), which is the first 4-byte local variable.
At this point, rbp is in the first page of the stack has been allocated by the kernel, so to understand that value we would to look into the kernel code and find out what it sets that to.
TODO does the kernel set that memory to something before reusing it for other processes when a process dies? If not, the new process would be able to read the memory of other finished programs, leaking data. See: Are uninitialized values ever a security risk?
We can then also play with our own stack modifications and write fun things like:
#include <assert.h>
int f() {
int i = 13;
return i;
}
int g() {
int i;
return i;
}
int main() {
f();
assert(g() == 13);
}
Note that GCC 11 seems to produce a different assembly output, and the above code stops "working", it is undefined behavior after all: Why does -O3 in gcc seem to initialize my local variable to 0, while -O0 does not?
Local variable in -O3
Implementation analysis at: What does <value optimized out> mean in gdb?
Global variables
Standards: 0
Implementation: .bss section.
#include <stdio.h>
int i;
int main() {
printf("%d\n", i);
}
gcc -O0 -std=c99 a.c
compiles to:
0000000000400536 <main>:
400536: 55 push %rbp
400537: 48 89 e5 mov %rsp,%rbp
40053a: 8b 05 04 0b 20 00 mov 0x200b04(%rip),%eax # 601044 <i>
400540: 89 c6 mov %eax,%esi
400542: bf e4 05 40 00 mov $0x4005e4,%edi
400547: b8 00 00 00 00 mov $0x0,%eax
40054c: e8 bf fe ff ff callq 400410 <printf#plt>
400551: b8 00 00 00 00 mov $0x0,%eax
400556: 5d pop %rbp
400557: c3 retq
400558: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
40055f: 00
# 601044 <i> says that i is at address 0x601044 and:
readelf -SW a.out
contains:
[25] .bss NOBITS 0000000000601040 001040 000008 00 WA 0 0 4
which says 0x601044 is right in the middle of the .bss section, which starts at 0x601040 and is 8 bytes long.
The ELF standard then guarantees that the section named .bss is completely filled with of zeros:
.bss This section holds uninitialized data that contribute to the
program’s memory image. By definition, the system initializes the
data with zeros when the program begins to run. The section occu-
pies no file space, as indicated by the section type, SHT_NOBITS.
Furthermore, the type SHT_NOBITS is efficient and occupies no space on the executable file:
sh_size This member gives the section’s size in bytes. Unless the sec-
tion type is SHT_NOBITS , the section occupies sh_size
bytes in the file. A section of type SHT_NOBITS may have a non-zero
size, but it occupies no space in the file.
Then it is up to the Linux kernel to zero out that memory region when loading the program into memory when it gets started.
That depends. If that definition is global (outside any function) then num will be initialized to zero. If it's local (inside a function) then its value is indeterminate. In theory, even attempting to read the value has undefined behavior -- C allows for the possibility of bits that don't contribute to the value, but have to be set in specific ways for you to even get defined results from reading the variable.
The basic answer is, yes it is undefined.
If you are seeing odd behavior because of this, it may depended on where it is declared. If within a function on the stack then the contents will more than likely be different every time the function gets called. If it is a static or module scope it is undefined but will not change.
Because computers have finite storage capacity, automatic variables will typically be held in storage elements (whether registers or RAM) that have previously been used for some other arbitrary purpose. If a such a variable is used before a value has been assigned to it, that storage may hold whatever it held previously, and so the contents of the variable will be unpredictable.
As an additional wrinkle, many compilers may keep variables in registers which are larger than the associated types. Although a compiler would be required to ensure that any value which is written to a variable and read back will be truncated and/or sign-extended to its proper size, many compilers will perform such truncation when variables are written and expect that it will have been performed before the variable is read. On such compilers, something like:
uint16_t hey(uint32_t x, uint32_t mode)
{ uint16_t q;
if (mode==1) q=2;
if (mode==3) q=4;
return q; }
uint32_t wow(uint32_t mode) {
return hey(1234567, mode);
}
might very well result in wow() storing the values 1234567 into registers
0 and 1, respectively, and calling foo(). Since x isn't needed within
"foo", and since functions are supposed to put their return value into
register 0, the compiler may allocate register 0 to q. If mode is 1 or
3, register 0 will be loaded with 2 or 4, respectively, but if it is some
other value, the function may return whatever was in register 0 (i.e. the
value 1234567) even though that value is not within the range of uint16_t.
To avoid requiring compilers to do extra work to ensure that uninitialized
variables never seem to hold values outside their domain, and avoid needing
to specify indeterminate behaviors in excessive detail, the Standard says
that use of uninitialized automatic variables is Undefined Behavior. In
some cases, the consequences of this may be even more surprising than a
value being outside the range of its type. For example, given:
void moo(int mode)
{
if (mode < 5)
launch_nukes();
hey(0, mode);
}
a compiler could infer that because invoking moo() with a mode which is
greater than 3 will inevitably lead to the program invoking Undefined
Behavior, the compiler may omit any code which would only be relevant
if mode is 4 or greater, such as the code which would normally prevent
the launch of nukes in such cases. Note that neither the Standard, nor
modern compiler philosophy, would care about the fact that the return value
from "hey" is ignored--the act of trying to return it gives a compiler
unlimited license to generate arbitrary code.
If storage class is static or global then during loading, the BSS initialises the variable or memory location(ML) to 0 unless the variable is initially assigned some value. In case of local uninitialized variables the trap representation is assigned to memory location. So if any of your registers containing important info is overwritten by compiler the program may crash.
but some compilers may have mechanism to avoid such a problem.
I was working with nec v850 series when i realised There is trap representation which has bit patterns that represent undefined values for data types except for char. When i took a uninitialized char i got a zero default value due to trap representation. This might be useful for any1 using necv850es
As far as i had gone it is mostly depend on compiler but in general most cases the value is pre assumed as 0 by the compliers.
I got garbage value in case of VC++ while TC gave value as 0.
I Print it like below
int i;
printf('%d',i);
I did an experiment to see what kind of assembly language would be generate if I try to get the same function to compile in there twice. I did the following:
I created two simple test files and their corresponding headers. Let's call them a.c/a.h, and b.c/b.h. Here are the contents of those files:
a.h:
#ifndef __A_H__
#define __A_H__
int a( void );
#endif
b.h:
#ifndef __B_H__
#define __B_H__
int b( void );
#endif
a.c:
#include "a.h"
int a( void )
{
return 1;
}
b.c:
#include "b.h"
#include "a.h"
int b( void )
{
return 1 + a();
}
I then created a static archive for a:
gcc -c a.c -o a.o
ar -rsc a.a a.o
and the same for b, including the static archive for a this time:
gcc -c b.c -o b.o
ar -rsc b.a a.a b.o
At this point, I disassemble the static archive for b to verify that it has assembly code for both functions a() and b(). It does.
Now, I define one last file:
main.c:
#include <stdio.h>
#include "a.h"
#include "b.h"
int main( void )
{
printf( "%d %d\n", a(), b() );
return 0;
}
and I compile it thusly:
gcc main.c a.a b.a -o main
This works fine. When I disassemble it, I see the following definitions for a and b in the code:
140 0000000000400561 <a>:
141 400561: 55 push %rbp
142 400562: 48 89 e5 mov %rsp,%rbp
143 400565: b8 01 00 00 00 mov $0x1,%eax
144 40056a: 5d pop %rbp
145 40056b: c3 retq
146
147 000000000040056c <b>:
148 40056c: 55 push %rbp
149 40056d: 48 89 e5 mov %rsp,%rbp
150 400570: e8 ec ff ff ff callq 400561 <a>
151 400575: 83 c0 01 add $0x1,%eax
152 400578: 5d pop %rbp
153 400579: c3 retq
154 40057a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
As you can see, the code has clearly defined b as calling a rather than inlining it, however, there is only one definition of a in the code, no duplicates.
It seems that gcc has either:
Detected the duplicate object code and removed the duplicates
--or--
the b archive was used first, and it included the reference to int a(), so the a archive was ignored.
My question is: is this behavior circumstantial to my test or is it standard, and can I expect the same behavior from other compilers? Obviously duplicate code is one problem, however there could be duplicate global references as well. Is it safe/good practice to build a large application that has multiple dependency paths to the same static archive? Are there less obvious situations than just duplicate symbol names where issues can arise when doing this?
Asking this because I've been playing with this idea for a project I'm on, and want to make the right choices.
My question is: is this behavior circumstantial to my test or is it standard, and can I expect the same behavior from other compilers?
As far as the compiler itself is concerned, there is no issue: you have one definition for each function among your sources.
As far as ar is concerned, you also have no issue: neither of the archives you built contains any duplicate symbols.
Different linkers may exhibit different behaviors, however. It is conceivable that some would reject linking archives that contain duplicate external symbols. Typical UNIX linkers will handle the situation you present, but they may vary in some details, such as whether a duplicate copy of function a() is included in the binary.
Obviously duplicate code is one problem, however there could be duplicate global references as well. Is it safe/good practice to build a large application that has multiple dependency paths to the same static archive?
"Multiple paths to the same static archive" does not seem to be a good characterization of the situation you present. In neither case do you provide the same archive more than once. Rather, in the b case you provide different archives with duplicate members. Linkers generally do not have problems with specifying the same archive multiple times in the same link command. Under some circumstances it may even be necessary to do so; it should not present a problem.
Providing distinct archives with duplicate members probably will not present a problem, except possibly for bloating your code with duplicate function implementations. This is a bit less certain, but I doubt it would present a problem in practice.
Whether that's good practice is a matter of opinion, but I'm inclined to think not. It's also not clear to me what gain you seen in such an approach. On the other hand, I won't be sharpening any stakes or preparing any kindling if you decide to go ahead anyway.
I'm looking for a way to find the names of the variables accessed by a given instruction (that performs a memory access).
Using debugging symbols and, for example, addr2line or objdump it's easy to convert instruction addresses into source code files + line numbers, but unfortunately often a single source code line contains more than one variable so this method does not have sufficiently fine granularity.
I've found that objdump is able to convert instruction addresses to global variables. But I haven't yet found a way to do this for local variables. For example, in the example bellow, I'd like to know that instruction at address 0x4004c4 is accessing the local variable "local_hello" and that the instruction at address 0x4004c9 is accessing the local variable "local_hello2".
Hello.c:
int global_hello = 4;
int main(){
int local_hello = 3;
int local_hello2 = 0;
local_hello2 = global_hello + local_hello;
return local_hello2;
}
Using "objdump -S hello":
local_hello2 = global_hello + local_hello;
4004be: 8b 15 cc 03 20 00 mov 0x2003cc(%rip),%edx # 600890 <global_hello>
4004c4: 8b 45 fc mov -0x4(%rbp),%eax
4004c7: 01 d0 add %edx,%eax
4004c9: 89 45 f8 mov %eax,-0x8(%rbp)
This might work for simple programs with no or only moderate optimization levels but will become difficult with compiler optimzation.
You might want to look into gdb sources to learn about the efforts to connect variables to optimized compiler output.
What's your objective, after all?
For the bounty: How can this behavior can be disabled on a case-by-case basis without disabling or lowering the optimization level?
The following conditional expression was compiled on MinGW GCC 3.4.5, where a is a of type signed long, and m is of type unsigned long.
if (!a && m > 0x002 && m < 0x111)
The CFLAGS used were -g -O2. Here is the corresponding assembly GCC output (dumped with objdump)
120: 8b 5d d0 mov ebx,DWORD PTR [ebp-0x30]
123: 85 db test ebx,ebx
125: 0f 94 c0 sete al
128: 31 d2 xor edx,edx
12a: 83 7d d4 02 cmp DWORD PTR [ebp-0x2c],0x2
12e: 0f 97 c2 seta dl
131: 85 c2 test edx,eax
133: 0f 84 1e 01 00 00 je 257 <_MyFunction+0x227>
139: 81 7d d4 10 01 00 00 cmp DWORD PTR [ebp-0x2c],0x110
140: 0f 87 11 01 00 00 ja 257 <_MyFunction+0x227>
120-131 can easily be traced as first evaluating !a, followed by the evaluation of m > 0x002. The first jump conditional does not occur until 133. By this time, two expressions have been evaluated, regardless of the outcome of the first expression: !a. If a was equal to zero, the expression can (and should) be concluded immediately, which is not done here.
How does this relate to the the C standard, which requires Boolean operators to short-circuit as soon as the outcome can be determined?
The C standard only specifies the behavior of an "abstract machine"; it does not specify the generation of assembly. As long as the observable behavior of a program matches that on the abstract machine, the implementation can use whatever physical mechanism it likes for implementing the language constructs. The relevant section in the standard (C99) is 5.1.2.3 Program execution.
It is probably a compiler optimization since comparing integral types has no side effects. You could try compiling without optimizations or using a function that has side effects instead of the comparison operator and see if it still does this.
For example, try
if (printf("a") || printf("b")) {
printf("c\n");
}
and it should print ac
As others have mentioned, this assembly output is a compiler optimization that doesn't affect program execution (as far as the compiler can tell). If you want to selectively disable this optimization, you need to tell the compiler that your variables should not be optimized across the sequence points in the code.
Sequence points are control expressions (the evaluations in if, switch, while, do and all three sections of for), logical ORs and ANDs, conditionals (?:), commas and the return statement.
To prevent compiler optimization across these points, you must declare your variable volatile. In your example, you can specify
volatile long a;
unsigned long m;
{...}
if (!a && m > 0x002 && m < 0x111) {...}
The reason that this works is that volatile is used to instruct the compiler that it can't predict the behavior of an equivalent machine with respect to the variable. Therefore, it must strictly obey the sequence points in your code.
The compiler's optimising - it gets the result into EBX, moves it to AL, part of EAX, does the second check into EDX, then branches based on the comparison of EAX and EDX. This saves a branch and leaves the code running faster, without making any difference at all in terms of side effects.
If you compile with -O0 rather than -O2, I imagine it will produce more naive assembly that more closely matches your expectations.
The code is behaving correctly (i.e., in accordance with the requirements of the language standard) either way.
It appears that you're trying to find a way to generate specific assembly code. Of two possible assembly code sequences, both of which behave the same way, you find one satisfactory and the other unsatisfactory.
The only really reliable way to guarantee the satisfactory assembly code sequence is to write the assembly code explicitly. gcc does support inline assembly.
C code specifies behavior. Assembly code specifies machine code.
But all this raises the question: why does it matter to you? (I'm not saying it shouldn't, I just don't understand why it should.)
EDIT: How exactly are a and m defined? If, as you suggest, they're related to memory-mapped devices, then they should be declared volatile -- and that might be exactly the solution to your problem. If they're just ordinary variables, then the compiler can do whatever it likes with them (as long as it doesn't affect the program's visible behavior) because you didn't ask it not to.