If in C I write:
int num;
Before I assign anything to num, is the value of num indeterminate?
Static variables (file scope and function static) are initialized to zero:
int x; // zero
int y = 0; // also zero
void foo() {
static int x; // also zero
}
Non-static variables (local variables) are indeterminate. Reading them prior to assigning a value results in undefined behavior.
void foo() {
int x;
printf("%d", x); // the compiler is free to crash here
}
In practice, they tend to just have some nonsensical value in there initially - some compilers may even put in specific, fixed values to make it obvious when looking in a debugger - but strictly speaking, the compiler is free to do anything from crashing to summoning demons through your nasal passages.
As for why it's undefined behavior instead of simply "undefined/arbitrary value", there are a number of CPU architectures that have additional flag bits in their representation for various types. A modern example would be the Itanium, which has a "Not a Thing" bit in its registers; of course, the C standard drafters were considering some older architectures.
Attempting to work with a value with these flag bits set can result in a CPU exception in an operation that really shouldn't fail (eg, integer addition, or assigning to another variable). And if you go and leave a variable uninitialized, the compiler might pick up some random garbage with these flag bits set - meaning touching that uninitialized variable may be deadly.
0 if static or global, indeterminate if storage class is auto
C has always been very specific about the initial values of objects. If global or static, they will be zeroed. If auto, the value is indeterminate.
This was the case in pre-C89 compilers and was so specified by K&R and in DMR's original C report.
This was the case in C89, see section 6.5.7 Initialization.
If an object that has automatic
storage duration is not initialized
explicitely, its value is
indeterminate. If an object that has
static storage duration is not
initialized explicitely, it is
initialized implicitely as if every
member that has arithmetic type were
assigned 0 and every member that has
pointer type were assigned a null
pointer constant.
This was the case in C99, see section 6.7.8 Initialization.
If an object that has automatic
storage duration is not initialized
explicitly, its value is
indeterminate. If an object that has
static storage duration is not
initialized explicitly, then: — if it
has pointer type, it is initialized to
a null pointer; — if it has arithmetic
type, it is initialized to (positive
or unsigned) zero; — if it is an
aggregate, every member is initialized
(recursively) according to these
rules; — if it is a union, the first
named member is initialized
(recursively) according to these
rules.
As to what exactly indeterminate means, I'm not sure for C89, C99 says:
3.17.2 indeterminate valueeither an unspecified value or a trap
representation
But regardless of what standards say, in real life, each stack page actually does start off as zero, but when your program looks at any auto storage class values, it sees whatever was left behind by your own program when it last used those stack addresses. If you allocate a lot of auto arrays you will see them eventually start neatly with zeroes.
You might wonder, why is it this way? A different SO answer deals with that question, see: https://stackoverflow.com/a/2091505/140740
It depends on the storage duration of the variable. A variable with static storage duration is always implicitly initialized with zero.
As for automatic (local) variables, an uninitialized variable has indeterminate value. Indeterminate value, among other things, mean that whatever "value" you might "see" in that variable is not only unpredictable, it is not even guaranteed to be stable. For example, in practice (i.e. ignoring the UB for a second) this code
int num;
int a = num;
int b = num;
does not guarantee that variables a and b will receive identical values. Interestingly, this is not some pedantic theoretical concept, this readily happens in practice as consequence of optimization.
So in general, the popular answer that "it is initialized with whatever garbage was in memory" is not even remotely correct. Uninitialized variable's behavior is different from that of a variable initialized with garbage.
Ubuntu 15.10, Kernel 4.2.0, x86-64, GCC 5.2.1 example
Enough standards, let's look at an implementation :-)
Local variable
Standards: undefined behavior.
Implementation: the program allocates stack space, and never moves anything to that address, so whatever was there previously is used.
#include <stdio.h>
int main() {
int i;
printf("%d\n", i);
}
compile with:
gcc -O0 -std=c99 a.c
outputs:
0
and decompiles with:
objdump -dr a.out
to:
0000000000400536 <main>:
400536: 55 push %rbp
400537: 48 89 e5 mov %rsp,%rbp
40053a: 48 83 ec 10 sub $0x10,%rsp
40053e: 8b 45 fc mov -0x4(%rbp),%eax
400541: 89 c6 mov %eax,%esi
400543: bf e4 05 40 00 mov $0x4005e4,%edi
400548: b8 00 00 00 00 mov $0x0,%eax
40054d: e8 be fe ff ff callq 400410 <printf#plt>
400552: b8 00 00 00 00 mov $0x0,%eax
400557: c9 leaveq
400558: c3 retq
From our knowledge of x86-64 calling conventions:
%rdi is the first printf argument, thus the string "%d\n" at address 0x4005e4
%rsi is the second printf argument, thus i.
It comes from -0x4(%rbp), which is the first 4-byte local variable.
At this point, rbp is in the first page of the stack has been allocated by the kernel, so to understand that value we would to look into the kernel code and find out what it sets that to.
TODO does the kernel set that memory to something before reusing it for other processes when a process dies? If not, the new process would be able to read the memory of other finished programs, leaking data. See: Are uninitialized values ever a security risk?
We can then also play with our own stack modifications and write fun things like:
#include <assert.h>
int f() {
int i = 13;
return i;
}
int g() {
int i;
return i;
}
int main() {
f();
assert(g() == 13);
}
Note that GCC 11 seems to produce a different assembly output, and the above code stops "working", it is undefined behavior after all: Why does -O3 in gcc seem to initialize my local variable to 0, while -O0 does not?
Local variable in -O3
Implementation analysis at: What does <value optimized out> mean in gdb?
Global variables
Standards: 0
Implementation: .bss section.
#include <stdio.h>
int i;
int main() {
printf("%d\n", i);
}
gcc -O0 -std=c99 a.c
compiles to:
0000000000400536 <main>:
400536: 55 push %rbp
400537: 48 89 e5 mov %rsp,%rbp
40053a: 8b 05 04 0b 20 00 mov 0x200b04(%rip),%eax # 601044 <i>
400540: 89 c6 mov %eax,%esi
400542: bf e4 05 40 00 mov $0x4005e4,%edi
400547: b8 00 00 00 00 mov $0x0,%eax
40054c: e8 bf fe ff ff callq 400410 <printf#plt>
400551: b8 00 00 00 00 mov $0x0,%eax
400556: 5d pop %rbp
400557: c3 retq
400558: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
40055f: 00
# 601044 <i> says that i is at address 0x601044 and:
readelf -SW a.out
contains:
[25] .bss NOBITS 0000000000601040 001040 000008 00 WA 0 0 4
which says 0x601044 is right in the middle of the .bss section, which starts at 0x601040 and is 8 bytes long.
The ELF standard then guarantees that the section named .bss is completely filled with of zeros:
.bss This section holds uninitialized data that contribute to the
program’s memory image. By definition, the system initializes the
data with zeros when the program begins to run. The section occu-
pies no file space, as indicated by the section type, SHT_NOBITS.
Furthermore, the type SHT_NOBITS is efficient and occupies no space on the executable file:
sh_size This member gives the section’s size in bytes. Unless the sec-
tion type is SHT_NOBITS , the section occupies sh_size
bytes in the file. A section of type SHT_NOBITS may have a non-zero
size, but it occupies no space in the file.
Then it is up to the Linux kernel to zero out that memory region when loading the program into memory when it gets started.
That depends. If that definition is global (outside any function) then num will be initialized to zero. If it's local (inside a function) then its value is indeterminate. In theory, even attempting to read the value has undefined behavior -- C allows for the possibility of bits that don't contribute to the value, but have to be set in specific ways for you to even get defined results from reading the variable.
The basic answer is, yes it is undefined.
If you are seeing odd behavior because of this, it may depended on where it is declared. If within a function on the stack then the contents will more than likely be different every time the function gets called. If it is a static or module scope it is undefined but will not change.
Because computers have finite storage capacity, automatic variables will typically be held in storage elements (whether registers or RAM) that have previously been used for some other arbitrary purpose. If a such a variable is used before a value has been assigned to it, that storage may hold whatever it held previously, and so the contents of the variable will be unpredictable.
As an additional wrinkle, many compilers may keep variables in registers which are larger than the associated types. Although a compiler would be required to ensure that any value which is written to a variable and read back will be truncated and/or sign-extended to its proper size, many compilers will perform such truncation when variables are written and expect that it will have been performed before the variable is read. On such compilers, something like:
uint16_t hey(uint32_t x, uint32_t mode)
{ uint16_t q;
if (mode==1) q=2;
if (mode==3) q=4;
return q; }
uint32_t wow(uint32_t mode) {
return hey(1234567, mode);
}
might very well result in wow() storing the values 1234567 into registers
0 and 1, respectively, and calling foo(). Since x isn't needed within
"foo", and since functions are supposed to put their return value into
register 0, the compiler may allocate register 0 to q. If mode is 1 or
3, register 0 will be loaded with 2 or 4, respectively, but if it is some
other value, the function may return whatever was in register 0 (i.e. the
value 1234567) even though that value is not within the range of uint16_t.
To avoid requiring compilers to do extra work to ensure that uninitialized
variables never seem to hold values outside their domain, and avoid needing
to specify indeterminate behaviors in excessive detail, the Standard says
that use of uninitialized automatic variables is Undefined Behavior. In
some cases, the consequences of this may be even more surprising than a
value being outside the range of its type. For example, given:
void moo(int mode)
{
if (mode < 5)
launch_nukes();
hey(0, mode);
}
a compiler could infer that because invoking moo() with a mode which is
greater than 3 will inevitably lead to the program invoking Undefined
Behavior, the compiler may omit any code which would only be relevant
if mode is 4 or greater, such as the code which would normally prevent
the launch of nukes in such cases. Note that neither the Standard, nor
modern compiler philosophy, would care about the fact that the return value
from "hey" is ignored--the act of trying to return it gives a compiler
unlimited license to generate arbitrary code.
If storage class is static or global then during loading, the BSS initialises the variable or memory location(ML) to 0 unless the variable is initially assigned some value. In case of local uninitialized variables the trap representation is assigned to memory location. So if any of your registers containing important info is overwritten by compiler the program may crash.
but some compilers may have mechanism to avoid such a problem.
I was working with nec v850 series when i realised There is trap representation which has bit patterns that represent undefined values for data types except for char. When i took a uninitialized char i got a zero default value due to trap representation. This might be useful for any1 using necv850es
As far as i had gone it is mostly depend on compiler but in general most cases the value is pre assumed as 0 by the compliers.
I got garbage value in case of VC++ while TC gave value as 0.
I Print it like below
int i;
printf('%d',i);
Related
I have an R package with C compiled code that's been relatively stable for quite a while and is frequently tested against a broad variety of platforms and compilers (windows/osx/debian/fedora gcc/clang).
More recently a new platform was added to test the package again:
Logs from checks with gcc trunk aka 10.0.1 compiled from source
on Fedora 30. (For some archived packages, 10.0.0.)
x86_64 Fedora 30 Linux
FFLAGS="-g -O2 -mtune=native -Wall -fallow-argument-mismatch"
CFLAGS="-g -O2 -Wall -pedantic -mtune=native -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong -fstack-clash-protection -fcf-protection"
CXXFLAGS="-g -O2 -Wall -pedantic -mtune=native -Wno-ignored-attributes -Wno-deprecated-declarations -Wno-parentheses -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong -fstack-clash-protection -fcf-protection"
At which point the compiled code promptly started segfaulting along these lines:
*** caught segfault ***
address 0x1d00000001, cause 'memory not mapped'
I've been able to reproduce the segfault consistently by using the rocker/r-base docker container with gcc-10.0.1 with optimization level -O2. Running a lower optimization gets rid of the problem. Running any other set-up, including under valgrind (both -O0 and -O2), UBSAN (gcc/clang), shows no problems at all. I'm also reasonably sure this ran under gcc-10.0.0, but don't have the data.
I ran the gcc-10.0.1 -O2 version with gdb and noticed something that seems odd to me:
While stepping through the highlighted section it appears the initialization of the second elements of the arrays is skipped (R_alloc is a wrapper around malloc that self garbage collects when returning control to R; the segfault happens before return to R). Later, the program crashes when the un-initialized element (in the gcc.10.0.1 -O2 version) is accessed.
I fixed this by explicitly initializing the element in question everywhere in the code that eventually led to the usage of the element, but it really should have been initialized to an empty string, or at least that's what I would have assumed.
Am I missing something obvious or doing something stupid? Both are reasonably likely as C is my second language by far. It's just strange that this just cropped up now, and I can't figure out what the compiler is trying to do.
UPDATE: Instructions to reproduce this, although this will only reproduce so long as debian:testing docker container has gcc-10 at gcc-10.0.1. Also, don't just run these commands if you don't trust me.
Sorry this is not a minimal reproducible example.
docker pull rocker/r-base
docker run --rm -ti --security-opt seccomp=unconfined \
rocker/r-base /bin/bash
apt-get update
apt-get install gcc-10 gdb
gcc-10 --version # confirm 10.0.1
# gcc-10 (Debian 10-20200222-1) 10.0.1 20200222 (experimental)
# [master revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779]
mkdir ~/.R
touch ~/.R/Makevars
echo "CC = gcc-10
CFLAGS = -g -O2 -Wall -pedantic -mtune=native -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong -fstack-clash-protection -fcf-protection
" >> ~/.R/Makevars
R -d gdb --vanilla
Then in the R console, after typing run to get gdb to run the program:
f.dl <- tempfile()
f.uz <- tempfile()
github.url <- 'https://github.com/brodieG/vetr/archive/v0.2.8.zip'
download.file(github.url, f.dl)
unzip(f.dl, exdir=f.uz)
install.packages(
file.path(f.uz, 'vetr-0.2.8'), repos=NULL,
INSTALL_opts="--install-tests", type='source'
)
# minimal set of commands to segfault
library(vetr)
alike(pairlist(a=1, b="character"), pairlist(a=1, b=letters))
alike(pairlist(1, "character"), pairlist(1, letters))
alike(NULL, 1:3) # not a wild card at top level
alike(list(NULL), list(1:3)) # but yes when nested
alike(list(NULL, NULL), list(list(list(1, 2, 3)), 1:25))
alike(list(NULL), list(1, 2))
alike(list(), list(1, 2))
alike(matrix(integer(), ncol=7), matrix(1:21, nrow=3))
alike(matrix(character(), nrow=3), matrix(1:21, nrow=3))
alike(
matrix(integer(), ncol=3, dimnames=list(NULL, c("R", "G", "B"))),
matrix(1:21, ncol=3, dimnames=list(NULL, c("R", "G", "B")))
)
# Adding tests from docs
mx.tpl <- matrix(
integer(), ncol=3, dimnames=list(row.id=NULL, c("R", "G", "B"))
)
mx.cur <- matrix(
sample(0:255, 12), ncol=3, dimnames=list(row.id=1:4, rgb=c("R", "G", "B"))
)
mx.cur2 <-
matrix(sample(0:255, 12), ncol=3, dimnames=list(1:4, c("R", "G", "B")))
alike(mx.tpl, mx.cur2)
Inspecting in gdb pretty quickly shows (if I understand correctly) that
CSR_strmlen_x is trying to access the string that was not initialized.
UPDATE 2: this is a highly recursive function, and on top of that the string initialization bit gets called many, many times. This is mostly b/c I was being lazy, we only need the strings initialized for the one time we actually encounter something we want to report in the recursion, but it was easier to initialize every time it is possible to encounter something. I mention this because what you'll see next shows multiple initializations, but only one of them (presumably the one with address <0x1400000001>) is being used.
I can't guarantee that the stuff I'm showing here is directly related to the element that caused the segfault (though it is the same illegal address acccess), but as #nate-eldredge asked it does show that the array element is not initialized either just before return or just after return in the calling function. Note the calling function is initializing 8 of these, and I show them all, with all them filled with either garbage or inaccessible memory.
UPDATE 3, disassembly of function in question:
Breakpoint 1, ALIKEC_res_strings_init () at alike.c:75
75 return res;
(gdb) p res.current[0]
$1 = 0x7ffff46a0aa5 "%s%s%s%s"
(gdb) p res.current[1]
$2 = 0x1400000001 <error: Cannot access memory at address 0x1400000001>
(gdb) disas /m ALIKEC_res_strings_init
Dump of assembler code for function ALIKEC_res_strings_init:
53 struct ALIKEC_res_strings ALIKEC_res_strings_init() {
0x00007ffff4687fc0 <+0>: endbr64
54 struct ALIKEC_res_strings res;
55
56 res.target = (const char **) R_alloc(5, sizeof(const char *));
0x00007ffff4687fc4 <+4>: push %r12
0x00007ffff4687fc6 <+6>: mov $0x8,%esi
0x00007ffff4687fcb <+11>: mov %rdi,%r12
0x00007ffff4687fce <+14>: push %rbx
0x00007ffff4687fcf <+15>: mov $0x5,%edi
0x00007ffff4687fd4 <+20>: sub $0x8,%rsp
0x00007ffff4687fd8 <+24>: callq 0x7ffff4687180 <R_alloc#plt>
0x00007ffff4687fdd <+29>: mov $0x8,%esi
0x00007ffff4687fe2 <+34>: mov $0x5,%edi
0x00007ffff4687fe7 <+39>: mov %rax,%rbx
57 res.current = (const char **) R_alloc(5, sizeof(const char *));
0x00007ffff4687fea <+42>: callq 0x7ffff4687180 <R_alloc#plt>
58
59 res.target[0] = "%s%s%s%s";
0x00007ffff4687fef <+47>: lea 0x1764a(%rip),%rdx # 0x7ffff469f640
0x00007ffff4687ff6 <+54>: lea 0x18aa8(%rip),%rcx # 0x7ffff46a0aa5
0x00007ffff4687ffd <+61>: mov %rcx,(%rbx)
60 res.target[1] = "";
61 res.target[2] = "";
0x00007ffff4688000 <+64>: mov %rdx,0x10(%rbx)
62 res.target[3] = "";
0x00007ffff4688004 <+68>: mov %rdx,0x18(%rbx)
63 res.target[4] = "";
0x00007ffff4688008 <+72>: mov %rdx,0x20(%rbx)
64
65 res.tar_pre = "be";
66
67 res.current[0] = "%s%s%s%s";
0x00007ffff468800c <+76>: mov %rax,0x8(%r12)
0x00007ffff4688011 <+81>: mov %rcx,(%rax)
68 res.current[1] = "";
69 res.current[2] = "";
0x00007ffff4688014 <+84>: mov %rdx,0x10(%rax)
70 res.current[3] = "";
0x00007ffff4688018 <+88>: mov %rdx,0x18(%rax)
71 res.current[4] = "";
0x00007ffff468801c <+92>: mov %rdx,0x20(%rax)
72
73 res.cur_pre = "is";
74
75 return res;
=> 0x00007ffff4688020 <+96>: lea 0x14fe0(%rip),%rax # 0x7ffff469d007
0x00007ffff4688027 <+103>: mov %rax,0x10(%r12)
0x00007ffff468802c <+108>: lea 0x14fcd(%rip),%rax # 0x7ffff469d000
0x00007ffff4688033 <+115>: mov %rbx,(%r12)
0x00007ffff4688037 <+119>: mov %rax,0x18(%r12)
0x00007ffff468803c <+124>: add $0x8,%rsp
0x00007ffff4688040 <+128>: pop %rbx
0x00007ffff4688041 <+129>: mov %r12,%rax
0x00007ffff4688044 <+132>: pop %r12
0x00007ffff4688046 <+134>: retq
0x00007ffff4688047: nopw 0x0(%rax,%rax,1)
End of assembler dump.
UPDATE 4:
So, trying to parse through the standard here are the parts of it that seem relevant (C11 draft):
6.3.2.3 Par7 Conversions > Other Operands > Pointers
A pointer to an object type may be converted to a pointer to a
different object type. If the resulting pointer is not correctly
aligned 68) for the referenced type, the behavior is undefined.
Otherwise, when converted back again, the result shall compare
equal to the original pointer. When a pointer to an object is
converted to a pointer to a character type,the result points to the
lowest addressed byte of the object. Successive increments of
the result, up to the size of the object, yield pointers to the
remaining bytes of the object.
6.5 Par6 Expressions
The effective type of an object for an access to its stored value is the
declared type of the object, if any. 87) If a value is stored into
an object having no declared type through an lvalue having a
type that is not a character type, then the type of the lvalue becomes
the effective type of the object for that access and for subsequent
accesses that do not modify the stored value. If a value is
copied into an object having no declared type
using memcpy or memmove, or is copied as an array of character type, then
the effective type of the modified object for that access and for
subsequent accesses that do not modify the value is the effective type
of the object from which the value is copied, if it has one. For all
other accesses to an object having no declared type, the effective
type of the object is simply the type of the lvalue used for the
access.
87) Allocated objects have no declared type.
IIUC R_alloc returns an offset into a malloced block that is guaranteed to be double aligned, and the size of the block after the offset is of the requested size (there is also allocation before the offset for R specific data). R_alloc casts that pointer to (char *) on return.
Section 6.2.5 Par 29
A pointer to void shall have the same representation and
alignment requirements as a pointer to a character
type. 48) Similarly, pointers to qualified or unqualified versions
of compatible types shall have the same representation and
alignment requirements. All pointers to structure types shall have
the same representation and alignment requirements as each other.
All pointers to union types shall have the same
representation and alignment requirements as each other.
Pointers to other types need not have the same representation
or alignment requirements.
48) The same representation and alignment requirements are meant to imply interchangeability asarguments to functions, return values from functions, and members of unions.
So the question is "are we allowed to recast the (char *) to (const char **) and write to it as (const char **)". My reading of the above is that so long as pointers on the systems the code run in have alignment compatible with double alignment, then its okay.
Are we violating "strict aliasing"? i.e.:
6.5 Par 7
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types: 88)
— a type compatible with the effective type of the object
...
88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.
So, what should the compiler think the effective type of the object pointed to by res.target (or res.current) is? Presumably the declared type (const char **), or is this actually ambiguous? It feels to me that it isn't in this case only because there is no other 'lvalue' in scope that accesses the same object.
I'll admit I'm struggling mightily to extract sense from these sections of the standard.
Summary: This appears to be a bug in gcc, related to string optimization. A self-contained testcase is below. There was initially some doubt as to whether the code is correct, but I think it is.
I have reported the bug as PR 93982. A proposed fix was committed but it does not fix it in all cases, leading to the followup PR 94015 (godbolt link).
You should be able to work around the bug by compiling with the flag -fno-optimize-strlen.
I was able to reduce your test case to the following minimal example (also on godbolt):
struct a {
const char ** target;
};
char* R_alloc(void);
struct a foo(void) {
struct a res;
res.target = (const char **) R_alloc();
res.target[0] = "12345678";
res.target[1] = "";
res.target[2] = "";
res.target[3] = "";
res.target[4] = "";
return res;
}
With gcc trunk (gcc version 10.0.1 20200225 (experimental)) and -O2 (all other options turned out to be unnecessary), the generated assembly on amd64 is as follows:
.LC0:
.string "12345678"
.LC1:
.string ""
foo:
subq $8, %rsp
call R_alloc
movq $.LC0, (%rax)
movq $.LC1, 16(%rax)
movq $.LC1, 24(%rax)
movq $.LC1, 32(%rax)
addq $8, %rsp
ret
So you are quite right that the compiler is failing to initialize res.target[1] (note the conspicuous absence of movq $.LC1, 8(%rax)).
It is interesting to play with the code and see what affects the "bug". Perhaps significantly, changing the return type of R_alloc to void * makes it go away, and gives you "correct" assembly output. Maybe less significantly but more amusingly, changing the string "12345678" to be either longer or shorter also makes it go away.
Previous discussion, now resolved - the code is apparently legal.
The question I have is whether your code is actually legal. The fact that you take the char * returned by R_alloc() and cast it to const char **, and then store a const char * seems like it might violate the strict aliasing rule, as char and const char * are not compatible types. There is an exception that allows you to access any object as char (to implement things like memcpy), but this is the other way around, and as best I understand it, that's not allowed. It makes your code produce undefined behavior and so the compiler can legally do whatever the heck it wants.
If this is so, the correct fix would be for R to change their code so that R_alloc() returns void * instead of char *. Then there would be no aliasing problem. Unfortunately, that code is outside your control, and it's not clear to me how you can use this function at all without violating strict aliasing. A workaround might be to interpose a temporary variable, e.g. void *tmp = R_alloc(); res.target = tmp; which solves the problem in the test case, but I'm still not sure if it's legal.
However, I am not sure of this "strict aliasing" hypothesis, because compiling with -fno-strict-aliasing, which AFAIK is supposed to make gcc allow such constructs, does not make the problem go away!
Update. Trying some different options, I found that either -fno-optimize-strlen or -fno-tree-forwprop will result in "correct" code being generated. Also, using -O1 -foptimize-strlen yields the incorrect code (but -O1 -ftree-forwprop does not).
After a little git bisect exercise, the error seems to have been introduced in commit 34fcf41e30ff56155e996f5e04.
Update 2. I tried digging into the gcc source a little bit, just to see what I could learn. (I don't claim to be any sort of compiler expert!)
It looks like the code in tree-ssa-strlen.c is meant to keep track of strings appearing in the program. As near as I can tell, the bug is that in looking at the statement res.target[0] = "12345678"; the compiler conflates the address of the string literal "12345678" with the string itself. (That seems to be related to this suspicious code which was added in the aforementioned commit, where if it tries to count the bytes of a "string" that is actually an address, it instead looks at what that address points to.)
So it thinks that the statement res.target[0] = "12345678", instead of storing the address of "12345678" at the address res.target, is storing the string itself at that address, as if the statement were strcpy(res.target, "12345678"). Note for what's ahead that this would result in the trailing nul being stored at address res.target+8 (at this stage in the compiler, all offsets are in bytes).
Now when the compiler looks at res.target[1] = "", it likewise treats this as if it were strcpy(res.target+8, ""), the 8 coming from the size of a char *. That is, as if it were simply storing a nul byte at address res.target+8. However, the compiler "knows" that the previous statement already stored a nul byte at that very address! As such, this statement is "redundant" and can be discarded (here).
This explains why the string has to be exactly 8 characters long to trigger the bug. (Though other multiples of 8 can also trigger the bug in other situations.)
In some library code, there's a pattern of setting up callbacks for events, where the callback may receive an argument. The callback itself may not do anything with the argument, or the argument might be 0, but it is passed. Decomposing it down to the basics, it looks like the following:
#include <stdio.h>
#include <string.h>
void callback_1(char *data) {
printf("Length of data: %d\n", strlen(data));
}
void callback_2() {
printf("No parameters used\n");
}
typedef void (*Callback)(char *);
int main(void) {
Callback callback;
callback = callback_1;
callback("test");
callback = callback_2;
callback("test");
return 0;
}
This compiles and runs on GCC 4.9.3 (32-bit) without anything unexpected.
A good callback function has a signature like callback_1, but occasionally I forget the data parameter if it's not being used. While I'm aware that C isn't always typesafe (especially with regards to void pointers), I expected a warning, since if I had provided a mismatched parameter, e.g. int data, I would receive a warning about incompatible types. If the typedef for Callback didn't accept a parameter, I would receive a compilation error if a callback function had one in the signature.
Is there a way in C to get a warning for the case where a function pointer is assigned to a function where the signature is missing an argument? What happens on the stack if the callback is missing the parameter? Are there possible repercussions of missing the parameter in the callback?
This is because you have callback2 defined as:
void callback_2()
The empty parenthesis means it takes an unspecified number of arguments. So it qualifies to be assigned to type Callback.
If you change the definition to this:
void callback_2(void)
This explicitly specifies that the function takes 0 arguments, and you'll get an "assignment from incompatible pointer type" warning.
In order to properly catch this condition, compile with -Wstrict-prototypes along with -Wall -Wextra and you'll get the following if declare or define a function with an empty argument list:
warning: function declaration isn’t a prototype
Your code not only has undefined behavior it also contains a deprecated feature.
6.7.5.3/14
An identifier list declares only the identifiers of the parameters of
the function. An empty list in a function declarator that is part of a
definition of that function specifies that the function has no
parameters. The empty list in a function declarator that is not part
of a definition of that function specifies that no information about
the number or types of the parameters is supplied.
Notice the difference. void f(); is a declarator without a definition. void f() {} is a declarator with a definition.
6.5.2.2/2:
If the expression that denotes the called function has a type that
includes a prototype, the number of arguments shall agree with the
number of parameters. Each argument shall have a type such that its
value may be assigned to an object with the unqualified version of the
type of its corresponding parameter.
6.11.6
The use of function declarators with empty parentheses (not
prototype-format parameter type declarators) is an obsolescent
feature.
It's true that void f() and void f(void) are compatible types, but since f() defines a function that takes no parameters, calling it with parameters is undefined behavior.
OK enough pedantry, so what actually happens? There is no name mangling in C, so the linker only sees the name of the function. GCC and Clang both emit code that call the functions in the exactly the same way. They push the pointer to the function onto the stack, the argument (the "test" string) then they call it. Nothing fishy really happens here. Here's what I get with objdump:
00000000004006b3 <callback_2>:
...
I only included this to show the address of callback_2. The address in this example for callback_1 is 400686.
First some general stack stuff:
4006c4: 55 push rbp
4006c5: 48 89 e5 mov rbp,rsp
4006c8: 48 83 ec 10 sub rsp,0x10
We store the address of callback_one:
4006cc: 48 c7 45 f8 86 06 40 mov QWORD PTR [rbp-0x8],0x400686
4006d3: 00
4006d4: 48 8b 45 f8 mov rax,QWORD PTR [rbp-0x8]
Then our "test" string. Using objdump -s -j .rodata shows that the address of our string is at index 7, address 4007b0.
4007b0 73207573 65640074 65737400 s used.test.
1234567
4007b0 + 7 is 4007b7.
4006d8: bf b7 07 40 00 mov edi,0x4007b7
Call callback_one:
4006dd: ff d0 call rax
Repeat for callback_two:
4006df: 48 c7 45 f8 b3 06 40 mov QWORD PTR [rbp-0x8],0x4006b3
4006e6: 00
4006e7: 48 8b 45 f8 mov rax,QWORD PTR [rbp-0x8]
4006eb: bf b7 07 40 00 mov edi,0x4007b7
4006f0: ff d0 call rax
So why didn't you get any warning? Well, technically you're not violating any rules of the language. And undefined behavior is not required to be diagnosable. The moral of the story is: if you think code needs a warning, don't write it. But C is a tricky language. So long as you know the language, the caveats and what exactly your compiler is doing you should be fine.
I'm looking for a way to find the names of the variables accessed by a given instruction (that performs a memory access).
Using debugging symbols and, for example, addr2line or objdump it's easy to convert instruction addresses into source code files + line numbers, but unfortunately often a single source code line contains more than one variable so this method does not have sufficiently fine granularity.
I've found that objdump is able to convert instruction addresses to global variables. But I haven't yet found a way to do this for local variables. For example, in the example bellow, I'd like to know that instruction at address 0x4004c4 is accessing the local variable "local_hello" and that the instruction at address 0x4004c9 is accessing the local variable "local_hello2".
Hello.c:
int global_hello = 4;
int main(){
int local_hello = 3;
int local_hello2 = 0;
local_hello2 = global_hello + local_hello;
return local_hello2;
}
Using "objdump -S hello":
local_hello2 = global_hello + local_hello;
4004be: 8b 15 cc 03 20 00 mov 0x2003cc(%rip),%edx # 600890 <global_hello>
4004c4: 8b 45 fc mov -0x4(%rbp),%eax
4004c7: 01 d0 add %edx,%eax
4004c9: 89 45 f8 mov %eax,-0x8(%rbp)
This might work for simple programs with no or only moderate optimization levels but will become difficult with compiler optimzation.
You might want to look into gdb sources to learn about the efforts to connect variables to optimized compiler output.
What's your objective, after all?
I think the question says it all. An example covering most standards from C89 to C11 would be helpful. I though of this one, but I guess it is just undefined behaviour:
#include <stdio.h>
int main( int argc, char* argv[] )
{
const char *s = NULL;
printf( "%c\n", s[0] );
return 0;
}
EDIT:
As some votes requested clarification: I wanted to have a program with an usual programming error (the simplest I could think of was an segfault), that is guaranteed (by standard) to abort. This is a bit different to the minimal segfault question, which don't care about this insurance.
raise() can be used to raise a segfault:
raise(SIGSEGV);
A segmentation fault is an implementation defined behavior. The standard does not define how the implementation should deal with undefined behavior and in fact the implementation could optimize out undefined behavior and still be compliant. To be clear, implementation defined behavior is behavior which is not specified by the standard but the implementation should document. Undefined behavior is code that is non-portable or erroneous and whose behavior is unpredictable and therefore can not be relied on.
If we look at the C99 draft standard §3.4.3 undefined behavior which comes under the Terms, definitions and symbols section in paragraph 1 it says (emphasis mine going forward):
behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
and in paragraph 2 says:
NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
If, on the other hand, you simply want a method defined in the standard that will cause a segmentation fault on most Unix-like systems then raise(SIGSEGV) should accomplish that goal. Although, strictly speaking, SIGSEGV is defined as follows:
SIGSEGV an invalid access to storage
and §7.14 Signal handling <signal.h> says:
An implementation need not generate any of these signals, except as a result of explicit calls to the raise function. Additional signals and pointers to undeclarable functions, with macro definitions beginning, respectively, with the letters SIG and an uppercase letter or with SIG_ and an uppercase letter,219) may also be specified by the implementation. The complete set of signals, their semantics, and their default handling is implementation-defined; all signal numbers shall be positive.
The standard only mentions undefined behavior. It knows nothing about memory segmentation. Also note that the code that produces the error is not standard-conformant. Your code cannot invoke undefined behavior and be standard conformant at the same time.
Nonetheless, the shortest way to produce a segmentation fault on architectures that do generate such faults would be:
int main()
{
*(int*)0 = 0;
}
Why is this sure to produce a segfault? Because access to memory address 0 is always trapped by the system; it can never be a valid access (at least not by userspace code.)
Note of course that not all architectures work the same way. On some of them, the above could not crash at all, but rather produce other kinds of errors. Or the statement could be perfectly fine, even, and memory location 0 is accessible just fine. Which is one of the reasons why the standard doesn't actually define what happens.
A correct program doesn't produce a segfault. And you cannot describe deterministic behaviour of an incorrect program.
A "segmentation fault" is a thing that an x86 CPU does. You get it by attempting to reference memory in an incorrect way. It can also refer to a situation where memory access causes a page fault (i.e. trying to access memory that's not loaded into the page tables) and the OS decides that you had no right to request that memory. To trigger those conditions, you need to program directly for your OS and your hardware. It is nothing that is specified by the C language.
If we assume we are not raising a signal calling raise, segmentation fault is likely to come from undefined behavior. Undefined behavior is undefined and a compiler is free to refuse to translate so no answer with undefined is guaranteed to fail on all implementations. Moreover a program which invokes undefined behavior is an erroneous program.
But this one is the shortest I can get that segfault on my system:
main(){main();}
(I compile with gcc and -std=c89 -O0).
And by the way, does this program really invokes undefined bevahior?
main;
That's it.
Really.
Essentially, what this does is it defines main as a variable.
In C, variables and functions are both symbols -- pointers in memory, so the compiler does not distinguish them, and this code does not throw an error.
However, the problem rests in how the system runs executables. In a nutshell, the C standard requires that all C executables have an environment-preparing entrypoint built into them, which basically boils down to "call main".
In this particular case, however, main is a variable, so it is placed in a non-executable section of memory called .bss, intended for variables (as opposed to .text for the code). Trying to execute code in .bss violates its specific segmentation, so the system throws a segmentation fault.
To illustrate, here's (part of) an objdump of the resulting file:
# (unimportant)
Disassembly of section .text:
0000000000001020 <_start>:
1020: f3 0f 1e fa endbr64
1024: 31 ed xor %ebp,%ebp
1026: 49 89 d1 mov %rdx,%r9
1029: 5e pop %rsi
102a: 48 89 e2 mov %rsp,%rdx
102d: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
1031: 50 push %rax
1032: 54 push %rsp
1033: 4c 8d 05 56 01 00 00 lea 0x156(%rip),%r8 # 1190 <__libc_csu_fini>
103a: 48 8d 0d df 00 00 00 lea 0xdf(%rip),%rcx # 1120 <__libc_csu_init>
# This is where the program should call main
1041: 48 8d 3d e4 2f 00 00 lea 0x2fe4(%rip),%rdi # 402c <main>
1048: ff 15 92 2f 00 00 callq *0x2f92(%rip) # 3fe0 <__libc_start_main#GLIBC_2.2.5>
104e: f4 hlt
104f: 90 nop
# (nice things we still don't care about)
Disassembly of section .data:
0000000000004018 <__data_start>:
...
0000000000004020 <__dso_handle>:
4020: 20 40 00 and %al,0x0(%rax)
4023: 00 00 add %al,(%rax)
4025: 00 00 add %al,(%rax)
...
Disassembly of section .bss:
0000000000004028 <__bss_start>:
4028: 00 00 add %al,(%rax)
...
# main is in .bss (variables) instead of .text (code)
000000000000402c <main>:
402c: 00 00 add %al,(%rax)
...
# aaand that's it!
PS: This won't work if you compile to a flat executable. Instead, you will cause undefined behaviour.
On some platforms, a standard-conforming C program can fail with a segmentation fault if it requests too many resources from the system. For instance, allocating a large object with malloc can appear to succeed, but later, when the object is accessed, it will crash.
Note that such a program is not strictly conforming; programs which meet that definition have to stay within each of the minimum implementation limits.
A standard-conforming C program cannot produce a segmentation fault otherwise, because the only other ways are via undefined behavior.
The SIGSEGV signal can be raised explicitly, but there is no SIGSEGV symbol in the standard C library.
(In this answer, "standard-conforming" means: "Uses only the features described in some version of the ISO C standard, avoiding unspecified, implementation-defined or undefined behavior, but not necessarily confined to the minimum implementation limits.")
The simplest form considering the smallest number of characters is:
++*(int*)0;
Most of the answers to this question are talking around the key point, which is: The C standard does not include the concept of a segmentation fault. (Since C99 it includes the signal number SIGSEGV, but it does not define any circumstance where that signal is delivered, other than raise(SIGSEGV), which as discussed in other answers doesn't count.)
Therefore, there is no "strictly conforming" program (i.e. program that uses only constructs whose behavior is fully defined by the C standard, alone) that is guaranteed to cause a segmentation fault.
Segmentation faults are defined by a different standard, POSIX. This program is guaranteed to provoke either a segmentation fault, or the functionally equivalent "bus error" (SIGBUS), on any system that is fully conforming with POSIX.1-2008 including the Memory Protection and Advanced Realtime options, provided that the calls to sysconf, posix_memalign and mprotect succeed. My reading of C99 is that this program has implementation-defined (not undefined!) behavior considering only that standard, and therefore it is conforming but not strictly conforming.
#define _XOPEN_SOURCE 700
#include <sys/mman.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
int main(void)
{
size_t pagesize = sysconf(_SC_PAGESIZE);
if (pagesize == (size_t)-1) {
fprintf(stderr, "sysconf: %s\n", strerror(errno));
return 1;
}
void *page;
int err = posix_memalign(&page, pagesize, pagesize);
if (err || !page) {
fprintf(stderr, "posix_memalign: %s\n", strerror(err));
return 1;
}
if (mprotect(page, pagesize, PROT_NONE)) {
fprintf(stderr, "mprotect: %s\n", strerror(errno));
return 1;
}
*(long *)page = 0xDEADBEEF;
return 0;
}
It's hard to define a method to segmentation fault a program on undefined platforms. A segmentation fault is a loose term that is not defined for all platforms (eg. simple small computers).
Considering only the operating systems that support processes, processes can receive notification that a segmentation fault occurred.
Further, limiting operating systems to 'unix like' OSes, a reliable method for a process to receive a SIGSEGV signal is kill(getpid(),SIGSEGV)
As is the case in most cross platform problems, each platform may (an usually does) have a different definition of seg-faulting.
But to be practical, current mac, lin and win OSes will segfault on
*(int*)0 = 0;
Further, it's not bad behaviour to cause a segfault. Some implementations of assert() cause a SIGSEGV signal which might produce a core file. Very useful when you need to autopsy.
What's worse than causing a segfault is hiding it:
try
{
anyfunc();
}
catch (...)
{
printf("?\n");
}
which hides the origin of an error and all you've got to go on is:
?
.
Here's another way I haven't seen mentioned here:
int main() {
void (*f)(void);
f();
}
In this case f is an uninitialized function pointer, which causes a segmentation fault when you try to call it.
For the bounty: How can this behavior can be disabled on a case-by-case basis without disabling or lowering the optimization level?
The following conditional expression was compiled on MinGW GCC 3.4.5, where a is a of type signed long, and m is of type unsigned long.
if (!a && m > 0x002 && m < 0x111)
The CFLAGS used were -g -O2. Here is the corresponding assembly GCC output (dumped with objdump)
120: 8b 5d d0 mov ebx,DWORD PTR [ebp-0x30]
123: 85 db test ebx,ebx
125: 0f 94 c0 sete al
128: 31 d2 xor edx,edx
12a: 83 7d d4 02 cmp DWORD PTR [ebp-0x2c],0x2
12e: 0f 97 c2 seta dl
131: 85 c2 test edx,eax
133: 0f 84 1e 01 00 00 je 257 <_MyFunction+0x227>
139: 81 7d d4 10 01 00 00 cmp DWORD PTR [ebp-0x2c],0x110
140: 0f 87 11 01 00 00 ja 257 <_MyFunction+0x227>
120-131 can easily be traced as first evaluating !a, followed by the evaluation of m > 0x002. The first jump conditional does not occur until 133. By this time, two expressions have been evaluated, regardless of the outcome of the first expression: !a. If a was equal to zero, the expression can (and should) be concluded immediately, which is not done here.
How does this relate to the the C standard, which requires Boolean operators to short-circuit as soon as the outcome can be determined?
The C standard only specifies the behavior of an "abstract machine"; it does not specify the generation of assembly. As long as the observable behavior of a program matches that on the abstract machine, the implementation can use whatever physical mechanism it likes for implementing the language constructs. The relevant section in the standard (C99) is 5.1.2.3 Program execution.
It is probably a compiler optimization since comparing integral types has no side effects. You could try compiling without optimizations or using a function that has side effects instead of the comparison operator and see if it still does this.
For example, try
if (printf("a") || printf("b")) {
printf("c\n");
}
and it should print ac
As others have mentioned, this assembly output is a compiler optimization that doesn't affect program execution (as far as the compiler can tell). If you want to selectively disable this optimization, you need to tell the compiler that your variables should not be optimized across the sequence points in the code.
Sequence points are control expressions (the evaluations in if, switch, while, do and all three sections of for), logical ORs and ANDs, conditionals (?:), commas and the return statement.
To prevent compiler optimization across these points, you must declare your variable volatile. In your example, you can specify
volatile long a;
unsigned long m;
{...}
if (!a && m > 0x002 && m < 0x111) {...}
The reason that this works is that volatile is used to instruct the compiler that it can't predict the behavior of an equivalent machine with respect to the variable. Therefore, it must strictly obey the sequence points in your code.
The compiler's optimising - it gets the result into EBX, moves it to AL, part of EAX, does the second check into EDX, then branches based on the comparison of EAX and EDX. This saves a branch and leaves the code running faster, without making any difference at all in terms of side effects.
If you compile with -O0 rather than -O2, I imagine it will produce more naive assembly that more closely matches your expectations.
The code is behaving correctly (i.e., in accordance with the requirements of the language standard) either way.
It appears that you're trying to find a way to generate specific assembly code. Of two possible assembly code sequences, both of which behave the same way, you find one satisfactory and the other unsatisfactory.
The only really reliable way to guarantee the satisfactory assembly code sequence is to write the assembly code explicitly. gcc does support inline assembly.
C code specifies behavior. Assembly code specifies machine code.
But all this raises the question: why does it matter to you? (I'm not saying it shouldn't, I just don't understand why it should.)
EDIT: How exactly are a and m defined? If, as you suggest, they're related to memory-mapped devices, then they should be declared volatile -- and that might be exactly the solution to your problem. If they're just ordinary variables, then the compiler can do whatever it likes with them (as long as it doesn't affect the program's visible behavior) because you didn't ask it not to.