There is simple code, where clang and gcc behave differently.
int t;
extern void abort (void);
int f(int t, const int *a)
{
const int b[] = { 1, 2, 3};
if (!t)
return f(1, b);
return b == a;
}
int main(void)
{
if (f(0, 0))
abort ();
return 0;
}
Clang:
> clang -v
clang version 4.0.1 (tags/RELEASE_401/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
> clang test.c
> ./a.out
Aborted
GCC:
> gcc -v
Target: x86_64-suse-linux
Thread model: posix
gcc version 7.2.0 (GCC)
> gcc test.c
> ./a.out
> echo $?
0
Reason is pretty obvious: behavior is implementation defined and clang merges constant local arrays to global one.
But lets say I want consistent behavior. Can I turn some switch on or off in clang to disable this optimization and make it honestly create different local arrays (even constant ones) for different stack frames?
The option in clang you're looking for is -fno-merge-all-constants. And you can enable it in gcc with -fmerge-all-constants if you want to achieve the opposite. But the documentation of the option in gcc makes me curious:
Languages like C or C++ require each variable, including multiple instances of the same variable in recursive calls, to have distinct locations, so using this option results in non-conforming behavior.
The only bit that somehow might suggest that clang is allowed to get away with this is (C11, 6.5.2.4):
String literals, and compound literals with const-qualified types, need not designate distinct objects.
The problem here is that your code doesn't have a compound literal.
There is in fact a bug report for clang about this and it appears that the developers are aware that this is non-conforming: https://bugs.llvm.org/show_bug.cgi?id=18538
The interesting comment in there is:
This is the only case I can think of (off the top of my head) where clang deliberately does not conform to the standard by default, and has a flag to make it conform. There are a few other places where we deliberately don't conform because we think the standard is wrong (and generally we try to get the standard fixed in those cases).
It does appear that clang is reusing the same array for variable b in both local scopes of f(), but that cannot be justified on the basis of implementation-defined behavior. Implementation-defined behaviors are explicitly called out in the standard, and conforming implementations document their actual behavior for each area of implementation-defined behavior. This is not an area where the standard grants such latitude.
On the contrary, the behavior of the clang-generated program is non-conforming, and Clang is non-conforming for producing such code. The standard specifically says that
For [an object with automatic storage duration] that does not have a variable length array type,
its lifetime extends from entry into the block with which it is
associated until execution of that block ends in any way. (Entering an
enclosed block or calling a function suspends, but does not end,
execution of the current block.) If the block is entered recursively,
a new instance of the object is created each time.
(C2011, 6.2.4/6; emphasis added)
The objects in question in this case do have automatic storage duration and do not have variable-length array type, so the standard specifies that they be distinct objects. That they are arrays with a const-qualifed element type does not permit clang to reuse the array.
But lets say I want consistent behavior. Can I turn some switch on or off in clang to disable this optimization and make it honestly create different local arrays (even constant ones) for different stack frames?
You can and should reported a bug against Clang, unless this issue has already been reported. The time frame for that bug being fixed is probably longer than you want to wait, but I do not find documentation of any command-line flag that would modulate this behavior.
The other answer suggests that there is in fact an option controlling this behavior. I'm entirely prepared to believe that, as I have previously found Clang's documentation to be incomplete in other ways, but it should not be necessary to explicitly turn off such an option to achieve language conformance.
Related
First of all, I know this way of programming is not good practice. For an explanation of why I'm doing this, read on after the actual question.
When declaring a function in C like this:
int f(n, r) {…}
The types of r and n will default to int. The compiler will likely generate a warning about it, but let's choose to ignore that.
Now suppose we call f but, accidentally or otherwise, leave out an argument:
f(25);
This will still compile just fine (tested with both gcc and clang). However there is no warning from gcc about the missing argument.
So my question is:
Why does this not produce a warning (in gcc) or error?
What exactly happens when it is executed? I assume I'm invoking undefined behaviour but I'd still appreciate an explanation.
Note that it does not work the same way when I declare int f(int n, int r) {…}, neither gcc nor clang will compile this.
Now if you're wondering why I would do such a thing, I was playing Code Golf and tried to shorten my code which used a recursive function f(n, r). I needed a way to call f(n, 0) implicitly, so I defined F(n) { return f(n, 0) } which was a little too many bytes for my taste. So I wondered whether I could just omit this parameter. I can't, it still compiles but no longer works.
While optimizing this code, it was pointed out to me that I could just leave out a return at the end of my function – no warning from gcc about this either. Is gcc just too tolerant?
You don't get any diagnostics from the compiler because you are not using modern "prototyped" function declarations. If you had written
int f(int n, int r) {…}
then a subsequent f(25) would have triggered a diagnostic. With the compiler on the computer I'm typing this on, it's actually a hard error.
"Old-style" function declarations and definitions intentionally cause the compiler to relax many of its rules, because the old-style code that they exist for backward compatibility with would do things like this all the dang time. Not the thing you were trying to do, hoping that f(25) would somehow be interpreted as f(25, 0), but, for instance, f(25) where the body of f never looks at the r argument when its n argument is 25.
The pedants commenting on your question are pedantically correct when they say that literally anything could happen (within the physical capabilities of the computer, anyway; "demons will fly out of your nose" is the canonical joke, but it is, in fact, a joke). However, it is possible to describe two general classes of things that are what usually happens.
With older compilers, what usually happens is, code is generated for f(25) just as it would have been if f only took one argument. That means the memory or register location where f will look for its second argument is uninitialized, and contains some garbage value.
With newer compilers, on the other hand, the compiler is liable to observe that any control-flow path passing through f(25) has undefined behavior, and based on that observation, assume that all such control-flow paths are never taken, and delete them. Yes, even if it's the only control-flow path in the program. I have actually witnessed Clang spit out main: ret for a program all of whose control-flow paths had undefined behavior!
GCC not complaining about f(n, r) { /* no return statement */ } is another case like (1), where the old-style function definition relaxes a rule. void was invented in the 1989 C standard; prior to that, there was no way to say explicitly that a function does not return a value. So you don't get a diagnostic because the compiler has no way of knowing that you didn't mean to do that.
Independently of that, yes, GCC's default behavior is awfully permissive by modern standards. That's because GCC itself is older than the 1989 C standard and nobody has reexamined its default behavior in a long time. For new programs, you should always use -Wall, and I recommend also at least trying -Wextra, -Wpedantic, -Wstrict-prototypes, and -Wwrite-strings. In fact, I recommend going through the "Warning Options" section of the manual and experimenting with all of the additional warning options. (Note however that you should not use -std=c11, because that has a nasty tendency to break the system headers. Use -std=gnu11 instead.)
First off, the C standard doesn't distinguish between warnings and errors. It only talks about "diagnostics". In particular, a compiler can always produce an executable (even if the source code is completely broken) without violating the standard.1
The types of r and n will default to int.
Not anymore. Implicit int has been gone from C since 1999. (And your test code requires C99 because for (int i = 0; ... isn't valid in C90).
In your test code gcc does issue a diagnostic for this:
.code.tio.c: In function ‘f’:
.code.tio.c:2:5: warning: type of ‘n’ defaults to ‘int’ [-Wimplicit-int]
It's not valid code, but gcc still produces an executable (unless you enable -Werror).
If you add the required types (int f(int n, int r)), it uncovers the next issue:
.code.tio.c: In function ‘main’:
.code.tio.c:5:3: error: too few arguments to function ‘f’
Here gcc somewhat arbitrarily decided not to produce an executable.
Relevant quotes from C99 (and probably C11 too; this text hasn't changed in the n1570 draft):
6.9.1 Function definitions
Constraints
[...]
If the declarator includes an identifier list, each declaration in the declaration list shall
have at least one declarator, those declarators shall declare only identifiers from the
identifier list, and every identifier in the identifier list shall be declared.
Your code violates a constraint (your function declarator includes an identifier list, but there is no declaration list), which requires a diagnostic (such as the warning from gcc).
Semantics
[...] If the
declarator includes an identifier list, the types of the parameters shall be declared in a
following declaration list.
Your code violates this shall rule, so it has undefined behavior. This applies even if the function is never called!
6.5.2.2 Function calls
Constraints
[...]
If the expression that denotes the called function has a type that includes a prototype, the
number of arguments shall agree with the number of parameters. [...]
Semantics
[...]
[...] If the number of arguments does not equal the number of parameters, the
behavior is undefined. [...]
The actual call also has undefined behavior if the number of arguments passed doesn't match the number of parameters the function has.
As for omitting return: This is actually valid as long as the caller doesn't look at the returned value.
Reference (6.9.1 Function definitions, Semantics):
If the } that terminates a function is reached, and the value of the function call is used by
the caller, the behavior is undefined.
1 The sole exception seems to be the #error directive, about which the standard says:
The implementation shall not successfully translate a preprocessing translation unit
containing a #error preprocessing directive unless it is part of a group skipped by
conditional inclusion.
#include<stdio.h>
int main()
{
const int a=1;
int *p=(int *)&a;
(*p)++;
printf("%d %d\n",*p,a);
if(a==1)
printf("No\n");//"No" in g++.
else
printf("Yes\n");//"Yes" in gcc.
return 0;
}
The above code gives No as output in g++ compilation and Yes in gcc compilation. Can anybody please explain the reason behind this?
Your code triggers undefined behaviour because you are modifying a const object (a). It doesn't have to produce any particular result, not even on the same platform, with the same compiler.
Although the exact mechanism for this behaviour isn't specified, you may be able to figure out what is happening in your particular case by examining the assembly produced by the code (you can see that by using the -S flag.) Note that compilers are allowed to make aggressive optimizations by assuming code with well defined behaviour. For instance, a could simply be replaced by 1 wherever it is used.
From the C++ Standard (1.9 Program execution)
4 Certain other operations are described in this International
Standard as undefined (for example, the effect of attempting to
modify a const object). [ Note: This International Standard imposes
no requirements on the behavior of programs that contain undefined
behavior. —end note ]
Thus your program has undefined behaviour.
In your code, notice following two lines
const int a=1; // a is of type constant int
int *p=(int *)&a; // p is of type int *
you are putting the address of a const int variable to an int * and then trying to modify the value, which should have been treated as const. This is not allowed and invokes undefined behaviour.
For your reference, as mentioned in chapter 6.7.3, C11 standard, paragraph 6
If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is
made to refer to an object defined with a volatile-qualified type through use of an lvalue
with non-volatile-qualified type, the behavior is undefined
So, to cut the long story short, you cannot rely on the outputs for comaprison. They are the result of undefined behaviour.
Okay we have here 'identical' code passed to "the same" compiler but once
with a C flag and the other time with a C++ flag. As far as any reasonable
user is concerned nothing has changed. The code should be interpreted
identically by the compiler because nothing significant has happened.
Actually, that's not true. While I would be hard pressed to point to it in
a standard but the precise interpretation of 'const' has slight differences
between C and C++. In C it's very much an add-on, the 'const' flag
says that this normal variable 'a' should not be written to by the code
round here. But there is a possibility that it will be written to
elsewhere. With C++ the emphasis is much more to the immutable constant
concept and the compiler knows that this constant is more akin to an
'enum' that a normal variable.
So I expect this slight difference means that slightly different parse
trees are generated which eventually leads to different assembler.
This sort of thing is actually fairly common, code that's in the C/C++
subset does not always compile to exactly the same assembler even with
'the same' compiler. It tends to be caused by other language features
meaning that there are some things you can't prove about the code right
now in one of the languages but it's okay in the other.
Usually C is the performance winner (as was re-discovered by the Linux
kernel devs) because it's a simpler language but in this example, C++
would probably turn out faster (unless the C dev switches to a macro
or enum
and catches the unreasonable act of taking the address of an immutable constant).
Please see next codes and result:
foo.c:
const int extern_const = 1;
main.c
#include <stdio.h>
extern int extern_const;
int main(void)
{
printf("before: %d\n", extern_const);
extern_const = 2;
printf("after : %d\n", extern_const);
return 0;
}
compile and result:
$ gcc -shared -fpic foo.c -o libfoo.so
$ gcc main.c -L. -lfoo -o test
$ LD_LIBRARY_PATH=`pwd` ./test
before: 1
after : 2
I declared a const int variable extern_const and it resides in a shared library libfoo.so.
In main.c, I declared the extern_const as just extern int not extern const int and changed the value from 1 to 2. Is this safe and effective?
The execution result shows that the substitution works anyway. I've heard that overwriting const value caucses undefined behavior and in fact, when I compiled foo.c and main.c at once (without creating shared library), the program actually ended with segmentation fault before the second printf.
What I want to know is next things:
Is it safe to change the value of any const variable in external library, generally?
If not, is it safe for GCC/GNU toolkits?
If both of 1. and 2. are wrong, did I get a lucky case of undefined behaviors?
If 1. or 2. is right, what does make such difference for the cases with/without library?
Modifying constant objects is Undefined Behavior. Anything may happen.
In your case, you might have become unlucky because GCC does not yet pool all constant variables and literals, it did not put it into a read-only section (Define more of them and it might happen), and your main() is the first and last code accessing that external constant object (though under false flag as non-const).
6.7.3 Type qualifiers §6
If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is
made to refer to an object defined with a volatile-qualified type through use of an lvalue
with non-volatile-qualified type, the behavior is undefined.133)
No.
No.
Yes.
Basically, it's up to the compiler/toolchain/operating system to provide protection for const variables. Some combinations of those go out of their way to make sure that what's supposed to be read-only will be read-only even if it means allocating a whole page (several kB) just to store one variable. Others make different trade-offs and will not waste a lot of space just to protect one variable and trust the programmer to not do crazy things like this.
The other answers are absolutely correct about this being Undefined Behavior and thus something you should not do. The reason your broken code is "working" for you is a side effect of the way dynamic linking works, namely copy relocations. Basically, what happens is that, since the main executable is not position-independent, it has to have addresses for all data objects it accesses directly hard-coded as immediates in the instructions which perform the accesses. Thus, the linker allocates writable space (since the value, which comes from a shared library at runtime, can't be known) in the main program's writable data segment and includes an instruction to the dynamic linker for it to copy the value from the shared library to the main program's data when performing relocations at startup. Any references from the shared library are then patched up to point to the new copy in the main program's data.
If you want to see your code fail, try to compile your main program as a position-independent executable:
$ gcc -fPIE -pie main.c -L. -lfoo -o test
and see what happens. Note that PIE is default on many hardened systems. Likewise, the ABIs for some cpu architectures (MIPS is one, if I'm not mistaken) never need copy relocations, and thus your program should crash even without PIE on such archs.
In foo.c, add:
int y;
void f(void) {
y = extern_const;
}
Compile foo.c with optimization to assembly. Read the assembly for f(). If your compiler is like mine, you should see that f() has been optimized to the equivalent of y = 1;
This means that according to my compiler (and probably yours), what you did is not okay, and you deserve any strange demon that may fly out of your nose as a result of declaring extern_const differently in different files.
Your sample is violating the one definition rule. This is undefined behavior, so anything can happen during compile phase and your program is not guaranteed to do anything useful, see also here.
From C standard
J.2 Undefined behavior
Two declarations of the same object or function specify types that are not compatible
(6.2.7).
also
6.2.7 Compatible type and composite type
2 All declarations that refer to the same object or function shall have compatible type;
otherwise, the behavior is undefined.
also
6.7.3 Type qualifiers
10 For two qualified types to be compatible, both shall have the identically qualified version
of a compatible type; the order of type qualifiers within a list of specifiers or qualifiers
does not affect the specified type.
Besides, as others stated, it is also undefined behavior to modify a const object:
6.7.3 Type qualifiers
6 If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is
made to refer to an object defined with a volatile-qualified type through use of an lvalue
with non-volatile-qualified type, the behavior is undefined.133)
How do I ensure each and every field of my structures are initialized in GCC when using designated initializers? (I'm especially interested in function pointers.) (I'm using C not C++.)
Here is an example:
typedef struct {
int a;
int b;
} foo_t;
typedef struct {
void (*Start)(void);
void (*Stop)(void);
} bar_t;
foo_t fooo = {
5
};
foo_t food = {
.b=4
};
bar_t baro = {
NULL
};
bar_t bard = {
.Start = NULL
};
-Wmissing-field-initializers does not help at all. It works for fooo only in GCC (mingw 4.7.3, 4.8.1), and clang does only marginally better (no warnings for food and bard).
I'm sure there is a reason for not producing warnings for designated initializer (even when I explicitly ask for them) but I want/need them. I do not want to initialize structures based on order/position because that is more error prone (for example swapping Start and Stop won't even give any warning). And neither gcc nor clang will give any warning that I failed to explicitly initialize a field (when initializing by name). I also don't want to litter my code with if(x.y==NULL) lines for multiple reasons, one of which is I want compile time warnings and not runtime errors.
At least splint will give me warnings on all 4 cases, but unfortunately I cannot use splint all the time (it chokes on some of the code (fails to parse some C99, GCC extensions)).
Note: If I'm using a real function instead of NULL GCC will also show a warning for baro (but not bard).
I searched google and stack overflow but only found related questions and have not found answer for this specific problem.
The best match I have found is 'Ensure that all elements in a structure are initialized'
Ensure that all elements in a structure are initialized
Which asks pretty much the same question, but has no satisfying answer.
Is there a better way dealing with this that I have not mentioned?
(Maybe other code analysis tool? Preferably something (free) that can be integrated into Eclipse or Visual Studio...)
If I'm not mistaken, the C standards specify that the other fields are automatically initialized with 0.
So what you are asking for - a compilation error when fields are not initialized - would be out of line with the C (modern?) specifications.
C99 standard, page 127 in: http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf
If there are fewer initializers in a brace-enclosed list than there are elements or members
of an aggregate, or fewer characters in a string literal used to initialize an array of known
size than there are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage duration.
gccs -Wmissing-field-initializers is documented to not warn with designated initializers. There is a request for an enhancement -Wmissing-field-initializers=2 that would then warn also: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39589
So I suggest you add your wish to that bug report, or maybe even provide a patch. From my experience with open-source software, adding a patch is best.
The four ways you have showed all initialize the rest of the structure. It's initialized to 0 (or the type equivalent).
A colleague of mine recently got bitten badly by writing out of bounds to a static array on the stack (he added an element to it without increasing the array size). Shouldn't the compiler catch this kind of error? The following code compiles cleanly with gcc, even with the -Wall -Wextra options, and yet it is clearly erroneous:
int main(void)
{
int a[10];
a[13] = 3; // oops, overwrote the return address
return 0;
}
I'm positive that this is undefined behavior, although I can't find an excerpt from the C99 standard saying so at the moment. But in the simplest case, where the size of an array is known as compile time and the indices are known at compile time, shouldn't the compiler emit a warning at the very least?
GCC does warn about this. But you need to do two things:
Enable optimization. Without at least -O2, GCC is not doing enough analysis to know what a is, and that you ran off the edge.
Change your example so that a[] is actually used, otherwise GCC generates a no-op program and has completely discarded your assignment.
.
$ cat foo.c
int main(void)
{
int a[10];
a[13] = 3; // oops, overwrote the return address
return a[1];
}
$ gcc -Wall -Wextra -O2 -c foo.c
foo.c: In function ‘main’:
foo.c:4: warning: array subscript is above array bounds
BTW: If you returned a[13] in your test program, that wouldn't work either, as GCC optimizes out the array again.
Have you tried -fmudflap with GCC? These are runtime checks but are useful, as most often you have got to do with runtime calculated indices anyway. Instead of silently continue to work, it will notify you about those bugs.
-fmudflap -fmudflapth -fmudflapir
For front-ends that support it (C and C++), instrument all risky
pointer/array dereferencing
operations, some standard
library string/heap functions, and some other associated
constructs with range/validity tests.
Modules so instrumented
should be immune to buffer overflows, invalid heap use, and some
other classes of C/C++ programming
errors. The instrumen‐
tation relies on a separate runtime library (libmudflap), which
will be linked into a program if
-fmudflap is given at link
time. Run-time behavior of the instrumented program is controlled
by the MUDFLAP_OPTIONS environment
variable. See "env
MUDFLAP_OPTIONS=-help a.out" for its options.
Use -fmudflapth instead of -fmudflap to compile and to link if your program is multi-threaded. Use
-fmudflapir, in addition
to -fmudflap or -fmudflapth, if instrumentation should ignore pointer reads. This produces
less instrumentation (and there‐
fore faster execution) and still provides some protection against
outright memory corrupting writes, but
allows erroneously
read data to propagate within a program.
Here is what mudflap gives me for your example:
[js#HOST2 cpp]$ gcc -fstack-protector-all -fmudflap -lmudflap mudf.c
[js#HOST2 cpp]$ ./a.out
*******
mudflap violation 1 (check/write): time=1229801723.191441 ptr=0xbfdd9c04 size=56
pc=0xb7fb126d location=`mudf.c:4:3 (main)'
/usr/lib/libmudflap.so.0(__mf_check+0x3d) [0xb7fb126d]
./a.out(main+0xb9) [0x804887d]
/usr/lib/libmudflap.so.0(__wrap_main+0x4f) [0xb7fb0a5f]
Nearby object 1: checked region begins 0B into and ends 16B after
mudflap object 0x8509cd8: name=`mudf.c:3:7 (main) a'
bounds=[0xbfdd9c04,0xbfdd9c2b] size=40 area=stack check=0r/3w liveness=3
alloc time=1229801723.191433 pc=0xb7fb09fd
number of nearby objects: 1
[js#HOST2 cpp]$
It has a bunch of options. For example it can fork off a gdb process upon violations, can show you where your program leaked (using -print-leaks) or detect uninitialized variable reads. Use MUDFLAP_OPTIONS=-help ./a.out to get a list of options. Since mudflap only outputs addresses and not filenames and lines of the source, i wrote a little gawk script:
/^ / {
file = gensub(/([^(]*).*/, "\\1", 1);
addr = gensub(/.*\[([x[:xdigit:]]*)\]$/, "\\1", 1);
if(file && addr) {
cmd = "addr2line -e " file " " addr
cmd | getline laddr
print $0 " (" laddr ")"
close (cmd)
next;
}
}
1 # print all other lines
Pipe the output of mudflap into it, and it will display the sourcefile and line of each backtrace entry.
Also -fstack-protector[-all] :
-fstack-protector
Emit extra code to check for buffer overflows, such as stack smashing attacks. This is done by adding a guard variable to functions with vulnerable objects. This includes functions that call alloca, and functions with buffers larger than 8 bytes. The guards are initialized when a function is entered and then checked when the function exits. If a guard check fails, an error message is printed and the program exits.
-fstack-protector-all
Like -fstack-protector except that all functions are protected.
You're right, the behavior is undefined. C99 pointers must point within or just one element beyond declared or heap-allocated data structures.
I've never been able to figure out how the gcc people decide when to warn. I was shocked to learn that -Wall by itself will not warn of uninitialized variables; at minimum you need -O, and even then the warning is sometimes omitted.
I conjecture that because unbounded arrays are so common in C, the compiler probably doesn't have a way in its expression trees to represent an array that has a size known at compile time. So although the information is present at the declaration, I conjecture that at the use it is already lost.
I second the recommendation of valgrind. If you are programming in C, you should run valgrind on every program, all the time until you can no longer take the performance hit.
It's not a static array.
Undefined behavior or not, it's writing to an address 13 integers from the beginning of the array. What's there is your responsibility. There are several C techniques that intentionally misallocate arrays for reasonable reasons. And this situation is not unusual in incomplete compilation units.
Depending on your flag settings, there are a number of features of this program that would be flagged, such as the fact that the array is never used. And the compiler might just as easily optimize it out of existence and not tell you - a tree falling in the forest.
It's the C way. It's your array, your memory, do what you want with it. :)
(There are any number of lint tools for helping you find this sort of thing; and you should use them liberally. They don't all work through the compiler though; Compiling and linking are often tedious enough as it is.)
The reason C doesn't do it is that C doesn't have the information. A statement like
int a[10];
does two things: it allocates sizeof(int)*10 bytes of space (plus, potentially, a little dead space for alignment), and it puts an entry in the symbol table that reads, conceptually,
a : address of a[0]
or in C terms
a : &a[0]
and that's all. In fact, in C you can interchange *(a+i) with a[i] in (almost*) all cases with no effect BY DEFINITION. So your question is equivalent to asking "why can I add any integer to this (address) value?"
* Pop quiz: what is the one case in this this isn't true?
The C philosophy is that the programmer is always right. So it will silently allow you to access whatever memory address you give there, assuming that you always know what you are doing and will not bother you with a warning.
I believe that some compilers do in certain cases. For example, if my memory serves me correctly, newer Microsoft compilers have a "Buffer Security Check" option which will detect trivial cases of buffer overruns.
Why don't all compilers do this? Either (as previously mentioned) the internal representation used by the compiler doesn't lend itself to this type of static analysis or it just isn't high enough of the writers priority list. Which to be honest, is a shame either way.
shouldn't the compiler emit a warning at the very least?
No; C compilers generally do not preform array bounds checks. The obvious negative effect of this is, as you mention, an error with undefined behavior, which can be very difficult to find.
The positive side of this is a possible small performance advantage in certain cases.
There are some extension in gcc for that (from compiler side)
http://www.doc.ic.ac.uk/~awl03/projects/miro/
on the other hand splint, rat and quite a few other static code analysis tools would have
found that.
You also can use valgrind on your code and see the output.
http://valgrind.org/
another widely used library seems to be libefence
It's simply a design decision ones made. Which now leads to this things.
Regards
Friedrich
-fbounds-checking option is available with gcc.
worth going thru this article
http://www.doc.ic.ac.uk/~phjk/BoundsChecking.html
'le dorfier' has given apt answer to your question though, its your program and it is the way C behaves.