A C code showing difference between static and dynamic allocation

A C code showing difference between static and dynamic allocation - c

I want to write a C code to see the difference between static and dynamic allocation.
That's my idea but it doesn't work.
It simply initializes an array of size 10, but assigns 100 elements instead of 10. I'll then initialize another array large enough hoping to replace the 90 elements that're not part of array1[10], then I print out the 100 elements of array1.
int i;
int array1[10];
int array2[10000];
for(i=0;i<100;i++)
array1[i] = i;
for(i=0;i<10000;i++)
array2[i] = i+1;
for(i=0;i<100;i++)
{
printf("%d \n",array1[i]);
}
What I hope to get is garbage outside then first 10 elements when using static allocation, afterwards, I'll use malloc and realloc to ensure that the 100 elements would be there correctly. But unfortunately, it seems that the memory is large enough so that the rest of the 100 elements wouldn't be replaced!
I tried to run the code on linux and use "ulimit" to limit the memory size, but it didn't work either.
Any ideas please?

Cdoesn't actually do any boundary checking with regards to arrays. It depends on the OS to ensure that you are accessing valid memory.
Accessing outside the array bounds is undefined behavior, from the c99 draft standard section Annex J.2 J.2 Undefined behavior includes the follow point:
An array subscript is out of range, even if an object is apparently accessible with the
given subscript (as in the lvalue expression a[1][7] given the declaration int
a[4][5]) (6.5.6).
In this example you are declaring a stack based array. Accessing out of bound will get memory from already allocated stack space. Currently undefined behavior is not in your favor as there is no Seg fault. Its programmer's responsibility to handle boundary conditions while writing code in C/C++.

You do get garbage after the first 10 elements of array1. All of the data after element 9 should not be considered allocated by the stack and can be written over at any time. When the program prints the 100 elements of array1, you might see the remnants of either for loop because the two arrays are allocated next to each other and normally haven't been written over. If this were implemented in a larger program, other arrays might take up the space after these two example arrays.

When you access array1[10] and higher index values, the program will just keep writing into adjacent memory locations even though they don't "belong" to your array. At some point you might try to access a memory location that's forbidden, but as long as you're mucking with memory that the OS has given to your program, this will run. The results will be unpredictable though. It could happen that this will corrupt data that belongs to another variable in your program, for example. It could also happen that the value that you wrote there will still be there when you go back to read it if no other variable has been "properly assigned" that memory location. (This seems to be what's happening in the specific case that you posted.)
All of that being said, I'm not clear at all how this relates to potential differences between static and dynamic memory allocation since you've only done static allocation in the program and you've deliberately introduced a bug.

Changing the memory size won't resolve your problem, because when you create your two arrays, the second one should be right after the first one in memory.
Your code should do what you think it will, and on my computer, it does.
Here's my output :
0
1
2
3
4
5
6
7
8
9
10
11
1
2
3
4
5
...
What OS are you running your code on ? (I'm on linux 64bit).
Anyway, as everybody told you, DON'T EVER DO THIS IN A REAL PROGRAM. Writing outside an array is an undefined behaviour and could lead your program to crash.

Writing out of bounds of an array will prove nothing and is not well-defined. Generally, there's nothing clever or interesting involved in invoking undefined behavior. The only thing you'll achieve by that is random crashes.
If you wish to know where a variable is allocated, you have to look at addresses. Here's one example:
#include <stdio.h>
#include <stdlib.h>
int main (void)
{
int stack;
static int data = 1;
static int bss = 0;
int* heap = malloc(sizeof(*heap));
printf("stack: %p\n", (void*)&stack);
printf(".data: %p\n", (void*)&data);
printf(".bss: %p\n", (void*)&bss);
printf(".heap: %p\n", (void*)heap);
}
This should print 4 distinctively different addresses (.data and .bss probably close to each other though). To know exactly where a certain memory area starts, you either need to check some linker script or use a system-specific API. And once you know the memory area's offset and size, you can determine if a variable is stored within one of the different memory segments.

Related

Static Allocation - C language [duplicate]

This question already has answers here:
Do I really need malloc?
(2 answers)
Closed 2 years ago.
As far as I know, the C compiler (I am using GCC 6) will scan the code in order to:
Finding syntax issues;
Allocating memory to the program (Static allocation concept);
So why does this code work?
int main(){
int integers_amount; // each int has 4 bytes
printf("How many intergers do you wanna store? \n");
scanf("%d", &integers_amount);
int array[integers_amount];
printf("Size of array: %d\n", sizeof(array)); // Should be 4 times integer_amount
for(int i = 0; i < integers_amount; i++){
int integer;
printf("Type the integer: \n");
scanf("%d", &integer);
array[i] = integer;
}
for(int j = 0; j < integers_amount; j++){
printf("Integer typed: %d \n", array[j]);
}
return 0;
}
My point is:
How does the C compiler infer the size of the array during compilation time?
I mean, it was declared but its value has not been informed just yet (Compilation time). I really believed that the compiler allocated the needed amount of memory (in bytes) at compilation time - That is the concept of static allocation matter of fact.
From what I could see, the allocation for the variable 'array' is done during runtime, only after the user has informed the 'size' of the array. Is that correct?
I thought that dynamic allocation was used to use the needed memory only (let's say that I declare an integer array of size 10 because I don't know how many values the user will need to hold there, but I ended up only using 7, so I have a waste of 12 bytes).
If during runtime I have those bytes informed I can allocate only the memory needed. However, it doesn't seem to be the case because from the code we can see that the array is only allocated during runtime.
Can I have some help understanding that?
Thanks in advance.

How does the C compiler infer the size of the array during compilation time?
It's what's called a variable length array or for short a VLA, the size is determined at runtime but it's a one off, you cannot resize anymore. Some compilers even warn you about the usage of such arrays, as they are stored in the stack, which has a very limited size, it can potencially cause a stackoverflow.
From what I could see, the allocation for the variable 'array' is done during runtime, only after the user has informed the 'size' of the array. Is that correct?
Yes, that is correct. That's why these can be dangerous, the compiler won't know what is the size of the array at compile time, so if it's too large there is nothing it can do to avoid problems. For that reason C++ forbids VLA's.
let's say that I declare an integer array of size 10 because I don't know how many values the user will need to hold there, but I ended up only using 7, so I have a waste of 12 bytes
Contrary to fixed size arrays, a variable length array size can be determined at runtime, but when its size is defined you can no longer change it, for that you have dynamic memory allocation (discussed ahead) if you are really set on having the exact size needed, and not one byte more.
Anyway, if you are expecting an outside value to set the size of the array, odds are that it is the size you need, if not, well there is nothing you can do, aside from the mentioned dynamic memory allocation, in any case it's better to have a little more wasted space than too little space.
Can I have some help understanding that?
There are three concepts I find relevant to the discussion:
Fixed size arrays, i.e. int array[10]:
Their size defined at compile time, they cannot be resized and are useful if you already know the size they should have.
Variable length arrays, i.e. int array[size], size being a non constant variable:
Their size is defined at runtime, but can only be set once, they are useful if the size of the array is dependant on external values, e.g. a user input or some value retrived from a file.
Dynamically allocated arrays: i.e. int *array = malloc(sizeof *arr * size), size may or may not be a constant:
These are used when your array will need to be resized, or if it's too large to store in the stack, which has limited size. You can change its size at any point in your code using realloc, which may simply resize the array or, as #Peter reminded, may simply allocate a new array and copy the contents of the old one over.

Variables defined inside functions, like array in your snippet (main is a function like any other!), have "automatic" storage duration; typically, this translates to them being on the "stack", a universal concept for a first in/last out storage which gets built and unbuilt as functions are entered and exited.
The "stack" simply is an address which keeps track of the current edge of unused storage available for local variables of a function. The compiler emits code for moving it "forward" when a function is entered in order to accommodate the memory needs of local variables and to move it "backward" when the program flow leaves the function (the double quotes are there because the stack may as well grow towards smaller addresses).
Typically these stack adjustments upon entering into and returning from functions are computed at compile time; after all, the local variables are all visible in the program code. But principally, nothing keeps a program from changing the stack pointer "on the fly". Very early on, Unixes made use of this and provided a function which dynamically allocates space on the stack, called alloca(). The FreeBSD man page says: "The alloca() function appeared in Version 32V AT&T UNIX"´(which was released in 1979).
alloca behaves very much like alloc except that the storage is lost when the current function returns, and that it underlies the usual stack size restrictions.
So the first part of the answer is that your array does not have static storage duration. The memory where local variables will reside is not known at compile time (for example, a function with local variables in it may or may not be called at all, depending on run-time user input!). If it were, your astonishment would be entirely justified.
The second part of the answer is that array is a variable length array, a fairly new feature of the C programming language which was only added in 1999. It declares an object on the stack whose size is not known until run time (leading to the anti-paradigmatic consequence that sizeof(array) is not a compile time constant!).
One could argue that variable length arrays are only syntactic sugar around an alloca call; but alloca is, although widely available, not part of any standard.

The memory necessary to store a matrix within the scope of an IF command is allocated only inside this scope?

Let us consider this piece of code below.
#include <stdio.h>
#define N 100
int main()
{
int n;
scanf("%d",&n);
if(n>0){
int m[N ][N] = {0};
}
return 0;
}
I would like to understand the behavior of this code regarding the memory. I would like to answer to following questions:
The memory necessary to store the matrix m will be allocated only
if n > 0? Or is it allocated at the begining of the programa, in an
independent way?
The memory necessary to allocate the matrix m
will be released at the end of the scope of the if?

As #dbush says, code can only legally access the memory named m while inside the scope of the if block.
Where the allocation is actually done will depend on the compiler and optimizer settings.
With no optimization, both gcc and clang do adjust the stack pointer (allocating memory) at the entry to main, but perform the initialization to zero only if n is non-zero.
With -O3 optimization, they both only call scanf and return 0, since those are the only observable effects. Neither compiler actually sets aside memory or looks at the value scanned, or attempts initialization.
Evidence: https://godbolt.org/z/6WRhxz

The memory used by m is only valid within the scope of the if block. So if you were to save a pointer to it outside of the block you would invoke undefined behavior if you tried to dereference that pointer outside the block.
That being said, an implementation may choose to set aside stack space for all local variables regardless of scope when a function is entered. For example, if I take your code and change N to 2000 the application immediately core dumps as soon as it starts, which indicates that it attempts to allocate space on the stack for an object that is too large.

Initializing integer pointers in c, not leading to unspecified behavior as expected

Having recently switched to c, I've been told a thousand ways to Sunday that referencing a value that hasn't been initialized isn't good practice, and leads to unexpected behavior. Specifically, (because my previous language initializes integers as 0) I was told that integers might not be equal to zero when uninitialized. So I decided to put that to the test.
I wrote the following piece of code to test this claim:
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
#include <assert.h>
int main(){
size_t counter = 0;
size_t testnum = 2000; //The number of ints to allocate and test.
for(int i = 0; i < testnum; i++){
int* temp = malloc(sizeof(int));
assert(temp != NULL); //Just in case there's no space.
if(*temp == 0) counter++;
}
printf(" %d",counter);
return 0;
}
I compiled it like so (in case it matters):
gcc -std=c99 -pedantic name-of-file.c
Based on what my instructors had said, I expected temp to point to a random integer, and that the counter would not be incremented very often. However, my results blow this assumption out of the water:
testnum: || code returns:
2 2
20 20
200 200
2000 2000
20000 20000
200000 200000
2000000 2000000
... ...
The results go on for a couple more powers of 10 (*2), but you get the point.
I then tested a similar version of the above code, but I initialized an integer array, set every even index to plus 1 of its previous value (which was uninitialized), freed the array, and then performed the code above, testing the same amount of integers as the size of the array (i.e. testnum). These results are much more interesting:
testnum: || code returns:
2 2
20 20
200 175
2000 1750
20000 17500
200000 200000
2000000 2000000
... ...
Based on this, it's reasonable to conclude that c reuses freed memory (obviously), and sets some of those new integer pointers to point to addresses which contain the previously incremented integers. My question is why all of my integer pointers in the first test consistently point to 0. Shouldn't they point to whatever empty spaces on the heap that my computer has offered the program, which could (and should, at some point) contain non-zero values?
In other words, why does it seem like all of the new heap space that my c program has access to has been wiped to all 0s?

As you already know, you are invoking undefined behavior, so all bets are off. To explain the particular results you are observing ("why is uninitialized memory that I haven't written to all zeros?"), you first have to understand how malloc works.
First of all, malloc does not just directly ask the system for a page whenever you call it. It has an internal "cache" from which it can hand you memory. Let's say you call malloc(16) twice. The first time you call malloc(16), it will scan the cache, see that it's empty, and request a fresh page (4KB on most systems) from the OS. It then splits this page into two chunks, gives you the smaller chunk, and saves the other chunk in its cache. The second time you call malloc(16), it will see that it has a large enough chunk in its cache, and allocate memory by splitting that chunk again.
freeing memory simply returns it to the cache. There, it may (or may not be) be merged with other chunks to form a bigger chunk, and is then used for other allocations. Depending on the details of your allocator, it may also choose to return free pages to the OS if possible.
Now the second piece of the puzzle -- any fresh pages you obtain from the OS are filled with 0s. Why? Imagine it simply handed you an unused page that was previously used by some other process that has now terminated. Now you have a security problem, because by scanning that "uninitialized memory", your process could potentially find sensitive data such as passwords and private keys that were used by the previous process. Note that there is no guarantee by the C language that this happens (it may be guaranteed by the OS, but the C specification doesn't care). It's possible that the OS filled the page with random data, or didn't clear it at all (especially common on embedded devices).
Now you should be able to explain the behavior you're observing. The first time, you are obtaining fresh pages from the OS, so they are empty (again, this is an implementation detail of your OS, not the C language). However, if you malloc, free, then malloc again, there is a chance that you are getting back the same memory that was in the cache. This cached memory is not wiped, since the only process that could have written to it was your own. Hence, you just get whatever data was previously there.
Note: this explains the behavior for your particular malloc implementation. It doesn't generalize to all malloc implementations.

First off, you need to understand, that C is a language that is described in a standard and implemented by several compilers (gcc, clang, icc, ...). In several cases, the standard mentions that certain expressions or operations result in undefined behavior.
What is important to understand is that this means you have no guarantees on what the behavior will be. In fact any compiler/implementation is basically free to do whatever it wants!
In your example, this means you cannot make any assumptions of when the uninitialized memory will contain. So assuming it will be random or contain elements of a previously freed object are just as wrong as assuming that it is zero, because any of that could happen at any time.
Many compilers (or OS's) will consistently do the same thing (such as the 0s you observer), but that is also not guaranteed.
(To maybe see different behaviors, try using a different compiler or different flags.)

Undefined behavior does not mean "random behavior" nor does it mean "the program will crash." Undefined behavior means "the compiler is allowed to assume that this never happens," and "if this does happen, the program could do anything." Anything includes doing something boring and predictable.
Also, the implementation is allowed to define any instance of undefined behavior. For instance, ISO C never mentions the header unistd.h, so #include <unistd.h> has undefined behavior, but on an implementation conforming to POSIX, it has well-defined and documented behavior.
The program you wrote is probably observing uninitialized malloced memory to be zero because, nowadays, the system primitives for allocating memory (sbrk and mmap on Unix, VirtualAlloc on Windows) always zero out the memory before returning it. That's documented behavior for the primitives, but it is not documented behavior for malloc, so you can only rely on it if you call the primitives directly. (Note that only the malloc implementation is allowed to call sbrk.)
A better demonstration is something like this:
#include <stdio.h>
#include <stdlib.h>
int
main(void)
{
{
int *x = malloc(sizeof(int));
*x = 0xDEADBEEF;
free(x);
}
{
int *y = malloc(sizeof(int));
printf("%08X\n", *y);
}
return 0;
}
which has pretty good odds of printing "DEADBEEF" (but is allowed to print 00000000, or 5E5E5E5E, or make demons fly out of your nose).
Another better demonstration would be any program that makes a control-flow decision based on the value of an uninitialized variable, e.g.
int foo(int x)
{
int y;
if (y == 5)
return x;
return 0;
}
Current versions of gcc and clang will generate code that always returns 0, but the current version of ICC will generate code that returns either 0 or the value of x, depending on whether register EDX is equal to 5 when the function is called. Both possibilities are correct, and so generating code that always returns x, and so is generating code that makes demons fly out of your nose.

useless deliberations, wrong assumptions, wrong test. In your test every time you malloc sizeof int of the fresh memory. To see the that UB you wanted to see you should put something in that allocated memory and then free it. Otherwise you do not reuse it, you just leak it. Most of the OS-es clear all the memory allocated to the program before executing it for the security reasons (so when you start the program everything was zeroed or initialised to the static values).
Change your program to:
int main(){
size_t counter = 0;
size_t testnum = 2000; //The number of ints to allocate and test.
for(int i = 0; i < testnum; i++){
int* temp = malloc(sizeof(int));
assert(temp != NULL); //Just in case there's no space.
if(*temp == 0) counter++;
*temp = rand();
free(temp);
}
printf(" %d",counter);
return 0;
}

strcpy working no matter the malloc size?

I'm currently learning C programming and since I'm a python programmer, I'm not entirely sure about the inner workings of C. I just stumbled upon a really weird thing.
void test_realloc(){
// So this is the original place allocated for my string
char * curr_token = malloc(2*sizeof(char));
// This is really weird because I only allocated 2x char size in bytes
strcpy(curr_token, "Davi");
curr_token[4] = 'd';
// I guess is somehow overwrote data outside the allocated memory?
// I was hoping this would result in an exception ( I guess not? )
printf("Current token > %s\n", curr_token);
// Looks like it's still printable, wtf???
char *new_token = realloc(curr_token, 6);
curr_token = new_token;
printf("Current token > %s\n", curr_token);
}
int main(){
test_realloc();
return 0;
}
So the question is: how come I'm able to write more chars into a string than is its allocated size? I know I'm supposed to handle mallocated memory myself but does it mean there is no indication that something is wrong when I write outside the designated memory?
What I was trying to accomplish
Allocate a 4 char ( + null char ) string where I would write 4 chars of my name
Reallocate memory to acomodate the last character of my name

know I'm supposed to handle mallocated memory myself but does it mean there is no indication that something is wrong when I write outside the designated memory?
Welcome to C programming :). In general, this is correct: you can do something wrong and receive no immediate feedback that was the case. In some cases, indeed, you can do something wrong and never see a problem at runtime. In other cases, however, you'll see crashes or other behaviour that doesn't make sense to you.
The key term is undefined behavior. This is a concept that you should become familiar with if you continue programming in C. It means just like it sounds: if your program violates certain rules, the behaviour is undefined - it might do what you want, it might crash, it might do something different. Even worse, it might do what you want most of the time, but just occasionally do something different.
It is this mechanism which allows C programs to be fast - since they don't at runtime do a lot of the checks that you may be used to from Python - but it also makes C dangerous. It's easy to write incorrect code and be unaware of it; then later make a subtle change elsewhere, or use a different compiler or operating system, and the code will no longer function as you wanted. In some cases this can lead to security vulnerabilities, since unwanted behavior may be exploitable.

Suppose that you have an array as shown below.
int arr[5] = {6,7,8,9,10};
From the basics of arrays, name of the array is a pointer pointing to the base element of the array. Here, arr is the name of the array, which is a pointer, pointing to the base element, which is 6. Hence,*arr, literally, *(arr+0) gives you 6 as the output and *(arr+1) gives you 7 and so on.
Here, size of the array is 5 integer elements. Now, try accessing the 10th element, though the size of the array is 5 integers. arr[10]. This is not going to give you an error, rather gives you some garbage value. As arr is just a pointer, the dereference is done as arr+0,arr+1,arr+2and so on. In the same manner, you can access arr+10 also using the base array pointer.
Now, try understanding your context with this example. Though you have allocated memory only for 2 bytes for character, you can access memory beyond the two bytes allocated using the pointer. Hence, it is not throwing you an error. On the other hand, you are able to predict the output on your machine. But it is not guaranteed that you can predict the output on another machine (May be the memory you are allocating on your machine is filled with zeros and may be those particular memory locations are being used for the first time ever!). In the statement,
char *new_token = realloc(curr_token, 6); note that you are reallocating the memory for 6 bytes of data pointed by curr_token pointer to the new_tokenpointer. Now, the initial size of new_token will be 6 bytes.

Usually malloc is implemented such a way that it allocates chunks of memory aligned to paragraph (fundamental alignment) that is equal to 16 bytes.
So when you request to allocate for example 2 bytes malloc actually allocates 16 bytes. This allows to use the same chunk of memory when realloc is called.
According to the C Standard (7.22.3 Memory management functions)
...The pointer returned if the allocation succeeds is suitably aligned so
that it may be assigned to a pointer to any type of object
with a fundamental alignment requirement and then used to access such an
object or an array of such objects in the space allocated
(until the space is explicitly deallocated).
Nevertheless you should not rely on such behavior because it is not normative and as result is considered as undefined behavior.

No automatic bounds checking is performed in C.
The program behaviour is unpredictable.
If you go writing in the memory reserved for another process, you will end with a Segmentation fault, otherwise you will only corrupt data, ecc...

Char and strcpy in C

I came across a part of question in which, I am getting an output, but I need a explanation why it is true and does work?
char arr[4];
strcpy(arr,"This is a link");
printf("%s",arr);
When I compile and execute, I get the following output.
Output:
This is a link

The short answer why it worked (that time) is -- you got lucky. Writing beyond the end of an array is undefined behavior. Where undefined behavior is just that, undefined, it could just a easily cause a segmentation fault as it did produce output. (though generally, stack corruption is the result)
When handling character arrays in C, you are responsible to insure you have allocated sufficient storage. When you intend to use the array as a character string, you also must allocate sufficient storage for each character +1 for the nul-terminating character at the end (which is the very definition of a nul-terminated string in C).
Why did it work? Generally, when you request say char arr[4]; the compiler is only guaranteeing that you have 4-bytes allocated for arr. However, depending on the compiler, the alignment, etc. the compiler may actually allocate whatever it uses as a minimum allocation unit to arr. Meaning that while you have only requested 4-bytes and are only guaranteed to have 4-usable-bytes, the compiler may have actually set aside 8, 16, 32, 64, or 128, etc-bytes.
Or, again, you were just lucky that arr was the last allocation requested and nothing yet has requested or written to the memory address starting at byte-5 following arr in memory.
The point being, you requested 4-bytes and are only guaranteed to have 4-bytes available. Yes it may work in that one printf before anything else takes place in your code, but your code is wholly unreliable and you are playing Russian-Roulette with stack corruption (if it has not already taken place).
In C, the responsibility falls to you to insure your code, storage and memory use is all well-defined and that you do not wander off into the realm of undefined, because if you do, all bets are off, and your code isn't worth the bytes it is stored in.
How could you make your code well-defined? Appropriately limit and validate each required step in your code. For your snippet, you could use strncpy instead of strcpy and then affirmatively nul-terminate arr before calling printf, e.g.
char arr[4] = ""; /* initialize all values */
strncpy(arr,"This is a link", sizeof arr); /* limit copy to bytes available */
arr[sizeof arr - 1] = 0; /* affirmatively nul-terminate */
printf ("%s\n",arr);
Now, you can rely on the contents of arr throughout the remainder of your code.

Your code has some memory issues (buffer overrun) . The function strcpy copies bytes until the null character. The function printf prints until the null character.

There is no guarantee on the behavior of this piece of code.
It's just like: you told me "I'll pick you up at 5:00 p.m." and when you came I would be there(guarantee). But I can't guarantee whether I had grabbed you a cup of coffee or not, because you didn't told me you want one. Maybe I'm very nice and bought two cups of coffee, or maybe I'm a cheapskate and just bought one for myself.

It may work. It may not. It may fail immediately and obviously. It may fail at some arbitrary future time and in subtle ways that will drive you insane.
That is the often-insidious nature of undefined behaviour. Don't do it.
If it works at all, it's totally by accident and in no way guaranteed. It's possible that you're overwriting stuff on the stack or in other memory (depending on the implementation and how/where the actual variable str is defined(a)) but that the memory being overwritten is not used after that point (given the simple nature of the code).
That possibility of it working accidentally in no way makes it a good idea.
For the language lawyers among us, section J.2 (instances of undefined behaviour) of C11 clearly states:
An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]).
That informative section references 6.5.6, which is normative, and which states when discussing pointer/integer addition (of which a[b] is an example):
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.
(a) For example, on my system, declaring the variable inside main causes the program to crash because the buffer overflow trashes the return address on the stack.
However, if I put the declaration at file level (outside of main), it seems to run just fine, printing the message then exiting the program.
But I assure you that's only because the memory you've trashed is not important for the continuation of the program in this case. It will almost certainly be important in anything more substantial than this example.

your code will always work as long as the printf is placed just after strcpy. But it is wrong coding
Try following and it won't work
int j;
char arr[4];
int i;
strcpy(arr,"This is a link");
i=0;
j=0;
printf("%s",arr);
To understand why it is so you must understand the idea of stack. All local variables are allocated on stack. Hence in your code, program control has allocated 4 bytes for "arr" and when you copy a string which is larger than 4 bytes then you are overwriting/corrupting some other memory. But as you accessed "arr" just after strcpy hence the area you have overwritten which may belong to some other variables still not updated by program that's why your printf works fine. But as I suggested in example code where other variables are updated which fall into the memory region you have overwritten, you won't get correct (? or more appropriate is desired) output
Your code is working also because stack grows downwards if it would have been other way then also you had not get desired output

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight