While I was doing "Learn C The Hard Way" examples, I thought to myself:
I set int a = 10; but where does that value 10 actually? Can I access it manually from the outside while my program is running?
Here's a little C code snippet for demonstration purposes:
int main (int argc, char const* argv[]) {
int a = 10;
int b = 5;
int c = a + b;
return 0;
}
I opened up the The GNU Project Debugger (GDB) and entered:
break main
run
next 2
From what I understood 0x7fff5bffb04 is a memory address of int c. I then used hexdump -C /dev/mem system call to dump the entire memory into the terminal.
Now the question is where do I look for the variable c in this massive hex dump? My hope is that given the address 0x7fff5bffb04 I can find its value, which is 15. Also, bonus question, what does each column in hexdump -C represent? (I know the last column is ASCII representation)
I then used hexdump -C /dev/mem system call to dump the entire memory into the terminal.
Your hexdump dumped physical memory addresses. The address 0x7fff5bffb04 is a virtual address of the variable in the process you are debugging. It is mapped to some physical address, but you will not be able to find which without examining kernel mapping tables (as Mat already told you in a comment).
To examine virtual address space, use /proc/<pid>/mem (as Barmar already told you in a comment).
But this entire exercise is pointless, because you already can examine the virtual memory in GDB, and you are not going to see anything when you look at virtual memory that GDB didn't already show you much more conveniently [1].
[1] Except you could see GDB-inserted breakpoints, but you are not expected to understand that :-)
Firstly, there is no reason why the values would even exist in ram. More than Likly the machine code for this program simply has the values in cpu registers. You would have to have more bytes (try at least 512) and set them to a random value, which you could then search for in the memory dump.
You are far better of looking at the assembly code produced by the c compiler.
Related
I have the following program in C:
1 #include<stdio.h>
2
3 int main(void) {
4 int i=0;
5 for (int k=0; k<10; k++)
6 printf("Number: %d", k);
7 printf("Hello\n");
8 return 0;
9 }
When I run it in gdb it gives me a listing of all the registers, but I don't see the variable k in any of those reigsters. For example, in the below screenshot, I know k=4, but I don't see that value in any of the registers. Where would this number be stored then?
I know k=4, but I don't see that value in any of the registers. Where would this number be stored then?
If you optimized the program, the value would indeed likely be stored in a register (but the program will be much harder to debug).
Without optimization, the value is stored on stack (to be precise, given the disassembly, it is stored at location $rbp-8), and is loaded into a register by the very next instruction (the one before which you have stopped).
If you do stepi and look at the value of $rax, you will find it right there.
P.S. info locals will give you info about local variables.
Update:
What does stepi do?
It executes a single machine instruction, then stops. You can find this out by reading the manual, or by using help stepi GDB command.
What/were is $rbp-8? Could you please explain a bit more about what that is and how it works?
That is something that would be covered in every introductory x86 programming book or tutorial.
Briefly, current state of the program execution can be described as a series of linked activation records or "frames". On x86 without optimization, the $RBP register is usually used as a frame pointer register (i.e. it points to the current frame). Locals are stored at negative offsets from the frame pointer (here, k is stored at offset -8).
I am very new to C, it's my second high-level programming language after Java. I have gotten most of the basics down, but for whatever reason I am unable to write a single character to screen memory.
This program is compiled using Turbo C for DOS on an Am486-DX4-100 running at 120mhz. The graphics card is a very standard VLB Diamond Multimedia Stealth SE using a Trio32 chip.
For an OS I am running PC-DOS 2000 with an ISO codepage loaded. I am running in standard MDA/CGA/EGA/VGA style 80 column text mode with colour.
Here is the program as I have it written:
#include <stdio.h>
int main(void) {
unsigned short int *Video = (unsigned short int *)0xB8000;
*Video = 0x0402;
getchar();
return 0;
}
As I stated, I am very new to C, so I apologize if my error seems obvious, I was unable to find a solid source on how to do this that I could understand.
To my knowledge, in real mode on the x86 platform, the screen memory for text mode starts at 0xB8000. Each character is stored in two bytes, one for the character, and one for the background/foreground. The idea is to write the value 0x0402 (which should be a red smiling face) to 0xB8000. This should put it at the top left of the screen.
I have taken into account the possibility that the screen may be scrolling, and thus immediately removing my character upon execution in two ways. To resolve this issue, I have tried:
Repeatedly write this value using a loop
Write it a bit further down.
I can read and print the value I wrote to memory, so it's obviously still somewhere in memory, but for whatever reason I do not get anything onscreen. I'm obviously doing something wrong, however I do not know what could be the issue. If any other details are needed, please ask. Thank you for any possible help you can give.
In real mode to address the first full 1MiB of memory a mechanism called 20-bit segment:offset addressing is used. 0xb8000 is a physical memory address. You need to use something called a far pointer that allows you to address memory with real mode segmentation. The different types of pointers are described in this Stackoverflow Answer
0xb8000 can be represented as a segment of 0xb800 and an offset of 0x0000. The calculation to get physical address is segment*16+offset. 0xb800*16+0x0000=0xb8000. With this in mind you can include dos.h and use the MK_FP C macro to initialize a far pointer to such an address given segment and offset.
From the documentation MK_FP is defined as:
MK_FP() Make a Far Pointer
#include <dos.h>
void far *MK_FP(seg,off);
unsigned seg; Segment
unsigned off; Offset
MK_FP() is a macro that makes a far pointer from its component segment 'seg' and offset 'off' parts.
Returns: A far pointer.
Your code could be written like this:
#include <stdio.h>
#include <dos.h>
int main(void) {
unsigned short int far *Video = (unsigned short int far *)MK_FP(0xB800,0x0000);
*Video = 0x0402;
getchar();
return 0;
}
The memory segment adress depends on the video mode used:
0xA0000 for EGA/VGA graphics modes (64 KB)
0xB0000 for monochrome text mode (32 KB)
0xB8000 for color text mode and CGA-compatible graphics modes (32 KB)
To directly access vram you need a 32 bit-pointer to hold segement and offset address otherwise you would mess up your heap. This usually leads to undefined behaviour.
char far *Video = (char far *)0xb8000000;
See also: What are near, far and huge pointers?
As #stacker pointed-out, in the 16-bit environment you need to assign the pointer carefully. AFAIK you need to put FAR keyword (my gosh, what a nostalgia).
Also make sure you don't compile in so-called "Huge" memory model. It's incompatible with far addressing, because every 32-bit pointer is automatically "normalized" to 20 bits. Try selecting "Large" memory model.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Whether variable name in any programming language takes memory space
I was just reading about memory allocation, and can't help wonder this question:
Do both
int x = 4;
and
int this_is_really_really_long_name_for_an_integer_variable = 4;
occupy same amount of memory (the total memory occupied by the variable. not just sizeof(int))
I understand that this question is related to 'programming languages and compiler construction'. But, I haven't got to study it :(
In general they occupy the same amount of space, i.e. sizeof(int). However, one could argue that when building an object file with additional symbols for debugging the ratio is different. The amount of data which the variable stores does not change but the debugging symbols occupy more space in case of the longer variable name. Consider a following example.
$ cat short.c && gcc -c short.c && wc -c short.o
int x = 0;
927 short.o
$ cat long.c && gcc -c long.c && wc -c long.o
int this_is_really_really_long_name_for_an_integer_variable = 0;
981 long.o
The difference in size is exactly the difference of lengths of variables' names.
From a run-time efficiency and memory usage point of view it does not matter, though.
In C? Yes, these variables will occupy the same amount of space.
Variable name is used only by compiler at compile-time.
But there are some languages that store variable names in run-time.
The length of the variable name has no bearing on the amount of storage reserved for it; in most cases, the variable name isn't preserved in the generated machine code.
32 bits, since compiler will no store your name. It will handle it as an address only.
int container only occupied 32 bits.
Variable names are only used for address binding at compile time.variables names are stored in symbol table in lexical processing which is one phase of compiler process
once address binding is done then there is no use of variable name, & your length of variable name does not matter. it only takes 32 bits
I'm working on a practice problem set for C programming, and I've encountered this question. I'm not entirely sure what the question is asking for... given that xDEADBEEF is the halt instruction, but where do we inject deadbeef? why is the FP relevant in this question? thank you!
You’ve been assigned as the lead computer engineer on an interplanetary space mission to Jupiter. After several months in space, the ship’s main computer, a HAL9000, begins to malfunction and starts killing off the crew members. You’re the last crew member left alive and you need to trick the HAL 9000 computer into executing a HALT instruction. The good news is that you know that the machine code for a halt instruction is (in hexadecimal) xDEADBEEF (in decimal, this is -559,038,737). The bad news is that the only program that the HAL 9000 operating system is willing to actually run is chess. Fortunately, we have a detailed printout of the source code for the chess program (an excerpt of all the important parts is given below). Note that the getValues function reads a set of non-zero integers and places each number in sequence in the array x. The original author of the program obviously expected us to just provide two positive numbers, however there’s nothing in the program that would stop us from inputting three or more numbers. We also know that the stack will use memory locations between 8000 and 8999, and that the initial frame pointer value will be 8996.
void getValues(void) {
int x[2]; // array to hold input values
int k = 0;
int n;
n = readFromKeyboard(); // whatever you type on the keyboard is assigned to n
while (n != 0) {
x[k] = nextNumber;
k = k + 1;
n = readFromKeyboard();// whatever you type on the keyboard is assigned to n
}
/* the rest of this function is not relevant */
}
int main(void) {
int x;
getValues();
/* the rest of main is not relevant */
}
What sequence of numbers should you type on the keyboard to force the computer to execute a halt instruction?
SAMPLE Solution
One of the first three numbers should be -559038737. The fourth number must be the address of where 0xdeadbeef was placed into memory. Typical values for the 4th number are 8992 (0xdeadbeef is the second number) or 8991 (0xdeadbeef is first number).
What you want to do is overflow the input such that the program will return into a set of instructions you have overwritten at the return address.
The problem lies here:
int x[2]; // array to hold input values
By passing more than 3 values in, you can overwrite memory that you shouldn't. Explaining the sample example:
First input -559,038,737 puts xDEADBEEF in memory
Second input -559,038,737, why not.
Third number -559,038,737 can't hurt
Fourth number 8992 is the address we want the function to return into.
When the function call returns, it will return to the address overwrote the return address on the stack with (8992).
Here are some handy resources, and an excerpt:
The actual buffer-overflow hack work slike this:
Find code with overflow potential.
Put the code to be executed in the
buffer, i.e., on the stack.
Point the return address to the same code
you have just put on the stack.
Also a good book on the topic is "Hacking: The art of exploitation" if you like messing around with stacks and calling procedures.
In your case, it seems they are looking for you to encode your instructions in integers passed to the input.
An article on buffer overflowing
Hint: Read about buffer overflow exploits.
I am trying to do an example from the Smashing the Stack for Fun and Profit in C, but am kind of stuck at a point,
following is the code (I have a 64-bit machine with Ubuntu 64-bit):
int main()
{
int x;
x = 0;
func(1,2,3);
x = 1;
printf("x is : %d\n", x);
}
void func(int a, int b, int c)
{
char buffer[1];
int *ret;
ret = buffer + 17;
(*ret) += 7;
}
The above code works fine and on returning the x=1 line is not executed, but I can't understand the logic behind ret = buffer + 17;, shouldn't it be ret = buffer + 16; i.e, 8bytes for buffer and 8 for the saved base pointer on stack.
Secondly, my understanding is that char buffer[1] is taking 8 bytes (owing to 64-bit arch)
and if I increase this buffer to say buffer[2], still the same code should work fine, BUT this is not happening and it starts giving seg fault.
Regards,
Numan
'char' on every architecture I've used is 8 bits wide irrespective of whether it's an 8 bit micro, a 16 bit micro, a 32 bit PC, or a 64 bit new PC. Int, on the other hand, tends to be the word size.
The order which the locals are put on the stack can be implementation specific. My guess is that your compiler is putting "int *ret" on the stack before "char buffer1". So, to get to the return address, we have to go through "char buffer1" (1 byte), "int *ret" (8 bytes), and the saved base pointer (8 bytes) for a total of 17 bytes.
Here's a description of the stack frame on x86 64-bit:
http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/projects/x86-64
Step through the disassembly in gdb (disassemble, stepi, nexti) and look at the registers at each step (info registers).
Here how you can step through disassembly:
gdb ./myprogram
break main
run
display/4i $eip
stepi
stepi
...
info registers
...
You should also know (you probably already do given that you got part of it working) that on many distros, the stack protector is enabled by default in gcc. You can manually disable it with -fno-stack-protector.
With a lot of this stack smashing stuff, your best friend is gdb. Since you're segfaulting already you're already writing memory you're not supposed to be (a good sign). A more effective way to do it right is to change the return address to somewhere else that's a valid address (e.g. to func's address or to some shellcode you've got). A great resource I'd recommend is the Shellcoder's Handbook, but since you're on a 64-bit architecture a lot of the examples need a bit of work to get going.
Aside from (or better yet, in addition to) running a debugger, you can also use the printf "%p" construct to print the addresses of your variables, e.g.:
printf("buf: %p\n", buffer); //&buffer[0] works too; &buffer works for an array
printf("ret: %p\n", &ret):
printf("a: %p\n", &a);
Printing the addresses of various things can give great insight into how your compiler/implementation is arranging things in the background. And you can do it directly from C code, too!
Consider taking a look at stealth's borrowed code chunk technique, if you're interested in x64 buffer overflow exploitation.