Examining memory with x86 proccesor [duplicate] - c

This question already has an answer here:
How many machine instructions can single memory address store?
(1 answer)
Closed 7 years ago.
I'm new in GDB and have some problem with it. I have x86 proccesor and it means that register eip in my proccesor should contain 4 byte memory. I compiled some c code and set break point to main(). Typing x/x $eip gives me back "0xd02404c7"(hexadecimal) which as i know is some instruction to machine language. So my questions is: if This machine instruction is the size of 4 byte. This command "x/4x $eip" should display 16 byte and it show me this:
0x8048426 <main+9>: 0xd02404c7 0xe8080484 0xfffffebe 0x9066c3c9
So i'm confused. If this is 16 byte than why does it show me that it is located on the same memory when 1 register in 32 bit proccesor should contain only 4 byte? Thank you.

Typing x/x $eip gives me back "0xd02404c7"(hexadecimal) which as i know is some instruction to machine language.
No, it gives you raw bytes in your code. These raw bytes can "cover" less than one, one, or several machine instructions. A shortest x86 instruction takes up just one byte. The longest instruction takes 15 bytes.
So my questions is: if This machine instruction is the size of 4 byte.
An address is 4 bytes, but the instruction itself may contain 1 to 15 bytes. You can see the relationship between bytes and instructions if you do (gdb) disas/r main
So every memory address can store 4 machine instructions?
Not at all. Every memory address corresponds to 1 byte of memory. That byte may contain an entire (single-byte) instruction, or it can be a start of multi-byte instruction, or it could not contain any instructions at all (if the address points to e.g. .data section).

Related

Why I want to add an address to the stack pointer?

I'm trying to understand the first microcorruption challenge.
I want to ask about the first line of the main function.
Why would they add that address to the stack pointer?
This looks like a 16-bit ISA1, otherwise the disassembly makes no sense.
0xff9c is -100 in 16-bit 2's complement, so it looks like this is reserving 100 bytes of stack space for main to use. (Stacks grow downward on most machines). It's not an address, just a small offset.
See MSP430 Assembly Stack Pointer Behavior for a detailed example of MSP430 stack layout and usage.
Footnote 1: MSP430 possibly? http://mspgcc.sourceforge.net/manual/x82.html it's a 16-bit ISA with those register names, and those mnemonics, and I think its machine code uses variable-length 2 or 4 byte instructions.
It's definitely not ARM; call and jmp are not ARM mnemonics; that would be bl and b. Also, ARM uses op dst, src1, src2 syntax, while this disassembly uses op src, dst.

ARM Assembly loop using PC?

I am currently learning arm assembly and I have some questions. When reading docs, I've found that the register nº 15 is the program counter that stores the next instruction adress, and when an instruction is done, it is incremented by 4 (bytes, or 2 in thumb mode).
So, my question is, if I run an instruction that changes PC by itself less 4 bytes, would it return to the instruction before, won't it? Then back and over and over again so it will be an infinite loop?
Thanks, and sorry if it is an obvious question.
Regards,
Pedro.
You have to look on an instruction by instruction basis, as some have modification of the PC being unpredictable, but for those where it is legal modification of the program counter essentially causes a jump to the address you save in the program counter. You dont have to worry about the two instructions ahead thing (it is 8 and 4 bytes not 4 and 2, two instructions ahead).
Yes - a jump/branch instruction is exactly what you're describing - it's an instruction which modifies the PC. If you arrange the result of the jump to put the program counter back where it was then, yes, you'll loop on the spot.
Note that this is not really the address of the next instruction but the address of the current instruction +4 (In Thumb mode) or +8 (In ARM mode). So in ARM this is 2 instructions later, but in Thumb it may not be (As instructions can be 16-bit or 32-bit)

How many machine instructions can single memory address store?

I'm new in GDB and currently trying to examine memory. I guess title says everything. Basically I compiled some c code and set break point to main. When I type x/x $eip it give me back some machine instruction 0xd02404c7.
On the second try x/5x $eipit gives back 0x8048426 <main+9>: 0xd02404c7 0xe8080484 0xfffffebe 0x9066c3c9
0x8048436: 0x90669066
So i got little confused here. The space between addreses 0x8048426--0x8048436 is equal to 10. So it turns out that four instructions took "10 addresses". My questions are: Can Memory address store maximum of 4 machine instructions?
Why does it took "10 addresses" to store 4 machine instructions?
Is there any relationship between how much bits does proccesor have and how many machine insturction can single memory address store?
Sorry if the question sounds silly.
The space between addreses 0x8048426--0x8048436 is equal to 10. So it turns out that four instructions took "10 addresses"
Not quite, it's equal to 0x10, which is an hexadecimal number, and equal to 16 in decimal.
So those instructions are taking 16 bytes.
Can Memory address store maximum of 4 machine instructions?
Addresses have a granularity of 1 byte. That is, one address refers to exactly one byte.
Machine instructions can take 1 or more byte. So a single memory address, a single byte, can store a maximum of 1 machine instruction, at least on x86.
Why does it took "10 addresses" to store 4 machine instructions?
Each of the numbers you see is not an instruction. The 4 numbers you see are called words and are what your CPU usually works with.
Is there any relationship between how much bits does proccesor have and how many machine insturction can single memory address store?
Not really. A single memory address can store at most one instruction. Because instructions are at lest 1 byte long (for x86).
But "how much bits does proccesor have" can indicate that your processor has access to an extended or different instruction set.

C Buffer overflow - Return address not expressible in ASCII

I'm trying to overflow a buffer of 64bytes.
The buffer is being filled by a call to gets
My understanding is that I need to write a total of 65 bytes to fill the buffer, and then write another 4 bytes to fill the stack frame pointer.
The next 4 bytes should overwrite the return address.
However, the address that I wish to write is 804846A.
Is this same as 0x0804846A? If so, I'm finding it hard to enter 04 (^D)
Should this be entered in reverse order? (6A 84 04 08)?
Some initial experiments that I was running with input being ZZZZZ..(64 times)..AAAABBBB
ended up making the ebp register to be 0x42414141
The architecture in question is x86.
update: I managed to get ASCII codes 0x04 and 0x08 working. The issue seems to be with 0x84. I tried copying the symbol corresponding to 0x84 from http://www.ascii-code.com which is apparently „. However, C seems to resolve this symbol into a representation greater than 1 byte.
I also tried to use ä as mentioned in http://www.theasciicode.com.ar
This also resulted in a representation greater than 1 byte.
You seem to be depending on implementation details of a particular compiler and CPU architecture. For example:
Not all CPU architectures use a frame pointer at all.
Endianness varies across different CPUs, and this would affect whether you need to "reverse" the bytes or not.
Where the stack metainformation (the frame pointer, etc.) is located with respect to a given local variable will differ between compilers, and even between the same compiler using different optimization options.

Why does an 8-byte array (C) in 64-bit Ubuntu take 16 bytes?

I've recently been (relearning) lower level CS material and I've been exploring buffer overflows. I created a basic C program that has an 8-byte array char buffer[8];. I then used GDB to explore and disassemble the program and step through its execution. I'm on a 64-bit version of Ubuntu, and I noticed that my 8-byte char array is actually represented in 16 bytes in memory - the high order bits all just being 0.
E.g. Instead of 0xDEADBEEF 0x12345678 as I might expect to represent the 8 byte array, it's actually something like 0x00000000 0xDEADBEEF 0x00000000 0x12345678.
I did some googling and was able to get GCC to compile my program as a 32-bit program (using -m32 flag) - which resulted in the expected 8 bytes as normal.
I'm just looking for an unambiguous explanation as to why the 8-byte character array is represented in 16 bytes on a 64-bit system. Is it because the minimum word size / addressable unit is 16 bytes (64 bits) and GDB is simply printing based on an 8-byte word size?
Hopefully this is clear, but let me know if clarification is needed.
64bit systems are geared toward aligning all memory to 16 byte boundries (16 byte stack alignment is part of the System-V ABI), for stack allocations, there are two parts to this, firstly, the stack itself needs to be aligned, secondly any allocations then try to preserve that alignment.
This explains the first part as to why the 8 byte array becomes 16 bytes on the stack, as to why it gets split into two 8byte qwords, this is a little more difficult to tell, as you haven't provided any code (assembly or C) as to the use of this buffer. And trying to replicated this using mingw64 provides the 16 byte alignment, but not the funny layout you are seeing.
Of course, the other possibility stemming from the lack of ASM is that GDB is displaying 2xQWORD's even though its in fact 2xDWORD's (in other words, try using p/x (char[8]) to dump the contents...).

Resources