Editing ELF binary call instruction - c

I am playing around with manipulating a binary's call functions. I have the below code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void myfunc2(char *str2, char *str1) {
// enter code here
}
void myfunc(char *str2, char *str1)
{
memcpy(str2 + strlen(str2), str1, strlen(str1));
}
int main(int argc, char **argv)
{
char str1[4] = "tim";
char str2[10] = "hello ";
myfunc((char *)&str2, (char *)&str1);
printf("%s\n", str2);
myfunc2((char *)&str2, (char *)&str1);
printf("%s\n", str2);
return 0;
}
void myfunc2(char *str2, char *str1)
{
memcpy(str2, str1, strlen(str1));
}
I have compiled the binary and using readelf or objdump I can see that my two functions reside at:
46: 000000000040072c 52 FUNC GLOBAL DEFAULT 13 myfunc2**
54: 000000000040064d 77 FUNC GLOBAL DEFAULT 13 myfunc**
Using the command objdump -D test (my binaries name), I can see that main has two callq functions. I tried to edit the first one to point to myfunc2 using the above address 72c, but that does not work; causes the binary to fail.
000000000040069a <main>:
40069a: 55 push %rbp
40069b: 48 89 e5 mov %rsp,%rbp
40069e: 48 83 ec 40 sub $0x40,%rsp
4006a2: 89 7d cc mov %edi,-0x34(%rbp)
4006a5: 48 89 75 c0 mov %rsi,-0x40(%rbp)
4006a9: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
4006b0: 00 00
4006b2: 48 89 45 f8 mov %rax,-0x8(%rbp)
4006b6: 31 c0 xor %eax,%eax
4006b8: c7 45 d0 74 69 6d 00 movl $0x6d6974,-0x30(%rbp)
4006bf: 48 b8 68 65 6c 6c 6f movabs $0x206f6c6c6568,%rax
4006c6: 20 00 00
4006c9: 48 89 45 e0 mov %rax,-0x20(%rbp)
4006cd: 66 c7 45 e8 00 00 movw $0x0,-0x18(%rbp)
4006d3: 48 8d 55 d0 lea -0x30(%rbp),%rdx
4006d7: 48 8d 45 e0 lea -0x20(%rbp),%rax
4006db: 48 89 d6 mov %rdx,%rsi
4006de: 48 89 c7 mov %rax,%rdi
4006e1: e8 67 ff ff ff callq 40064d <myfunc>
4006e6: 48 8d 45 e0 lea -0x20(%rbp),%rax
4006ea: 48 89 c7 mov %rax,%rdi
4006ed: e8 0e fe ff ff callq 400500 <puts#plt>
4006f2: 48 8d 55 d0 lea -0x30(%rbp),%rdx
4006f6: 48 8d 45 e0 lea -0x20(%rbp),%rax
4006fa: 48 89 d6 mov %rdx,%rsi
4006fd: 48 89 c7 mov %rax,%rdi
400700: e8 27 00 00 00 callq 40072c <myfunc2>
400705: 48 8d 45 e0 lea -0x20(%rbp),%rax
400709: 48 89 c7 mov %rax,%rdi
40070c: e8 ef fd ff ff callq 400500 <puts#plt>
400711: b8 00 00 00 00 mov $0x0,%eax
400716: 48 8b 4d f8 mov -0x8(%rbp),%rcx
40071a: 64 48 33 0c 25 28 00 xor %fs:0x28,%rcx
400721: 00 00
400723: 74 05 je 40072a <main+0x90>
400725: e8 f6 fd ff ff callq 400520 <__stack_chk_fail#plt>
40072a: c9 leaveq
40072b: c3 retq
I suspect I need to do something with calculating the address information through relative location or using the lea/mov instructions.
Any assistance to learn how to modify the call function would be greatly appreciated - please no pointers on editing strings like the howtos all over most of the internet...

In order to rewrite the address, you have to know the exact way the callq instructions are encoded.
Let's take the disassembly output of the first call:
4006e1: e8 67 ff ff ff callq 40064d <myfunc>
4006e6: ...
You can clearly see that the instruction is encoded with 5 bytes. The e8 byte is the instruction opcode, and 67 ff ff ff is the address to jump to. At this point, one would ask the question, what has 67 ff ff ff to do with 0x40064d?
Well, the answer is that e8 encodes a so-called "relative call" and the jump is performed relative to the location of the next instruction. You have to calculate the distance between 4006e6 and the called function in order to rewrite the address. Had the call been absolute (ff), you could just put the function address in these 4 bytes.
To prove that this is the case, consider the following arithmetic:
0x004006e6 + 0xffffff67 == 0x10040064d

Related

Inline functions expansion with inner switch

Consider the scenario where you have a complex procedure that have a state machine and is tracked with a state that never changes during a call to kernel, as in code illustrated below.
static inline void kernel(int recursion, int mode){
if(!recursion) return;
// branches all lead to similar switch cases here.
// logical branches and loops can be quite complicated.
switch(mode){
default return;
case 1: mode1(recursion-1);
case 2: mode2(recursion-1);
}
}
void mode1(int recursion){
kernel(recursion,1)
}
void mode2(int recursion){
kernel(recursion,2)
}
If only mode1 and mode2 functions are called elsewhere, can recent compilers eliminate the inner branches?
All functions are in the same compilation unit.
Came across this while implementing a interpreter for a subset of spirv byte code. The inner branch is for finding out how much to allocate, building up the AST, and doing the actual evaluation of expressions. The kernel takes care of traversing the tree, with all the switch on instruction OpCodes. Writing separate functions for each state will be even more difficult to maintain, since the kernel already takes up 1000+ loc, and moving up and down for the same traversal point and keeping them the same can be really difficult.
(I know modern c++ have constexpr if, but this is pure c code.)
Edit:
I've tried with msvc compiler with the following code:
uint32_t interp_code[]={1,2,1,1,2};
void mode1(const uint32_t* code, int recursion);
void mode2(const uint32_t* code, int recursion);
static INLINE void kernel(const uint32_t* code, int recursion, int mode)
{
if (!recursion) return;
// branches all lead to similar switch cases here.
// logical branches and loops can be quite complicated.
switch (*code) {
default: return;
case 1:
switch (mode) {
default: return;
case 1: mode1(code + 1, recursion - 1);
case 2: mode2(code + 1, recursion - 1);
}
case 2:
switch(mode) {
default: return;
case 1: mode2(code + 1, recursion - 1);
case 2: mode1(code + 1, recursion - 1);
}
}
}
void mode1(const uint32_t* code,int recursion)
{
kernel(code, recursion, 1);
}
void mode2(const uint32_t* code, int recursion)
{
kernel(code, recursion, 2);
}
int main()
{
mode1(interp_code, 5);
return 0;
}
Using inline in the INLINE place yielded a function call (O2 optimizations), and using __forceinline yields the two modes compiled separately with no function call.
Disassembly for inline:
31: void mode1(const uint32_t* code,int recursion)
32: {
00007FF65F8910C0 48 83 EC 28 sub rsp,28h
33: kernel(code, recursion, 1);
00007FF65F8910C4 85 D2 test edx,edx
00007FF65F8910C6 74 6C je mode1+74h (07FF65F891134h)
00007FF65F8910C8 48 89 5C 24 30 mov qword ptr [rsp+30h],rbx
00007FF65F8910CD 48 8D 59 04 lea rbx,[rcx+4]
00007FF65F8910D1 48 89 74 24 38 mov qword ptr [rsp+38h],rsi
00007FF65F8910D6 48 89 7C 24 20 mov qword ptr [rsp+20h],rdi
00007FF65F8910DB 8D 7A FF lea edi,[rdx-1]
00007FF65F8910DE 66 90 xchg ax,ax
00007FF65F8910E0 8B 4B FC mov ecx,dword ptr [rbx-4]
00007FF65F8910E3 8B F7 mov esi,edi
00007FF65F8910E5 83 E9 01 sub ecx,1
00007FF65F8910E8 74 07 je mode1+31h (07FF65F8910F1h)
00007FF65F8910EA 83 F9 01 cmp ecx,1
00007FF65F8910ED 75 36 jne mode1+65h (07FF65F891125h)
00007FF65F8910EF EB 1A jmp mode1+4Bh (07FF65F89110Bh)
00007FF65F8910F1 8B D7 mov edx,edi
00007FF65F8910F3 48 8B CB mov rcx,rbx
00007FF65F8910F6 E8 C5 FF FF FF call mode1 (07FF65F8910C0h)
00007FF65F8910FB 41 B8 02 00 00 00 mov r8d,2
00007FF65F891101 8B D7 mov edx,edi
00007FF65F891103 48 8B CB mov rcx,rbx
00007FF65F891106 E8 F5 FE FF FF call kernel (07FF65F891000h)
00007FF65F89110B 41 B8 02 00 00 00 mov r8d,2
00007FF65F891111 8B D7 mov edx,edi
00007FF65F891113 48 8B CB mov rcx,rbx
00007FF65F891116 E8 E5 FE FF FF call kernel (07FF65F891000h)
00007FF65F89111B FF CF dec edi
00007FF65F89111D 48 83 C3 04 add rbx,4
00007FF65F891121 85 F6 test esi,esi
00007FF65F891123 75 BB jne mode1+20h (07FF65F8910E0h)
00007FF65F891125 48 8B 74 24 38 mov rsi,qword ptr [rsp+38h]
00007FF65F89112A 48 8B 5C 24 30 mov rbx,qword ptr [rsp+30h]
00007FF65F89112F 48 8B 7C 24 20 mov rdi,qword ptr [rsp+20h]
34: }
00007FF65F891134 48 83 C4 28 add rsp,28h
00007FF65F891138 C3 ret
For __forceinline:
31: void mode1(const uint32_t* code,int recursion)
32: {
00007FF670271002 EC in al,dx
00007FF670271003 28 85 D2 74 60 48 sub byte ptr [rbp+486074D2h],al
33: kernel(code, recursion, 1);
00007FF670271009 89 5C 24 30 mov dword ptr [rsp+30h],ebx
00007FF67027100D 48 8D 59 04 lea rbx,[rcx+4]
00007FF670271011 48 89 74 24 38 mov qword ptr [rsp+38h],rsi
00007FF670271016 48 89 7C 24 20 mov qword ptr [rsp+20h],rdi
00007FF67027101B 8D 7A FF lea edi,[rdx-1]
00007FF67027101E 66 90 xchg ax,ax
00007FF670271020 8B 4B FC mov ecx,dword ptr [rbx-4]
00007FF670271023 8B F7 mov esi,edi
00007FF670271025 83 E9 01 sub ecx,1
00007FF670271028 74 07 je mode1+31h (07FF670271031h)
00007FF67027102A 83 F9 01 cmp ecx,1
00007FF67027102D 75 2A jne mode1+59h (07FF670271059h)
00007FF67027102F EB 14 jmp mode1+45h (07FF670271045h)
00007FF670271031 8B D7 mov edx,edi
00007FF670271033 48 8B CB mov rcx,rbx
00007FF670271036 E8 C5 FF FF FF call mode1 (07FF670271000h)
00007FF67027103B 8B D7 mov edx,edi
00007FF67027103D 48 8B CB mov rcx,rbx
00007FF670271040 E8 2B 00 00 00 call mode2 (07FF670271070h)
00007FF670271045 8B D7 mov edx,edi
00007FF670271047 48 8B CB mov rcx,rbx
00007FF67027104A E8 21 00 00 00 call mode2 (07FF670271070h)
00007FF67027104F FF CF dec edi
00007FF670271051 48 83 C3 04 add rbx,4
00007FF670271055 85 F6 test esi,esi
00007FF670271057 75 C7 jne mode1+20h (07FF670271020h)
00007FF670271059 48 8B 74 24 38 mov rsi,qword ptr [rsp+38h]
00007FF67027105E 48 8B 5C 24 30 mov rbx,qword ptr [rsp+30h]
00007FF670271063 48 8B 7C 24 20 mov rdi,qword ptr [rsp+20h]
34: }
00007FF670271068 48 83 C4 28 add rsp,28h
00007FF67027106C C3 ret
It seems with inline the compiler chose to inline the entirety of mode2 function body, and make kernel a separate function call. __forceinline forced the mode1 and mode2 to compile into two function bodies with the kernel. (This code doesn't break on the case, so fall through is expected)
Working with inline directive yields just the same code as nothing specified in INLINE in O2

Is there a command execution vulnerability in this C program?

So I am working on a challenge problem to find a vulnerability in a C program binary that allows a command to be executed by the program (using the effective UID in Linux).
I am really struggling to find how to do this with this particular program.
The disassembly of the function in question (main function):
**************************************************************
* *
* FUNCTION *
**************************************************************
int __cdecl main(int argc, char * * argv)
int EAX:4 <RETURN>
int Stack[0x4]:4 argc
char * * Stack[0x8]:4 argv XREF[2]: 000109b0(R),
000109dd(R)
undefined4 Stack[-0x8]:4 local_8 XREF[1]: 00010bcb(R)
int Stack[-0xc]:4 in XREF[5]: 000109f0(W),
000109f3(R),
00010ad4(R),
00010b27(R),
00010b59(R)
int Stack[-0x10]:4 fd XREF[6]: 00010a1f(W),
00010a22(R),
00010aa5(R),
00010ab2(R),
00010ac9(R),
00010b4e(R)
pid_t Stack[-0x14]:4 pid XREF[4]: 00010a6b(W),
00010a6e(R),
00010a8b(R),
00010b6a(R)
int[2] Stack[-0x1c]:8 pipefd XREF[3,3]: 00010a3f(*),
00010a95(R),
00010b42(R),
00010abd(R),
00010b0f(R),
00010b36(R)
char Stack[-0x1d]:1 c XREF[2]: 00010b14(*),
00010b23(*)
int Stack[-0x24]:4 status XREF[2]: 00010b66(*),
00010b75(R)
main XREF[5]: Entry Point(*),
_start:00010866(*), 00010d30,
00010da0(*), 00011f34(*)
0001097d 55 PUSH EBP
0001097e 89 e5 MOV EBP,ESP
00010980 53 PUSH EBX
00010981 83 ec 1c SUB ESP,0x1c
00010984 e8 87 16 CALL <EXTERNAL>::geteuid __uid_t geteuid(void)
00 00
00010989 89 c3 MOV EBX,EAX
0001098b e8 80 16 CALL <EXTERNAL>::geteuid __uid_t geteuid(void)
00 00
00010990 53 PUSH EBX
00010991 50 PUSH EAX
00010992 e8 9d 16 CALL <EXTERNAL>::setreuid int setreuid(__uid_t __ruid, __u
00 00
00010997 83 c4 08 ADD ESP,0x8
0001099a e8 75 16 CALL <EXTERNAL>::getegid __gid_t getegid(void)
00 00
0001099f 89 c3 MOV EBX,EAX
000109a1 e8 6e 16 CALL <EXTERNAL>::getegid __gid_t getegid(void)
00 00
000109a6 53 PUSH EBX
000109a7 50 PUSH EAX
000109a8 e8 9b 16 CALL <EXTERNAL>::setregid int setregid(__gid_t __rgid, __g
00 00
000109ad 83 c4 08 ADD ESP,0x8
000109b0 8b 45 0c MOV EAX,dword ptr [EBP + argv]
000109b3 83 c0 04 ADD EAX,0x4
000109b6 8b 00 MOV EAX,dword ptr [EAX]
000109b8 85 c0 TEST EAX,EAX
000109ba 75 21 JNZ LAB_000109dd
000109bc a1 98 1f MOV EAX,[stderr]
01 00
000109c1 50 PUSH EAX
000109c2 6a 22 PUSH 0x22
000109c4 6a 01 PUSH 0x1
000109c6 68 50 0c PUSH s_Please_specify_the_file_to_verif_00010c50 = "Please specify the file to ve
01 00
000109cb e8 50 16 CALL <EXTERNAL>::fwrite size_t fwrite(void * __ptr, size
00 00
000109d0 83 c4 10 ADD ESP,0x10
000109d3 b8 01 00 MOV EAX,0x1
00 00
000109d8 e9 ee 01 JMP LAB_00010bcb
00 00
LAB_000109dd XREF[1]: 000109ba(j)
000109dd 8b 45 0c MOV EAX,dword ptr [EBP + argv]
000109e0 83 c0 04 ADD EAX,0x4
000109e3 8b 00 MOV EAX,dword ptr [EAX]
000109e5 6a 00 PUSH 0x0
000109e7 50 PUSH EAX
000109e8 e8 43 16 CALL <EXTERNAL>::open int open(char * __file, int __of
00 00
000109ed 83 c4 08 ADD ESP,0x8
000109f0 89 45 f8 MOV dword ptr [EBP + in],EAX
000109f3 83 7d f8 00 CMP dword ptr [EBP + in],0x0
000109f7 79 17 JNS LAB_00010a10
000109f9 68 73 0c PUSH DAT_00010c73 = 6Fh o
01 00
000109fe e8 19 16 CALL <EXTERNAL>::perror void perror(char * __s)
00 00
00010a03 83 c4 04 ADD ESP,0x4
00010a06 b8 02 00 MOV EAX,0x2
00 00
00010a0b e9 bb 01 JMP LAB_00010bcb
00 00
LAB_00010a10 XREF[1]: 000109f7(j)
00010a10 6a 02 PUSH 0x2
00010a12 68 78 0c PUSH s_/dev/null_00010c78 = "/dev/null"
01 00
00010a17 e8 14 16 CALL <EXTERNAL>::open int open(char * __file, int __of
00 00
00010a1c 83 c4 08 ADD ESP,0x8
00010a1f 89 45 f4 MOV dword ptr [EBP + fd],EAX
00010a22 83 7d f4 00 CMP dword ptr [EBP + fd],0x0
00010a26 79 17 JNS LAB_00010a3f
00010a28 68 73 0c PUSH DAT_00010c73 = 6Fh o
01 00
00010a2d e8 ea 15 CALL <EXTERNAL>::perror void perror(char * __s)
00 00
00010a32 83 c4 04 ADD ESP,0x4
00010a35 b8 05 00 MOV EAX,0x5
00 00
00010a3a e9 8c 01 JMP LAB_00010bcb
00 00
LAB_00010a3f XREF[1]: 00010a26(j)
00010a3f 8d 45 e8 LEA EAX=>pipefd,[EBP + -0x18]
00010a42 50 PUSH EAX
00010a43 e8 f8 15 CALL <EXTERNAL>::pipe int pipe(int * __pipedes)
00 00
00010a48 83 c4 04 ADD ESP,0x4
00010a4b 85 c0 TEST EAX,EAX
00010a4d 79 17 JNS LAB_00010a66
00010a4f 68 82 0c PUSH DAT_00010c82 = 70h p
01 00
00010a54 e8 c3 15 CALL <EXTERNAL>::perror void perror(char * __s)
00 00
00010a59 83 c4 04 ADD ESP,0x4
00010a5c b8 03 00 MOV EAX,0x3
00 00
00010a61 e9 65 01 JMP LAB_00010bcb
00 00
LAB_00010a66 XREF[1]: 00010a4d(j)
00010a66 e8 d9 15 CALL <EXTERNAL>::fork __pid_t fork(void)
00 00
00010a6b 89 45 f0 MOV dword ptr [EBP + pid],EAX
00010a6e 83 7d f0 00 CMP dword ptr [EBP + pid],0x0
00010a72 79 17 JNS LAB_00010a8b
00010a74 68 87 0c PUSH DAT_00010c87 = 66h f
01 00
00010a79 e8 9e 15 CALL <EXTERNAL>::perror void perror(char * __s)
00 00
00010a7e 83 c4 04 ADD ESP,0x4
00010a81 b8 04 00 MOV EAX,0x4
00 00
00010a86 e9 40 01 JMP LAB_00010bcb
00 00
LAB_00010a8b XREF[1]: 00010a72(j)
00010a8b 83 7d f0 00 CMP dword ptr [EBP + pid],0x0
00010a8f 0f 85 8c JNZ LAB_00010b21
00 00 00
00010a95 8b 45 e8 MOV EAX,dword ptr [EBP + pipefd[0]]
00010a98 6a 00 PUSH 0x0
00010a9a 50 PUSH EAX
00010a9b e8 60 15 CALL <EXTERNAL>::dup2 int dup2(int __fd, int __fd2)
00 00
00010aa0 83 c4 08 ADD ESP,0x8
00010aa3 6a 01 PUSH 0x1
00010aa5 ff 75 f4 PUSH dword ptr [EBP + fd]
00010aa8 e8 53 15 CALL <EXTERNAL>::dup2 int dup2(int __fd, int __fd2)
00 00
00010aad 83 c4 08 ADD ESP,0x8
00010ab0 6a 02 PUSH 0x2
00010ab2 ff 75 f4 PUSH dword ptr [EBP + fd]
00010ab5 e8 46 15 CALL <EXTERNAL>::dup2 int dup2(int __fd, int __fd2)
00 00
00010aba 83 c4 08 ADD ESP,0x8
00010abd 8b 45 ec MOV EAX,dword ptr [EBP + pipefd[1]]
00010ac0 50 PUSH EAX
00010ac1 e8 8a 15 CALL <EXTERNAL>::close int close(int __fd)
00 00
00010ac6 83 c4 04 ADD ESP,0x4
00010ac9 ff 75 f4 PUSH dword ptr [EBP + fd]
00010acc e8 7f 15 CALL <EXTERNAL>::close int close(int __fd)
00 00
00010ad1 83 c4 04 ADD ESP,0x4
00010ad4 ff 75 f8 PUSH dword ptr [EBP + in]
00010ad7 e8 74 15 CALL <EXTERNAL>::close int close(int __fd)
00 00
00010adc 83 c4 04 ADD ESP,0x4
00010adf 6a 00 PUSH 0x0
00010ae1 68 8c 0c PUSH s_-asxml_00010c8c = "-asxml"
01 00
00010ae6 68 93 0c PUSH DAT_00010c93 = 74h t
01 00
00010aeb 68 93 0c PUSH DAT_00010c93 = 74h t
01 00
00010af0 e8 17 15 CALL <EXTERNAL>::execlp int execlp(char * __file, char *
00 00
00010af5 83 c4 10 ADD ESP,0x10
00010af8 68 98 0c PUSH s_execlp_00010c98 = "execlp"
01 00
00010afd e8 1a 15 CALL <EXTERNAL>::perror void perror(char * __s)
00 00
00010b02 83 c4 04 ADD ESP,0x4
00010b05 b8 05 00 MOV EAX,0x5
00 00
00010b0a e9 bc 00 JMP LAB_00010bcb
00 00
LAB_00010b0f XREF[1]: 00010b34(j)
00010b0f 8b 45 ec MOV EAX,dword ptr [EBP + pipefd[1]]
00010b12 6a 01 PUSH 0x1
00010b14 8d 55 e7 LEA EDX=>c,[EBP + -0x19]
00010b17 52 PUSH EDX
00010b18 50 PUSH EAX
00010b19 e8 1e 15 CALL <EXTERNAL>::write ssize_t write(int __fd, void * _
00 00
00010b1e 83 c4 0c ADD ESP,0xc
LAB_00010b21 XREF[1]: 00010a8f(j)
00010b21 6a 01 PUSH 0x1
00010b23 8d 45 e7 LEA EAX=>c,[EBP + -0x19]
00010b26 50 PUSH EAX
00010b27 ff 75 f8 PUSH dword ptr [EBP + in]
00010b2a e8 d5 14 CALL <EXTERNAL>::read ssize_t read(int __fd, void * __
00 00
00010b2f 83 c4 0c ADD ESP,0xc
00010b32 85 c0 TEST EAX,EAX
00010b34 75 d9 JNZ LAB_00010b0f
00010b36 8b 45 ec MOV EAX,dword ptr [EBP + pipefd[1]]
00010b39 50 PUSH EAX
00010b3a e8 11 15 CALL <EXTERNAL>::close int close(int __fd)
00 00
00010b3f 83 c4 04 ADD ESP,0x4
00010b42 8b 45 e8 MOV EAX,dword ptr [EBP + pipefd[0]]
00010b45 50 PUSH EAX
00010b46 e8 05 15 CALL <EXTERNAL>::close int close(int __fd)
00 00
00010b4b 83 c4 04 ADD ESP,0x4
00010b4e ff 75 f4 PUSH dword ptr [EBP + fd]
00010b51 e8 fa 14 CALL <EXTERNAL>::close int close(int __fd)
00 00
00010b56 83 c4 04 ADD ESP,0x4
00010b59 ff 75 f8 PUSH dword ptr [EBP + in]
00010b5c e8 ef 14 CALL <EXTERNAL>::close int close(int __fd)
00 00
00010b61 83 c4 04 ADD ESP,0x4
00010b64 6a 00 PUSH 0x0
00010b66 8d 45 e0 LEA EAX=>status,[EBP + -0x20]
00010b69 50 PUSH EAX
00010b6a ff 75 f0 PUSH dword ptr [EBP + pid]
00010b6d e8 b2 14 CALL <EXTERNAL>::waitpid __pid_t waitpid(__pid_t __pid, i
00 00
00010b72 83 c4 0c ADD ESP,0xc
00010b75 8b 45 e0 MOV EAX,dword ptr [EBP + status]
00010b78 c1 f8 08 SAR EAX,0x8
00010b7b 0f b6 c0 MOVZX EAX,AL
00010b7e 83 f8 01 CMP EAX,0x1
00010b81 74 18 JZ LAB_00010b9b
00010b83 83 f8 02 CMP EAX,0x2
00010b86 74 22 JZ LAB_00010baa
00010b88 85 c0 TEST EAX,EAX
00010b8a 75 2d JNZ LAB_00010bb9
00010b8c 68 9f 0c PUSH DAT_00010c9f = 4Fh O
01 00
00010b91 e8 92 14 CALL <EXTERNAL>::puts int puts(char * __s)
00 00
00010b96 83 c4 04 ADD ESP,0x4
00010b99 eb 2b JMP LAB_00010bc6
LAB_00010b9b XREF[1]: 00010b81(j)
00010b9b 68 a4 0c PUSH s_Your_file_is_not_completely_comp_00010ca4 = "Your file is not completely c
01 00
00010ba0 e8 83 14 CALL <EXTERNAL>::puts int puts(char * __s)
00 00
00010ba5 83 c4 04 ADD ESP,0x4
00010ba8 eb 1c JMP LAB_00010bc6
LAB_00010baa XREF[1]: 00010b86(j)
00010baa 68 ca 0c PUSH s_Your_file_contains_errors_00010cca = "Your file contains errors"
01 00
00010baf e8 74 14 CALL <EXTERNAL>::puts int puts(char * __s)
00 00
00010bb4 83 c4 04 ADD ESP,0x4
00010bb7 eb 0d JMP LAB_00010bc6
LAB_00010bb9 XREF[1]: 00010b8a(j)
00010bb9 68 e4 0c PUSH s_I_can't_tell_if_your_file_is_XHT_00010ce4 = "I can't tell if your file is
01 00
00010bbe e8 65 14 CALL <EXTERNAL>::puts int puts(char * __s)
00 00
00010bc3 83 c4 04 ADD ESP,0x4
LAB_00010bc6 XREF[3]: 00010b99(j), 00010ba8(j),
00010bb7(j)
00010bc6 b8 00 00 MOV EAX,0x0
00 00
LAB_00010bcb XREF[6]: 000109d8(j), 00010a0b(j),
00010a3a(j), 00010a61(j),
00010a86(j), 00010b0a(j)
00010bcb 8b 5d fc MOV EBX,dword ptr [EBP + local_8]
00010bce c9 LEAVE
00010bcf c3 RET
According to Ghidra, this decompiles to:
int main(int argc,char **argv)
{
__uid_t __euid;
__uid_t __ruid;
__gid_t __egid;
__gid_t __rgid;
int iVar1;
int __fd;
int iVar2;
__pid_t __pid;
ssize_t sVar3;
uint uVar4;
int status;
char c;
int pipefd [2];
pid_t pid;
int fd;
int in;
__euid = geteuid();
__ruid = geteuid();
setreuid(__ruid,__euid);
__egid = getegid();
__rgid = getegid();
setregid(__rgid,__egid);
if (argv[1] == (char *)0x0) {
fwrite("Please specify the file to verify\n",1,0x22,stderr);
iVar1 = 1;
}
else {
iVar1 = open(argv[1],0);
if (iVar1 < 0) {
perror("open");
iVar1 = 2;
}
else {
__fd = open("/dev/null",2);
if (__fd < 0) {
perror("open");
iVar1 = 5;
}
else {
iVar2 = pipe(pipefd);
if (iVar2 < 0) {
perror("pipe");
iVar1 = 3;
}
else {
__pid = fork();
if (__pid < 0) {
perror("fork");
iVar1 = 4;
}
else if (__pid == 0) {
dup2(pipefd[0],0);
dup2(__fd,1);
dup2(__fd,2);
close(pipefd[1]);
close(__fd);
close(iVar1);
execlp("tidy","tidy","-asxml",0);
perror("execlp");
iVar1 = 5;
}
else {
while( true ) {
sVar3 = read(iVar1,&c,1);
if (sVar3 == 0) break;
write(pipefd[1],&c,1);
}
close(pipefd[1]);
close(pipefd[0]);
close(__fd);
close(iVar1);
waitpid(__pid,&status,0);
uVar4 = status >> 8 & 0xff;
if (uVar4 == 1) {
puts("Your file is not completely compliant");
}
else if (uVar4 == 2) {
puts("Your file contains errors");
}
else if (uVar4 == 0) {
puts("OK!");
}
else {
puts("I can\'t tell if your file is XHTML-compliant");
}
iVar1 = 0;
}
}
}
}
}
return iVar1;
}
It appears it is (to summarize) opening the file passed as the first argument using open in read only mode. If successful, it is forking and using the child process to execute tidy to validate the file is valid XHTML.
Nothing about it stands out to me as an obvious vulnerability that I can use here. I've looked into vulnerabilities for the tidy command, but wasn't really able to find anything useful for this.
Any help would be much appreciated!
In regular C code, execlp("tidy","tidy","-asxml",0); is incorrect as execlp() expects a null pointer argument to mark the end of the argument list.
0 is a null pointer when used in a pointer context, which this is not. Yet on architectures where pointers have the same size and passing convention as int, such as 32-bit linux, passing 0 or passing NULL generate the same code, so sloppiness does not get punished.
In 64-bit mode, it would be incorrect to do so but you might get lucky with the x86_64 ABI and a 64-bit 0 value will be passed in this case.
In your own code, avoid such pitfalls and use NULL or (char *)0 as the last argument for execlp(). But on this listing, Ghidra produces code that generates the same assembly code, and in 32-bit mode, passing 0 or (char *)0 produce the same code, so no problem here.
In your context, execlp("tidy","tidy","-asxml",0); shows another problem: it will look for an executable program with the name tidy in the current PATH and run this program as tidy with a command line argument -asxml. Since it changed the effective uid and gid, this is a problem if the program is setuid root because you can create a program named tidy in a directory appearing in the PATH variable before the system directories and this program will be run with the modified rights.
Another potential problem is the program does not check for failure of the system calls setreuid() and setregid(). Although these calls are unlikely to fail for the arguments passed, as documented in the manual pages, it is a grave security error to omit checking for a failure return from setreuid(). In case of failure, the real and effective uid (or gid) is not changed and the process may fork and exec with root privileges.

Why is this code acting different with a single printf? ucontext.h

When I compile my code below it prints
I am running :)
forever(Until I send KeyboardInterrupt signal to the program),
but when I uncomment // printf("done:%d\n", done);, recompile and run it, it will print only two times, prints done: 1 and then returns.
I'm new to ucontext.h and I'm very confused about how this code is working and
why a single printf is changing whole behavior of the code, if you replace printf with done++; it would do the same but if you replace it with done = 2; it does not affect anything and works as we had the printf commented at first place.
Can anyone explain:
Why is this code acting like this and what's the logic behind it?
Sorry for my bad English,
Thanks a lot.
#include <ucontext.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
int main()
{
register int done = 0;
ucontext_t one;
ucontext_t two;
getcontext(&one);
printf("I am running :)\n");
sleep(1);
if (!done)
{
done = 1;
swapcontext(&two, &one);
}
// printf("done:%d\n", done);
return 0;
}
This is a compiler optimization "problem". When the "printf()" is commented, the compiler deduces that "done" will not be used after the "if (!done)", so it does not set it to 1 as it is not worth. But when the "printf()" is present, "done" is used after "if (!done)", so the compiler sets it.
Assembly code with the "printf()":
$ gcc ctx.c -o ctx -g
$ objdump -S ctx
[...]
int main(void)
{
11e9: f3 0f 1e fa endbr64
11ed: 55 push %rbp
11ee: 48 89 e5 mov %rsp,%rbp
11f1: 48 81 ec b0 07 00 00 sub $0x7b0,%rsp
11f8: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
11ff: 00 00
1201: 48 89 45 f8 mov %rax,-0x8(%rbp)
1205: 31 c0 xor %eax,%eax
register int done = 0;
1207: c7 85 5c f8 ff ff 00 movl $0x0,-0x7a4(%rbp) <------- done set to 0
120e: 00 00 00
ucontext_t one;
ucontext_t two;
getcontext(&one);
1211: 48 8d 85 60 f8 ff ff lea -0x7a0(%rbp),%rax
1218: 48 89 c7 mov %rax,%rdi
121b: e8 c0 fe ff ff callq 10e0 <getcontext#plt>
1220: f3 0f 1e fa endbr64
printf("I am running :)\n");
1224: 48 8d 3d d9 0d 00 00 lea 0xdd9(%rip),%rdi # 2004 <_IO_stdin_used+0x4>
122b: e8 70 fe ff ff callq 10a0 <puts#plt>
sleep(1);
1230: bf 01 00 00 00 mov $0x1,%edi
1235: e8 b6 fe ff ff callq 10f0 <sleep#plt>
if (!done)
123a: 83 bd 5c f8 ff ff 00 cmpl $0x0,-0x7a4(%rbp)
1241: 75 27 jne 126a <main+0x81>
{
done = 1;
1243: c7 85 5c f8 ff ff 01 movl $0x1,-0x7a4(%rbp) <----- done set to 1
124a: 00 00 00
swapcontext(&two, &one);
124d: 48 8d 95 60 f8 ff ff lea -0x7a0(%rbp),%rdx
1254: 48 8d 85 30 fc ff ff lea -0x3d0(%rbp),%rax
125b: 48 89 d6 mov %rdx,%rsi
125e: 48 89 c7 mov %rax,%rdi
1261: e8 6a fe ff ff callq 10d0 <swapcontext#plt>
1266: f3 0f 1e fa endbr64
}
printf("done:%d\n", done);
126a: 8b b5 5c f8 ff ff mov -0x7a4(%rbp),%esi
1270: 48 8d 3d 9d 0d 00 00 lea 0xd9d(%rip),%rdi # 2014 <_IO_stdin_used+0x14>
1277: b8 00 00 00 00 mov $0x0,%eax
127c: e8 3f fe ff ff callq 10c0 <printf#plt>
return 0;
Assembly code without the "printf()":
$ gcc ctx.c -o ctx -g
$ objdump -S ctx
[...]
int main(void)
{
11c9: f3 0f 1e fa endbr64
11cd: 55 push %rbp
11ce: 48 89 e5 mov %rsp,%rbp
11d1: 48 81 ec b0 07 00 00 sub $0x7b0,%rsp
11d8: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
11df: 00 00
11e1: 48 89 45 f8 mov %rax,-0x8(%rbp)
11e5: 31 c0 xor %eax,%eax
register int done = 0;
11e7: c7 85 5c f8 ff ff 00 movl $0x0,-0x7a4(%rbp) <------ done set to 0
11ee: 00 00 00
ucontext_t one;
ucontext_t two;
getcontext(&one);
11f1: 48 8d 85 60 f8 ff ff lea -0x7a0(%rbp),%rax
11f8: 48 89 c7 mov %rax,%rdi
11fb: e8 c0 fe ff ff callq 10c0 <getcontext#plt>
1200: f3 0f 1e fa endbr64
printf("I am running :)\n");
1204: 48 8d 3d f9 0d 00 00 lea 0xdf9(%rip),%rdi # 2004 <_IO_stdin_used+0x4>
120b: e8 80 fe ff ff callq 1090 <puts#plt>
sleep(1);
1210: bf 01 00 00 00 mov $0x1,%edi
1215: e8 b6 fe ff ff callq 10d0 <sleep#plt>
if (!done)
121a: 83 bd 5c f8 ff ff 00 cmpl $0x0,-0x7a4(%rbp)
1221: 75 1d jne 1240 <main+0x77>
{
done = 1; <------------- done is no set here (it is optimized by the compiler)
swapcontext(&two, &one);
1223: 48 8d 95 60 f8 ff ff lea -0x7a0(%rbp),%rdx
122a: 48 8d 85 30 fc ff ff lea -0x3d0(%rbp),%rax
1231: 48 89 d6 mov %rdx,%rsi
1234: 48 89 c7 mov %rax,%rdi
1237: e8 74 fe ff ff callq 10b0 <swapcontext#plt>
123c: f3 0f 1e fa endbr64
}
//printf("done:%d\n", done);
return 0;
1240: b8 00 00 00 00 mov $0x0,%eax
}
1245: 48 8b 4d f8 mov -0x8(%rbp),%rcx
1249: 64 48 33 0c 25 28 00 xor %fs:0x28,%rcx
1250: 00 00
1252: 74 05 je 1259 <main+0x90>
1254: e8 47 fe ff ff callq 10a0 <__stack_chk_fail#plt>
1259: c9 leaveq
125a: c3 retq
125b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
To disable the optimization on "done", add the "volatile" keyword in its definition:
volatile register int done = 0;
This makes the program work in both cases.
(There is some overlap with Rachid K's answer as it was posted while I was writing this.)
I am guessing you are declaring done as register in hopes that it will actually be put in a register, so that its value will be saved and restored by the context switch. But the compiler is never obliged to honor this; most modern compilers ignore register declarations completely and make their own decisions about register usage. And in particular, gcc without optimizations will nearly always put local variables in memory on the stack.
As such, in your test case, the value of done is not restored by the context switch. So when getcontext returns for the second time, done has the same value as when swapcontext was called.
When the printf is present, as Rachid also points out, the done = 1 is actually stored before the swapcontext, so on the second return of getcontext, done has the value 1, the if block is skipped, and the program prints done:1 and exits.
However, when the printf is absent, the compiler notices that the value of done is never used after its assignment (since it assumes swapcontext is a normal function and doesn't know that it will actually return somewhere else), so it optimizes out the dead store (yes, even though optimizations are off). Thus we have done == 0 when getcontext returns the second time, and you get an infinite loop. This is maybe what you were expecting if you thought done would be placed in a register, but if so, you got the "right" behavior for the wrong reason.
If you enable optimizations, you'll see something else again: the compiler notices that done can't be affected by the call to getcontext (again assuming it's a normal function call) and therefore it is guaranteed to be 0 at the if. So the test need not be done at all, because it will always be true. The swapcontext is then executed unconditionally, and as for done, it's optimized completely out of existence, because it no longer has any effect on the code. You'll again see an infinite loop.
Because of this issue, you really can't make any safe assumptions about local variables that have been modified in between the getcontext and swapcontext. When getcontext returns for the second time, you might or might not see the changes. There are further issues if the compiler chose to reorder some of your code around the function call (which it knows no reason not to do, since again it thinks these are ordinary function calls that can't see your local variables).
The only way to get any certainty is to declare a variable volatile. Then you can be sure that intermediate changes will be seen, and the compiler will not assume that getcontext can't change it. The value seen at the second return of getcontext will be the same as at the call to swapcontext. If you write volatile int done = 0; you ought to see just two "I am running" messages, regardless of other code or optimization settings.

Understanding array declaration in C

I'm trying to understand how the C Standard explains that the declaration can cause an error. Consider the following pretty simple code:
int main()
{
char test[1024 * 1024 * 1024];
test[0] = 0;
return 0;
}
Demo
This segfaluts. But the following code does not:
int main()
{
char test[1024 * 1024 * 1024];
return 0;
}
Demo
But when I compiled it on my machine the latest one segfaulted too. The main function looks as
00000000000008c6 <main>:
8c6: 55 push %rbp
8c7: 48 89 e5 mov %rsp,%rbp
8ca: 48 81 ec 20 00 00 40 sub $0x40000020,%rsp
8d1: 89 bd ec ff ff bf mov %edi,-0x40000014(%rbp) // <---HERE
8d7: 48 89 b5 e0 ff ff bf mov %rsi,-0x40000020(%rbp)
8de: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
8e5: 00 00
8e7: 48 89 45 f8 mov %rax,-0x8(%rbp)
8eb: 31 c0 xor %eax,%eax
8ed: b8 00 00 00 00 mov $0x0,%eax
8f2: 48 8b 55 f8 mov -0x8(%rbp),%rdx
8f6: 64 48 33 14 25 28 00 xor %fs:0x28,%rdx
8fd: 00 00
8ff: 74 05 je 906 <main+0x40>
901: e8 1a fe ff ff callq 720 <__stack_chk_fail#plt>
906: c9 leaveq
907: c3 retq
908: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
90f: 00
As far as I understood the segfault occurred when trying to mov %edi,-0x40000014(%rbp).
I tried to find the exaplanation in the N1570, Section 6.7.9 Initialization, but it does not seem to be the relevant one.
So how does the Standard explains this behavior?
The result is implementation-dependent
I can think of several reasons of why the behaviour should differ
compiler seeing that variable isn't used, no possible side-effect, and optimizing it away (even without optimization levels)
stack resizing on request. Since there are no writes to this variable yet, why resizing the stack now?
compilers don't have to use the stack for auto memory. Compiler can allocate memory using malloc, and free it on exit. Using heap would allow to allocate 1Gb without issues
stack size set at 1Gb :)

Why does this code prevent gcc & llvm from tail-call optimization?

I have tried the following code on gcc 4.4.5 on Linux and gcc-llvm on Mac OSX(Xcode 4.2.1) and this. The below are the source and the generated disassembly of the relevant functions. (Added: compiled with gcc -O2 main.c)
#include <stdio.h>
__attribute__((noinline))
static void g(long num)
{
long m, n;
printf("%p %ld\n", &m, n);
return g(num-1);
}
__attribute__((noinline))
static void h(long num)
{
long m, n;
printf("%ld %ld\n", m, n);
return h(num-1);
}
__attribute__((noinline))
static void f(long * num)
{
scanf("%ld", num);
g(*num);
h(*num);
return f(num);
}
int main(void)
{
printf("int:%lu long:%lu unsigned:%lu\n", sizeof(int), sizeof(long), sizeof(unsigned));
long num;
f(&num);
return 0;
}
08048430 <g>:
8048430: 55 push %ebp
8048431: 89 e5 mov %esp,%ebp
8048433: 53 push %ebx
8048434: 89 c3 mov %eax,%ebx
8048436: 83 ec 24 sub $0x24,%esp
8048439: 8d 45 f4 lea -0xc(%ebp),%eax
804843c: c7 44 24 08 00 00 00 movl $0x0,0x8(%esp)
8048443: 00
8048444: 89 44 24 04 mov %eax,0x4(%esp)
8048448: c7 04 24 d0 85 04 08 movl $0x80485d0,(%esp)
804844f: e8 f0 fe ff ff call 8048344 <printf#plt>
8048454: 8d 43 ff lea -0x1(%ebx),%eax
8048457: e8 d4 ff ff ff call 8048430 <g>
804845c: 83 c4 24 add $0x24,%esp
804845f: 5b pop %ebx
8048460: 5d pop %ebp
8048461: c3 ret
8048462: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi
8048469: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi
08048470 <h>:
8048470: 55 push %ebp
8048471: 89 e5 mov %esp,%ebp
8048473: 83 ec 18 sub $0x18,%esp
8048476: 66 90 xchg %ax,%ax
8048478: c7 44 24 08 00 00 00 movl $0x0,0x8(%esp)
804847f: 00
8048480: c7 44 24 04 00 00 00 movl $0x0,0x4(%esp)
8048487: 00
8048488: c7 04 24 d8 85 04 08 movl $0x80485d8,(%esp)
804848f: e8 b0 fe ff ff call 8048344 <printf#plt>
8048494: eb e2 jmp 8048478 <h+0x8>
8048496: 8d 76 00 lea 0x0(%esi),%esi
8048499: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi
080484a0 <f>:
80484a0: 55 push %ebp
80484a1: 89 e5 mov %esp,%ebp
80484a3: 53 push %ebx
80484a4: 89 c3 mov %eax,%ebx
80484a6: 83 ec 14 sub $0x14,%esp
80484a9: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi
80484b0: 89 5c 24 04 mov %ebx,0x4(%esp)
80484b4: c7 04 24 e1 85 04 08 movl $0x80485e1,(%esp)
80484bb: e8 94 fe ff ff call 8048354 <__isoc99_scanf#plt>
80484c0: 8b 03 mov (%ebx),%eax
80484c2: e8 69 ff ff ff call 8048430 <g>
80484c7: 8b 03 mov (%ebx),%eax
80484c9: e8 a2 ff ff ff call 8048470 <h>
80484ce: eb e0 jmp 80484b0 <f+0x10>
We can see that g() and h() are mostly identical except the & (address of) operator beside the argument m of printf()(and the irrelevant %ld and %p).
However, h() is tail-call optimized and g() is not. Why?
In g(), you're taking the address of a local variable and passing it to a function. A "sufficiently smart compiler" should realize that printf does not store that pointer. Instead, gcc and llvm assume that printf might store the pointer somewhere, so the call frame containing m might need to be "live" further down in the recursion. Therefore, no TCO.
It's the & that does it. It tells the compiler that m should be stored on the stack. Even though it is passed to printf, the compiler has to assume that it might be accessed by somebody else and thus must the cleaned from the stack after the call to g.
In this particular case, as printf is known by the compiler (and it knows that it does not save pointers), it could probably be taught to perform this optimization.
For more info on this, look up 'escape anlysis'.

Resources