I'm a newbie to x86 assembly (Intel syntax) and have been playing around with some simple instructions using inline GCC. I have successfully managed to do manipulation of numbers and control flow and am now tackling standard input and output using interrupts. I am using Mac OS X and forcing compilation for 32-bit using the -m32 GCC flag.
I have the following for printing a string to standard output:
char* str = "Hello, World!\n";
int strLen = strlen(str);
asm
{
mov eax, 4
push strLen
push str
push 1
push eax
int 0x80
add esp, 16
}
When compiled and run this prints Hello, World! to the console! However, when I try to do some reading from standard input, things don't work as well:
char* str = (char*)malloc(sizeof(char) * 16);
printf("Please enter your name: ");
asm
{
mov eax, 3
push 16
push str
push 0
push eax
int 0x80
add esp, 16
}
printf("Hello, %s!\n", str);
When run, I get a prompt, but without the "Please enter your name: " string. When I enter some input and hit Enter, the entry string is printed as well as the expected output, e.g.
Please enter your name: Hello, Joe Bloggs
!
How do I get the entry string to appear in the expected location, before the user enters any input?
printf writes using stdio, which does buffering (i.e., what's written doesn't get output straight away). You need to call fflush(stdout) first, before you send your syscall to read (since syscalls bypass stdio and knows nothing about buffers).
Also, as Kerrek SB has noted, your asm does not have a clobber list and it's not volatile. That means that gcc is free to relocate your assembly code elsewhere in the function (since it's free to assume your assembly code has no side effects), which may have a different effect from what you expect. I recommend you use asm volatile.
Related
I have a C program that writes a NOP character to stdout:
#include <stdio.h>
int main(char *argc, char *argv[]) {
fwrite("\x90", 1, sizeof(char), stdout);
return 0;
}
I also have another program that takes input, which i am runnning in gdb (so i can view the stack).
After running the first program i copy the NOP from stdout and paste it in GDB as input for the second program.
When viewing the stack i always get this value:
0x00bdbfef
When it should be
0x00000090
Why is this? The problem also seems to occur with python but i cannot pinpoint why.
The utf-8 sequence ef bf bd (keeping in mind the byte reversal of larger data types in some architectures) is the replacement-character code point, the diamond with a question mark within.
Most likely your terminal is unable to render 90 so it gives you that instead. And, when you mark and copy that character elsewhere, that's what it is.
Learning assembly and reading about the BIT instruction on msp430.
When trying to compile this code:
int main (void)
{
while(1){
__asm__("BIT R2, 3");
}
return 0;
}
It says: error: odd operand: -3
Yet when writing __asm__("BIT.B R2, 3"); instead, it works.
Could somebody explain this please?
The instruction BIT R2, 3 is using symbolic mode for the destination address (i.e. an offset from the program counter). You must use BIT R2, #3 if you want to use the immediate value 3.
The reason this fails with BIT and not with BIT.B is because BIT does a word operation and you are using an odd address which is illegal. Word operations must be word aligned (i.e. even addresses) in the MSP430. Byte operations can operate on any byte address, odd or even.
You can get quite detailed information if you read the User Guide for the family of MCU you are using. For example, for the MSP430x2xxx family you would read the https://www.ti.com/lit/ug/slau144j/slau144j.pdf document, Chapter 3 or 4 depending on whether your MCU has the newer 20-bit address core.
I am reading The shellcoder's Handbook and im currently at chapter 2 where i have a simple program to exploit by overflowing the expected input and then issuing a new location for the ret instruction so that the function return_input can be executed twice !
Here is the simple program made in C
void return_input (void)
{
char array[30];
gets (array);
printf(ā%s\nā, array);
}
main()
{
return_input();
return 0;
}
And this is the disassembled version of the main fucntion where we can see the jump adress of the call function.
I use the following command and input the chars that overflow with the adress following them that should replace ret's content
But as you can see i do not run the return_input function twice instead it just prints out a question mark and says segmentation failed
gets read terminating byte in and replaced it with NULL byte and thus your desired ret was broken with that NULL byte.
The offset you saw in disassembly codes is NOT the real address, you compiled the program with PIE flag set so the real address may look like 0x55555????58a, that's why gdb didn't allow you to insert a break point because you might try to do b *0x58a or something. Compile with -no-pie would make life easier.
I am working on an assigment on creating an operating system using some assembly functions and 16 bit C compiler. My task is to print strings on screen using 0x10 interrupt. Since interrupts can be called in assembly file, I have been provided with an assembly file which contains a function called interrupt which takes five arguments : the interrupt number, and the interrupt parameters passed in the AX, BX, CX, and DX.
For example, to print 'Q' with the provided function, I need to write like this:
char al = 'Q'
char ah = 0xE
int ax = ah*256+al;
interrupt(0x10,ax,0,0,0);
OR, simply:
interrupt(0x10,0xE*256+'Q',0,0,0);
in a C program called kernel.c
My task is to write a function printString(char *chars) in C which takes a string and prints it on screen using the discussed assembly function.
I have done it this way:
void printString(char * chars){
int i = 0;
int l = length(chars);
for(; i < l; i++){
interrupt(0x10,0xE*256+chars[i],0,0,0);
}
}
but it prints the string multiple times instead of printing one time.
when I try to print "Hello World", it's printed 11 times, because it contains 11 characters, same is the case with other strings.
I think you need to look for a null character to terminate the read. I've noticed the assembly file does some weird stuff with the character buffers too. I even had multiple characters print when I called the interrupt function directly from main().
Adding the line: while(1); keeps main() from returning. The boot loader executing multiple instances of main() is what causes the repeated output.
this code is meant to take a list of names in a text file, and convert to email form
so Kate Jones becomes kate.jones#yahoo.com
this code worked fine on linux mint 12, but now the exact same code is giving a segfault on arch linux.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
FILE *fp;
fp = fopen("original.txt", "r+");
if (fp == NULL )
{
printf("error opening file 1");
return (1);
}
char line[100];
char mod[30] = "#yahoo,com\n";
while (fgets(line, 100, fp) != NULL )
{
int i;
for (i = 0; i < 100; ++i)
{
if (line[i] == ' ')
{
line[i] = '.';
}
if (line[i] == '\n')
{
line[i] = '\0';
}
}
strcat(line, mod);
FILE *fp2;
fp2 = fopen("final.txt", "a");
if (fp == NULL )
{
printf("error opening file 2");
return (1);
}
if (fp2 != NULL )
{
fputs(line, fp2);
fclose(fp2);
}
}
fclose(fp);
return 0;
}
Arch Linux is a fairly fresh install, could it be that there is something else I didn't install that C will need?
I think the problem would be when your original string plus mod exceeds 100 characters.
When you call strcat, it simply copies the string from the second appended to the first, assuming there is enough room in the first string which clearly doesn't seem to be the case here.
Just increase the size of line i.e. it could be
char line[130]; // 130 might be more than what is required since mod is shorter
Also it is much better to use strncat
where you can limit maximum number of elements copied to dst, otherwise, strcat can still go beyond size without complaining if given large enough strings.
Though a word of caution with strncat is that it does not terminate strings with null i.e. \0 on its own, specially when they are shorter than the given n. So its documentation should be thoroughly read before actual use.
Update: Platform specific note
Thought of adding, it is by sheer coincidence that it didn't seg fault on mint and crashed on arch. In practice it is invoking undefined behavior and should crash sooner or latter. There is nothing platform specific here.
Firstly your code isn't producing segmentation fault. Instead it will bring up "Stack Smashing" and throws below libc_message in the output console.
*** stack smashing detected ***: _executable-name-with-path_ terminated.
Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than there was actually allocated for that buffer.
Stack Smashing Protector (SSP) is a GCC extension for protecting applications from such stack-smashing attacks.
And, as said in other answers, your problem gets resolved with incrementing (strcat() function's first argument). from
char line[100]
to
char line[130]; // size of line must be atleast `strlen(line) + strlen(mod) + 1`. Though 130 is not perfect, it is safer
Lets see where the issue exactly hits in your code:
For that I am bringing up disassembly code of your main.
(gdb) disas main
Dump of assembler code for function main:
0x0804857c <+0>: push %ebp
0x0804857d <+1>: mov %esp,%ebp
0x0804857f <+3>: and $0xfffffff0,%esp
0x08048582 <+6>: sub $0xb0,%esp
0x08048588 <+12>: mov %gs:0x14,%eax
0x0804858e <+18>: mov %eax,0xac(%esp)
..... //Leaving out Code after 0x0804858e till 0x08048671
0x08048671 <+245>: call 0x8048430 <strcat#plt>
0x08048676 <+250>: movl $0x80487d5,0x4(%esp)
.... //Leaving out Code after 0x08048676 till 0x08048704
0x08048704 <+392>: mov 0xac(%esp),%edx
0x0804870b <+399>: xor %gs:0x14,%edx
0x08048712 <+406>: je 0x8048719 <main+413>
0x08048714 <+408>: call 0x8048420 <__stack_chk_fail#plt>
0x08048719 <+413>: leave
0x0804871a <+414>: ret
Following the usual assembly language prologue,
Instruction at 0x08048582 : stack grows by b0(176 in decimal) bytes for allowing storage stack contents for the main function.
%gs:0x14 provides the random canary value used for stack protection.
Instruction at 0x08048588 : Stores above mentioned value into the eax register.
Instruction at 0x0804858e : eax content(canary value) is pushed to stack at $esp with offset 172
Keep a breakpoint(1) at 0x0804858e.
(gdb) break *0x0804858e
Breakpoint 1 at 0x804858e: file program_name.c, line 6.
Run the program:
(gdb) run
Starting program: /path-to-executable/executable-name
Breakpoint 1, 0x0804858e in main () at program_name.c:6
6 {
Once program pauses at the breakpoint(1), Retreive the random canary value by printing the contents of register 'eax'
(gdb) i r eax
eax 0xa3d24300 -1546501376
Keep a breakpoint(2) at 0x08048671 : Exactly before call strcat().
(gdb) break *0x08048671
Breakpoint 2 at 0x8048671: file program_name.c, line 33.
Continue the program execution to reach the breakpoint (2)
(gdb) continue
Continuing.
Breakpoint 2, 0x08048671 in main () at program_name.c:33
print out the second top stack content where we stored the random canary value by executing following command in gdb, to ensure it is the same before strcat() is called.
(gdb) p *(int*)($esp + 172)
$1 = -1546501376
Keep a breakpoint (3) at 0x08048676 : Immediately after returning from call strcat()
(gdb) break *0x08048676
Breakpoint 3 at 0x8048676: file program_name.c, line 36.
Continue the program execution to reach the breakpoint (3)
(gdb) continue
Continuing.
Breakpoint 3, main () at program_name.c:36
print out the second top stack content where we stored the random canary value by executing following command in gdb, to ensure it is not corrupted by calling strcat()
(gdb) p *(int*)($esp + 172)
$2 = 1869111673
But it is corrupted by calling strcat(). You can see $1 and $2 are not same.
Lets see what happens because of corrupting the random canary value.
Instruction at 0x08048704 : Pulls the corrupted random canary value and stores in 'edx` register
Instruction at 0x0804870b : xor the actual random canary value and the contents of 'edx' register
Instruction at 0x08048712 : If they are same, jumps directly to end of main and returns safely. In our case random canary value is corrupted and 'edx' register contents is not the same as the actual random canary value. Hence Jump condition fails and __stack_chk_fail is called which throws libc_message mentioned in the top of the answer and aborts the application.
Useful Links:
IBM SSP Page
Interesting Read on SSP - caution pdf.
Since you didn't tell us where it faults I'll just point out some suspect lines:
for(i=0; i<100; ++i)
What if a line is less than 100 chars? This will read uninitialized memory - its not a good idea to do this.
strcat(line, mod);
What if a line is 90 in length and then you're adding 30 more chars? Thats 20 out of bounds..
You need to calculate the length and dynamically allocate your strings with malloc, and ensure you don't read or write out of bounds, and that your strings are always NULL terminated. Or you could use C++/std::string to make things easier if it doesn't have to be C.
Instead of checking for \n only, for the end of line, add the check for \r character also.
if(line[i] == '\n' || line[i] == '\r')
Also, before using strcat ensure that line has has enough room for mod. You can do this by checking if (i < /* Some value far less than 100 */), if i == 100 then that means it never encountered a \n character hence \0 was not added to line, hence Invalid memory Access occurs inside strcat() and therefore Seg Fault.
Fixed it. I simply increased the size of my line string.