I'm trying to use the assembly inline code in C with gcc, to use the interrupt 21 with ah = 07h to make a getchar without echo. This is my code(the main):
...
int main(int argc, char *argv[])
{
int t, x, y;
char input;
asm(
"movb $0x01, %%ah\n\t"
"int $0x21\n\t"
"movb %%al, %0"
: "=r" (input)
);
printf("Character: %c\n", input);
return 0;
}
...
But it doesn't work, it compiles successfully but it doesn't do anything.
First of all you mixed an AT&T syntax with DOS int. so here are answers for each platform:
1.DOS: http://msdn.microsoft.com/en-us/library/5f7adz6y(v=vs.71).aspx
__asm mov ah,01
__asm int 21
Now al contains the read byte. As explained in here.
If you want to pass al to the char input use the right offset to stack pointer esp - <the offset> to reach the address of input and set to it the read value by calling mov byte [esp-offset], al.
2.LINUX:
The way you write the assembly is AT&T style so please check this out.
static inline
unsigned read_cr0( void )
{
unsigned val;
asm volatile( "mov %%cr0, %0"
: "=r"(val) );
return val;
}
From the informations you give, it seems that you are programming on a platform different from DOS. Interrupt 21 only works on DOS. For linux, have a look at the libraries ncurses and readline for advanced terminal tricks. If you want to do something more low-level, you might also be interested in the ANSI escape sequences. They give you a way to interact with the terminal.
Instead of using inline assembler, which as others have pointed out is DOS-specific, why don't you just set your terminal to disable echo? See this answer to another question for details on how to do so.
Related
I am trying to implement getrusage function into my client server program using sockets and all of this is running on FreeBSD. I want to print out processor time usage and memory usage.
I have tried to implement the following code but I am getting output Illegal instrucion (Core dumped)
int getrusage(int who, struct rusage *usage){
int errorcode;
__asm__(
"syscall"
: "=a" (errorcode)
: "a" (117), "D" (who), "S" (usage) //const Sysgetrusage : scno = 117
: "memory"
);
if (errorcode<0) {
printf("error");
}
return 1;
}
UPDATE: I have tried to run this but I get zero values or some random number value or negative number values. Any ideas what am I missing?
int getrusage(int who, struct rusage *usage){
int errorcode;
__asm__("push $0;"
"push %2;"
"push %1;"
"movl $117, %%eax;"
"int $0x80;"
:"=r"(errorcode)
:"D"(who),"S"(usage)
:"%eax"
);
if (errorcode<0) {
printf("error");
}
return 1;
}
I would like to use system call write more likely, but it is giving me a compilation warning: passing arg 1 of 'strlen' makes pointer from integer without a cast
EDIT: (this is working code now, regarding to comment)
struct rusage usage;
getrusage(RUSAGE_SELF,&usage);
char tmp[300];
write(i, "Memory: ", 7);
sprintf (tmp, "%ld", usage.ru_maxrss);
write(i, tmp, strlen(tmp));
write(i, "Time: ", 5);
sprintf (tmp, "%lds", usage.ru_utime.tv_sec);
write(i, tmp, strlen(tmp));
sprintf (tmp, "%ldms", usage.ru_utime.tv_usec);
write(i, tmp, strlen(tmp));
Any ideas what is wrong?
The reason you are getting an illegal instruction error is because the SYSCALL instruction is only available on 64-bit FreeBSD running a 64-bit program. This is a serious issue since one of your comments suggests that your code is running on 32-bit FreeBSD.
Under normal circumstances you don't need to write your own getrusage since it is part of the C library (libc) on that platform. It appears you have been tasked to do it with inline assembly.
64-bit FreeBSD and SYSCALL Instruction
There is a bit of a bug in your 64-bit code since SYSCALL destroys the contents of RCX and R11. Your code may work but may fail in the future especially as the program expands and you enable optimizations. The following change adds those 2 registers to the clobber list:
int errorcode;
__asm__(
"syscall"
: "=a" (errorcode)
: "a" (117), "D" (who), "S" (usage) //const Sysgetrusage : scno = 117
: "memory", "rcx", "r11"
);
Using the memory clobber can lead to generation of inefficient code so I use it only if necessary. As you become more of an expert the need for memory clobber can be eliminated. I would have used a function like the following if I wasn't allowed to use the C library version of getrusage:
int getrusage(int who, struct rusage *usage){
int errorcode;
__asm__(
"syscall"
: "=a"(errorcode), "=m"(*usage)
: "0"(117), "D"(who), "S"(usage)
: "rcx", "r11"
);
if (errorcode<0) {
printf("error");
}
return errorcode;
}
This uses a memory operand as an output constraint and drops the memory clobber. Since the compiler knows how large a rusage structure and is =m says the output constraint modifies that memory we don't need need the memory clobber.
32-bit FreeBSD System Calls via Int 0x80
As mention in the comments and your updated code, to make a system call in 32-bit code in FreeBSD you have to use int 0x80. This is described in the FreeBSD System Calls Convention. Parameters are pushed on the stack right to left and you must allocate 4 bytes on the stack by pushing any 4 byte value onto the stack after you push the last parameter.
Your edited code has a few bugs. First you push the extra 4 bytes before the rest of the arguments. You need to push it after. You need to adjust the stack after int 0x80 to effectively reclaim the stack space used by the arguments passed. You pushed three 4-byte values on the stack, so you need to add 12 to ESP after int 0x80.
You also need a memory clobber because the compiler doesn't know you have actually modified memory at all. This is because the way you have done your constraints the data in the variable usage gets modified but the compiler doesn't know what.
The return value of the int 0x80 will be in EAX but you use the constraint =r. It should have been =a since the return value will be returned in EAX. Since using =a tells the compiler EAX is clobbered you don't need to list it as a clobber anymore.
The modified code could have looked like:
int getrusage(int who, struct rusage *usage){
int errorcode;
__asm__("push %2;"
"push %1;"
"push $0;"
"movl $117, %%eax;"
"int $0x80;"
"add $12, %%esp"
:"=a"(errorcode)
:"D"(who),"S"(usage)
:"memory"
);
if (errorcode<0) {
printf("error");
}
return errorcode;
}
Another way one could have written this with more advanced techniques is:
int getrusage(int who, struct rusage *usage){
int errorcode;
__asm__("push %[usage]\n\t"
"push %[who]\n\t"
"push %%eax\n\t"
"int $0x80\n\t"
"add $12, %%esp"
:"=a"(errorcode), "=m"(*usage)
:"0"(117), [who]"ri"(who), [usage]"r"(usage)
:"cc" /* Don't need this with x86 inline asm but use for clarity */
);
if (errorcode<0) {
printf("error");
}
return errorcode;
}
This uses a label (usage and who) to identify each parameter rather than using numerical positions like %3, %4 etc. This makes the inline assembly easier to follow. Since any 4-byte value can be pushed onto the stack just before int 0x80 we can save a few bytes by simply pushing the contents of any register. In this case I used %%eax. This uses =m constraint like I did in the 64-bit example.
More information on extended inline assembler can be found in the GCC documentation.
I'm trying to use a small amount of AT&T style inline assembly in C and GCC by reading an article on CodeProject here. The main reason I wish to do this is to find the old value of the EIP register to be able to have a reliable address of instructions in my code. I have written a simple example program to demonstrate my understanding of this concept thus far :
#include <stdio.h>
#include <stdlib.h>
int mainReturnAddress = 0;
int main()
{
asm volatile (
"popl %%eax;"
"pushl %%eax;"
"movl %%eax, %0;"
: "=r" ( mainReturnAddress )
);
printf( "Address : %d\n", mainReturnAddress );
return 0;
}
The purpose of this particular example is to pop 4 bytes from the top of the stack representing the 32 bit return address saved from the EIP register, and then to push it back on the stack. Afterwards, I store it in the global mainReturnAddress variable. Finally, I print the value stored in mainReturnAddress.
The output from I recieve from this code 4200560.
Does this code achieve the purpose aforementioned, and is this is cross processor on the Windows platform 32-bit?
In GCC, you should use __builtin_return_address rather then trying to use inline assembly.
I was reading some answers and questions on here and kept coming up with this suggestion but I noticed no one ever actually explained "exactly" what you need to do to do it, On Windows using Intel and GCC compiler. Commented below is exactly what I am trying to do.
#include <stdio.h>
int main()
{
int x = 1;
int y = 2;
//assembly code begin
/*
push x into stack; < Need Help
x=y; < With This
pop stack into y; < Please
*/
//assembly code end
printf("x=%d,y=%d",x,y);
getchar();
return 0;
}
You can't just push/pop safely from inline asm, if it's going to be portable to systems with a red-zone. That includes every non-Windows x86-64 platform. (There's no way to tell gcc you want to clobber it). Well, you could add rsp, -128 first to skip past the red-zone before pushing/popping anything, then restore it later. But then you can't use an "m" constraints, because the compiler might use RSP-relative addressing with offsets that assume RSP hasn't been modified.
But really this is a ridiculous thing to be doing in inline asm.
Here's how you use inline-asm to swap two C variables:
#include <stdio.h>
int main()
{
int x = 1;
int y = 2;
asm("" // no actual instructions.
: "=r"(y), "=r"(x) // request both outputs in the compiler's choice of register
: "0"(x), "1"(y) // matching constraints: request each input in the same register as the other output
);
// apparently "=m" doesn't compile: you can't use a matching constraint on a memory operand
printf("x=%d,y=%d\n",x,y);
// getchar(); // Set up your terminal not to close after the program exits if you want similar behaviour: don't embed it into your programs
return 0;
}
gcc -O3 output (targeting the x86-64 System V ABI, not Windows) from the Godbolt compiler explorer:
.section .rodata
.LC0:
.string "x=%d,y=%d"
.section .text
main:
sub rsp, 8
mov edi, OFFSET FLAT:.LC0
xor eax, eax
mov edx, 1
mov esi, 2
#APP
# 8 "/tmp/gcc-explorer-compiler116814-16347-5i3lz1/example.cpp" 1
# I used "\n" instead of just "" so we could see exactly where our inline-asm code ended up.
# 0 "" 2
#NO_APP
call printf
xor eax, eax
add rsp, 8
ret
C variables are a high level concept; it doesn't cost anything to decide that the same registers now logically hold different named variables, instead of swapping the register contents without changing the varname->register mapping.
When hand-writing asm, use comments to keep track of the current logical meaning of different registers, or parts of a vector register.
The inline-asm didn't lead to any extra instructions outside the inline-asm block either, so it's perfectly efficient in this case. Still, the compiler can't see through it, and doesn't know that the values are still 1 and 2, so further constant-propagation would be defeated. https://gcc.gnu.org/wiki/DontUseInlineAsm
#include <stdio.h>
int main()
{
int x=1;
int y=2;
printf("x::%d,y::%d\n",x,y);
__asm__( "movl %1, %%eax;"
"movl %%eax, %0;"
:"=r"(y)
:"r"(x)
:"%eax"
);
printf("x::%d,y::%d\n",x,y);
return 0;
}
/* Load x to eax
Load eax to y */
If you want to exchange the values, it can also be done using this way. Please note that this instructs GCC to take care of the clobbered EAX register. For educational purposes, it is okay, but I find it more suitable to leave micro-optimizations to the compiler.
You can use extended inline assembly. It is a compiler feature whicg allows you to write assembly instructions within your C code. A good reference for inline gcc assembly is available here.
The following code copies the value of x into y using pop and push instructions.
( compiled and tested using gcc on x86_64 )
This is only safe if compiled with -mno-red-zone, or if you subtract 128 from RSP before pushing anything. It will happen to work without problems in some functions: testing with one set of surrounding code is not sufficient to verify the correctness of something you did with GNU C inline asm.
#include <stdio.h>
int main()
{
int x = 1;
int y = 2;
asm volatile (
"pushq %%rax\n" /* Push x into the stack */
"movq %%rbx, %%rax\n" /* Copy y into x */
"popq %%rbx\n" /* Pop x into y */
: "=b"(y), "=a"(x) /* OUTPUT values */
: "a"(x), "b"(y) /* INPUT values */
: /*No need for the clobber list, since the compiler knows
which registers have been modified */
);
printf("x=%d,y=%d",x,y);
getchar();
return 0;
}
Result x=2 y=1, as you expected.
The intel compiler works in a similar way, I think you have just to change the keyword asm to __asm__. You can find info about inline assembly for the INTEL compiler here.
Somebody over at SO posted a question asking how he could "hide" a function. This was my answer:
#include <stdio.h>
#include <stdlib.h>
int encrypt(void)
{
char *text="Hello World";
asm("push text");
asm("call printf");
return 0;
}
int main(int argc, char *argv[])
{
volatile unsigned char *i=encrypt;
while(*i!=0x00)
*i++^=0xBE;
return EXIT_SUCCESS;
}
but, there are problems:
encode.c: In function `main':
encode.c:13: warning: initialization from incompatible pointer type
C:\DOCUME~1\Aviral\LOCALS~1\Temp/ccYaOZhn.o:encode.c:(.text+0xf): undefined reference to `text'
C:\DOCUME~1\Aviral\LOCALS~1\Temp/ccYaOZhn.o:encode.c:(.text+0x14): undefined reference to `printf'
collect2: ld returned 1 exit status
My first question is why is the inline assembly failing ... what would be the right way to do it? Other thing -- the code for "ret" or "retn" is 0x00 , right... my code xor's stuff until it reaches a return ... so why is it SEGFAULTing?
As a high level point, I'm not quite sure why you're trying to use inline assembly to do a simple call into printf, as all you've done is create an incorrect version of a function call (your inline pushes something onto the stack, but never pop it off, most likely causing problems cause GCC isn't aware that you've modified the stack pointer in the middle of the function. This is fine in a trivial example, but could lead to non-obvious errors in a more complicated function)
Here's a correct implementation of your top function:
int encrypt(void)
{
char *text="Hello World";
char *formatString = "%s\n";
// volatile really isn't necessary but I just use it by habit
asm volatile("pushl %0;\n\t"
"pushl %1;\n\t"
"call printf;\n\t"
"addl $0x8, %%esp\n\t"
:
: "r"(text), "r"(formatString)
);
return 0;
}
As for your last question, the usual opcode for RET is "C3", but there are many variations, have a look at http://pdos.csail.mit.edu/6.828/2009/readings/i386/RET.htm
Your idea of searching for RET is also faulty as due to the fact that when you see the byte 0xC3 in a random set of instructions, it does NOT mean you've encountered a ret. As the 0xC3 may simply be the data/attributes of another instruction (as a side note, it's particularly hard to try and parse x86 instructions as you're doing due to the fact x86 is a CISC architecture with instruction lengths between 1-16 bytes)
As another note, not all OS's allow modification to the text/code segment (Where executable instructions are stored), so the the code you have in main may not work regardless.
GCC inline asm uses AT&T syntax (if no specific options are selected for using Intel's one).
Here's an example:
int a=10, b;
asm ("movl %1, %%eax;
movl %%eax, %0;"
:"=r"(b) /* output */
:"r"(a) /* input */
:"%eax" /* clobbered register */
);
Thus, your problem is that "text" is not identifiable from your call (and following instruction too).
See here for reference.
Moreover your code is not portable between 32 and 64 bit environments. Compile it with -m32 flag to ensure proper analysis (GCC will complain anyway if you fall in error).
A complete solution to your problem is on this post on GCC Mailing list.
Here's a snippet:
for ( i = method->args_size - 1; i >= 0; i-- ) {
asm( "pushl %0": /* no outputs */: \
"g" (stack_frame->op_stack[i]) );
}
asm( "call *%0" : /* no outputs */ : "g" (fp) :
"%eax", "%ecx", "%edx", "%cc", "memory" );
asm ( "movl %%eax, %0" : "=g" (ret_value) : /* No inputs */ );
On windows systems there's also an additional asm ( "addl %0, %%esp" : /* No outputs */ : "g" (method->args_size * 4) ); to do. Google for better details.
It is not printf but _printf
I have been dealing with Nasm and GNU C inline asm on a Linux environment for some time and this function worked great... but now I am switching to a windows environment and I want to use Masm (with VS2008) I cant seem to get this to work...
void outportb (unsigned short _port, unsigned short _data)
{
__asm__ __volatile__ ("outb %1, %0" : : "dN" (_port), "a" (_data));
}
When I write something like this...
void outportb (unsigned short _port, unsigned short _data)
{
asm volatile ("outb %1, %0" : : "dN" (_port), "a" (_data));
}
asm is no more recognised and volatile throws an error saying "string", I also tried writing _asm volatile but I get an error saying "inline assembler syntax error in 'opcode'; found 'data type'"
Assuming you're talking about x86 command set, here are few things to remember:
the instruction "outb" outputs one byte, which would be equivalent to type "char" or "unsigned char" in C/C++. For outputting a 16-bit (since you're using "unsigned short") word one needs to use "outw"
having said that, it is recommended by Intel (and required by VS) that you use the instruction mnemonic "out" and the port size is recognized from the operand size. For example "out dx, ax" would be equivalent for "outw", while "out dx, al" is equivalent for "outb"
on x86 the "out" instruction requires the port and the outputting value to be placed into (e)dx and {eax/ax/al} registers respectively. While Nasm might do it for you (I don't have the compiler handy, so I can't confirm that), in VS you have to do it the way it is done on the CPU level.
there is no reason to specify "volatile" keyword with __asm. Any inline assembly instructions cause VS compiler to disable read caching (what volatile keyword is for)
Here is the code (assuming you're writing into 16-bit port):
void outportw(unsigned short port, unsigned short data)
{
__asm mov ax, data;
__asm mov dx, port;
__asm out dx, ax;
}
in case you're writing into 8-bit port, the code should look like that:
void outportb(unsigned short port, unsigned char data)
{
__asm mov al, data;
__asm mov dx, port;
__asm out dx, al;
}