Strange behavior of asm block in c code - c

I'm trying to create a little example of 'How to use asm block in C code'.
In my example, i'm trying to increment a value of variable which I created in my C code.
This is my code:
int main()
{
unsigned int i = 0;
unsigned int *ptr1;
// Get the address of the variable i.
ptr1 = &i;
// Show ECHO message.
printf_s("Value before '_asm' block:");
printf_s("\ni = %d (Address = ptr1: %d)\n\n", i, ptr1);
_asm {
// Copy the value of i from the memory.
mov bx, word ptr [ptr1]
// Increment the value of i.
inc bx
// Update the new value of i in memory.
mov word ptr [ptr1], bx
}
// Show ECHO message.
printf_s("Value after '_asm' block:");
printf_s("\ni = %d (Address = ptr1: %d)\n\n", i, ptr1);
// Force the console to stay open.
getchar();
return 0;
}
This is the result of the code in the console:
Values before '_asm' block:
i = 0 (Address = ptr1: 1441144)
Values after '_asm' block:
i = 0 (Address = ptr1: 1441145)
This is very wierd. I only want to update the value of the 'i' variable, but it doesn't work.
In addition, the pointer 'ptr1' now points to the next memory block...
Why is this happening ? And how should I solve this problem?
EDIT:
Thanks to the comments below, I solved the problem.
The main change is in this line:
// Increment the value of i.
inc bx
Due to the fact that we want to increment the VALUE of the variable 'i', we should use brackets.
In addition, the bx register should be changed now to 'ebx', that is a 32-bit register.
Because of using the 'ebx' register, the expression 'word ptr' should be replaced with 'dword ptr'.
The code of the asm block, after the editings:
_asm {
// Copy the value of i from the memory.
mov ebx, dword ptr [ptr1]
// Increment the value of i.
inc [ebx]
// Update the new value of i in memory.
mov dword ptr [ptr1], ebx
}

Related

XCode C compiler gives different result about integer pointer to const integer value, why?

In XCode I've tried to manipulate const int value by using pointer. Here is the code:
const int con = 5;
int *p;
p = &con;
(*p) +=1;
printf("Add of constant:%p\n",&con);
printf("Add of pointer:%p\n",p);
printf("%d - %d",con,*p);
Result is like that on XCode:
Add of constant:0x7fff5fbff79c
Add of pointer:0x7fff5fbff79c
5 - 6
but on linux virtual machine values of con and *p is same 6.
Why there is a difference between values on XCode?
Tried this with VisualStudio, get the same result as with XCode. Assembly listing proved #Gerhardh point:
(*p) +=1; successfully increases value of con in RAM
printf("Add of constant:%p\n",&con); prints address of this memory correctly
printf("%d - %d",con,*p); doesn't read con value from RAM, but passes 5 directly into printf. Optimization threw away unnecessary read for value known at compile time. Here is related assembly listing
printf("%d - %d",con,*p);
mov eax,dword ptr [p] //get p
mov ecx,dword ptr [eax] //get *p
push ecx //push *p (3rd param)
push 5 //push 5 (2nd param). No read of con
push offset string "%d - %d" (415800h) //push addr of format string (1st param)
call dword ptr [__imp__printf (4182BCh)] //call printf()
Obviously, compiler on your VM didn't perform the same optimization.

INT 13 Extension Read in C

i can use extended read functions of bios int 13h well from assembly,
with the below code
; *************************************************************************
; Setup DISK ADDRESS PACKET
; *************************************************************************
jmp strtRead
DAPACK :
db 010h ; Packet Size
db 0 ; Always 0
blkcnt:
dw 1 ; Sectors Count
db_add :
dw 07e00h ; Transfer Offset
dw 0 ; Transfer Segment
d_lba :
dd 1 ; Starting LBA(0 - n)
dd 0 ; Bios 48 bit LBA
; *************************************************************************
; Start Reading Sectors using INT13 Func 42
; *************************************************************************
strtRead:
mov si, OFFSET DAPACK; Load DPACK offset to SI
mov ah, 042h ; Function 42h
mov dl, 080h ; Drive ID
int 013h; Call INT13h
i want to convert this to be a c callable function but i have no idea about how to transfer the parameters from c to asm like drive id , sectors count, buffer segment:offset .... etc.
i am using msvc and masm and working with nothing except bios functions.
can anyone help ?!!
update :
i have tried the below function but always nothing loaded into the buffer ??
void read_sector()
{
static unsigned char currentMBR[512] = { 0 };
struct disk_packet //needed for int13 42h
{
byte size_pack; //size of packet must be 16 or 16+
byte reserved1; //reserved
byte no_of_blocks; //nof blocks for transfer
byte reserved2; //reserved
word offset; //offset address
word segment; //segment address
dword lba1;
dword lba2;
} disk_pack;
disk_pack.size_pack = 16; //set size to 16
disk_pack.no_of_blocks = 1; //1 block ie read one sector
disk_pack.reserved1 = 0; //reserved word
disk_pack.reserved2 = 0; //reserved word
disk_pack.segment = 0; //segment of buffer
disk_pack.offset = (word)&currentMBR[0]; //offset of buffer
disk_pack.lba1 = 0; //lba first 32 bits
disk_pack.lba2 = 0; //last 32 bit address
_asm
{
mov dl, 080h;
mov[disk_pack.segment], ds;
mov si, disk_pack;
mov ah, 42h;
int 13h
; jc NoError; //No error, ignore error code
; mov bError, ah; // Error, get the error code
NoError:
}
}
Sorry to post this as "answer"; I want to post this as "comment" but it is too long...
Different compilers have a different syntax of inline assembly. This means that the correct syntax of the following lines:
mov[disk_pack.segment], ds;
mov si, disk_pack;
... depends on the compiler used. Unfortunately I do not use 16-bit C compilers so I cannot help you in this point.
The next thing I see in your program is the following one:
disk_pack.segment = 0; //segment of buffer
disk_pack.offset = (word)&currentMBR[0]; //offset of buffer
With a 99% chance this will lead to a problem. Instead I would to the following:
struct disk_packet //needed for int13 42h
{
byte size_pack;
byte reserved;
word no_of_blocks; // note that this is 16-bit!
void far *data; // <- This line is the main change!
dword lba1;
dword lba2;
} disk_pack;
...
disk_pack.size_pack = 16;
disk_pack.no_of_blocks = 1;
disk_pack.reserved = 0;
disk_pack.data = &currentMBR[0]; // also note the change here
disk_pack.lba1 = 0;
disk_pack.lba2 = 0;
...
Note that some compilers name the keyword "_far" or "__far" instead of "far".
A third problem is that some (buggy) BIOSes require ES to be equal to the segment value from the disk_pack and a fourth one is that many compilers require the inline assembly code not to modify any registers (AX, CX and DX is normally OK).
These two could be solved the following way:
push ds;
push es;
push si;
mov dl, 080h;
// TODO here: Set ds:si to disk_pack in a compiler-specific way
mov es,[si+6];
mov ah, 42h;
int 13h;
...
pop si;
pop es;
pop ds;
In my opinion the "#pragma pack" should not be neccessary because all elements in the structure are propperly aligned.

How do I get info from the stack, using inline assembly, to program in c?

I have a task to do and I'm asking for some help. (on simple c lang')
What I need to do?
I need to check every command on the main c program (using interrupt num 1) and printing a message only if the next command is the same procedure that was sent earlier to the stack, by some other procedure.
What I want to do?
I want to take info from the stack, using inline assembley, and put it on a variable that can be compare on c program itself after returnning to c. (volatile)
This is the program:
#include <stdio.h>
#include <dos.h>
#include <conio.h>
#include <stdlib.h>
typedef void (*FUN_PTR)(void);
void interrupt (*Int1Save) (void); //pointer to interrupt num 1//
volatile FUN_PTR our_func;
char *str2;
void interrupt my_inter (void) //New interrupt//
{volatile FUN_PTR next_command;
asm { PUSH BP
MOV BP,SP
PUSH AX
PUSH BX
PUSH ES
MOV ES,[BP+4]
MOV BX,[BP+2]
MOV AX,ES:[BX]
MOV word ptr next_command,AX
POP ES
POP BX
POP AX
pop BP}
if (our_func==next_command) printf("procedure %s has been called\n",str2);}
void animate(int *iptr,char str[],void (*funptr)(), char fstr[])
{
str2=fstr;
our_func=funptr;
Int1Save = getvect(1); // save old interrupt//
setvect(1,my_inter);
asm { pushf //TF is ON//
pop ax
or ax,100000000B
push ax
popf}}
void unanimate()
{asm { pushf //TF is OFF//
pop ax
and ax,1111111011111111B
push ax
popf}
setvect (1,Int1Save); //restore old interrupt//}
void main(void)
{int i;
int f1 = 1;
int f2 = 1;
int fibo = 1;
animate(&fibo, "fibo", sleep, "sleep");
for(i=0; i < 8; i++)
{
sleep(2);
f1 = f2;
f2 = fibo;
fibo = f1 + f2;} // for//
unanimate();} // main//
My question...
Off course the problem is at "my inter" on the inline assembly. but can't figure it out.
What am I doing wrong? (please take a look at the code above)
I wanted to save the address of the pointer for the specific procedure (sleep) in the volatile our_func. then take the info (address to each next command) from the stack to volatile next_command and then finaly returnning to c and make the compare each time. If the same value (address) is on both variables then to print a specific message.
Hope I'm clear..
10x,
Nir B
Answered as a comment by the OP
I got the answer I wanted:
asm { MOV SI,[BP+18] //Taking the address of each command//
MOV DI,[BP+20]
MOV word ptr next_command+2,DI
MOV word ptr next_command,SI}
if ((*our_func)==(*next_command)) //Making the next_command compare//
printf("procedure %s has been called\n",str2);

C - Inline asm patching at runtime

I am writing a program in C and i use inline asm. In the inline assembler code is have some addresses where i want to patch them at runtime.
A quick sample of the code is this:
void __declspec(naked) inline(void)
{
mov eax, 0xAABBCCDD
call 0xAABBCCDD
}
An say i want to modify the 0xAABBCCDD value from the main C program.
What i tried to do is to Call VirtualProtect an is the pointer of the function in order to make it Writeable, and then call memcpy to add the appropriate values to the code.
DWORD old;
VirtualProtect(inline, len, PAGE_EXECUTE_READWRITE, &old);
However VirtualProtect fails and GetLastError() returns 487 which means accessing invalid address. Anyone have a clue about this problem??
Thanks
Doesn't this work?
int X = 0xAABBCCDD;
void __declspec(naked) inline(void)
{
mov eax, [X]
call [X]
}
How to do it to another process at runtime,
Create a variable that holds the program base address
Get the target RVA (Relative Virtual Address)
Then calculate the real address like this PA=RVA + BASE
then call it from your inline assembly
You can get the base address like this
DWORD dwGetModuleBaseAddress(DWORD dwProcessID)
{
TCHAR zFileName[MAX_PATH];
ZeroMemory(zFileName, MAX_PATH);
HANDLE hProcess = OpenProcess(PROCESS_ALL_ACCESS, true, dwProcessID);
HANDLE hSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPMODULE, dwProcessID);
DWORD dwModuleBaseAddress = 0;
if (hSnapshot != INVALID_HANDLE_VALUE)
{
MODULEENTRY32 ModuleEntry32 = { 0 };
ModuleEntry32.dwSize = sizeof(MODULEENTRY32);
if (Module32First(hSnapshot, &ModuleEntry32))
{
do
{
if (wcscmp(ModuleEntry32.szModule, L"example.exe") == 0)
{
dwModuleBaseAddress = (DWORD_PTR)ModuleEntry32.modBaseAddr;
break;
}
} while (Module32Next(hSnapshot, &ModuleEntry32));
}
CloseHandle(hSnapshot);
CloseHandle(hProcess);
}
return dwModuleBaseAddress;
}
Assuming you have a local variable and your base address
mov dword ptr ss : [ebp - 0x14] , eax;
mov eax, dword ptr BaseAddress;
add eax, PA;
call eax;
mov eax, dword ptr ss : [ebp - 0x14] ;
You have to restore the value of your Register after the call returns, since this value may be used somewhere down the code execution, assuming you're trying to patch an existing application that may depend on the eax register after your call. Although this method has it disadvantages, but at least it will give anyone idea on what to do.

Multithreading with inline assembly and access to a c variable

I'm using inline assembly to construct a set of passwords, which I will use to brute force against a given hash. I used this website as a reference for the construction of the passwords.
This is working flawlessly in a singlethreaded environment. It produces an infinite amount of incrementing passwords.
As I have only basic knowledge of asm, I understand the idea. The gcc uses ATT, so I compile with -masm=intel
During the attempt to multithread the program, I realize that this approach might not work.
The following code uses 2 global C variables, and I assume that this might be the problem.
__asm__("pushad\n\t"
"mov edi, offset plaintext\n\t" <---- global variable
"mov ebx, offset charsetTable\n\t" <---- again
"L1: movzx eax, byte ptr [edi]\n\t"
" movzx eax, byte ptr [charsetTable+eax]\n\t"
" cmp al, 0\n\t"
" je L2\n\t"
" mov [edi],al\n\t"
" jmp L3\n\t"
"L2: xlat\n\t"
" mov [edi],al\n\t"
" inc edi\n\t"
" jmp L1\n\t"
"L3: popad\n\t");
It produces a non deterministic result in the plaintext variable.
How can i create a workaround, that every thread accesses his own plaintext variable? (If this is the problem...).
I tried modifying this code, to use extended assembly, but I failed every time. Probably due to the fact that all tutorials use ATT syntax.
I would really appreciate any help, as I'm stuck for several hours now :(
Edit: Running the program with 2 threads, and printing the content of plaintext right after the asm instruction, produces:
b
b
d
d
f
f
...
Edit2:
pthread_create(&thread[i], NULL, crack, (void *) &args[i]))
[...]
void *crack(void *arg) {
struct threadArgs *param = arg;
struct crypt_data crypt; // storage for reentrant version of crypt(3)
char *tmpHash = NULL;
size_t len = strlen(param->methodAndSalt);
size_t cipherlen = strlen(param->cipher);
crypt.initialized = 0;
for(int i = 0; i <= LIMIT; i++) {
// intel syntax
__asm__ ("pushad\n\t"
//mov edi, offset %0\n\t"
"mov edi, offset plaintext\n\t"
"mov ebx, offset charsetTable\n\t"
"L1: movzx eax, byte ptr [edi]\n\t"
" movzx eax, byte ptr [charsetTable+eax]\n\t"
" cmp al, 0\n\t"
" je L2\n\t"
" mov [edi],al\n\t"
" jmp L3\n\t"
"L2: xlat\n\t"
" mov [edi],al\n\t"
" inc edi\n\t"
" jmp L1\n\t"
"L3: popad\n\t");
tmpHash = crypt_r(plaintext, param->methodAndSalt, &crypt);
if(0 == memcmp(tmpHash+len, param->cipher, cipherlen)) {
printf("success: %s\n", plaintext);
break;
}
}
return 0;
}
Since you're already using pthreads, another option is making the variables that are modified by several threads into per-thread variables (threadspecific data). See pthread_getspecific OpenGroup manpage. The way this works is like:
In the main thread (before you create other threads), do:
static pthread_key_y tsd_key;
(void)pthread_key_create(&tsd_key); /* unlikely to fail; handle if you want */
and then within each thread, where you use the plaintext / charsetTable variables (or more such), do:
struct { char *plainText, char *charsetTable } *str =
pthread_getspecific(tsd_key);
if (str == NULL) {
str = malloc(2 * sizeof(char *));
str.plainText = malloc(size_of_plaintext);
str.charsetTable = malloc(size_of_charsetTable);
initialize(str.plainText); /* put the data for this thread in */
initialize(str.charsetTable); /* ditto */
pthread_setspecific(tsd_key, str);
}
char *plaintext = str.plainText;
char *charsetTable = str.charsetTable;
Or create / use several keys, one per such variable; in that case, you don't get the str container / double indirection / additional malloc.
Intel assembly syntax with gcc inline asm is, hm, not great; in particular, specifying input/output operands is not easy. I think to get that to use the pthread_getspecific mechanism, you'd change your code to do:
__asm__("pushad\n\t"
"push tsd_key\n\t" <---- threadspecific data key (arg to call)
"call pthread_getspecific\n\t" <---- gets "str" as per above
"add esp, 4\n\t" <---- get rid of the func argument
"mov edi, [eax]\n\t" <---- first ptr == "plainText"
"mov ebx, [eax + 4]\n\t" <---- 2nd ptr == "charsetTable"
...
That way, it becomes lock-free, at the expense of using more memory (one plaintext / charsetTable per thread), and the expense of an additional function call (to pthread_getspecific()). Also, if you do the above, make sure you free() each thread's specific data via pthread_atexit(), or else you'll leak.
If your function is fast to execute, then a lock is a much simpler solution because you don't need all the setup / cleanup overhead of threadspecific data; if the function is either slow or very frequently called, the lock would become a bottleneck though - in that case the memory / access overhead for TSD is justified. Your mileage may vary.
Protect this function with mutex outside of inline Assembly block.

Resources