Handling names in IDAPython - disassembly

I am working on a small IDAPython script.
The script itself works 100% of the time on lines like this:
qword_FFFFFFF006F1E6C0 DCQ 0xFFFFFFF007758C18
As it looks into address 0xFFFF.. sees if there's a function there, and if there is, renames the qword with the function name + segment info.
Now, sometimes, the disassembly looks like this:
off_FFFFFFF006F1E690 DCQ OSDictionary::withCapacity(uint) , and of course, the script breaks down here (expects an address, is given a name..).
What I'd like to do is to get the address of the second operand (OSDictionary::with...), and execute the script as normal.
Unfortunately, I have no idea how to do that, as to get the address I use this:
disas = GetDisasm(addr).split(" ")
fun_addr = disas[1]
....

If you always want the address of the destination, you could use get_qword function
https://www.hex-rays.com/products/ida/support/idadoc/1321.shtml
Which is the Qword function in IDAPython.
Simply do Qword(addr), this will give you the address as a number.
You might need to compensate for endianity (use struct).

Related

in gdb, can we set a variable as an expression?

In the u-boot C code, the value "gd" is declared like this. (arm)
register volatile gd_t *gd asm ("r9")
and register r9 contains the pointer to the struct global_data. (typedef'd to gd_t)
While debugging/analying, to see gd->malloc_base in the C code, typing "p gd->malloc_base" doesn't work. it says Missing ELF symbol "gd".
I later learned that I should do "p ((gd_t *)$r9)->malloc_base" to see the value.
I'm typing this kind of command many times a day. Isn't there a way that I can assign a variable in gdb that represents ((gd_t *)$r9)? and why doesn't gdb recognize the value gd in the code?
What I want to do is make a variable representing ((gd_t *)$r9) so that I can use it like gd->ram_top or gd->env_has_init, etc. according to the value I'm curious about.
This is the comment of #ssbssa to my quetion (thanks #ssbssa). I asked him to put his answer but he doesn't. so I add the answer myself.
You can define macro like macro define gd ((gd_t *)$r0) and then you can later use p /x gd->reloc_off or p /x gd->flags. Of course you can put this macro define command in the command file and start gdb like this arm-none-eabi-gdb u-boot -x gdb_command.

Getting caller function name from the function address

I have printed the address of the function in U-boot by adding the following print.
printf("initcall: %pS \n", (char *)*init_fnc_ptr - reloc_ofs);
Following line printed by adding debug prints. Is there anyway to know the function name from the function address.
initcall: 80809c05
When building U-Boot a file u-boot.map is written. You can look up the the addresses of the functions (before relocation) there.

x86 assembly read/write to file

I'm currently trying to write to a file so far I have the following code to append to the file. Does anybody know why this isn't working? It runs fine but by the end nothing has changed.
filewritemode: .asciz "a"
filelocation: .asciz "/h/test.txt"
_main:
push $filelocation
push $filewritemode
call _fopen
push $blabla
push %eax
call _fprintf
push $result
call _printf
push $0
call _exit # exit the program
gcc is used in order to turn the source file into an .exe
$blabla is currently the string with some random chars that are ment for testing
It doesn't work because you have pushed the parameters of fopen in the wrong order. Parameters must be pushed from last to first. Aside from that you are repeatedly pushing parameters, but you don't remove them again. In this case that works because you take a dive to exit, but if instead you would have returned with the ret instruction, you would have found that this would result in a crash as you would be jumping to one of the pushed parameters.
You are passing the arguments to fopen reversed and you are not checking for errors. In situations like this, ltrace may be your friend.

Should I unescape bytea field in C-function for Postgresql and if so - how to do it?

I write my own C function for Postgresql which have bytea parameter. This function is defined as followed
CREATE OR REPLACE FUNCTION putDoc(entity_type int, entity_id int,
doc_type text, doc_data bytea) RETURNS text
AS 'proj_pg', 'call_putDoc'
LANGUAGE C STRICT;
My function call_putDoc, written on C, reads doc_data and pass its data to another function like file_magic to determine mime-type of the data and then pass data to appropriate file converter.
I call this postgresql function from php script which loads file content to last parameter. So, I should pass file contents with pg_escape_bytea.
When data are passed to call_putDoc C function, does its data already unescaped and if not - how to unescape them?
Edit: As I found, no, data, passed to C function, is not unescaped. How to unescape it?
When it comes to programming C functions for PostgreSQL, the documentation explains some of the basics, but for the rest it's usually down to reading the source code for the PostgreSQL server.
Thankfully the code is usually well structured and easy to read. I wish it had more doc comments though.
Some vital tools for navigating the source code are either:
A good IDE; or
The find and git grep commands.
In this case, after having a look I think your bytea argument is being decoded - at least in Pg 9.2, it's possible (though rather unlikely) that 8.4 behaved differently. The server should automatically do that before calling your function, and I suspect you have a programming error in how you are calling your putDoc function from SQL. Without sources it's hard to say more.
Try calling it putDoc from psql with some sample data you've verified is correctly escape encoded for your 8.4 server
Try setting a breakpoint in byteain to make sure it's called before your function
Follow through the steps below to verify that what I've said applies to 8.4.
Set a breakpoint in your function and step through with gdb, using the print function as you go to examine the variables. There are lots of gdb tutorials that'll teach you the required break, backtrace, cont, step, next, print, etc commands, so I won't repeat all that here.
As for what's wrong: You could be double-encoding your data - for example, given your comments I'm wondering if you've base64 encoded data and passed it to Pg with bytea_output set to escape. Pg would then decode it ... giving you a bytea containing the bytea representation of the base64 encoding of the bytes, not the raw bytes themselves. (Edit Sounds like probably not based on comments).
For correct use of bytea see:
http://www.postgresql.org/docs/current/static/runtime-config-client.html
http://www.postgresql.org/docs/current/static/datatype-binary.html
To say more I'd need source code.
Here's what I did:
A quick find -name bytea\* in the source tree locates src/include/utils/bytea.h. A comment there notes that the function definitions are in utils/adt/varlena.c - which turns out to actually be src/backend/util/adt/varlena.c.
In bytea.h you'll also notice the definition of the bytea_output GUC parameter, which is what you see when you SHOW bytea_output or SET bytea_output in psql.
Let's have a look at a function we know does something with bytea data, like bytea_substr, in varlena.c. It's so short I'll include one of its declarations here:
Datum
bytea_substr(PG_FUNCTION_ARGS)
{
PG_RETURN_BYTEA_P(bytea_substring(PG_GETARG_DATUM(0),
PG_GETARG_INT32(1),
PG_GETARG_INT32(2),
false));
}
Many of the public functions are wrappers around private implementation, so the private implementation can be re-used with functions that have differing arguments, or from other private code too. This is such a case; you'll see that the real implementation is bytea_substring. All the above does is handle the SQL function calling interface. It doesn't mess with the Datum containing the bytea input at all.
The real implementation bytea_substring follows directly below the SQL interface wrappers in this partcular case, so read on in varlena.c.
The implementation doesn't seem to refer to the bytea_output GUC, and basically just calls DatumGetByteaPSlice to do the work after handling some border cases. git grep DatumGetByteaPSlice shows us that DatumGetByteaPSlice is in src/include/fmgr.h, and is a macro defined as:
#define DatumGetByteaPSlice(X,m,n) ((bytea *) PG_DETOAST_DATUM_SLICE(X,m,n))
where PG_DETOAST_DATUM_SLICE is
#define PG_DETOAST_DATUM_SLICE(datum,f,c) \
pg_detoast_datum_slice((struct varlena *) DatumGetPointer(datum), \
(int32) (f), (int32) (c))
so it's just detoasting the datum and returning a memory slice. This leaves me wondering: has the decoding been done elsewhere, as part of the function call interface? Or have I missed something?
A look at byteain, the input function for bytea, shows that it's certainly decoding the data. Set a breakpoint in that function and it should trip when you call your function from SQL, showing that the bytea data is really being decoded.
For example, let's see if byteain gets called when we call bytea_substr with:
SELECT substring('1234'::bytea, 2,2);
In case you're wondering how substring(bytea) gets turned into a C call to bytea_substr, look at src/catalog/pg_proc.h for the mappings.
We'll start psql and get the pid of the backend:
$ psql -q regress
regress=# select pg_backend_pid();
pg_backend_pid
----------------
18582
(1 row)
then in another terminal connect to that pid with gdb, set a breakpoint, and continue execution:
$ sudo -u postgres gdb -q -p 18582
Attaching to process 18582
... blah blah ...
(gdb) break bytea_substr
Breakpoint 1 at 0x6a9e40: file varlena.c, line 1845.
(gdb) cont
Continuing.
In the 1st terminal we execute in psql:
SELECT substring('1234'::bytea, 2,2);
... and notice that it hangs without returning a result. Good. That's because we tripped the breakpoint in gdb, as you can see in the 2nd terminal:
Breakpoint 1, bytea_substr (fcinfo=0x1265690) at varlena.c:1845
1845 PG_RETURN_BYTEA_P(bytea_substring(PG_GETARG_DATUM(0),
(gdb)
A backtrace with the bt command doesn't show bytea_substr in the call path, it's all SQL function call machinery. So Pg is decoding the bytea before it's passing it to bytea_substr.
You can now detach the debugger with quit. This won't quit the Pg backend, only detach and quit the debugger.

Checking when a variable is modified

Using Valgrind or any other debugger in Linux, how can one see places where a variable is modified. I am using gcc. Note that I don't want to step into the code using gdb. I just want to run the program and have the debugger report me in the end, places where the variable is modified in the code.
Hm, thinking about it it's not exact duplicate of Can I set a breakpoint on 'memory access' in GDB?, because it asks a little bit more. So:
Use gdb
Find the address you want to watch (hardware watchpoints only work for watching address, so you have to run it to the point where the variable or object are instantiated, take their address and use the watch command on that address.
Attach command to the address to give you a backtrace (or any other info you need to collect) and continue.
So you'll have something like:
p &variable
watch *$$
cmd
bt
c
end
(I am not completely sure with the $$, I normally use the $n as printed by the p command).
Use Breakpoint Command Lists to do this in gdb. You will have to know the address of variable to watch. Set watchpoint with a series of commands like this:
watch *0xfeedface
commands
silent
bt
cont
end
You can also optionally save all this output to log file. Look gdb doc for more details.

Resources