Assign string stored in memory into a GDB variable - c

When debugging a C program, how can I assign a string (array of bytes terminated by \0 byte) stored at some known memory location into a GDB convenience variable?
E.g.:
There is a string "hello_world" stored at memory location 0xAAAAAAAA, how can I store the string into a GDB variable string_variable using that memory location? Using (gdb) set $string_Variable = (char *) 0xAAAAAAAA stores the address and not the string itself.

A string convenience variable in GDB is an array of char:
(gdb) set $foo = "bbb"
(gdb) ptype $foo
type = char [4]
Using GDB's CLI, I can't find a straightforward way to create a string convenience variable from an address of a NUL-terminated string of bytes in the debuggee.
What does work is to use GDB's Python extension to get a gdb.Value from the debuggee, then convert it to a string:
(gdb) python gdb.set_convenience_variable("string_variable", gdb.parse_and_eval("(char *)0x555555556011").string())
(gdb) ptype $string_variable
type = char [12]
(gdb) p $string_variable
$3 = "hello_world"

Related

Resolving memory address in gdb indirection

After compiling with -g to get debug info in a program and running in gdb, I can do the following to print the argument vector:
>>> p __libc_argv
$2 = (char **) 0x7fffffffe9f8
>>> p __libc_argv[0]
$3 = 0x7fffffffec63 "./sample.out"
My question is two-fold:
Why doesn't __libc_argv[0] and __libgc_argv produce the same memory address? Does gdb do some sort of interpretation in the background?
How could I get the memory address of 0x7fffffffec63 from the above? For example:
>>> p __libc_argv
$2 = (char **) 0x7fffffffe9f8
>>> x/s 0x7fffffffec63 <-- how do figure out this memory address value?
0x7fffffffec63: "./sample.out"
__libc_argv is a pointer to an array of pointers. __libc_argv[0] is the contents of the first element of that array. There's no reason why they should be the same, unless you first did __libc_argv[0] = __libc_argv for some reason. But that wouldn't be reasonable, since the elements of __libc_argv should be pointers to strings, not pointers to arrays.
On the other hand, __libc_argv == &__libc_argv[0] and *__libc_argv == __libc_argv[0].
To get the address you want, just indirect through __libc_argv.
>>> x/s *__libc_argv

gdb printing array/strings with a certain offset

I have the following defined:
strings: .asciz "Once\n", "upon\n", "a\n", "time\n", "...\n", ""
And I can see the label is stored at the following memory address, 600109:
>>> info va strings
Non-debugging symbols:
0x0000000000600109 strings
I can print this as:
>>> x/s 0x0000000000600109
0x600109: "Once\n"
>>> x/s 0x0000000000600109+6
0x60010f: "upon\n"
# etc...
Or referencing the variable to get the first string:
>>> x/s &strings
0x600109: "Once\n"
How do I do proper offsets in gdb to do addition on the memory address -- for example, to be able to do x/s &strings+6 to get the value "upon\n"?
What would be the correct way to do the following?
>>> x/s &strings+6
# Cannot perform pointer math on incomplete type "<data variable, no debug info>", try casting to a known type, or void *.
You need to cast this to (void *) to be able to do addition/subtraction on the memory address:
>>> x/s 6 + (void *) &strings
0x60010f: "upon\n"

Mechanism of initializing an array with a string constant in C

Does the definition:
char arr_of_chars[] = "hello world";
create a constant character array (null terminated) somewhere in memory, and then copy the content of that array to arr_of_chars, or does it directly assign it to arr_of_chars?
What exactly is the mechanism that works here?
What you're asking is not specified by C. In a nutshell, C is specified in terms of an abstract machine and its observable behavior. In this case, this means all you know is there is an array variable arr_of_chars initialized from a string literal.
When talking about segments, copying, etc, you're already talking about concrete implementations of C and what they're doing. Assuming your arr_of_chars is at file scope and given a target machine/system that knows binaries with data segments, it would be possible for a C compiler to put the initialized array directly in a data segment -- the observable behavior would be no different from an approach where the runtime first copies the bytes to your array.
"...creates a constant character array (null terminated) somewhere in memory, and then copies the content of that array to arr_of_chars"
Indeed. The string literal "hello world" is stored somewhere in the .rodata section of the program, unless the compiler managed to optimize it away entirely (depends on your array's scope). From there it is copied into your array.
This will create a null terminated string hello world\0 in the const segment.
In the main function this string will be copied to the character array.
Let me highlight a few lines from the assembly output to clairfy this.
PUBLIC ??_C#_0M#LACCCNMM#hello?5world?$AA#
This creates a public token.
CONST SEGMENT
??_C#_0M#LACCCNMM#hello?5world?$AA# DB 'hello world', 00H
CONST ENDS
This assigns the constant null terminated string to the token.
lea rax, QWORD PTR arr_of_chars$[rbp]
lea rcx, OFFSET FLAT:??_C#_0M#LACCCNMM#hello?5world?$AA#
mov rdi, rax ; Set destination to stack location
mov rsi, rcx ; Set source to public token
mov ecx, 12 ; Set counter to number of times to repeat
rep movsb ; Copy single byte from source to destination and increment locations
This sets up the source and destination and copies character by character 12 times which is the length of "hello world" and the null terminator. The destination is a location on the stack and the source is the public token.
It is subject of storage of the string in c.
Strings can be stored in following ways,
Strings as character arrays
Strings using character pointers
When strings are declared as character arrays, they are stored like other types of arrays in C. For example, if str[] is an auto variable then string is stored in stack segment, if it’s a global or static variable then stored in data segment.
Ex.
char str[] = "Hello_world";
In case of storing the strings using the character pointers, It can be done by two ways,
Read only string in a shared segment.
Ex.
char *str = "Hello_World";
In the above line "Hello_World" is stored in a shared read only location, but pointer str is stored in a read-write memory. You can change str to point something else but cannot change value at present str. So this kind of string should only be used when we don’t want to modify string at a later stage in program.
Dynamically allocated in heap segment.
char *str = NULL;
int size = 6;
str = (char *) malloc(sizeof(char)*size);
*(str+0) = 'H';
*(str+1) = 'E';
*(str+2) = 'L';
*(str+3) = 'L';
*(str+4) = 'O';
*(str+5) = '\0';

GDB debugger : char array type examine and print command

In C program i have declared a buffer of characters: char buffer_in[500];
When i run this program step by step on GDB i test the buffer reference with this commands:
(gdb) ptype buffer_in
type = char [500]
(gdb) ptype &buffer_in
type = char (*)[500]
(gdb) p &buffer_in
$9 = (char (*)[500]) 0x7fffffffdb60
(gdb) x buffer_in
0x7fffffffdb60: 0x2e
(gdb) x &buffer_in
0x7fffffffdb60: 0x2e
In C if I declared and array of characters the object is referenced like a pointer. I &buffer_in it is the address of first element of the array why the output of command x buffer_in is the same than x &buffer_in ?. I think that x buffer_in must trie to examine 0x2e address and so it is wrong referenced.
Thanks
So, gdb's x command expects a memory address - the command's purpose is to dump some memory in hex. If you give it an array variable, it will assume you mean you want it to dump starting at the address the array is stored at. If you give it a pointer to an array variable, it will assume you mean you want it to dump starting at that pointer. Those two are the same - this is much like the way C is actually compiled.
To put a finer point on it,
printf("0x%8.8lX 0x%8.8lX\n", (unsigned long)buffer_in, (unsigned long)&buffer_in);
prints the same number twice. So you'd expect gdb to dump the same byte from memory when asked to dump each address expression.

Memory allocation for constant char array

If I write char * p = "Welcome".
I can see the address for p. But what's the address for the string i.e at which address Welcome stored?
If I write again char *s = "Welcome". p and s will point to same address?
In a debugger, if you inspect p, you will see the address of the string.
&p is the address of p itself.
And no, p and s are not guaranteed to point to the same address, but they might.
"Welcome" is string constant and it is stored in read only data section of memory but pointer p is created in stack which points to this string literal
String constant "Welcome" often are putted in "read-only-data" section of memory.
Here are good explanations about: String litereals where do they go and data segment
you can find the address of string constant "Welcome" by
printf("%p",p);
If I write again char *s = "Welcome". p and s will point to same
address?
Maybe same string constant are putted in the same address, maybe not.

Resources