Using personality syscall to make the stack executable

Using personality syscall to make the stack executable - c

I am trying to understand how to make process stack executable with personality syscall, so I wrote this code that creates a new process with and runs a bash on the stack and I get segment fault because I don't have execute permission on the stack. What am I doing wrong?
#include <stdio.h>
#include <sys/personality.h>
int main()
{
setvbuf(stdout, 0, 2, 0);
unsigned char shellcode[] = "\x48\x31\xf6\x56\x48\xbf\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x57\x54\x5f\x6a\x3b\x58\x99\x0f\x05"; // open bash
if(personality(READ_IMPLIES_EXEC | ADDR_NO_RANDOMIZE) == -1) // return 0
{
printf("personality failed");
exit(0);
}
int (*ret)() = (int(*)())shellcode;
if(fork() == 0) // child proces
ret();
return 0;
}
Compiled with gcc file.c -o file.o
$ uname -r
4.4.179-0404179-generic
$ readelf -l
...
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10

There are two problems:
You cannot change the personality of a process after it's started. Doing personality(READ_IMPLIES_EXEC) is not going to do anything per se, it just sets the current process' personality value (a simple 32bit integer) and that's it. In order for the change to take effect a new program will need to be executed (i.e. through execve). The current process (and its children) will not be affected.
Linux will ignore the READ_IMPLIES_EXEC personality flag if the ELF includes a PT_GNU_STACK program header that specifies that the stack should not be executable, which is usually the default choice by compilers.
By default GCC will create ELFs with a PT_GNU_STACK program header whose flags are set to RW and not RWX. In order to have an executable stack, you will have to pass the -z execstack option to GCC when compiling, which will set PT_GNU_STACK to RWX. You can check this with readelf -l your_elf (note: readelf will show E instead of X for program header flags).
Therefore, in your case, gcc -zexecstack -o file file.c should do what you want, and you don't need to call personality() nor fork() really. Just put your shellcode on the stack and jump into it. In theory you could also locate the program headers in the ELF file and edit the flags for PT_GNU_STACK manually (7 = RWX) e.g. using an hex editor, but that'd be more work than needed.
So, at the end of the day:
I am trying to understand how to make process stack executable with personality syscall
You cannot. Personality only affects new executions, not already existing ones, and on top of that ELF properties such as the PT_GNU_STACK program header supersede personality. You can however re-compile your program as explained above.
NOTE: you can nonetheless use mprotect() to change the permissions of stack memory pages to RWX, given that you can somehow extrapolate the stack base address and size at runtime (e.g. taking the address of a local variable in a function and zeroing out the lowest 12 bits).
This is enough information for older kernels like yours (4.4), but since Linux v5.8 the situation is a bit more nuanced. Assuming you are on x86, you can take a look at this comment in the source code for an explanation:
/*
* An executable for which elf_read_implies_exec() returns TRUE will
* have the READ_IMPLIES_EXEC personality flag set automatically.
*
* The decision process for determining the results are:
*
* CPU: | lacks NX* | has NX, ia32 | has NX, x86_64 |
* ELF: | | | |
* ---------------------|------------|------------------|----------------|
* missing PT_GNU_STACK | exec-all | exec-all | exec-none |
* PT_GNU_STACK == RWX | exec-stack | exec-stack | exec-stack |
* PT_GNU_STACK == RW | exec-none | exec-none | exec-none |
*
* exec-all : all PROT_READ user mappings are executable, except when
* backed by files on a noexec-filesystem.
* exec-none : only PROT_EXEC user mappings are executable.
* exec-stack: only the stack and PROT_EXEC user mappings are executable.
*
* *this column has no architectural effect: NX markings are ignored by
* hardware, but may have behavioral effects when "wants X" collides with
* "cannot be X" constraints in memory permission flags, as in
* https://lkml.kernel.org/r/20190418055759.GA3155#mellanox.com
*
*/
#define elf_read_implies_exec(ex, executable_stack) \
(mmap_is_ia32() && executable_stack == EXSTACK_DEFAULT)

Related

Cannot create anonymous mapping with MAP_32BIT on MacOS

I'm on a 64-bit system, but want to use mmap to allocate pages within the first 2GB of memory. On Linux, I can do this with the MAP_32BIT flag:
#include <sys/mman.h>
#include <stdio.h>
int main() {
void *addr = mmap(
NULL, // address hint
4096, // size
PROT_READ | PROT_WRITE, // permissions
MAP_32BIT | MAP_PRIVATE | MAP_ANONYMOUS, // flags
-1, // file descriptor
0 // offset
);
if (addr == MAP_FAILED)
perror("mmap");
else
printf("%p", addr);
}
Godbolt link demonstrating that this works on Linux. As of version 10.15, MacOS also allegedly supports the MAP_32BIT flag. However, when I compile and run the program on my system (11.3), it fails with ENOMEM. The mapping does work when MAP_32BIT is removed.
I have a few potential explanations for why this doesn't work, but none of them are very compelling:
The permissions are wrong somehow (although removing either PROT_READ or PROT_WRITE didn't solve it).
I need to specify an address hint for this to work, for some reason.
MacOS (or my version of it) simply doesn't support MAP_32BIT for anonymous mappings.

The problem is the "zero page": on some 32-bit Unixes, the lowest page of memory is commonly configured to be inaccessible so that accesses to NULL can be detected and signal an error. On 64-bit systems, MacOS extends this to the entire first 4 GiB of memory by default. mmap therefore refuses to map addresses in this region, since they are already mapped to page zero.
This can be simply changed using a linker option:
$ cc -Wl,-pagezero_size,0x1000 test.c
$ ./a.out
0xb0e5000

bash unable to execute the executable file

I'm trying to compile c program on my Linux using make utility
this what happened if i try to create the .o file
#make size_of.o
cc -c -o size_of.o size_of.c
compiling process run correctly,but when i execute the executable file i got this error
#./size_of.o
bash: ./size_of.o: cannot execute binary file
Then once again i run make without .o suffix
#make size_of
cc size_of.o -o size_of
The compiling and executing process run as i expected.
is there any problem with the program or can you tell me what's wrong?
How can i fix this problem and is there any different between executable file in C?
This the program:
#include <stdio.h>
int main (void){
printf("char %d bytes\n",sizeof(char));
printf("short %d bytes\n",sizeof(short));
printf("int %d bytes\n",sizeof(int));
printf("long %d bytes\n",sizeof(long));
printf("float %d bytes\n",sizeof(float));
printf("double %d bytes\n",sizeof(double));
printf("long double %d bytes\n",sizeof(long double));
return 0;
}
and this is the output:
char 1 bytes
short 2 bytes
int 4 bytes
long 4 bytes
float 4 bytes
double 8 bytes
long double 12 bytes

.o files are object files, not executables. You have specifically told the compiler to only create object files, because you used the -c flag. You don't run object files, they feed into a linker (along with other things) to create the executable file.
The general (simplified) process is:
Phase
-----
+---------+
| main.c | (source)
+---------+
|
Compile........|............................
|
V
+---------+ +-----------+
| main.o | (object) | libs etc. |
+---------+ +-----------+
| |
Link...........|.........................|....
| |
+-------------------------+
|
V
+---------+
| main | (executable)
+---------+
You fix that by either using turning the object file into an executable as you've done later in the process, though I would do it as:
cc -o size_of size_of.o
Or simply create the executable directly from the source file:
cc -o size_of size_of.c
And, if you're using make, make sure you have an actual Makefile. Otherwise, you get default rules which may not be what you want. It could be as simple as:
size_of: size_of.c Makefile
gcc -o size_of size_of.c

In the first make invocation you are setting make target to be an object file (target has an .o extension). Built in make rule for object files is just to compile and assemble them (no linking) and that is what you get an object file.
Second invocation is actually asking make to build an executable file.
Gnumake has a set of built in rules for different targets. Please see this link for details:
https://www.gnu.org/software/make/manual/html_node/Catalogue-of-Rules.html

Linux: is it possible to share code between processes?

I wonder if it's possible for a linux process to call code located in the memory of another process?
Let's say we have a function f() in process A and we want process B to call it. What I thought about is using mmap with MAP_SHARED and PROT_EXEC flags to map the memory containing the function code and pass the pointer to B, assuming, that f() will not call any other function from A binary. Will it ever work? If yes, then how do I determine the size of f() in memory?
=== EDIT ===
I know, that shared libraries will do exactly that, but I wonder if it's possible to dynamically share code between processes.

Yes, you can do that, but the first process must have first created the shared memory via mmap and either a memory-mapped file, or a shared area created with shm_open.
If you are sharing compiled code then that's what shared libraries were created for. You can link against them in the ordinary way and the sharing will happen automatically, or you can load them manually using dlopen (e.g. for a plugin).
Update:
As the code has been generated by a compiler then you will have relocations to worry about. The compiler does not produce code that will Just Work anywhere. It expects that the .data section is in a certain place, and that the .bss section has been zeroed. The GOT will need to be populated. Any static constructors will have to be called.
In short, what you want is probably dlopen. This system allows you to open a shared library like it was a file, and then extract function pointers by name. Each program that dlopens the library will share the code sections, thus saving memory, but each will have its own copy of the data section, so they do not interfere with each other.
Beware that you need to compile your library code with -fPIC or else you won't get any code sharing either (actually, the linkers and dynamic loaders for many architectures probably don't support libraries that aren't PIC anyway).

The standard approach is to put the code of f() in a shared library libfoo.so. Then you could either link to that library (e.g. by building program A with gcc -Wall a.c -lfoo -o a.bin), or load it dynamically (e.g. in program B) using dlopen(3) then retrieving the address of f using dlsym.
When you compile a shared library you want to :
compile each source file foo1.c with gcc -Wall -fPIC -c foo1.c -o foo1.pic.o into position independent code, and likewise for foo2.c into foo2.pic.o
link all of them into libfoo.so with gcc -Wall -shared foo*.pic.o -o libfoo.so ; notice that you can link additional shared libraries into lbfoo.so (e.g. by appending -lm to the linking command)
See also the Program Library Howto.
You could play insane tricks by  mmap-ing some other /proc/1234/mem but that is not reasonable at all. Use shared libraries.
PS. you can dlopen a big lot (hundreds of thousands) of shared objects lib*.sofiles; you may want to dlclosethem (but practically you don't have to).

It would be possible to do so, but that's exactly what shared libraries are for.
Also, beware that you need to check that the address of the shared memory is the same for both processes, otherwise any references that are "absolute" (that is, a pointer to something in the shared code). And like with shared libaries, the bitness of the code will have to be the same, and as with all shared memory, you need to make sure that you don't "mess up" for the other process if you modify any of the shared memory.
Determining the size of a function ranges from "hard" to "nearly impossible", depending on the actual code generated, and the level of information you have available. Debug symbols will have the size of a function, but beware that I have seen compilers generate code where two functions share the same "return" piece of code (that is, the compiler generates a jump to another function that has the same bit of code to return the result, because it saves a few bytes of code, and there was already going to be a jump anyway [e.g. there is a if/else that the compiler has to jump around]).

not directly
that's what shared libraries are for
relocations
Oh no! Anyways...
Here's the insane, unreasonable, not-good, purely academic demonstration of this capability. It was fun for me, I hope it's fun for you.
Overview
Program A will use shm_open to create a shared memory object, and mmap to map it to its memory space. Then it it will copy some code from a function defined in A to the shared memory. Then program B will open up the shared memory, execute the function, and just for kicks, make a very simple modification to the code. Then A will execute the code to demonstrate the change took effect.
Again, this is no recommendation for how to solve a problem, it's an academic demonstration.
// A.c
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
int foo(int y) {
int x = 14;
return x + y;
}
int main(int argc, char *argv[]) {
const size_t mem_size = 0x1000;
// create shared memory objects
int shared_fd = shm_open("foobar2", O_RDWR | O_CREAT, 0777);
ftruncate(shared_fd, mem_size);
void *shared_mem =
mmap(NULL, mem_size, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_SHARED, shared_fd, 0);
// copy function to shared memory
const size_t fn_size = 24;
memcpy(shared_mem, &foo, fn_size);
// wait
getc(stdin);
// execute the shared function
int(*shared_foo)(int) = shared_mem;
printf("shared_foo(3) = %d\n", shared_foo(3));
// clean up
shm_unlink("foobar2");
}
Note the use of PROT_READ | PROT_WRITE | PROT_EXEC in the call to mmap. This program is compiled with
gcc A.c -lrt -o A
The constant fn_size was determined by looking at the output of objdump -dj .text A
...
000000000000088a <foo>:
88a: 55 push %rbp
88b: 48 89 e5 mov %rsp,%rbp
88e: 89 7d ec mov %edi,-0x14(%rbp)
891: c7 45 fc 0e 00 00 00 movl $0xe,-0x4(%rbp)
898: 8b 55 fc mov -0x4(%rbp),%edx
89b: 8b 45 ec mov -0x14(%rbp),%eax
89e: 01 d0 add %edx,%eax
8a0: 5d pop %rbp
8a1: c3 retq
...
I think that's 24 bytes, I dunno. I guess I could put anything larger than that and it would do the same thing. Anything shorter and I'll probably get an exception from the processor. Also, note that the value of x from foo (14, that's (apparently) 0e 00 00 00 in LE) is located at foo + 10. This will be the constant x_offset in program B.
// B.c
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
const int x_offset = 10;
int main(int argc, char *argv[]) {
// create shared memory objects
int shared_fd = shm_open("foobar2", O_RDWR | O_CREAT, 0777);
void *shared_mem = mmap(NULL, 0x1000, PROT_EXEC | PROT_WRITE, MAP_SHARED, shared_fd, 0);
int (*shared_foo)(int) = shared_mem;
int z = shared_foo(13);
printf("result: %d\n", z);
int *x_p = (int*)((char*)shared_mem + x_offset);
*x_p = 100;
shm_unlink("foobar");
}
Anyways first I run A, then I run B. The output of B is:
result: 27
Then I go back to A and push enter, then I get:
shared_foo(3) = 103
Good enough for me.
/dev/shm/foobar2
To completely eliminate the mystique of all this, after running A you can do something like
xxd /dev/shm/foobar2 | vim -
Then, edit that constant 0e 00 00 00 just like before, then save the file with the 'ol
:w !xxd -r > /dev/shm/foobar2
and push enter in A and see similar results as above.

How can the exit status of a process depend on whether it's statically built?

A modern system:
% pacman -Q glibc gcc
glibc 2.16.0-4
gcc 4.7.1-6
% uname -sr
Linux 3.5.4-1-ARCH
A trivial program:
% < wtf.c
void main(){}
Let's do static and dynamic builds:
% gcc -o wtfs wtf.c -static
% gcc -o wtfd wtf.c
Everything looks fine:
% file wtf?
wtfd: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=0x4b421af13d6b3ccb6213b8580e4a7b072b6c7c3e, not stripped
wtfs: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.32, BuildID[sha1]=0x1f2a9beebc0025026b89a06525eec5623315c267, not stripped
Could anybody explain this to me?
% for n in $(seq 1 10); do ./wtfd; echo $?; done | xargs
0 0 0 0 0 0 0 0 0 0
% for n in $(seq 1 10); do ./wtfs; echo $?; done | xargs
128 240 48 128 128 32 64 224 160 48
Sure, one can use int main(). And -Wmain will issue a warning (return type of ‘main’ is not ‘int’).
I'd just like to understand what is going on there.

That's EXACTLY the point.
There is no "void main()". There is ALWAYS a result value, and if you don't return one and don't do anything in your program, the return value is what happens to be in the appropiate register at program start (or specifically, whatever happens to be there when main is called from the startup code). Which can certainly depend on what the program is doing before main, such as dealing with shared libs.
EDIT: to get an idea how this can happen, try this:
int foo(void)
{
return 55;
}
void main(void)
{
foo();
}
There is no guarantee, of course, but there's a good chance that this program will have an exit code of 55, simply because that's the last value returned by some function. Just imagine that call happened before main.

To further illustrate what Christian is saying. Even though you declared void main() your process will return whatever value was previous in eax (since you are on linux x86 arch).
void main() {
asm("movl $55, %eax");
}
So now it always returns 55 b/c the above code explicitly initializes eax.
$ cc rval.c
$ ./a.out
$ echo $?
55
Again this example will only work on the current major OSs since I am assuming the calling convention. There is no reason an OS could not have a different calling convention and the return value could be somewhere else (RAM, register, whatever).

Can I find a structure with certain size in c source?

I'm trying to debug one core dump (mainly using gdb) and all I found out so far, is that that there is a structure of exactly 124 bytes, that is causing problems. Given all the sources of this program, is there a way to find that structure? (I mean is there a way to find structure, whose size is 124 bytes)
PS. I know exact place in memory of this structure, yet there is no
clue about it's purpose if I look at it. It is also common structure,
so I can make as many core dumps as I wish.
PS2. So far I tried:
to use regular expression grep '^ *[a-zA-Z][^ ;,."()]* [a-zA-Z][^
;,."()]*' * | grep -v 'return' | sed 's/[^:]*: *\([^ ]*\).*/\1/' |
sort | uniq > tmp.txt , add p sizeof(x) to each found line and
input to gdb.
to use info variables in gdb, log output, extract variable types
and add sizeof(x) to each type and output to gdb.

In a header file which is included by all the source file, define a macro,
#define malloc(size) my_malloc(size, __FILE__, __LINE__)
And then in the implementation:
#undef malloc
void * my_malloc(size_t size, const char* file, int line)
{
//if the size equal to 124 bytes, log it, then you will have a chance know where this kind of allocation happens, so you know the struct.
if(124==size) printf(...);
return malloc(size);
}

Try to use this:
objdump -W <elf-name> | grep -B 2 "DW_AT_byte_size : 124"
This command dump all debugging symbols in the ELF file and find those size is 124.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Using personality syscall to make the stack executable - c

Related

Cannot create anonymous mapping with MAP_32BIT on MacOS

bash unable to execute the executable file

Linux: is it possible to share code between processes?

How can the exit status of a process depend on whether it's statically built?

Can I find a structure with certain size in c source?

Categories

Resources