Why is unprivileged recursive unshare(CLONE_NEWUSER) not permitted? - c

I'm on Ubuntu 17.04.
Single unprivilleged unshare of mount namespace works. You can try using unshare(1) command:
$ unshare -m -U /bin/sh
#
However unshare within unshare is not permitted:
$ unshare -m -U /bin/sh
# unshare -m -U /bin/sh
unshare: Operation not permitted
#
Here is a C program that will basically do the same:
#define _GNU_SOURCE
#include <stdio.h>
#include <sched.h>
#include <sys/mount.h>
#include <unistd.h>
int
main(int argc, char *argv[])
{
if(unshare(CLONE_NEWUSER|CLONE_NEWNS) == -1) {
perror("unshare");
return -1;
}
if(unshare(CLONE_NEWUSER|CLONE_NEWNS) == -1) {
perror("unshare2");
return -1;
}
return 0;
}
Why it's not permitted? Where I can find documentation about this? I failed to find this information in unshare or clone man page and in kernel unshare documentation.
Is there a system setting that would allow this?
What I want to achieve:
First unshare: I want to mask few binaries on system with my own versions.
Second unshare: unprivilleged chroot.

I'm somewhat guessing here, but I think that the reason is the UID mapping. In order to perform it, certain conditions must be met (from the user_namespaces man page):
In order for a process to write to the /proc/[pid]/uid_map (/proc/[pid]/gid_map) file, all of the following require‐
ments must be met:
1. The writing process must have the CAP_SETUID (CAP_SETGID) capability in the user namespace of the process pid.
2. The writing process must either be in the user namespace of the process pid or be in the parent user namespace of
the process pid.
3. The mapped user IDs (group IDs) must in turn have a mapping in the parent user namespace.
I believe what happens is that the first time you run, the mapping matches that of the parent UID. The second time, however, it does not, and this fails the system call.
From the unshare(2) manual page:
EPERM CLONE_NEWUSER was specified in flags, but either the effective user ID or the effective group ID of the caller
does not have a mapping in the parent namespace (see user_namespaces(7)).

Related

Is there a system call to run systemctl in C program (not the system() function)?

I am on a Ubuntu 22.04 server. I want to run:
systemctl restart someService
but want to do so in a C program.
Intuitively I tried:
system("systemctl restart someService")
This did not work even if my program itself has setUid set to root as systemctl does not itself have setUid bit set to root.
I would like to write a program and set its uid to root so that anyone can execute it to restart a certain system service. This is only possible by using some direct function and not the system call as done above. Any suggestions?
I don't think there is a system-call that can do the job of systemctl in general. I think your approach of calling the systemctl command from your program is correct. But, I am not getting into the security considerations here. You should be really careful when writing set-uid programs.
Now, the main issue with your code is that system should not be used from set-uid binaries because it doesn't let you control the environment variables, which can be set maliciously before calling your program to change the behavior of the called process. Besides that, the system command calls /bin/sh to run your command which on some versions of Linux drop privilege as mentioned on the man-page linked above. The right approach would be to use execve family of functions that offer more control and do not spawn a shell. What you need to do can be done in the following way -
int main(int argc, char* argv[]) {
setuid(0);
setgid(0);
char *newargv[] = {"/usr/bin/systemctl", "restart", "someService", NULL};
char *newenviron[] = { NULL };
execve(newargv[0], newargv, newenviron);
perror("execve"); /* execve() returns only on error */
exit(EXIT_FAILURE);
}
Notice the empty (or pure) environment above. It is worth noting that the execve should not return unless there is an error. If you need to wait for the return value from the systemctl command, you might have to combine this with fork

Exec function in c is not returning -1 when it should

I am using an execv function to run a program called code.x.
code.x has a part where it guarantees its failure by Assertion.
My code that runs execl is:
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <string.h>
int main()
{
pid_t pid;
char *args[] = { "./code.x",NULL };
pid = fork();
if (pid > 0) {
wait(NULL);
printf("%s\n", strerror(errno));
printf("done\n");
}
else if (pid == 0) {
printf("%s\n", strerror(errno));
execv(args[0], args);
printf("should fail");
}
else {
printf("forkfail");
}
return 1;
}
the code prints
Success
code.x: code.c:15: main: Assertion '0 == 1' failed.
Success
done
"should fail" is never printed and WEXITSTATUS(status) shows that the exit status is 0.
The exec family of functions replace the calling process with a new program in its initial state loaded from an executable file. They can only fail if this replacement fails, e.g. due to the requested file not existing or the invoking user not having permissions to access/execute it.
If an assertion failure in the program ./code.x you're invoking happens, this is long past the point where execv could have failed; at this point, the original program state where execv was performed no longer exists, because it was already replaced. The parent process will see it exit via a wait-family function, and can inspect the status reported by the wait-family function to determine why it exited.
exec* functions succeed if the program starts running. Your program did start running.
An assertion failure causes the program to abort, exit with a signal. The Linux manual page wait(2) explains that:
WEXITSTATUS(wstatus)
returns the exit status of the child. This consists of the least significant 8 bits of the status argument that the child specified in a call to exit(3) or _exit(2) or as
the argument for a return statement in main(). This macro should be employed only if WIFEXITED returned true.
If you didn't check that WIFEXITED(status) is true, then WEXITSTATUS(status) is garbage.
Instead, check WIFSIGNALED(status) and if true, get the signal - WTERMSIG(status), which should equal to SIGABRT.
execv successfully did its job. The process ./code.x executed, then exited because of an assertiong.
The exec family of functions don't care about the process's return value. Once the process starts, the calling process is effectively terminated and gone.
Exec will only return if for some reason the process couldn't be started at all. Specifically, only these errors (taken from the man page) will cause exec to return and set errno to one of these values:
E2BIG The total number of bytes in the environment (envp) and argument list (argv) is too large.
EACCES Search permission is denied on a component of the path prefix of filename or the name of a script interpreter. (See also path_resolution(7).)
EACCES The file or a script interpreter is not a regular file.
EACCES Execute permission is denied for the file or a script or ELF interpreter.
EACCES The filesystem is mounted noexec.
EAGAIN (since Linux 3.1)
Having changed its real UID using one of the set*uid() calls, the caller was—and is now still—above its RLIMIT_NPROC resource limit (see
setrlimit(2)). For a more detailed explanation of this error, see NOTES.
EFAULT filename or one of the pointers in the vectors argv or envp points outside your accessible address space.
EINVAL An ELF executable had more than one PT_INTERP segment (i.e., tried to name more than one interpreter).
EIO An I/O error occurred.
EISDIR An ELF interpreter was a directory.
ELIBBAD An ELF interpreter was not in a recognized format.
ELOOP Too many symbolic links were encountered in resolving filename or the name of a script or ELF interpreter.
ELOOP The maximum recursion limit was reached during recursive script interpretation (see "Interpreter scripts", above). Before Linux 3.8, the
error produced for this case was ENOEXEC.
EMFILE The per-process limit on the number of open file descriptors has been reached.
ENAMETOOLONG filename is too long.
ENFILE The system-wide limit on the total number of open files has been reached.
ENOENT The file filename or a script or ELF interpreter does not exist, or a shared library needed for the file or interpreter cannot be found.
ENOEXEC An executable is not in a recognized format, is for the wrong architecture, or has some other format error that means it cannot be executed.
ENOMEM Insufficient kernel memory was available.
ENOTDIR A component of the path prefix of filename or a script or ELF interpreter is not a directory.
EPERM The filesystem is mounted nosuid, the user is not the superuser, and the file has the set-user-ID or set-group-ID bit set.
EPERM The process is being traced, the user is not the superuser and the file has the set-user-ID or set-group-ID bit set.
EPERM A "capability-dumb" applications would not obtain the full set of permitted capabilities granted by the executable file. See capabilities(7).
ETXTBSY The specified executable was open for writing by one or more processes.
The Exec function family replaces the existing process image with a new process image. This is why it is required to fork before spawning another process, because the currently running process is completely replaced, this includes the program counter, which keeps track of the next instruction to execute.
printf("should fail");
is never excecuted because the instant you called execv(args[0], args), the program counter was moved to execute args[0], thus leaving behind the execution path that would have resulted in that print statement.
Exec returns -1 on the condition that it encountered an error while replacing the image, and has absolutely no relation to the return value of the program being executed. This is because the two processes, after Exec is called, are not coordinating with each other at all. Remember: the fork() command created a new address space, which means that these processes are now running in separate domains on separate executables.
Some documentation may be of help:
http://man7.org/linux/man-pages/man3/exec.3.html
Hope this helped.

Entering sudo password through c

I asked this question before but no-one gave a straight answer. I wanted to know how I can enter the sudo password through c code. I'm trying to write a script to be able to execute sudo bash and enter the password required. Also I know the risks of hardcoding passwords but I don't mind in this instance.
No. Doing it that way is an antipattern.
There are several alternatives to choose from, depending on the situation:
Use gksudo (if DISPLAY environment variable is set) for a graphical prompt.
Install the scripts to be executed in /usr/share/yourapp/scripts/ or /usr/lib/yourapp/scripts/, and the proper sudo configuration that allows running them with sudo without supplying a password in /etc/sudoers.d/yourapp (or /etc/sudoers in systems without /etc/sudoers.d/)
At the beginning of your program, check if geteuid() == 0. If not, re-execute self using gksudo/sudo, to obtain root privileges.
For normal operations, your program should use only the privileges of the real user who executed the program. To be able to raise the privileges later, the root privileges are "saved". So, initially, your program will drop the privileges using e.g.
uid_t uid = getuid();
gid_t gid = getgid();
if (setresgid(gid, gid, 0) == -1 ||
setresuid(uid, uid, 0) == -1) {
/* Failed: no root privileges! */
}
To re-elevate privileges, you use
if (setresgid(gid, 0, 0) == -1 ||
setresuid(uid, 0, 0) == -1) {
/* Failed: no root privileges! */
}
which changes only the effective identity to root (as setuid binaries do), or
if (setresgid(0, 0, 0) == -1 ||
setresuid(0, 0, 0) == -1) {
/* Failed: no root privileges! */
}
which changes both real and effective identity to root.
Often, the privileges are elevated for only forking a privileged child slave, after which the main program drops the privileges completely using
if (setresgid(gid, gid, gid) == -1 ||
setresuid(uid, uid, uid) == -1) {
/* Failed. */
}
keeping just a socket pair or pipes between the parent and the child; the child can then fork and execute new processes. (If an Unix domain socket is used, the parent can even send new descriptors to be used for the new processes' standard streams via ancillary messages.)
Use filesystem capabilities to give your program the capabilities it needs, without elevating all its privileges.
For example, sudo setcap CAP_NET_BIND_SERVICE=pe /usr/bin/yourapp gives /usr/bin/yourapp the CAP_NET_BIND_SERVICE capability (permitted and effective; not inherited), which allows your program to bind to any unused TCP/IP and UDP/IP ports, including 1-1024. See man 7 capabilities for detailed descriptions of the capabilities.
Use a trusted helper binary (program) to act as sudo for you. If you install this at e.g. /usr/lib/yourapp/execute, you can add the sudo configuration necessary to allow executing it without supplying a password. Alternatively, you can make it setuid root, or give it the necessary capabilities via filesystem capabilities.
To avoid other programs from exploiting this helper, you must ensure it is only executed by your program. One way to ensure that is to have your program create an Unix domain socket pair, leaving one end open in your program, and the other end for the helper in e.g. descriptor 3. Before doing anything, the helper checks that there is nothing to receive yet (to avoid "pipe stuffing" attacks), and writes a single byte to the parent. The parent responds with a single byte, but with its credentials in an ancillary message. The credentials contain the process ID of the parent. (Do not simply use getppid(), because that allows certain attacks; this socket approach verifies the parent is still alive when we do the check.) Finally, use readlink() to read the /proc/PID/exe pseudo-symlink, where PID is the parent process ID from the credentials ancillary message. At this point, the helper should send a byte, and receive a byte with the credentials again as an ancillary message, to ensure the parent process is still the same.
The verification process is complex, but necessary, to avoid making it easy to exploit root privileges by misusing the helper. For another approach to do exactly this, look into Apache suEXEC, the helper used by Apache to execute CGI programs with specific user privileges.
Let's say you are totally uninterested in doing things in a sensible way, and insist on using passwords. Fine; all I ask is that you don't publish such code, or at least warn your users that it is completely unsafe.
This is not just a crude hack: it is a suicidal one, similar to putting the password to your web site in your e-mail signature, because you only mail to friends who should have admin access to your site in the first place. So, footgun, with a hair trigger, no safety, and armed with buckshot coated in tetrodotoxin. With a nice label with big, child-readable letters saying "Please play with me! I'm safe!", stored in the kids bedroom.
The simplest thing to do is to execute sudo with the -S flag, which causes it to read the password from the standard input. For example, example.c:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
int main(void)
{
FILE *cmd;
int status;
cmd = popen("/usr/bin/sudo -S id -un 2>/dev/null", "w");
if (!cmd) {
fprintf(stderr, "Cannot run sudo: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
fprintf(cmd, "Password\n");
fflush(cmd);
status = pclose(cmd);
if (WIFEXITED(status)) {
if (WEXITSTATUS(status) == EXIT_SUCCESS)
fprintf(stderr, "No errors.\n");
else
fprintf(stderr, "Command failed with exit status %d.\n", WEXITSTATUS(status));
} else
if (WIFSIGNALED(status))
fprintf(stderr, "Command killed by signal %d.\n", WTERMSIG(status));
else
fprintf(stderr, "Command failed.\n");
return EXIT_SUCCESS;
}
The command pipe is opened in write mode, so that we can write the password to it. The 2>/dev/null redirection hides the password prompt.
If the password is correct, the above will output what id -un outputs when run as root (i.e.: root) to standard output, and No errors. to standard error.
If the password is incorrect, sudo will retry a couple of times (so nothing will happen for a few seconds), then the program will report Command failed with exit status 1. to standard error, because that's what sudo does when the password is incorrect.

start and terminate cu with a C program

I'm trying to communicate with another UNIX device via ttyS0 using cu (google "cu unix" to find out more about cu). My program works perfectly fine, but the problem is, that after the first execution of the program (establish connection, read logfiles and some other stuff) the terminal is not accessible anymore. I've just posted the core of my problem in a simplified version of the code where I'm only focussing on the actual question that I have:
When I'm doing these commands by hand "cu -l /dev/ttyS0 -s 115200" and "~." (just like in the man pages of cu: ~. terminates the connection) everything works fine. A sequential program like
system("cu -l /dev/ttyS0 -s 115200");
system("~.");
isn't working, because cu is still active, and nothing is executed after that....the program just sits there waiting for cu... same thing would happen in a simple bash script... cu would be preventing the program/script from proceeding - that's why I'm using threads, and like I said, my actual program works, but the program isn't terminating like I want it to and the Terminal has to be restarted.
When I execute the following program, I only get
Connected
sh: ~.: not found
pressing enter
cu: can't restore terminal: Input/Output error
Disconnected
and an unusable terminal is left open (can't type or do anything with it)...
#define _BSD_SOURCE
#include <termios.h>
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
#include <time.h>
#include <stdlib.h>
#include <pthread.h>
void first(){
system("cu -l /dev/ttyS0 -s 115200");
pthread_exit(NULL);
}
void second(){
system("~."); //also "~.\n" isn't changing anything
pthread_exit(NULL);
int main(){
pthread_t thread1, thread2;
pthread_create ( &thread1, NULL, (void*)first, NULL );
sleep(3);
pthread_create ( &thread2, NULL, (void*)second, NULL );
sleep(4);
exit(0);
return 0;
}
When you do it by hand, the ~. you're typing is not taken as a system command, but as input for the cu process that's still running. Best proof is that you didn't have a shell prompt at that time.
So the equivalent is not to do another system("~.") but to pass those characters as input to the first system("cu ...").
For instance:
system("echo '~.' | cu ....");
Obviously this doesn't allow you to open a "cu" connection and send the "~." at a later time. If you wish to do that, I suggest you take a look at the popen command (man 3 popen). This will start a cu process, and leave you with a file descriptor into which you can write your ~. at a later time.

setgid Operation not permitted

I have a C program that calls setgid() with the group id of the group "agrp", and it is saying "Operation not permitted" when I try to run it.
The program has the following ls -la listing:
-r-xr-s--x 1 root agrp 7508 Nov 18 18:48 setgidprogram
What I want, is setgidprogram to be able to access a file that has the owner otheruser and the group agrp, and permissions set to u+rw,g+rw (User and group read/writeable.)
What am I doing wrong? Does setgidprogram HAVE to have the setuid bit set also? (When I tried it, it worked.)
I am running Fedora 19, and I have SELinux disabled.
EDIT
Here is some example code:
wrap.c:
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <unistd.h>
#include <grp.h>
int main(void)
{
struct group *grp = getgrnam("agrp");
printf("%d\n",grp->gr_gid);
if(setgid(grp->gr_gid) != 0)
{
printf("%s.\n", strerror(errno));
return 1;
}
execl("/tmp/whoami_script.sh", NULL);
printf("%s.\n", strerror(errno));
return 0;
}
/tmp/whoami_script.sh:
#!/usr/bin/bash
id
$ ls -la /tmp/whoami_script.sh wrap
-r-xr-xr-x 1 root agrp 19 Nov 18 19:53 /tmp/whoami_script.sh
$ ./wrap
1234
uid=1000(auser) gid=1000(auser) groups=1000(auser),0(root),10(wheel)
---x--s--x 1 root agrp 7500 Nov 18 19:55 wrap
Is this enough information now?
The original version of the question showed 6550 permission on the file.
If you're not either user root or in group agrp, you need to be able to use the public execute permissions on the program — which are missing. Since it is a binary, you don't need read permission. To fix it:
# chmod o+x setgidprogram
(The # denotes 'as root or via sudo', or equivalent mechanisms.) As it stands, only people who already have the relevant privileges can use the program.
If the program is installed SGID agrp, there is no need for the program to try to do setgid(agrp_gid) internally. The effective GID will be the GID belonging to agrp and the program will be able to access files as any other member of agrp could.
That said, normally you can do a no-op successfully. For example, this code works fine:
#include <stdio.h>
#include <unistd.h>
#include "stderr.h"
int main(int argc, char **argv)
{
err_setarg0(argv[argc-argc]);
gid_t gid = getegid();
if (setgid(gid) != 0)
err_syserr("Failed to setgid(%d)\n", (int)gid);
puts("OK");
return 0;
}
(You just have to accept that the err_*() function do error reporting; the argc-argc trick avoids a warning/error from the compiler about otherwise unused argument argc.)
If you make the program SUID root, then the SGID property doesn't matter much; the program will run with EUID root and that means it can do (almost) anything. If it is SUID root, you should probably be resetting the EUID to the real UID:
setuid(getuid());
before invoking the other program. Otherwise, you're invoking the other program as root, which is likely to be dangerous.
Dissecting POSIX
In his answer, BenjiWiebe states:
The problem was I was only setting my effective GID, not my real GID. Therefore, when I exec'd, the child process was started with the EGID set to the RGID. So, in my code, I used setregid() which worked fine.
Yuck; which system does that? Linux trying to be protective? It is not the way things worked classically on Unix, that's for sure. However, the POSIX standard seems to have wriggle room in the verbiage (for execvp()):
If the ST_NOSUID bit is set for the file system containing the new process image file, then the effective user ID, effective group ID, saved set-user-ID, and saved set-group-ID are unchanged in the new process image. Otherwise, if the set-user-ID mode bit of the new process image file is set, the effective user ID of the new process image shall be set to the user ID of the new process image file. Similarly, if the set-group-ID mode bit of the new process image file is set, the effective group ID of the new process image shall be set to the group ID of the new process image file. The real user ID, real group ID, and supplementary group IDs of the new process image shall remain the same as those of the calling process image. The effective user ID and effective group ID of the new process image shall be saved (as the saved set-user-ID and the saved set-group-ID) for use by setuid().
If I'm parsing that right, then we have a number of scenarios:
ST_NOSUID is set.
ST_NOSUID is not set, but SUID or SGID bit is set on the executable.
ST_NOSUID is not set, but SUID or SGID biy is not set on the executable.
In case 1, it is fairly clearly stated that the EUID and EGID of the exec'd process are the same as in the original process (and if the EUID and RUID are different in the original process, they will be different in the child).
In case 2, if the SUID bit is set on the executable, the EUID will be set to the SUID. Likewise if the SGID bit is set on the executable, the EGID will be set to the SGID. It is not specified what happens if the SUID bit is set, the SGID bit is not set, and the original process has different values for EGID and RGID; nor, conversely, is it specified what happens if the SGID bit is set, the SUID bit is not set, and the original process has different values for EUID and RUID.
Case 3, where neither the SUID nor SGID bit is set on the executable, also seems to be unspecified behaviour.
Classically on Unix systems, the EUID and RUID could be different, and the difference would be inherited across multiple (fork() and) exec() operations if the executable does not override the EUID or EGID with its own SUID or SGID bits. However, it is not clear that the POSIX standard mandates or prohibits this; it seems to be unspecified behaviour. The rationale section provides no guidance on the intentions.
If my reading is correct, then I find it amusing that the ST_NOSUID bit means that if a program is launched by a process that is running SUID, then the program on the 'no SUID' file system will be run with different real and effective UID (RUID and EUID), which seems counter-intuitive. It doesn't matter what the SUID and SGID bits on the executable are set to (so the bits on the executable are ignored), but the inherited values of EUID and RUID are maintained.
This code finally worked:
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <unistd.h>
#include <grp.h>
int main(void)
{
gid_t g = getegid();
if(setregid(g, g) != 0)
{
printf("Error setting GID: %s.\n", strerror(errno));
}
execl("/tmp/whoami_script.sh", "/tmp/whoami_script.sh", NULL);
printf("Error: %s.\n", strerror(errno));
return 0;
}
The problem was I was only setting my effective GID, not my real GID. Therefore, when I exec'd, the child process was started with the EGID set to the RGID. So, in my code, I used setregid which worked fine.

Resources