Why is it bad to use system() in linux programming? [duplicate]

Why is it bad to use system() in linux programming? [duplicate] - c

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Issuing system commands in Linux from C, C++
I see on some text that it is not good to use system() call in linux programming, I wonder what are the real reasons of it? It should consume more memory and maybe more CPU. Apart from these what could be the reason?
For example, if I type system("echo 1 > file"); instead of using fopen(), fwrite() what a hacker can do in my program/linux system? I saw that system() is not advise because of security issues. But how a person can hack a linux system just because of using system() call? I would be glad if someone can explain tangibly what could go bad to use system().

Using system("echo 1 > file"); literally isn't a security risk, just a needless execution of another process where you don't need one.
The risk comes in when you build a string programmatically and then use it with system(). Then you have a risk of Shell Command Injection, where you thought you were building one command, but a malicious user subverts your program by giving you carefully crafted inputs that cause your command to do something you didn't expect.

The problem is that you are trusting the string passed into system to be safe from a trusted source. Suppose you had something like this:
char *command = null;
//Read command from external source;
system(command);
would you able to trust command was safe and not to do something nasty like "rm -fr ~/*" ? Using fopen doesn't make you necessary safe either though because again a hacker could just pass in a name of file such /etc/passwd and read that which you don't want. the bottom line where you program interfaces with the outside world. that is where you to put in some validation and restriction to what an external user can do

System(3) starts another process and runs a command interpreter ("/bin/sh -c") to execute your command. This can be a problem if your program is running with a SUID or SGID bit. The behaviour of the shell is controlled by many environment variables and some of these may be used to gain control of the command interpreter. This situation is similar to executing a SUID or SGID shell script.

Related

Identify environment, z/OS UNIX vs JCL or TSO

What function can I call from inside a C program, to discover what z/OS environment the program is being run in, e.g. is it z/OS UNIX (aka USS) or is it from TSO, say via JCL?

There are two approaches: CEE3INF, and rummage through the z/OS data areas.
CEE3INF has the advantage of being documented and portable to any LE environment, as well as providing information about PIPI that you don't easily find in the z/OS structures.
As an alternative to CEE3INF, there's plenty of information in the system data areas if you just need to distinguish between Batch, TSO, CICS and whether or not you've been dubbed as a USS process. The alternative is easy, and it's especially helpful outside the LE environment...though it's even easy to do in C by just loading up some pointers that you can get by using the XLC DSECT to C-structure conversion utility.
A TSO address space is one where ASCBTSB is non-zero (PSAAOLD->ASCBTSB). A Batch job is one where ASCBJBNI is filled in (PSAAOLD->ASCBJBNI). A CICS address space has TCBCAUF set non-zero (PSATOLD->TCBCAUF).
In any of the above, you can also check whether your task has been dubbed as a UNIX process by checking TCB->STCB->STCBOTCB. If non-zero, you've been dubbed and can use UNIX Services. The OTCBPRLI field has process information like the PID, and THLI has thread-level information.
Note that a given task might be eligible to use USS functions, but it hasn't yet. The "querydub()" function can help you distinguish between a task that's already been dubbed, versus one that can be, but just hasn't been yet.
If you use CEE3INF, there have been some comments about it not working properly outside of the main() function, but I think the issue is a small bug in the sample IBM provides in their documentation. This sample works fine on my z/OS 2.3 and 2.4 systems:
#include <leawi.h>
#include <string.h>
#include <ceeedcct.h>
int do_call(void)
{
_INT4 sys_subsys,env_info,member_id,gpid;
_FEEDBACK fc;
CEE3INF(&sys_subsys,&env_info,&member_id,&gpid,&fc);
if ( _FBCHECK(fc,CEE000) != 0 )
{
printf("CEE3INF failed with message number %d\n", fc.tok_msgno);
}
printf("System/Subsystem in hex %08x \n",sys_subsys);
printf("Enviornment info in hex %08x \n",env_info);
printf("Member languages in hex %08x \n",member_id);
printf("GPID information in hex %08x \n",gpid);
printf("\n");
}
int main(void)
{
do_call();
}
This is the sample code from the IBM manual, except notice in the call to CEE3INF, the IBM doc has a bug ("...fc" instead of "...&fc"). There were comments about CEE3INF not working if called outside of main(), but I think the issue is simply the bug in the sample above.
To test, I compile the code above under the UNIX Services shell using this command:
xlc -o testinf testinf.c
I then run the executable from a z/OS shell session:
> ./testinf
System/Subsystem in hex 02000002
Enviornment info in hex 00540000
Member languages in hex 10000000
GPID information in hex 04020300
This is a z/OS 2.3 system - I get identical results on 2.4.
UPDATE: What does "running in the z/OS UNIX Services environment" mean?
It's easy to understand batch jobs versus TSO sessions versus started tasks, but what's meant by "running in the z/OS UNIX Services environment"? In subsystems like CICS, IMS, or WebSphere "running under xxx" is easy to define because the transactions run inside a special type of service address space...but unfortunately, UNIX Services isn't like that.
Indeed, just about any task running on z/OS can make use of z/OS UNIX Services, so there really isn't a "z/OS UNIX Services environment" that you can define in a traditional way. A parallel would be VSAM...is a program that opens a VSAM file "running in VSAM?". We might care about programs running IDCAMS, programs opening VSAM files, programs using CICS/VSAM - but "running in VSAM" isn't particularly meaningful without further qualification. Plus, "running in VSAM" isn't exclusive with running as batch, STC or TSO user - it's the same with z/OS UNIX services - you can be a batch job, a started task or a TSO user, AND you can also be "running in z/OS UNIX Services" or not.
Here are three very different definitions of "running in z/OS UNIX Services":
Whether the unit of work has been "dubbed" as a UNIX Services process and is therefore ready and able to request UNIX Services kernel functions.
Whether the unit of work is running under a UNIX shell, such as /bin/sh.
Whether an LE program is running with the POSIX(ON) runtime option.
Why would any of this matter? Well, some software - especially things like runtime library functions called by other applications - behaves differently depending on whether the caller is a UNIX process or not.
Imagine writing an "OPEN" function that gets passed a filename as an argument. If your caller is a UNIX process, you might interpret the filename as an actual filename...OPEN(XYZ) is interpreted as "check the current working directory for a file called 'XYZ'". But if the caller isn't dubbed as a UNIX process, then OPEN(XYZ) might mean to open the 'XYZ' DD statement. You can make this determination using the approach I outlined above, since it tells you that your task is in fact dubbed as a UNIX process.
Okay, but what's different between this and #2 above (running under the shell)?
Here's one example. Suppose you have a callable routine that wants to write a message to an output file. Most non-mainframe UNIX applications would simply write to STDOUT or STDERR, but this doesn't always work on z/OS because many applications are UNIX processes, but they aren't running under the shell - and without the shell, STDOUT and STDERR may not exist.
Here's the scenario...
You run a conventional program that has nothing to do with UNIX Services, but it does something to get itself dubbed as a UNIX process. Just as an example, maybe someone puts "DD PATH=/some/unix/file" in the JCL of an age-old COBOL program...miraculously, when this COBOL batch job runs, it's a UNIX process because it makes use of the UNIX Services filesystem.
There are lots of things that can get your task dubbed as a UNIX process...DD PATH is one, but even calling a function that opens a TCP/IP socket or something similarly benign can do the trick. Maybe you're writing a vendor product that's just a batch assembler program, but it opens a TCP/IP socket...that's another common example of UNIX processes that run without a shell.
So why is this a problem? Well, think about what happens if that callable function decides to write it's messages to STDERR. Maybe it tests to see if it's running as a UNIX Services process, and if so it writes to STDERR, otherwise it dynamically allocates and writes to a SYSOUT file. Sounds simple, but it won't work for my example of an app having DD PATH.
Where does STDERR come from? Normally, the UNIX shell program sets it up - when you run a program under the shell, the shell typically passes your program three pre-opened file handles for STDIN, STDOUT and STDERR. Since there's no shell in my sample scenario, these file handles weren't passed to the application, so a write to STDERR is going to fail. In fact, there are many things that the shell passes to a child process besides STDIN/STDOUT/STDERR, such as environment variables, signal handling and so forth. (Certainly, the user can manually allocate STDIN/STDOUT/STDERR in his JCL...I'm not talking about that here).
If you want to have software that can handle both running under the shell and not running under the shell, you have more work to do than just seeing if your application has been dubbed as a UNIX process:
Check to see if you're a UNIX process...if not, you can't be running under the shell.
Check to see if you were launched by the shell. There are a variety of ways to do this, but generally you're checking your "parent process" or something like the environment variables you were passed. This isn't always easy to do, since there are actually many different shells on z/OS, so there's not much you can go on to spot the "legitimate" ones. One of the more bulletproof approaches is to get the login shell for the user and check for that.
As an alternative to checking the parent process, you can check for the resource you need directly, such as by calling ioctl() against the STDERR file handle as in my example. This, of course, can be dangerous...imagine the case where an application opens a few sockets and calls your function...what you think are really STDIN/STDOUT/STDERR could in fact be open file handles setup by your caller, and what you write could easily clobber his data.
As for my third example - LE programs running with POSIX(ON) - this is largely an issue for developers writing in high-level languages based on the LE runtime, since the behaviors of certain runtime functions are different with POSIX(ON) or POSIX(OFF).
An example is the C programmer writing a function that can be called by both POSIX(ON) and POSIX(OFF) callers. Let's say the function wants to do some background processing under a separate thread...in POSIX(ON) applications, the developer might use pthread_create(), but this won't work in POSIX(OFF). There are actually lots of things in IBM's LE runtime that behave differently depending on the POSIX setting: threads, signal handling, etc etc etc. If you hope to write "universal" code and you need these functions, you'll definitely need to query the POSIX setting at execution time and take different paths depending on how it's set.
So hopefully that sheds some light on the complexity hiding behind this question...three different definitions of "running in z/OS UNIX environment", and three different use-cases illustrating why each is important.

How to run an arbitrary script or executable from memory?

I know I can use a system call like execl("/bin/sh", "-c", some_string, 0) to interpret a "snippet" of shell code using a particular shell/interpreter. But in my case I have an arbitrary string in memory that represents some complete script which needs to be run. That is, the contents of this string/memory buffer could be:
#! /bin/bash
echo "Hello"
Or they might be:
#! /usr/bin/env python
print "Hello from Python"
I suppose in theory the string/buffer could even include a valid binary executable, though that's not a particular priority.
My question is: is there any way to have the system launch a subprocess directly from a buffer of memory I give it, without writing it to a temporary file? Or at least, a way to give the string to a shell and have it route it to the proper interpreter?
It seems that all the system calls I've found expect a path to an existing executable, rather than something low level which takes an executable itself. I do not want to parse the shebang or anything myself.

You haven't specified the operating system, but since #! is specific to Unix, I assume that's what you're talking about.
As far as I know, there's no system call that will load a program from a block of memory rather than a file. The lowest-level system call for loading a program is the execve() function, and it requires a pathname of the file to load from.

My question is: is there any way to have the system launch a
subprocess directly from a buffer of memory I give it, without writing
it to a temporary file? Or at least, a way to give the string to a
shell and have it route it to the proper interpreter?
It seems that all the system calls I've found expect a path to an
existing executable, rather than something low level which takes an
executable itself. I do not want to parse the shebang or anything
myself.
Simple answer: no.
Detailed answer:
execl and shebang convention are POSIXisms, so this answer will focus on POSIX systems. Whether the program you want to execute is a script utilizing the shebang convention or a binary executable, the exec-family functions are the way for a userspace program to cause a different program to run. Other interfaces such as system() and popen() are implemented on top of these.
The exec-family functions all expect to load a process image from a file. Moreover, on success they replace the contents of the process in which they are called, including all memory assigned to it, with the new image.
More generally, substantially all modern operating systems enforce process isolation, and one of the central pillars of process isolation is that no process can access another's memory.

Check if command was run directly by the user

Say I want to change the behavior of kill for educational reasons. If a user directly types it in the shell, then nothing will happen. If some other program/entity-who-is-not-the-user calls it, it performs normally. A wrapping if-statement is probably sufficient, but what do I put in that if?
Edit I don't want to do this in the shell. I'm asking about kernel programming.
In line 2296 of the kernel source, kill is defined. I will wrap an if statement around the code inside. In that statement, there should be a check to see whether the one who called this was the user or just some process. The check is the part I don't know how to implement.
Regarding security
Goal:
Block the user from directly calling kill from any shell
Literally everything else is fine and will not be blocked

While other answers are technically true, I think they're being too strict regarding the question. What you want to do it not possible to do in a 100% reliable way, but you can get pretty close by making some reasonable assumptions.
Specifically if you define an interactive kill as:
called by process owned by a logged in user
called directly from/by a process named like a shell (it may be a new process, or it may be a built-in operation)
called by a process which is connected to a serial/pseudo-terminal (possibly also belonging to the logged in user)
then you can check for each of those properties when processing a syscall and make your choice that way.
There are ways this will not be reliable (sudo + expect + sh should work around most of these checks), but it may be enough to have fun with. How to implement those checks is a longer story and probably each point would deserve its own question. Check the documentation about users and pty devices - that should give you a good idea.
Edit: Actually, this may be even possible to implement as a LKM. Selinux can do similar kind of checks.

It looks you are quite confused and do not understand what exactly a system call is and how does a Linux computer works. Everything is done inside some process thru system calls.
there should be a check to see whether the one who called this was directly done by the user or just some process
The above sentence has no sense. Everything is done by some process thru some system call. The notion of user exists only as an "attribute" of processes, see credentials(7) (so "directly done by the user" is vague). Read syscalls(2) and spend several days reading about Advanced Linux Programming, then ask a more focused question.
(I really believe you should not dare patching the kernel without knowing quite well what the ALP book above is explaining; then you would ask your question differently)
You should spend also several days or weeks reading about Operating Systems and Computer Architecture. You need to get a more precise idea of how a computer works, and that will take times (perhaps many years) and any answer here cannot cover all of it.
When the user types kill, he probably uses the shell builtin (type which kill and type kill) and the shell calls kill(2). When the user types /bin/kill he is execve(2) a program which will call kill(2). And the command might not come from the terminal (e.g. echo kill $$ | sh, the command is then coming from a pipe, or echo kill 1234|at midnight the kill is happening outside of user interaction and without any user interactively using the computer, the command being read from some file in /var/spool/cron/atjobs/, see atd(8)) In both cases the kernel only sees a SYS_kill system call.
BTW, modifying the kernel's behavior on kill could affect a lot of system software, so be careful when doing that. Read also signal(7) (some signals are not coming from a kill(2)).
You might use isatty(STDIN_FILENO) (see isatty(3)) to detect if a program is run in a terminal (no need to patch the kernel, you could just patch the shell). but I gave several cases where it is not. You -and your user- could also write a desktop application (using GTK or Qt) calling kill(2) and started on the desktop (it probably won't have any terminal attached when running, read about X11).
See also the notion of session and setsid(2); recent systemd based Linuxes have a notion of multi-seat which I am not familiar with (I don't know what kernel stuff is related to it).
If you only want to change the behavior of interactive terminals running some (well identified) shells, you need only to change the shell -with chsh(1)- (e.g. patch it to remove its kill builtin, and perhaps to avoid the shell doing an execve(2) of /bin/kill), no need to patch the kernel. But this won't prohibit the advanced user to code a small C program calling kill(2) (or even code his own shell in C and use it), compile his C source code, and run his freshly compiled ELF executable. See also restricted shell in bash.
If you just want to learn by making the exercise to patch the kernel and change its behavior for the kill(2) syscall, you need to define what process state you want to filter. So think in terms of processes making the kill(2) syscall, not in terms of "user" (processes do have several user ids)
BTW, patching the kernel is very difficult (if you want that to be reliable and safe), since by definition it is affecting your entire Linux system. The rule of thumb is to avoid patching the kernel when possible .... In your case, it looks like patching the shell could be enough for your goals, so prefer patching the shell (or perhaps patching the libc which is practically used by all shells...) to patching the kernel. See also LD_PRELOAD tricks.
Perhaps you just want the uid 1234 (assuming 1234 is the uid of your user) to be denied by your patched kernel using the kill(2) syscall (so he will need to have a setuid executable to do that), but your question is not formulated this way. That is probably simple to achieve, perhaps by adding in kill_ok_by_cred (near line 692 on Linux 4.4 file kernel/signal.c) something as simple as
if (uid_eq(1234, tcred->uid))
return 0;
But I might be completely wrong (I never patched the kernel, except for some drivers). Surely in a few hours Craig Ester would give a more authoritative answer.

You can use aliases to change the behavior of commands. Aliases are only applied at interactive shells. Shell scripts ignore them. For example:
$ alias kill='echo hello'
$ kill
hello
If you want an alias to be available all the time, you could add it to ~/.bashrc (or whatever the equivalent file is if your shell isn't bash).

What is the call for the "lp filename" command in linux in a c program?

I want to use the above command in a c program in linux.
I have searched so far that there are system calls and exec calls that one may make in a code. Is there any other way using exec or system commands?
Using the system command isn't an ideal command for a multi-threaded server ,what do you suggest?

First make sure you have lp installed in this path. (Using which lp in the terminal).
You may want to understand the lp command. It's a classic unix command to send data to the "line printer", but it works with e.g. .pdf files too nowadays, depending on your printer system. However, it isn't necessarily installed. Sometimes, lpr may work better, too.
See also: http://en.wikipedia.org/wiki/Lp_%28Unix%29
The second part is about executing unix commands. system is the easiest (also the easiest to introduce a security issue into your program!), using fork and execve is one of a number of alternatives (have a look at man execve).

Yes, this code is ok. It will print the file named filename provided that the lp is found at /usr/bin and the filename file exists. You can add checks for that if you want your program to report if something went wrong, other than that it will do exactly what you expect.
Doing system("lp filename"); would work if you don't mind your program blocking after that system() call and until lp finishes.

You could also use popen(3) (instead of system(3)). But you always need to fork a process (both system and popen are calling fork(2)). BTW, if you have a CUPS server you might use some HTTP client protocol library like libcurl but that is probably inconvenient. Better popen or system an lp (or lpr) command.
BTW, printing is a relatively slow and complex operation, so the overhead of forking a process is negligible (I believe you could do that in a server; after all people usually don't print millions of pages). Some libraries might give you some API (e.g. QPrinter in Qt).
Notice that the lp (or lpr) command is not actually doing the printing, it is simply interacting with some print daemon (cupsd, lpd ...) and its spooling system. See e.g. CUPS. So running the lp or lpr command is reasonably fast (much faster than the printing itself), generally a few milliseconds (certainly compatible with a multi-threaded or server application).
Quite often, the command passed to popen or system is constructed (e.g. with snprintf(3) etc...), e.g.
char cmdbuf[128];
snprintf (cmdbuf, sizeof(cmdbuf), "lp %s", filename);
but beware of code injection (think about filename containing foo; rm -rf $HOME) and of buffer overflow
Of course, notice that library functions like system, popen, fopen are generally built above existing syscalls(2). Read Advanced Linux Programming

C, runtime test if executable exists in PATH

I am currently writing an application in C, targetting BSD and Linux systems with a hope to being generall portable. This program a runtime dependency, in this case mplayer.
As it stands I am using execlp() to start mplayer. I am checking the error code of the execlp call and I am testing for EACCESS, so I know when I attempt to run mplayer if it exists or not.
Because of the way my program works, mplayer is a required dependency but may not be used for some time after my program starts. As a user experience it is poor for the program to have been running for some time before failing due to mplayer being missing. So I would like to test for mplayer existing as my program starts up. Probably delivering an error message if mplayer is not available.
Now I understand there is a race condition here so my current handling of an EACCESS error will have to stay. We could find a situation where a user starts my program running, then uninstalls mplayer. This is accepted.
My initial thought was to call execlp() early on in execution and however this results in mplayer visibly starting. To be honest I'd prefer not to be starting mplayer, just testing if I "could" start it (eg. does a file exist called mplayer somewhere in my path and is it executable).
A second thought was then to run those precise steps, looking through the path and testing if the matching file is executable. I've not yet coded this for two reasons. The first reason, to be sure execlp is finding the same thing I have found I would have to pass the discovered pathname to execlp, bypassing the builtin PATH searching mechanism. The other reason is simply I feel I'm missing an obvious trick.
Is there a function I should be using to do the search for an executable? Or do I really need to just get on and code it the long way.

Some systems (FreeBSD, Linux) support a which command that searches the user's path for a given command.
I suppose that begs the question in a sense... if your code might run on a variety of systems, you might feel the need to do which which just to determine if which is available. ;-) If that's a problem you might still have to consider building that functionality into your program, but the code could still be a helpful starting point.

with a hope to being generally portable
To POSIX platforms, I suppose? execlp is far from generally available.
There's no portable way to check for a command's availability except trying to execute it. What you could do is copy the path finding logic from BSD execlp (the userland part), or BSD's which command.

There is no certain way in ANSI C. You may try fopen() and check return code.
Try to use stat call (man 2 stat), it exists on Linux, but I'm not sure about BSD.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight