C program calling shell script - c

I have a small C program calling a shell script myScript.sh. I am getting the value of ret as 256. Please help me in knowing what went wrong with the system call?
int main()
{
int ret;
ret = system (myScript.sh);
ret >>= ret;
if (ret != 0)
{
printf("ret is [%d]",ret);
}
}
Working on 64 bit UNIX operating system and using ksh shell

On my system, man system says:
The system() function returns the exit status of the shell as returned by
waitpid(2), or -1 if an error occurred when invoking fork(2) or
waitpid(2). A return value of 127 means the execution of the shell
failed.
The waitpid man page describes a set of macros such as WEXITSTATUS() that extract the actual exit code from the return value.
I'm not quite sure what you're intending to do with ret >>= ret, but that can't be right.

The way that the system function usually works on *nix is that it calls fork and then the child calls one of the exec functions with /bin/sh -c and then the string you passed to system in the child, which turns the child process into an instance of the /bin/sh program which runs the command. The parent calls one of the wait functions, which waits for the /bin/sh to exit, which it does with the same exit status as the shell script, and then system also returns that value.
If you look at the man pages for the wait system call(s):
main 3 wait
You should get some information about what gets returns and some macro functions that help you make sense of it.
The WIFEXITED(stat_val) macro can be used to test if the program exited normally as opposed to with a signal. Normal exits involve calling the exit system call. If this function returns a non-zero value then you can use the WEXITSTATUS(stat_val) macro to get the value that it actually returned.
The WIFSIGNALED(stat_val) macro can be used to test if the program was terminated with a signal, and if so the WTERMSIG(stat_val) macro will return the signal number that caused the termination.
There are some other macros that can tell you if the process were stopped or continued, rather than terminated, but I don't think that they are overly helpful to you for this purpose, but you may want to look into them.
As far as what is actually happening in this case, it can be difficult to tell. If the fork call fails then system will be able to return -1 and set errno to reflect the error. If the fork did not fail then the error may have happened in the child and be more difficult to locate. It may be possible that on your platform system might do some tests before forking to insure that you have permission to execute the appropriate files and set errno to reflect that, but maybe not.
You should look into the perror function to print out error messages in the case that errno is set.
If the failure happens after fork and within the child then you either need to get the shell to tell you more about what is happening, or get the shell script to. This may be by including echo statements in the script similarly to using print statements in your C programs.
You should also look into the access function to test if you have permission to read and/or execute files.
If you are using Linux then you should be able to do:
strace -o my_program.strace -f ./my_program
or
ltrace -o my_program.ltrace -f -S ./my_program
and then examine the trace files (after the -o) to look at what the programs and kernel say to each other. ltrace looks at how the program talks to library function, while strace looks at system calls, but the -S tells ltrace to also look at system calls. The -f argument tells them both to trace children of the program as they are created.
I just noticed that you said that you were using ksh
As I mentioned system under a Posix system should use /bin/sh or a compatible shell. This doesn't mean that /bin/sh won't run /bin/ksh to run your script (or that the kernel won't use the #! line at the beginning of the script file to do this), but it could be a problem. There are ways to run shell scripts so that this line is not used to know which shell is to be used. The most notable is:
. myshell.sh
The period and space essentially tries to dump the text of the file into the current shell session rather than run it in another process (this is useful for setting up an environment). If you were to be doing:
int x = system(". myshell.sh");
Then that could be a problem.

The exit status of the command is encoded as two bytes:
The high-order byte contains the exit status.
The low-order byte contains the signal that killed it (if any).
Since 0x0100 is 256 decimal, your shell script exited with status 1. Review your shell script and ensure it exits with status 0 when it is successful.

From the Standard (emphasis is mine):
6.5.7/3
If the value of the right operand [of the >> operator] is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
So, when you do
ret >>= ret;
and ret < 0 or ret >= CHAR_BIT * sizeof (int) ... anything goes
The return value of the system function can be -1 on error.
If your call returned in such a negative value, the next operation ret >>= ret; (same as ret = -1 >> -1;) results in something that has no meaning: you cannot right shift by a negative number of bits.
When you try to do things with no meaning, C is allowed to do anything ... anything at all (that includes doing nothing, doing what you expect, reformatting your hard disk, transferring your bank account to mine, making demons fly out your nose, ..., ..., ...)

Make sure your script is executable and in the path, or use the full path instead.

Nothing went wrong. Did you read the documentation? See:
RETURN VALUE
The value returned is -1 on error (e.g. fork(2) failed), and the
return status of the command otherwise. This latter return status is in the format specified in wait(2).
Thus, the exit code of the command will be WEXITSTATUS(status). In case /bin/sh could not be executed,
the exit status will be that of a command that does exit(127).
Since 256 is not -1, the call did not fail.

Why do you shift the result? Just remove the line ret >>= ret, and it will work.

I am working on linux and it helped to call the script with system("sh script.sh")

Related

what exactly return 0 does internally in a c program?

I came to know that while writing a c program we write "return 0" to tell the os that the program is executed successfully. My question is how can we tell the os while writing the program itself without even executing the program that the program had executed successfully. Can someone tell me what exactly "return 0" does.
The return value of the main function is passed to the exit() function by the C startup code, providing the OS an exit status available to its parent process.
The OS itself does not do anything with this exit status, but shell scripts and other programs invoking multiple processes may use the exit status to determine what to do next. For example the make utility uses the exit status of the commands, eg: when invoking the C compiler, to proceed with the next command or stop the build process with an error status.
Returning 0 or EXIT_SUCCESS at the end of the main function is a convention to tell the calling process that the program completed successfully. It has become implicit in C99 so legacy programs that do not have a return statement can be considered to have completed successfully when recompiled with a C99 compiler. Nevertheless, it is considered good style to provide the exit status explicitly.
return 0 means returning the integer value 0 from a function, which is generally done inside a program.
You are referring to exit(0), which means that you stop a program successfully. Generally, when you do exit(i) with i an integer value, then that i value is the error code, explaining what happened wrongly. More information can be found here.
Once you finish writing your code and compiling it you could execute it wherever you want correct?
Once it is finished executing that piece of software you made will return a number (in your example zero).
In most cases that do absolutely nothing because there is nobody actively looking for that return code.
But you might find yourself in situations where you want to execute that software in an automated environment or something like that. let's say a script, that calls your software, and if everything goes ok, it goes on to do something else, but if it is not ok it tries again. That script or other software that is calling your first software has the power to read and interpret that return.
You can also do that manually to actually see that a certain software return code is:
Windows:
open cmd
run the software
run echo Return Code: %errorlevel%
Linux:
open terminal
run software
run echo Return Code: $?
Those pre-defined variables will give you the value your last command returned and you can do something about it.
By convention (as mentioned before) 0 means that everything was ok, and anything other than that indicates that there has been an error. If you write your code with multiple error checks, and when something goes wrong return different values. you will know straight away what went wrong with your execution.

Does it make sense to exit(0) after a call to execl()?

Consider the following code:
close(channel_data->pty_master);
if (login_tty(channel_data->pty_slave) != 0) // new terminal session
{
exit(1); // fail
}
execl("/bin/sh", "sh", mode, command, NULL); // replace process image
exit(0);
According to the documentation of execl(), the current process image is being replaced and the call returns only on errors.
But why call exit() after a call to execl() if the process image is replaced?
Exec calls may fail. A common epilogue would be:
perror("Some eror message");
_exit(127); /*or _exit(1); (127 is what shells return) */
You would usually run _exit rather than exit here, in order to skip atexit hooks and stdio buffer flushes.
It does make sense to call exit after some exec(3) function because they can fail (e.g. when execve(2) is failing). The execve(2) page lists a number of failure reasons.
It should better be exit(EXIT_FAILURE) or some other (non 0) exit code (conventionally a high exit code like 127 or 126 would be used for that usage, to separate failure of exec vs errors in the program it would run), and I recommend calling perror just before that exit. As explained by PSKocick there are good reasons to call _exit (but his arguments could be reversed, one would want to run atexit and standard fflushs by using exit instead).
In your case, failure is unlikely, but imagine however if some other process has removed /bin/sh (e.g. the sysadmin making the stupid mistake of running /bin/rm -rf . in the root directory, or in /bin/, perhaps in some other terminal window).
Still that execve could also fail when system resources are (temporarily) exhausted, e.g for
ENOMEM Insufficient kernel memory was available.
And (in rare cases) this could even happen for /bin/sh;
BTW your exec usage would probably fail (with E2BIG) if (by mistake) command was a string of a million of non-null bytes.
As a general coding rule, all important system calls should be checked against failure.
You'll want to call exit because you failed to exec the program in question and you typically don't want that process to hang around since it's not running what you wanted it to run. Since execl only returns on failure, there's no need to check the return status.
In many cases, it also makes sense to print an error message to see why it failed. You should also use an exit code other than 0. A non-zero exit code is used to indicate an abnormal exit, and the parent process can capture that when it calls wait.
execl("/bin/sh", "sh", mode, command, NULL);
perror("command failed");
exit(1);
So yes, it makes sense to cal exit, but not necessarily exit(0).
But why call exit() after a call to execl() if the process image is replaced?
As you said, execl() returns only on errors:
execl("/bin/sh", "sh", mode, command, NULL); // replace process image
exit(0);
In the code above, exit() is called only if the execl() call failed.
As Jonathan Leffler suggested in his comment it may be a very good idea to return a value other than zero, since zero indicates success and the code did indeed fail if the program's control flow ever reached the exit() call in the code above.

Get system calls IDs and store them in a .txt file(LINUX)

So i've been struggling with this exercise. I must get al of the System Calls made by any given Linux command of my choice (I.E. ls or cd), list them in a .txt file, and have their unique IDs listed beside them.
So far here's what i got:
strace -o filename.txt ls
This when executed in the Linux shell gives me a "filename.txt" file containing all the system calls of the ls command. Now in my C script:
#include <stdio.h>
#include <stdlib.h>
int main(){
system("strace -o filename.txt ls");
return 0;
}
This should do the same as the previous code, but it's not returning me anything, although the code succesfully compiles. How would i go about fixing this, and then get the IDs? I'm using the "stdlib" library because in my research i found that it has some relation to system call IDs, but haven't found any indication on how to get them. Basically i must read that file i created and have it give each system call its ID.
The exercise is obviously designed to be solved by using the ptrace() facility, because the strace utility does not have an option to print the syscall number (as far as I know).
Technically, you can use something like
printf '#include <sys/syscall.h>\n' | gcc -dD -E - | awk '$1 == "#define" { m[$2] = $3 } END { for (name in m) if (name ~ /^SYS_/) { v = name; while (v in m) v = m[v]; sub(/^SYS_/, "", name); printf "%s %s\n", v, name } }'
to generate a number of syscall-number syscall-name lines, to be used for mapping syscall names back to syscall numbers, but this would be silly and error-prone. Silly, because being able to use ptrace() gives you much more control than using the strace utility, and using a "clever hack" like above just means you avoid learning how to do that, which in my opinion is by definition self-defeating and therefore utterly silly; and error-prone, because there is absolutely no guarantee that the installed headers match the running architecture. This is especially problematic on multiarch architectures, where you can use -m32 and -m64 compiler options to switch between 32-bit and 64-bit architectures. They typically have completely different syscall numbers.
Essentially, your program should:
fork() a child process.
In the child process:
Enable ptracing by calling prctl(PR_SET_DUMPABLE, 1L)
Make parent process the tracer by calling ptrace(PTRACE_TRACEME, (pid_t)0, (void *)0, (void *)0)
Optionally, set tracing options. For example, call ptrace(PTRACE_SETOPTIONS, getpid(), PTRACE_O_TRACECLONE | PTRACE_O_TRACEEXEC | PTRACE_O_TRACEEXIT | PTRACE_O_TRACEFORK) so that you catch at least clone(), fork(), and exec() family of syscalls.
If you do not set the PTRACE_O_TRACEEXEC option, you should stop the child process at this point using e.g. raise(SIGSTOP);, so that the parent process can start tracing this child.
Execute the command to be traced using e.g. execv(). In particular, if the first command line parameter is the command to run, optionally followed by its options, you can use execvp(argv[1], argv + 1);.
If you set the PTRACE_O_TRACEEXEC option above, then the kernel will auto-pause the child process just before executing the new binary.
If the exec fails, the child process should exit. I like to use exit(127);, to return exit status 127.
In the parent process, use waitpid(childpid, &status, WUNTRACED | WCONTINUED in a loop, to catch events in the child process.
The very first event should be the initial pause, i.e. WIFSTOPPED(status) being true. (If not, something else went wrong.)
There are three three different reasons why waitpid(childpid, &status, WUNTRACED | WCONTINUED) may return:
When the child exits (WIFEXITED(status) will be true).
This should obviously end the tracing, and have the parent tracer process exit, too.
When the child resumes execution (WIFCONTINUED(status) will be true).
You cannot assume that a PTRACE_SYSCALL, PTRACE_SYSEMU, PTRACE_CONT etc. commands have actually caused the child process to continue, until the parent gets this signal. In other words, you cannot just fire ptrace() commands to the child process, and expect them to take place in an orderly fashion! The ptrace() facility is asynchronous, and the call will return immediately; you need to waitpid() for the WIFCONTINUED(status) type of event to know that the child process heeded the command.
When the kernel stopped the child (with SIGTRAP) because the child process is about to execute a syscall. (In the parent, WIFSTOPPED(status) will be true.)
Whenever the child process gets stopped because it is about to execute a syscall, you need to use ptrace(PTRACE_GETREGS, childpid, (void *)0, &regs) to obtain the CPU register state in the child process at the point of syscall execution.
regs is of type struct user, defined in <sys/user.h>. For Intel/AMD architectures, regs.regs.eax (for 32-bit) or regs.regs.rax (for 64-bit) contains the syscall number (SYS_foo as defined in <sys/syscall.h>.
You then need to call ptrace(PTRACE_SYSCALL, childpid, (void *)0, (void *)0) to tell the kernel to execute that syscall, and waitpid() again to wait for the WIFCONTINUED(status) event notifying that it did.
The next WIFSTOPPED(status) type event from waitpid() will occur when the syscall is completed. If you want, you can use PTRACE_GETREGS again to examine regs.regs.eax or regs.regs.rax, which contains the syscall return value; on Intel/AMD, if an error occurred, it will be a negative errno value (i.e. -EACCES, -EINVAL, or similar.)
You need to call ptrace(PTRACE_SYSCALL, childpid, (void *)0, (void *)0) to tell the kernel to continue running the child, until the next syscall.
There are quite a few examples on-line showing some of the details above, although most that I have personally seen are pretty lax on error checking, and occasionally omit checking the WIFCONTINUED(status) waitpid() events. I've even written an answer detailing how to stop and continue individual threads on StackOverflow. Since the technique can be used as a very powerful custom debugging tool, I do recommend you try to learn the facility so you can leverage it in your work, rather than just copy-paste some existing code to get a passing grade on the exercise.

What's the purpose of exit(0) ?

I understand that exit(1) indicated an error , for example :
if (something went wrong)
exit(EXIT_FAILURE);
But what's the purpose of using exit(EXIT_SUCCESS); ?
When handling with processes maybe ? e.g. for fork() ?
thanks
This gives the part of the system that invokes the program (usually the command shell) a way to check if the program terminated normally or not.
Edit - start -
By the way, it is possible to query the exit code of an interactive command as well through the use of the $? shell variable. For instance this failed ls command yields an exit code of value 2.
$ ls -3
ls: invalid option -- '3'
Try `ls --help' for more information.
$ echo $?
2
Edit - end -
Imagine a batch file (or shell script) that invokes a series of programs and depending on the outcome of each run may choose some action or the other. This action may consist of a simple message to the user, or the invocation of some other program or set of programs.
This is a way for a program to return a status of its run.
Also, note that zero denotes no problem, any non-zero value indicates a problem.
Programs will often use different non-zero values to pass more information back (other than just non-normal termination). So the non-zero exit value then serves as a more specific error code that can identify a particular problem. This of course depends on the meanings of the code being available (usually/hopefully in the documentation)
For instance, the ls man page has this bit of information at the bottom:
Exit status is 0 if OK, 1 if minor problems, 2 if serious trouble.
For Unix/Linux man pages, look for the section titled EXIT STATUS to get this information.
you can only exit your program from the main function by calling return. To exit the program from anywhere else, you can call exit(EXIT_SUCCESS). For example, when the user clicks an exit button.
It's a system call. There's always good information on system calls if you check the man pages:
http://linux.die.net/man/3/exit
On a Linux box, you can simply type man exit into a terminal and this information will come up.
There are two ways of 'normally' exiting a program: returning from main(), or calling exit(). Normally exit() is used, and thought of, for signalling a failure. However, if you are not in main(), you must still exit somehow. exit(0) is usually used to terminate the process when not in main().
main() is actually not a special function to the operating system, only to the runtime environment. The 'function' that actually gets loaded is normally defined as _start() (this is handled by the linker, and beyond the scope of this answer), written in assembly, which simply prepares the environment and calls main(). Upon return from main(), it also calls exit() with the return value from main().

What's the difference between system() in C and Perl?

The system() function will launch a new process from C and a Perl script.
What exactly are the differences between processes called by system() in C and from Perl scripts, in terms of representation of error codes?
A little research brings up:
The return value is the exit status of
the program as returned by the wait
call. To get the actual exit value,
shift right by eight (see below). See
also "exec". This is not what you want
to use to capture the output from a
command, for that you should use
merely backticks or qx//, as described
in "STRING" in perlop. Return value
of -1 indicates a failure to start the
program or an error of the wait(2)
system call (inspect $! for the
reason).
And the docs of wait say:
Behaves like the wait(2) system call
on your system: it waits for a child
process to terminate and returns the
pid of the deceased process, or -1 if
there are no child processes. The
status is returned in $? and
${^CHILD_ERROR_NATIVE} . Note that a
return value of -1 could mean that
child processes are being
automatically reaped, as described in
perlipc.
Sources: This was taken from perldoc. Here's a tutorial on system in Perl.

Resources