This question already has answers here:
What is the origin of magic number 42, indispensable in coding? [closed]
(6 answers)
Closed 6 years ago.
Why do we use 42 as an argument of exit while exiting the process? I am wondering is it some macro value (like 1 is value of EXIT_FAILURE macro) or it has some deeper meaning?
if(pid == 0) {
printf("something\n");
exit(42);
}
It is kind of clear that it doesn't matter if I use exit(1) or exit(42), but why just 42?
Any number except for 0 would have done. But 42 is the Answer to the Ultimate Question of Life, the Universe, and Everything.
Very popular among IT people...
But why did Douglas Adams pick 42?
I sat at my desk, stared into the garden and thought '42 will do'. I
typed it out. End of story
Such magic value may be used to indicate exact exit reason to parent process. You may threat it like a some kind of minimalistic IPC. Of course both processes must agree about actual values and their meanings, as well as do not use special reserved exit codes.
It is kind of clear that it doesn't matter if I use exit(1) or exit(42)
It actually matters a lot.
The exit code can be used by the process that launches the exiting process to know how it completed and why it failed.
The process that launches your program can inspect the value of the environment variable $? immediately after your program completes to know if it succeeded or why it failed, if it didn't succeed.
Let's say your program downloads a file from a remote site and stores it in a local directory. It expects to use an existing local directory and it doesn't attempt to create it if it doesn't exist. It can exit, for example, with code 37 when the remote file cannot be downloaded because the remote site return 404 Not Found, with code 62 when it cannot download the file because the network is down (or a timeout happens) and code 41 when the local directory does not exist.
A bash script, for example, that invokes your program can check the value of the environment variable $? immediately after your program completes. If its value is 37 (remote file is not found) it must not attempt to retry because the error is permanent. On exit code 62 (network issues) it can wait a couple of seconds and try again (the error condition is transient, it could disappear after a while). On exit code 41 (local directory not found) it can create the local directory then launch your program again (a precondition was not met).
Related
I'm using Camel Exec for automated shutdowns on some of our devices.
The shutdown command is pretty simple, and it mostly works fine:
from(START_DEEP_SLEEP)
.setBody(constant(null)) // we don't want stdin for exec
.setHeader(ExecBinding.EXEC_COMMAND_ARGS, constant("""shutdown $shutdownDelay "starting deep sleep shutdown" """))
.to("exec:sudo")
Obviously, this command will send a shutdown to the application executing it. That too isn't much of an issue, except that sometimes this produces an exit value of 143. I know the meaning of the return value, and it makes sense to see it here, but this only happens on some devices. Most others just return 0. They are all of the same type, so I really don't know where this discrepancy comes from, but it's not even that big an issue. The shutdown works none the less.
The problem is that camel exec logs this as an error:
ERROR 549 --- [Camel (camel-1) thread #1 - seda://start-deepsleep] o.a.camel.component.exec.ExecProducer : The command ExecCommand [args=[shutdown, now, starting deep sleep shutdown], executable=sudo, timeout=9223372036854775807, outFile=null, workingDir=null, useStderrOnEmptyStdout=false] returned exit value 143
This produces undesired noise in our monitoring, and I would rather not have it logged.
The core issue here is that Camel Exec does not throw, so there's no exception I could handle. It just logs the error, which then gets picked up by our log analysis.
I would like to handle that exit code gracefully without camel Exec logging an error. The return value is already logged separately anyways. How can I do that?
According to the docu http://camel.apache.org/exec.html there is a header ExecBinding.EXEC_EXIT_VALUE filled with the error number. Yours should be 143 (the docu states that this depends on the OS).
That could be a "hook" to handle the log entry, e.g. deleting the last entry with the same error number.
Of course this is only a cosmetic fix. The implementation could be like this:
from(START_DEEP_SLEEP)
.setBody(constant(null)) // we don't want stdin for exec
.setHeader(ExecBinding.EXEC_COMMAND_ARGS, constant("""shutdown $shutdownDelay "starting deep sleep shutdown" """))
.to("exec:sudo")
.when(header(ExecBinding.EXEC_EXIT_VALUE))
.to("direct:edit_the_log")
Please note that I did not test that code. Maybe u access that header with
.when(header(EXEC_EXIT_VALUE))
instead.
Please, inform me if that could be a proper solution or not.
I take LTP (Linux Test Project) on embedded device. Device stuck in following while loop in test case setfsgid03, because getgrgid() always return NULL when it is called by nobody .
It works fine when it is called by root on embedded device. And it works fine on x86 linux host when it is called by nobody.
Is it caused by any configuration of linux on device?
Relevant code snippet is below:
gid = 1;
while (!getgrgid(gid))
gid++;
getgrgid will read the entries from /etc/group or with Glibc more generally from sources specified in /etc/nsswitch.conf. If /etc/group does not exist or it doesn't have other groups besides the gid then this code will loop at least until the wrap-around/signed overflow of gid. If there is only the entry for nobody at pid -2 it will also take ages to find that pid.
All in all, the code is utterly bad. I'd just ensure that there is an entry in /etc/group with GID 2 say; the proper way to find a defined non-root gid would be to use getgrent_r successively until the returned record has gr_gid != 0, and fail if NULL is returned before such a record is found.
I want to use p3dfft first NxNxN and in-between 2Nx2Nx2N dimensions. I have tried to implement that, it shows
P3DFFT Setup error: the problem is already initialized.
Currently multiple setups not supported.
Quit the library using p3dfft_clean before initializing another setup
or if I tried to set up p3dfft 2 times (without cleaning first one) it shows error code 139. See below,
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 4127 RUNNING AT localhost.localdomain
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
I was trying to making header file and made one function for given dimension but still error is there when function called 2 times for 2 different dimensions.
I am developing a shared-library L which is used by an other system service S. In my program I need to call a small program P, which uses Oracle shared libraries.
The small program P is packaged and installed correctly, and the environment variables, such as PATH,LD_LIBRARY_PATH and ORACLE_HOME are set correctly. I can run P on command line without any problem.
But when service S call my library L which runs the small program P via system(), it gives me a return code 127. I've googled, people says it's a command not found error, probably a PATH issue, so I've tried with absolute path like the following
int status = system("/usr/bin/myprog --args");
if (status < 0) { ... }
ret = WEXITSTATUS(status);
ret still equals 127.
Any idea please ? Thank you.
Update
It turns out that the service S is launched via command daemon, in its init.d script, I have found the following line:
daemon /usr/bin/myserv
if I export explicitly all my environment variables (PATH, ORACLE_HOME and LD_LIBRARY_PATH), it works. I don't know if daemon eliminates my environment variables.
this excerpt from the man page for system()
-----------------------------------------------------------------
The value returned is -1 on error (e.g., fork(2) failed), and the
return status of the command otherwise.
This latter return status is
in the format specified in wait(2).
Thus, the exit code of the command
will be WEXITSTATUS(status).
In case /bin/sh could not be executed,
the exit status will be that of a command that does exit(127)."
-----------------------------------------------------------------
indicates the 127 means that /bin/sh could not be executed.
Well, I have found the answer:How to make unix service see environment variables?,the environment variables are removed in init.d script.
I'm not an expert C programmer. I'm having trouble debugging a program using GDB. (The bug I am trying to fix is unrelated to the problem I am asking about here.) My problem is that the program runs fine when I run the binary directly from a shell, but the program crashes when I run it using GDB.
Here is some information about the program which may be useful: it is a 20+ year old piece of database software, originally written for Solaris (I think) but since ported to Linux, which is setuid (but not to root, thank god).
The program crashes in GDB when trying to open a file for writing. Using GDB, I was able to determine that crash occurs because the following system call fails:
fd = open(path, O_WRONLY|O_CREAT|O_TRUNC, 0644);
For clarification: path is the path to a lockfile which should not exist. If the lock file exists, then the program shuts down cleanly before it even reaches this system call.
I do not understand why this system call would fail, since 1) The user this program runs as has rwx permissions on the directory containing path (I have verified this by examining the value of the variable stored in path), and 2) the program successfully opens the file for writing when I am not using GDB to debug it.
Are there any reasons why I cannot
The key turns out to be this bit:
... is setuid (but not to root, thank god).
When you run a program under (any) debugger (using any of the stop-and-inspect/modify program facilities), the kernel disables setuid-ness, even for non-root setuid.
If you think about this a bit it makes sense. Consider a game that keeps a "high scores" file, and uses "setuid games" to do this, with:
fd = open(GAME_SCORE_FILE, open_mode, file_mode);
score_data = read_scores(fd);
/* set breakpoint here or so */
if (check_for_new_high_score(current_score, score_data)) {
printf("congratulations, you've entered the High Scores records!\n");
save_scores(fd, score_data);
}
close(fd);
Access to the "high scores" file is protected by file permissions: only the "games" user can write to it.
If you run the game under a debugger, though, you can set a breakpoint at the marked line, and set the current_score data to some super-high value and then resume the program.
To avoid allowing debuggers to corrupt the internal data of setuid programs, the kernel simply disables setuid-ness when running code with debug facilities enabled. If you can su (or sudo or whatever) to the user, indicating that you have permission regardless of any debugging, you can then run gdb itself as that user, so that the program runs as the user it "would have" setuid-ed to.