Experts,
I have a client that connects over ssh to a server (it gets a tty allocated). I have a process A that is running on the server. Now, whenever the client disconnects, I need A to know about the tty that vanishes.
I was thinking since SSHD knows the session dying (after timeout or a simple exit), it can generate a signal to process A.
Is there any other way that A can get information about the tty that vanishes like listening on SIGHUP for the tty? I am writing the code in C on Linux.
Appreciate your help.
POSIX.1 provides a facility, utmpx, which lists the currently logged in users, their terminals, and other information. In Linux, that is the same as utmp; see man 5 utmp for further information.
OpenSSH does maintain utmp records.
Here is a simple example, that lists all users currently logged in from remote machines, the terminal they are using, and the initial process group the user owns:
#define _POSIX_C_SOURCE 200809L
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <utmpx.h>
int main(void)
{
struct utmpx *entry;
setutxent();
while ((entry = getutxent()))
if (entry->ut_type == USER_PROCESS && entry->ut_host[0] != '\0')
printf("%s is logged in on /dev/%s from %s and owns process group %d\n",
entry->ut_user, entry->ut_line, entry->ut_host,
(int)getpgid(entry->ut_pid));
return 0;
}
In your case, I would expect process A to maintain a list of remotely connected users, and periodically do a similar loop as above to update the status of known entries and to add new entries; and remove entries that are no longer seen.
New entries then match a "login" event, entries that are no longer seen a "logout" event (and deleted after the loop), and all other events are "still logged in" users.
The loop above is quite lightweight in terms of CPU time used and I/O used. The utmp records (/var/run/utmp in most Linux machines) are in binary form, and if frequently accessed, usually in page cache. Entries are relatively small, and even on servers with a lot of users the file read is well under a megabyte in size. Still, I wouldn't do it in a tight loop.
Personally, I would use inotify to wait for CLOSE_WRITE events on the UTMPX_FILE file (/var/run/utmp on most Linux machines), and reread the records after each event. That way the service would block on the read() on the inotify file descriptor most of the time (not wasting any CPU time), and pretty much immediately react to any login/logout events.
You face two problems, both difficult. The succinct answer is "you can't"; the longer answer is "you can't without making significant modifications".
A signal relays very little information other than the fact that it occurred. If you use sigaction() and SA_SIGINFO, you can find the process ID of the process that sent the signal, but under your scheme, that would be sshd, which isn't dreadfully helpful. Thus, it will be hard (nigh on impossible) to get the information about which terminal via the signal. Obviously, other schemes can be defined, but you'd have to write the information to a file, or something similar.
You'd have to modify sshd to record the information about which terminal it allocates (or is allocated) to its child processes, and then arrange for it to send that information to your Process A when a child terminates. That would be tricky, at best.
These two factors alone make it rather difficult. If you still want to do it, then the way I'd try is by getting sshd to run a special process of your devising, which in turn forks and the child runs the the process that sshd would otherwise run. The parent (a) records which terminal the child is connected to, and (b) waits for the child to terminate. When it does, it writes the terminal information to somewhere that Process A will find it, and exits. You still have to revise sshd, and you have to devise a mechanism whereby the parent process knows what to run as the child process (but that's probably not very hard; you leave the argument list unchanged, but simply have sshd exec your monitor process instead of whatever is specified as argv[0]…the parent uses argv[0] as the file argument to execvp().
This scheme minimizes the changes to sshd (but does still require a non-standard version). And you have to write the parent code carefully, and it has to cooperate with Process A. All decidedly non-trivial.
Related
I know similar questions have been asked, but I think my situation is little bit different. I need to check if child thread is alive, and if it's not print error message. Child thread is supposed to run all the time. So basically I just need non-block pthread_join and in my case there are no race conditions. Child thread can be killed so I can't set some kind of shared variable from child thread when it completes because it will not be set in this case.
Killing of child thread can be done like this:
kill -9 child_pid
EDIT: alright, this example is wrong but still I'm sure there exists way to kill a specific thread in some way.
EDIT: my motivation for this is to implement another layer of security in my application which requires this check. Even though this check can be bypassed but that is another story.
EDIT: lets say my application is intended as a demo for reverse engineering students. And their task is to hack my application. But I placed some anti-hacking/anti-debugging obstacles in child thread. And I wanted to be sure that this child thread is kept alive. As mentioned in some comments - it's probably not that easy to kill child without messing parent so maybe this check is not necessary. Security checks are present in main thread also but this time I needed to add them in another thread to make main thread responsive.
killed by what and why that thing can't indicate the thread is dead? but even then this sounds fishy
it's almost universally a design error if you need to check if a thread/process is alive - the logic in the code should implicitly handle this.
In your edit it seems you want to do something about a possibility of a thread getting killed by something completely external.
Well, good news. There is no way to do that without bringing the whole process down. All ways of non-voluntary death of a thread kill all threads in the process, apart from cancellation but that can only be triggered by something else in the same process.
The kill(1) command does not send signals to some thread, but to a entire process. Read carefully signal(7) and pthreads(7).
Signals and threads don't mix well together. As a rule of thumb, you don't want to use both.
BTW, using kill -KILL or kill -9 is a mistake. The receiving process don't have the opportunity to handle the SIGKILL signal. You should use SIGTERM ...
If you want to handle SIGTERM in a multi-threaded application, read signal-safety(7) and consider setting some pipe(7) to self (and use poll(2) in some event loop) which the signal handler would write(2). That well-known trick is well explained in Qt documentation. You could also consider the signalfd(2) Linux specific syscall.
If you think of using pthread_kill(3), you probably should not in your case (however, using it with a 0 signal is a valid but crude way to check that the thread exists). Read some Pthread tutorial. Don't forget to pthread_join(3) or pthread_detach(3).
Child thread is supposed to run all the time.
This is the wrong approach. You should know when and how a child thread terminates because you are coding the function passed to pthread_create(3) and you should handle all error cases there and add relevant cleanup code (and perhaps synchronization). So the child thread should run as long as you want it to run and should do appropriate cleanup actions when ending.
Consider also some other inter-process communication mechanism (like socket(7), fifo(7) ...); they are generally more suitable than signals, notably for multi-threaded applications. For example you might design your application as some specialized web or HTTP server (using libonion or some other HTTP server library). You'll then use your web browser, or some HTTP client command (like curl) or HTTP client library like libcurl to drive your multi-threaded application. Or add some RPC ability into your application, perhaps using JSONRPC.
(your putative usage of signals smells very bad and is likely to be some XY problem; consider strongly using something better)
my motivation for this is to implement another layer of security in my application
I don't understand that at all. How can signal and threads add security? I'm guessing you are decreasing the security of your software.
I wanted to be sure that this child thread is kept alive.
You can't be sure, other than by coding well and avoiding bugs (but be aware of Rice's theorem and the Halting Problem: there cannot be any reliable and sound static source code program analysis to check that). If something else (e.g. some other thread, or even bad code in your own one) is e.g. arbitrarily modifying the call stack of your thread, you've got undefined behavior and you can just be very scared.
In practice tools like the gdb debugger, address and thread sanitizers, other compiler instrumentation options, valgrind, can help to find most such bugs, but there is No Silver Bullet.
Maybe you want to take advantage of process isolation, but then you should give up your multi-threading approach, and consider some multi-processing approach. By definition, threads share a lot of resources (notably their virtual address space) with other threads of the same process. So the security checks mentioned in your question don't make much sense. I guess that they are adding more code, but just decrease security (since you'll have more bugs).
Reading a textbook like Operating Systems: Three Easy Pieces should be worthwhile.
You can use pthread_kill() to check if a thread exists.
SYNOPSIS
#include <signal.h>
int pthread_kill(pthread_t thread, int sig);
DESCRIPTION
The pthread_kill() function shall request that a signal be delivered
to the specified thread.
As in kill(), if sig is zero, error checking shall be performed
but no signal shall actually be sent.
Something like
int rc = pthread_kill( thread_id, 0 );
if ( rc != 0 )
{
// thread no longer exists...
}
It's not very useful, though, as stated by others elsewhere, and it's really weak as any type of security measure. Anything with permissions to kill a thread will be able to stop it from running without killing it, or make it run arbitrary code so that it doesn't do what you want.
As the parent process is using huge mount of memory, fork may fail with errno of ENOMEM under some configuration of kernel overcommit policy. Even though the child process may only exec low memory-consuming program like ls.
To clarify the problem, when /proc/sys/vm/overcommit_memory is configured to be 2, allocation of (virtual) memory is limited to SWAP + MEMORY * ration(default to 50%).
When a process forks, virtual memory is not copied thanks to COW. But the kernel still need to allocate virtual memory space. As an analogy, fork is like malloc(virtual memory space size) which will not allocate physical memory and writing to shared memory will cause copy of virtual memory and physical memory is allocated. When overcommit_memory is configured to be 2, fork may fail due to virtual memory space allocation.
Is it possible to fork a process without inherit virtual memory space of parent process in the following conditions?
if the child process calls exec after fork
if the child process doesn't call exec and will not using any global or static variable from parent process. For example, the child process just do some logging then quit.
As Basile Starynkevitch answered, it's not possible.
There is, however, a very simple and common solution used for this, that does not rely on Linux-specific behaviour or memory overcommit control: Use an early-forked slave process do the fork and exec.
Have the large parent process create an unix domain socket and fork a slave process as early as possible, closing all other descriptors in the slave (reopening STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO to /dev/null). I prefer a datagram socket for its simplicity and guarantees, although a stream socket will also work.
In some rare cases it is useful to have the slave process execute a separate dedicated small helper program. In most instances this is not necessary, and makes security design much easier. (In Linux, you can include SCM_CREDENTIALS ancillary messages when passing data using an Unix domain socket, and use the process ID therein to verify the identity/executable the peer is using the /proc/PID/exe pseudo-file.)
In any case, the slave process will block in reading from the socket. When the other end closes the socket, the read/receive will return 0, and the slave process will exit.
Each datagram the slave process receives, describes a command to execute. (Using a datagram allows using C strings, delimited with NUL characters, without any escaping etc.; using an Unix stream socket typically requires you to delimit the "command" somehow, which in turn means escaping the delimiters in the command component strings.)
The slave process creates one or more pipes, and forks a child process. This child process closes the original Unix socket, replaces the standard streams with the respective pipe ends (closing the other ends), and executes the desired command. I personally prefer to use an extra close-on-exec socket in Linux to detect successful execution; in an error case, the errno code is written to the socket, so that the slave-parent can reliably detect the failure and the exact reason, too. If success, the slave-parent closes the unnecessary pipe ends, replies to the original process about the success, with the other pipe ends as SCM_RIGHTS ancillary data. After sending the message, it closes the rest of the pipe ends, and waits for a new message.
On the original process side, the above process is sequential; only one thread may execute start executing an external process at a time. (You simply serialize the access with a mutex.) Several can run at the same time; it is only the request to and response from the slave helper that is serialized.
If that is an issue -- it should not be in typical cases -- you can for example multiplex the connections, by prefixing each message with an ID number (assigned by the parent process, monotonically increasing). In that case, you'll probably use a dedicated thread on the parent end to manage the communications with the slave, as you certainly cannot have multiple threads reading from the same socket at the same time, and expect deterministic results.
Further improvements to the scheme include things like using a dedicated process group for the executed processes, setting limits to them (by setting limits to the slave process), and executing the commands as dedicated users and groups by using a privileged slave.
The privileged slave case is where it is most useful to have the parent execute a separate helper process for it. In Linux, both sides can use SCM_CREDENTIALS ancillary messages via Unix domain sockets to verify the identity (PID, and with ID, the executable) of the peer, making it rather straightforward to implement robust security. (But note that /proc/PID/exe has to be checked more than once, to catch the attacks where a message is sent by a nefarious program, quickly executing the appropriate program but with command-line arguments that cause it to exit soon, making it occasionally look like the correct executable made the request, while a copy of the descriptor -- and thus the entire communications channel -- was in control of a nefariuous user.)
In summary, the original problem can be solved, although the answer to the posed question is No. If the executions are security-sensitive, for example change privileges (user accounts) or capabilities (in Linux), then the design has to be carefully considered, but in normal cases the implementation is quite straight-forward.
I'd be happy to elaborate if necessary.
No, it is not possible. You might be interested by vfork(2) which I don't recommend. Look also into mmap(2) and its MAP_NORESERVE flag. But copy-on-write techniques are used by the kernel, so you practically won't double the RAM consumption.
My suggestion is to have enough swap space to not being concerned by such an issue. So setup your computer to have more available swap space than the largest running process. You can always create some temporary swap file (e.g. with dd if=/dev/zero of=/var/tmp/swapfile bs=1M count=32768 then mkswap /var/tmp/swapfile) then add it as a temporary swap zone (swapon /var/tmp/swapfile) and remove it (swapoff /var/tmp/swapfile and rm /var/tmp/swapfile) when you don't need it anymore.
You probably don't want to swap on a tmpfs file system like /tmp/ often is, since tmpfs file systems are backed up by swap space!.
I dislike memory overcommitment and I disable it (thru proc(5)). YMMV.
I'm not aware of any way to do (2), but for (1) you could try to use vfork which will fork a new process without copying the page tables of the parent process. But this generally isn't recommended for a number of reasons, including because it causes the parent to block until the child performs an execve or terminates.
This is possible on Linux. Use the clone syscall without the flag CLONE_THREAD and with the flag CLONE_VM. The parent and child processes will use the same mappings, much like a thread would; there is no COW or page table copying.
madvise(addr, size, MADV_DONTFORK)
Alternatively, you can call munmap() after fork() to remove the virtual addresses inherited from the parent process.
Is there an easy way to determine if a certain process is running?
I need to know if an instance of my program is running in the background, and if not fork and create the background process.
Normally the race-free way of doing this is:
Open a lock file / pid file for writing (but do not truncate it)
Attempt to take an exclusive lock on it (using fcntl or flock) without blocking
If that fails with EAGAIN, then the other process is already running.
The file descriptor should now be inherited by the daemon and left open for its lifetime
The advantage of doing this over simply storing a PID, is that if somebody reuses the PID, you won't get a false positive.
The biggest problem with storing the pid in the file is that a low-numbered pid used by a system start up daemon can get reused on a subsequent reboot by a different daemon. I have seen this happen.
This is usually done using pidfiles: a file in /var/run/[name].pid containing only the process ID returned by fork().
if pidfile exists:
exit()
else:
create pidfile
pid = start_background()
pidfile.write(pid)
On shutdown: remove pidfile
Linux software, by far and large does not care about the exclusivity of programs, only the resources they use. "Caring" is most often provided by the implementation (E.G. the infrastructure of the distro).
For instance, if you want to run a program, but that program locks up or turns zombie and you have no way to kill it, or it's running as a different user performing some other function. Why should the program care whether another copy of itself is running? Having it do so only seems like an unnecessary restriction.
If it's a process that opens a socket (like a TCP port), have the program fail if it can't open the socket. If it needs exclusive access to a file, have it fail if it can't get it. Support a PID file, but don't make it mandatory.
You'll see this methodology all over GNU software, which is part of what makes it so versatile.
I want only one copy of my program in the system. How can I look for another copies in the system from C-code? I want something like that
# program &
[1] 12586
# program &
Program is already running
The best idea I have is making .lock-files. But I didn't find any guildlines about them.
Thank you.
One daemon I wrote opened a UNIX domain socket for regular client-daemon communication. Other instances then checked whether they could connect to that socket. If they could, another instance was currently running.
Edit: As noted by #psmears, there's a race condition. The other instances should just try to create that same listening socket. That will fail if it is already in use.
Lock files work more often than that special case. You may create an (empty) file in a well known location and then use file locks, say with fcntl(2) and F_SETLK and F_GETLK to set a lock on that file or determine whether a lock is held. May not work over NFS. Locks are cleared when your process dies, so this should work, and is portable (at least to HP-UX). Some daemons like to dump their pid into that file if they determine that no other instance is currently running.
You can use named sempahores, which is a very standard approach to this problem.
Your program calls semctl() to find if there are any active sempahores, then checks to see if you can run. If you find none, then you create the sempahore.
The OS handles the problem of processes being killed off with kill -9 and leaving sempahores.
You need to read the man page for semctl and sem_open for your machines to see what that mechanism
is.
This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Linux API to list running processes?
How can I detect hung processes in Linux using C?
Under linux the way to do this is by examining the contents of /proc/[PID]/* a good one-stop location would be /proc/*/status. Its first two lines are:
Name: [program name]
State: R (running)
Of course, detecting hung processes is an entirely separate issue.
/proc//stat is a more machine-readable format of the same info as /proc//status, and is, in fact, what the ps(1) command reads to produce its output.
Monitoring and/or killing a process is just a matter of system calls. I'd think the toughest part of your question would really be reliably determining that a process is "hung", rather than meerly very busy (or waiting for a temporary condition).
In the general case, I'd think this would be rather difficult. Even Windows asks for a decision from the user when it thinks a program might be "hung" (on my system it is often wrong about that, too).
However, if you have a specific program that likes to hang in a specific way, I'd think you ought to be able to reliably detect that.
Seeing as the question has changed:
http://procps.sourceforge.net/
Is the source of ps and other process tools. They do indeed use proc (indicating it is probably the conventional and best way to read process information). Their source is quite readable. The file
/procps-3.2.8/proc/readproc.c
You can also link your program to libproc, which sould be available in your repo (or already installed I would say) but you will need the "-dev" variation for the headers and what-not. Using this API you can read process information and status.
You can use the psState() function through libproc to check for things like
#define PS_RUN 1 /* process is running */
#define PS_STOP 2 /* process is stopped */
#define PS_LOST 3 /* process is lost to control (EAGAIN) */
#define PS_UNDEAD 4 /* process is terminated (zombie) */
#define PS_DEAD 5 /* process is terminated (core file) */
#define PS_IDLE 6 /* process has not been run */
In response to comment
IIRC, unless your program is on the CPU and you can prod it from within the kernel with signals ... you can't really tell how responsive it is. Even then, after the trap a signal handler is called which may run fine in the state.
Best bet is to schedule another process on another core that can poke the process in some way while it is running (or in a loop, or non-responsive). But I could be wrong here, and it would be tricky.
Good Luck
You may be able to use whatever mechanism strace() uses to determine what system calls the process is making. Then, you could determine what system calls you end up in for things like pthread_mutex deadlocks, or whatever... You could then use a heuristic approach and just decide that if a process is hung on a lock system call for more than 30 seconds, it's deadlocked.
You can run 'strace -p ' on a process pid to determine what (if any) system calls it is making. If a process is not making any system calls but is using CPU time then it is either hung, or is running in a tight calculation loop inside userspace. You'd really need to know the expected behaviour of the individual program to know for sure. If it is not making system calls but is not using CPU, it could also just be idle or deadlocked.
The only bulletproof way to do this, is to modify the program being monitored to either send a 'ping' every so often to a 'watchdog' process, or to respond to a ping request when requested, eg, a socket connection where you can ask it "Are you Alive?" and get back "Yes". The program can be coded in such a way that it is unlikely to do the ping if it has gone off into the weeds somewhere and is not executing properly. I'm pretty sure this is how Windows knows a process is hung, because every Windows program has some sort of event queue where it processes a known set of APIs from the operating system.
Not necessarily a programmatic way, but one way to tell if a program is 'hung' is to break into it with gdb and pull a backtrace and see if it is stuck somewhere.