Logging compatibly with logrotate - c

I am writing a Linux daemon that writes a log. I'd like the log to be rotated by logrotate. The program is written in C.
Normally, my program would open the log file when it starts, then write entries as needed and then, finally, close the log file on exit.
What do I need to do differently in order to support log rotation using logrotate? As far as I have understood, my program should be able to reopen the log file each time logrotate has finished it's work. The sources that I googled didn't, however, specify what reopening the log file exactly means. Do I need to do something about the old file and can I just create another file with the same name? I'd prefer quite specific instructions, like some simple sample code.
I also understood that there should be a way to tell my program when it is time to do the reopening. My program already has a D-Bus interface and I thought of using that for those notifications.
Note: I don't need instructions on how to configure logrotate. This question is only about how to make my own software compatible with it.

There are several common ways:
you use logrotate and your program should be able to catch a signal (usually SIGHUP) as a request to close and reopen its log file. Then logrotate sends the signal in a postrotate script
you use logrotate and your program is not aware of it, but can be restarted. Then logrotate restarts your program in a postrotate script. Cons: if the start of the program is expensive, this may be suboptimal
you use logrotate and your program is not aware of it, but you pass the copytruncate option to logrotate. Then logrotate copies the file and then truncates it. Cons: in race conditions you can lose messages. From rotatelog.conf manpage
... Note that there is a very small time slice between copying the file and truncating it, so some logging data might be lost...
you use rotatelogs, an utility for httpd Apache. Instead of writing directly to a file, you programs pipes its logs to rotatelogs. Then rotatelogs manages the different log files. Cons: your program should be able to log to a pipe or you will need to install a named fifo.
But beware, for critical logs, it may be interesting to close the files after each message, because it ensures that everything has reached the disk in case of an application crash.

Although man logrotate examples use the HUP signal, I recommend using USR1 or USR2, as it is common to use HUP for "reload configuration". So, in logrotate configuration file, you'd have for example
/var/log/yourapp/log {
rotate 7
weekly
postrotate
/usr/bin/killall -USR1 yourapp
endscript
}
The tricky bit is to handle the case where the signal arrives in the middle of logging. The fact that none of the locking primitives (other than sem_post(), which does not help here) are async-signal safe makes it an interesting issue.
The easiest way to do it is to use a dedicated thread, waiting in sigwaitinfo(), with the signal blocked in all threads. At exit time, the process sends the signal itself, and joins the dedicated thread. For example,
#define ROTATE_SIGNAL SIGUSR1
static pthread_t log_thread;
static pthread_mutex_t log_lock = PTHREAD_MUTEX_INITIALIZER;
static char *log_path = NULL;
static FILE *volatile log_file = NULL;
int log(const char *format, ...)
{
va_list args;
int retval;
if (!format)
return -1;
if (!*format)
return 0;
va_start(args, format);
pthread_mutex_lock(&log_lock);
if (!log_file)
return -1;
retval = vfprintf(log_file, format, args);
pthread_mutex_unlock(&log_lock);
va_end(args);
return retval;
}
void *log_sighandler(void *unused)
{
siginfo_t info;
sigset_t sigs;
int signum;
sigemptyset(&sigs);
sigaddset(&sigs, ROTATE_SIGNAL);
while (1) {
signum = sigwaitinfo(&sigs, &info);
if (signum != ROTATE_SIGNAL)
continue;
/* Sent by this process itself, for exiting? */
if (info.si_pid == getpid())
break;
pthread_mutex_lock(&log_lock);
if (log_file) {
fflush(log_file);
fclose(log_file);
log_file = NULL;
}
if (log_path) {
log_file = fopen(log_path, "a");
}
pthread_mutex_unlock(&log_lock);
}
/* Close time. */
pthread_mutex_lock(&log_lock);
if (log_file) {
fflush(log_file);
fclose(log_file);
log_file = NULL;
}
pthread_mutex_unlock(&log_lock);
return NULL;
}
/* Initialize logging to the specified path.
Returns 0 if successful, errno otherwise. */
int log_init(const char *path)
{
sigset_t sigs;
pthread_attr_t attrs;
int retval;
/* Block the rotate signal in all threads. */
sigemptyset(&sigs);
sigaddset(&sigs, ROTATE_SIGNAL);
pthread_sigmask(SIG_BLOCK, &sigs, NULL);
/* Open the log file. Since this is in the main thread,
before the rotate signal thread, no need to use log_lock. */
if (log_file) {
/* You're using this wrong. */
fflush(log_file);
fclose(log_file);
}
log_file = fopen(path, "a");
if (!log_file)
return errno;
log_path = strdup(path);
/* Create a thread to handle the rotate signal, with a tiny stack. */
pthread_attr_init(&attrs);
pthread_attr_setstacksize(65536);
retval = pthread_create(&log_thread, &attrs, log_sighandler, NULL);
pthread_attr_destroy(&attrs);
if (retval)
return errno = retval;
return 0;
}
void log_done(void)
{
pthread_kill(log_thread, ROTATE_SIGNAL);
pthread_join(log_thread, NULL);
free(log_path);
log_path = NULL;
}
The idea is that in main(), before logging or creating any other threads, you call log_init(path-to-log-file), noting that a copy of the log file path is saved. It sets up the signal mask (inherited by any threads you might create), and creates the helper thread. Before exiting, you call log_done(). To log something to the log file, use log() like you would use printf().
I'd personally also add a timestamp before the vfprintf() line, automatically:
struct timespec ts;
struct tm tm;
if (clock_gettime(CLOCK_REALTIME, &ts) == 0 &&
localtime_r(&(ts.tv_sec), &tm) == &tm)
fprintf(log_file, "%04d-%02d-%02d %02d:%02d:%02d.%03ld: ",
tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
tm.tm_hour, tm.tm_min, tm.tm_sec,
ts.tv_nsec / 1000000L);
This YYYY-MM-DD HH:MM:SS.sss format has the nice benefit that it is close to a worldwide standard (ISO 8601) and sorts in the correct order.

Normally, my program would open the log file when it starts, then
write entries as needed and then, finally, close the log file on exit.
What do I need to do differently in order to support log rotation
using logrotate?
No, your program should work as if it doesn't know anything about logrotate.
Do I need to do something about the old file and can I just create another file with the same name?
No. There should be only one log file to be opened and be written. Logrotate will check that file and if it becomes too large, it does copy/save the old part, and truncate the current log file. Therefore, your program should work completely transparent - it doesn't need to know anything about logrotate.

Related

attach a terminal to a process running as a daemon (to run an ncurses UI)

I have a (legacy) program which acts as a daemon (in the sense it runs forever waiting for requests to service) but which has an ncurses based user interface which runs on the host.
I would like to alter the program such that if I connect to the host via ssh I can enable the user interface on demand.
I know there is at least one way using pseudo-terminals but I'm not quite sure how to achieve it.
There are two application behaviours I consider interesting:
Run the UI only if the application is running in the foreground on a terminal
If the application runs in the foreground on a terminal - display the UI
If the application runs in the background - do not display the UI
If the application is moved to the background - close the UI
If the application is moved to the foreground of a terminal - open the UI
Create a new UI on demand when someone connects to the server
The application is running in the background
A new user logs in to the machine
They run something which causes an instance of the UI to open in their terminal
Multiple users can have their own instances of the UI.
Notes
There is a simple way to do this using screen. So:
original:
screen mydaemon etc...
new ssh session:
screen -d
screen -r
This detaches the screen leaving it running in the background and then reattches it to the current terminal. On closing the terminal the screen session becomes detached so this works quite well.
I'd like to understand what screen does under the hood, both for my own education and to understand how you would put some of that functionality into the application itself.
I know how I would do this for a server connected via a socket. What I would like to understand is how this could be done in principle with pseudo terminals. It is indeed a odd way to make an application work but I think it would serve to explore deeply the powers and limitations of using pseudo-terminals.
For case one, I assume I want the ncurses UI running in a slave terminal which the master side passing input to and from it.
The master process would use something like isatty() to check whether it is currently in the foreground of a terminal and activate or deactivate the UI using newterm() and endwin().
I've been experimenting with this but I have not got it to work yet as there are some aspects of terminals and ncurses that I have at best not got to grips with yet and at worst fundamental misunderstand.
Pseudo code for this is:
openpty(masterfd,slavefd)
login_tty();
fork();
ifslave
close(stdin)
close(stdout)
dup_a_new_stdin_from_slavefd();
newterm(NULL, newinfd, newoutfd); (
printw("hello world");
insert_uiloop_here();
endwin();
else ifmaster
catchandforwardtoslave(SIGWINCH);
while(noexit)
{
docommswithslave();
forward_output_as_appropriate();
}
Typically I either get a segfault inside fileno_unlocked() in newterm()
or output on the invoking terminal rather than a new invisible terminal.
Questions
What is wrong with the above pseudo code?
Do I have the master and slave ends the right way around?
What does login_tty actually do here?
Is there any practical difference between openpty() + login_tty() vs posix_openpt() + grantpt()?
Does there have to be a running process associated with or slave master tty at all times?
Note: This is a different question to ncurses-newterm-following-openpty which describes a particular incorrect/incomplete implementation for this use case and asks what is wrong with it.
This is a good question, and a good example of why we have pseudoterminals.
For the daemon to be able to use an ncurses interface, it requires a pseudoterminal (the slave side of a pseudoterminal pair), which is available from the point the daemon starts executing, continuously, until the daemon exits.
For a pseudoterminal to exist, there must be a process that has an open descriptor to the master side of the pseudoterminal pair. Additionally, it must consume all output from the pseudoterminal slave side (visible stuff output by ncurses). Usually, a library like vterm is used to interpret that output to "draw" the actual text framebuffer into an array (well, usually two arrays - one for the wide characters displayed in each cell (specific row and clumn), and another for the attributes like color).
For the pseudoterminal pair to work correctly, either the process at the master end is a parent or ancestor of the process running ncurses in the slave end, or the two are completely unrelated. The process running ncurses in the slave end should be in a new session, with the pseudoterminal as its controlling terminal. This is easiest to achieve, if we use a small pseudoterminal "server" that launches the daemon in a child process; and indeed, this is the pattern that is typically used with pseudoterminals.
The first scenario is not really feasible, because there is no parent/master process maintaining the pseudoterminal.
We can provide the behaviour of the first scenario, by adding a small pseudoterminal-providing "janitor" process, whose task is to maintain the pseudoterminal pair in existence, and to consume any ncurses output generated by the process running in the pseudoterminal pair.
However, that behavour also matches the second scenario.
Put another way, here is what would work:
Instead of launching the daemon directly, we use a custom program, say 'janitor', that creates a pseudoterminal and runs the daemon inside that pseudoterminal.
Janitor will stay running for as long as the daemon runs.
Janitor provides an interface for other processes to "connect" to the master side of the pseudoterminal pair.
This does not necessarily mean 1:1 proxying of data. Usually input (keypresses) to the daemon are provided unmodified, but how the contents of the pseudoterminal "framebuffer", the character-based virtual window contents, are transferred does vary. This is completely under our own control.
To connect to the janitor, we'll need a second helper program.
In the case of 'screen', these two programs are actually the same binary; the behaviour is just controlled by command-line parameters, and keypresses "consumed" by 'screen' itself, to control 'screen' behaviour and not passed to the actual ncurses-based process running in the pseudoterminal.
Thus far, we could just examine tmux or screen sources to see how they do the above; it is very straightforward terminal multiplexing stuff.
However, here we have a very interesting bit I had not considered before; this small bit made me understand the quite important core of this question:
Multiple users can have their own instances of the UI.
A process can only have one controlling terminal. This specifies a certain relationship. For example, when the master side of the controlling terminal is closed, the pseudoterminal pair vanishes, and the descriptors open to the slave side of the pseudoterminal pair become nonfunctional (all operations yield EIO, if I recall correctly); but more than that, every process in the process group receives a HUP signal.
The ncurses newterm() function lets a process connect to an existing terminal or pseudoterminal, at run time. That terminal does not need to be the controlling terminal, nor does the ncurses-using process need to belong to that session. It is important to realize that in this case, the standard streams (standard input, output, and error) are not redirected to the terminal.
So, if there is a way to tell a daemon that it has a new pseudoterminal available, and should open that because there is a user that wants to use the interface the daemon provides, we can have the daemon open and close the pseudoterminals on demand!
Note, however, that this requires explicit co-operation between the daemon, and the processes that are used to connect to the ncurses-based UI the daemon provides. There is no standard way of doing this with arbitrary ncurses-based processes or daemons. For example, as far as I know, nano and top provide no such interface; they only use the pseudoterminal associated with the standard streams.
After posting this answer – hopefully fast enough before the question is closed because others do not see the validity of the question, and its usefulness to other server-side POSIXy developers –, I shall construct an example program pair to exemplify the above; probably using an Unix domain socket as the "new UI for this user, please" communications channel, as file descriptors can be passed as ancillary data using Unix domain sockets, and identity of the user at either end of the socket can be verified (credentials ancillary data).
However, for now, let's go back to the questions asked.
What is wrong with the above pseudo code? [Typically I either get a segfault inside fileno_unlocked() in newterm() or output on the invoking terminal rather than a new invisible terminal.]
newinfd and newoutfd should be the same (or dup()s of) the pseudoterminal slave end file descriptor, slavefd.
I think there should also be an explicit set_term() with the SCREEN pointer returned by newterm() as a parameter. (It could be that it gets automatically called for the very first terminal provided by newterm(), but I'd rather call it explicitly.)
newterm() connects to and prepares a new terminal. The two descriptors usually both refer to the same slave side of a pseudoterminal pair; infd can be some other descriptor where the user keypresses are received from.
Only one terminal can be active in ncurses at a time. You need to use set_term() to select which one will be affected by following printw() etc. calls. (It returns the terminal that was previously active, so that one can do an update to another terminal and then return back to the original terminal.)
(This also means that if a program provides multiple terminals, it must cycle between them, checking for input, and update each terminal, at a relatively high frequency, so that human users feel the UI is responsive, and not "laggy". A crafty POSIX programmer can select or poll on the underlying descriptors, though, and only cycle through terminals that have input pending.)
Do I have the master and slave ends the right way around?
Yes, I do believe you do. Slave end is the one that sees a terminal, and can use ncurses. Master end is the one that provides keypresses, and does something with the ncurses output (say, draws them to a text-based framebuffer, or proxies to a remote terminal).
What does login_tty actually do here?
There are two commonly used pseudoterminal interfaces: UNIX98 (which is standardized in POSIX), and BSD.
With the POSIX interface, posix_openpt() creates a new pseudoterminal pair, and returns the descriptor to its master side. Closing this descriptor (the last open duplicate) destroys the pair. In the POSIX model, initially the slave side is "locked", and unopenable. unlockpt() removes this lock, allowing the slave side to be opened. grantpt() updates the character device (corresponding to the slave side of the pseudoterminal pair) ownership and mode to match the current real user. unlockpt() and grantpt() can be called in either order, but it makes sense to call grantpt() first; that way the slave side cannot be opened "accidentally" by other processes, before its ownership and access mode have been set properly. POSIX provides the path to the character device corresponding to the slave side of the pseudoterminal pair via ptsname(), but Linux provides an TIOCGPTPEER ioctl (in kernels 4.13 and later) that allows opening the slave end even if the character device node is not shown in the current mount namespace.
Typically, grantpt(), unlockpt(), and opening the slave side of the pseudoterminal pair are done in a child process (that still has access to the master-side descriptor) that has started a new session using setsid(). The child process redirects standard streams (standard input, output, and error) to the slave side of the pseudoterminal, closes its copy of the master-side descriptor, and makes sure the pseudoterminal is its controlling terminal. Usually this is followed by executing the binary that will use the pseudoterminal (usually via ncurses) for its user interface.
With the BSD interface, openpty() creates the pseudoterminal pair, providing open file descriptors to both sides, and optionally sets the pseudoterminal termios settings and window size. It roughly corresponds to POSIX posix_openpt() + grantpt() + unlockpt() + opening the slave side of the pseudoterminal pair + optionally setting the termios settings and terminal window size.
With the BSD interface, login_tty is run in the child process. It runs setsid() to create a new session, makes the slave side the controlling terminal, redirects standard streams to the slave side of the controlling terminal, and closes the copy of the master side descriptor.
With the BSD interface, forkpty() combines openpty(), fork(), and login_tty(). It returns twice; once in the parent (returning the PID of the child process), and once in the child (returning zero). The child is running in a new session, with the pseudoterminal slave side as its controlling terminal, already redirected to the standard streams.
Is there any practical difference between openpty() + login_tty() vs posix_openpt() + grantpt() [ + unlockpt() + opening the slave side]?
No, not really.
Both Linux and most BSDs tend to provide both. (In Linux, when using the BSD interface, you need to link in the libutil library (-lutil gcc option), but it is provided by the same package that provides the standard C library, and can be assumed to be always available.)
I tend to prefer the POSIX interface, even though it is lots more verbose, but other than kinda preferring POSIX interfaces over BSD ones, I don't even know why I prefer it over the BSD interface. The BSD forkpty() does basically everything for the most common use cases in one call!
Also, instead of relying on ptsname() (or the GNU ptsname_r() extension), I tend to first try the Linux-specific ioctl if it looks like it is available, and fall back to ptsname() if it is not available. So, if anything, I probably should prefer the BSD interface.. but the libutil kinda sorta annoys me a little bit, I guess, so I don't.
I definitely have no objection to others preferring the BSD interface. If anything, I'm a bit puzzled as to how my preference even exists; normally I prefer the simpler, more robust interfaces over the verbose, complex ones.
Does there have to be a running process associated with or slave master tty at all times?
There has to be a process having the master side of the pseudoterminal open. When the last duplicate of the descriptor is closed, the kernel destroys the pair.
Also, if the process having the master side descriptor does not read from it, the process running in the pseudoterminal will unexpectedly block in some ncurses call. Normally, the calls do not block (or only block for very short durations, shorter than what humans notice). If the process just reads but discards the input, then we do not actually know the contents of the ncurses terminal!
So, we can say that having a process that reads from the pseudoterminal pair master side, keeping a descriptor open to the master side, is absolutely required.
(The slave side is different; because the character device node is usually visible, a process can close its connection to the pseudoterminal temporarily, and just reopen it later. In Linux, when no process has an open descriptor to the slave side, the process reading from or writing to the master side will get EIO errors (read() and write() returning -1 with errno==EIO). I'm not absolutely certain if this is guaranteed behaviour, though; haven't thus far ever relied on it, and only noticed it recently (when implementing an example) myself.
Here is an example of an ncurses application that animates a bouncing X on each terminal supplied as a parameter:
// SPDX-License-Identifier: CC0-1.0
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <sys/ioctl.h>
#include <locale.h>
#include <curses.h>
#include <time.h>
#include <string.h>
#include <signal.h>
#include <stdio.h>
#include <errno.h>
#ifndef FRAMES_PER_SECOND
#define FRAMES_PER_SECOND 25
#endif
#define FRAME_DURATION (1.0 / (double)(FRAMES_PER_SECOND))
/* Because the terminals are not the controlling terminal for this process,
* this process may not receive the SIGWINCH signal whenever a screen size
* changes. Therefore, we call this function to update it whenever we switch
* between terminals.
*/
extern void _nc_update_screensize(SCREEN *);
/*
* Signal handler to notice if this program - all its terminals -- should exit.
*/
static volatile sig_atomic_t done = 0;
static void handle_done(int signum)
{
done = signum;
}
static int install_done(int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_handler = handle_done;
act.sa_flags = 0;
return sigaction(signum, &act, NULL);
}
/* Difference in seconds between to timespec structures.
*/
static inline double difftimespec(const struct timespec after, const struct timespec before)
{
return (double)(after.tv_sec - before.tv_sec)
+ (double)(after.tv_nsec - before.tv_nsec) / 1000000000.0;
}
/* Sleep the specified number of seconds using nanosleep().
*/
static inline double nsleep(const double seconds)
{
if (seconds <= 0.0)
return 0.0;
const long sec = (long)seconds;
long nsec = (long)(1000000000.0 * (seconds - (double)sec));
if (nsec < 0)
nsec = 0;
if (nsec > 999999999)
nsec = 999999999;
if (sec == 0 && nsec < 1)
return 0.0;
struct timespec req = { .tv_sec = (time_t)sec, .tv_nsec = nsec };
struct timespec rem = { .tv_sec = 0, .tv_nsec = 0 };
if (nanosleep(&req, &rem) == -1 && errno == EINTR)
return (double)(rem.tv_sec) + (double)(rem.tv_nsec) / 1000000000.0;
return 0.0;
}
/*
* Structure describing each client (terminal) state.
*/
struct client {
SCREEN *term;
FILE *in;
FILE *out;
int col; /* Ball column */
int row; /* Ball row */
int dcol; /* Ball direction in column axis */
int drow; /* Ball direction in row axis */
};
static size_t clients_max = 0;
static size_t clients_num = 0;
static struct client *clients = NULL;
/* Add a new terminal, based on device path, and optionally terminal type.
*/
static int add_client(const char *ttypath, const char *term)
{
if (!ttypath || !*ttypath)
return errno = EINVAL;
if (clients_num >= clients_max) {
const size_t temps_max = (clients_num | 15) + 13;
struct client *temps;
temps = realloc(clients, temps_max * sizeof clients[0]);
if (!temps)
return errno = ENOMEM;
clients_max = temps_max;
clients = temps;
}
clients[clients_num].term = NULL;
clients[clients_num].in = NULL;
clients[clients_num].out = NULL;
clients[clients_num].col = 0;
clients[clients_num].row = 0;
clients[clients_num].dcol = +1;
clients[clients_num].drow = +1;
clients[clients_num].in = fopen(ttypath, "r+");
if (!clients[clients_num].in)
return errno;
clients[clients_num].out = fopen(ttypath, "r+");
if (!clients[clients_num].out) {
const int saved_errno = errno;
fclose(clients[clients_num].in);
return errno = saved_errno;
}
clients[clients_num].term = newterm(term, clients[clients_num].in,
clients[clients_num].out);
if (!clients[clients_num].term) {
fclose(clients[clients_num].out);
fclose(clients[clients_num].in);
return errno = ENOMEM;
}
set_term(clients[clients_num].term);
start_color();
cbreak();
noecho();
nodelay(stdscr, TRUE);
keypad(stdscr, TRUE);
scrollok(stdscr, FALSE);
curs_set(0);
clear();
refresh();
clients_num++;
return 0;
}
static void close_all_clients(void)
{
while (clients_num > 0) {
clients_num--;
if (clients[clients_num].term) {
set_term(clients[clients_num].term);
endwin();
delscreen(clients[clients_num].term);
clients[clients_num].term = NULL;
}
if (clients[clients_num].in) {
fclose(clients[clients_num].in);
clients[clients_num].in = NULL;
}
if (clients[clients_num].out) {
fclose(clients[clients_num].out);
clients[clients_num].out = NULL;
}
}
}
int main(int argc, char *argv[])
{
struct timespec curr, prev;
int arg;
if (argc < 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
const char *arg0 = (argc > 0 && argv && argv[0] && argv[0][0]) ? argv[0] : "(this)";
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help ]\n", arg0);
fprintf(stderr, " %s TERMINAL [ TERMINAL ... ]\n", arg0);
fprintf(stderr, "\n");
fprintf(stderr, "This program displays a bouncing ball animation in each terminal.\n");
fprintf(stderr, "Press Q or . in any terminal, or send this process an INT, HUP,\n");
fprintf(stderr, "QUIT, or TERM signal to quit.\n");
fprintf(stderr, "\n");
return EXIT_SUCCESS;
}
setlocale(LC_ALL, "");
for (arg = 1; arg < argc; arg++) {
if (add_client(argv[arg], NULL)) {
fprintf(stderr, "%s: %s.\n", argv[arg], strerror(errno));
close_all_clients();
return EXIT_FAILURE;
}
}
if (install_done(SIGINT) == -1 ||
install_done(SIGHUP) == -1 ||
install_done(SIGQUIT) == -1 ||
install_done(SIGTERM) == -1) {
fprintf(stderr, "Cannot install signal handlers: %s.\n", strerror(errno));
close_all_clients();
return EXIT_FAILURE;
}
clock_gettime(CLOCK_MONOTONIC, &curr);
while (!done && clients_num > 0) {
size_t n;
/* Wait until it is time for the next frame. */
prev = curr;
clock_gettime(CLOCK_MONOTONIC, &curr);
nsleep(FRAME_DURATION - difftimespec(curr, prev));
/* Update each terminal. */
n = 0;
while (n < clients_num) {
int close_this_terminal = 0;
int ch, rows, cols;
set_term(clients[n].term);
/* Because the terminal is not our controlling terminal,
we may miss SIGWINCH window size change signals.
To work around that, we explicitly check it here. */
_nc_update_screensize(clients[n].term);
/* Process inputs - if we get any */
while ((ch = getch()) != ERR)
if (ch == 'x' || ch == 'X' || ch == 'h' || ch == 'H')
clients[n].dcol = -clients[n].dcol;
else
if (ch == 'y' || ch == 'Y' || ch == 'v' || ch == 'V')
clients[n].drow = -clients[n].drow;
else
if (ch == '.' || ch == 'q' || ch == 'Q')
close_this_terminal = 1;
if (close_this_terminal) {
endwin();
delscreen(clients[n].term);
fclose(clients[n].in);
fclose(clients[n].out);
/* Remove from array. */
clients_num--;
clients[n] = clients[clients_num];
clients[clients_num].term = NULL;
clients[clients_num].in = NULL;
clients[clients_num].out = NULL;
continue;
}
/* Obtain current terminal size. */
getmaxyx(stdscr, rows, cols);
/* Leave a trace of dots. */
if (clients[n].row >= 0 && clients[n].row < rows &&
clients[n].col >= 0 && clients[n].col < cols)
mvaddch(clients[n].row, clients[n].col, '.');
/* Top edge bounce. */
if (clients[n].row <= 0) {
clients[n].row = 0;
clients[n].drow = +1;
}
/* Left edge bounce. */
if (clients[n].col <= 0) {
clients[n].col = 0;
clients[n].dcol = +1;
}
/* Bottom edge bounce. */
if (clients[n].row >= rows - 1) {
clients[n].row = rows - 1;
clients[n].drow = -1;
}
/* Right edge bounce. */
if (clients[n].col >= cols - 1) {
clients[n].col = cols - 1;
clients[n].dcol = -1;
}
clients[n].row += clients[n].drow;
clients[n].col += clients[n].dcol;
mvaddch(clients[n].row, clients[n].col, 'X');
refresh();
/* Next terminal. */
n++;
}
}
close_all_clients();
return EXIT_SUCCESS;
}
This contains no pseudoterminals, and the only real quirk is the use of _nc_update_screensize() to detect if any of the terminals have changed. (Because they are not our controlling terminal, we do not receive the SIGWINCH signal, and thus ncurses misses the window change.)
I recommend compiling this with gcc -Wall -Wextra -O2 bounce.c -lncurses -o bounce.
Open a couple of terminal windows, and run tty to see the path to their controlling terminals (usually slave ends of pseudoterminals, /dev/pts/N).
Run ./bounce with one or more of those paths as parameters, and let the bouncing commence.
If you don't want the shell in a window to consume your input, and want the above program to see it, run e.g. sleep 6000 in the terminal windows before running the above command.
This program simply opens two streams to each terminal, and lets ncurses take control of them; basically, it is an example of a multi-terminal ncurses application, and how to juggle them using newterm(), set_term() and so on.
If you supply the same terminal more than once, pressing Q closes them in a random order, so ncurses may not revert the terminal to the original state correctly. (You may need to type reset blindly, to reset the terminal to a workable state; it's a companion command to clear, which just clears the terminal. They don't do anything else, just terminal stuff.)
Instead of providing the paths to the terminal devices as a command-line parameter, the program could just as well run all the time, but listen for incoming Unix domain datagrams, with SOL_SOCKET-level SCM_RIGHTS-type ancillary data, that can be used to duplicate file descriptors between unrelated processes.
However, if one relinguishes the control of a terminal like that (either by opening the terminal, or by passing the terminal file descriptor to another process), the problem is that it is impossible to revoke that access. We can avoid that by using a pseudoterminal in between, and proxying the data between the pseudoterminal and our real terminal. To break the connection, we simply stop proxying data and destroy the pseudoterminal pair, and revert our terminal to its initial state.
Examining the above program, we see that the procedure in pseudocode for taking control of a new terminal is
Obtain two FILE stream handles to the terminal.
The above program uses fopen() to open them like normal. Other programs can use dup() to duplicate a single descriptor, and fdopen() to convert them to stdio FILE stream handles.
Call SCREEN *term = newterm(NULL, in, out) to let ncurses know about this new terminal.
in and out are the two FILE stream handles. The first parameter is the terminal type string; if NULL, the TERM environment variable is used instead. A typical value today is xterm-256color, but ncurses supports a lot of other types of terminals also.
Call set_term(term) to make the new terminal the currently active one.
At this point, we can do normal ncurses setup stuff, like cbreak(); noecho(); and so on.
Relinguishing control of a terminal is also simple:
Call set_term(term) to make that terminal the currently active one.
Call endwin() and delscreen(term).
Close the two FILE streams to the terminal.
Updating terminal contents requires a loop, with each iteration handling one terminal, starting with a set_term(term) call (followed by the _nc_update_screensize(term) call, if we wish to react to window size changes in those terminals).
The above example program uses nodelay() mode, so that getch() will return either a keypress, or ERR if there is no pending input from the current terminal. (At least in Linux, we will get KEY_RESIZE whenever the window size changes, as long as either the terminal is our controlling terminal, or we call _nc_update_screensize().)
But note: if there are other processes also reading from that terminal, say a shell, the input could be read by any of the processes.

copy_to_user returns an error in a char device read function

I've implemented a char device for my kernel module and implemented a read function for it. The read function calls copy_to_user to return data to the caller. I've originally implemented the read function in a blocking manner (with wait_event_interruptible) but the problem reproduces even when I implement read in a non-blocking manner. My code is running on a MIPS procesor.
The user space program opens the char device and reads into a buffer allocated on the stack.
What I've found is that occasionally copy_to_user will fail to copy any bytes. Moreover, even if I replace copy_to_user with a call to memcpy (only for the purposes of checking... I know this isn't the right thing to do), and print out the destination buffer immediately afterwards, I see that memcpy has failed to copy any bytes.
I'm not really sure how to further debug this - how can I determine why memory is not being copied? Is it possible that the process context is wrong?
EDIT: Here's some pseudo-code outlining what the code currently looks like:
User mode (runs repeatedly):
char buf[BUF_LEN];
FILE *f = fopen(char_device_file, "rb");
fread(buf, 1, BUF_LEN, f);
fclose(f);
Kernel mode:
char_device =
create_char_device(char_device_name,
NULL,
read_func,
NULL,
NULL);
int read_func(char *output_buffer, int output_buffer_length, loff_t *offset)
{
int rc;
if (*offset == 0)
{
spin_lock_irqsave(&lock, flags);
while (get_available_bytes_to_read() == 0)
{
spin_unlock_irqrestore(&lock, flags);
if (wait_event_interruptible(self->wait_queue, get_available_bytes_to_read() != 0))
{
// Got a signal; retry the read
return -ERESTARTSYS;
}
spin_lock_irqsave(&lock, flags);
}
rc = copy_to_user(output_buffer, internal_buffer, bytes_to_copy);
spin_unlock_irqrestore(&lock, flags);
}
else rc = 0;
return rc;
}
It took quite a bit of debugging, but in the end Tsyvarev's hint (the comment about not calling copy_to_user with a spinlock taken) seems to have been the cause.
Our process had a background thread which occasionally launched a new process (fork + exec). When we disabled this thread, everything worked well. The best theory we have is that the fork made all of our memory pages copy-on-write, so when we tried to copy to them, the kernel had to do some work which could not be done with the spinlock taken. Hopefully it at least makes some sense (although I'd have guessed that this would apply only to the child process, and the parent's process pages would simply remain writable, but who knows...).
We rewrote our code to be lockless and the problem disappeared.
Now we just need to verify that our lockless code is indeed safe on different architectures. Easy as pie.

Atomically open and lock file

I have a file foo.hex that is accessed by two processes. One process has O_RDONLY access and the other has O_RDWR access.
When starting the system for the very first time, the reading process should not access the file before the writing process has initialized it.
Thus, I wrote something like this to initialize the file.
fd = open("foo.hex", O_RDWR|O_CREAT, 0666);
flock(fd, LOCK_EX);
init_structures(fd);
flock(fd, LOCK_UN);
Which still leaves the possibility to the reader process to access the file before it is initialized.
I couldn't find a way to open() and flock() in an atomic fashion. Besides mutexes what other possibilities are there to achieve my goal in an elegant way with as little overhead as possible (since it's only used once, the very first time the system is started)?
Make the writer create a file called "foo.hex.init" instead, and initialize that before renaming it to "foo.hex". This way, the reader can never see the uninitialized file contents.
Another approach could be to remove the existing file, recreate it without permissions for any process to access it, then change the file permissions after it's written:
unlink("foo.hex");
fd = open("foo.hex", O_RDWR|O_CREAT|O_EXCL, 0);
init_structures(fd);
fchmod(fd, 0666);
That likely won't work if you're running as root. (Which you shouldn't be doing anyway...)
This would prevent any process from using old data once the unlink() call is made. Depending on your requirements, that may or may not be worth the extra reader code necessary to deal with the file not existing or being accessible while the new file is being initialized.
Personally, I'd use the rename( "foo.hex.init", "foo.hex" ) solution unless init_structures() takes significant time, and there's a real, hard requirement to not use old data once new data is available. But sometimes important people aren't comfortable with using old data while any portion of new data is available, and they don't really understand, "If the reader process started two milliseconds earlier it would use the old data anyway".
An alternative approach is for the reader process to sleep a little and retry upon finding that the file doesn't yet exist, or is empty.
int open_for_read(const char *fname)
{
int retries = 0;
for (;;) {
int fd = open(fname, O_RDONLY);
if (fd == -1) {
if (errno != ENOENT) return -1;
goto retry;
}
if (flock(fd, LOCK_SH)) {
close(fd);
return -1;
}
struct stat st;
if (fstat(fd, &st)) {
close(fd);
return -1;
}
if (st.st_size == 0) {
close(fd);
goto retry;
}
return fd;
retry:
if (++retries > MAX_RETRIES) return -1;
sleep(1);
}
/* not reached */
}
You need similar code on the write side, so that if the writer loses the race it doesn't have to be restarted.
There are many ways of inter-process communications.
Perhaps use a named semaphore that the writing process locks before opening and initializing the file? Then the reading process could attempt to lock the semaphore as well, and if it succeeds and the file doesn't exist it unlocks the semaphore and wait a little while and retry.
The simplest way though, especially if the file will be recreated by the writing process every time, is already in the answer by John Zwinck.

Write atomically to a file using Write() with snprintf()

I want to be able to write atomically to a file, I am trying to use the write() function since it seems to grant atomic writes in most linux/unix systems.
Since I have variable string lengths and multiple printf's, I was told to use snprintf() and pass it as an argument to the write function in order to be able to do this properly, upon reading the documentation of this function I did a test implementation as below:
int file = open("file.txt", O_CREAT | O_WRONLY);
if(file < 0)
perror("Error:");
char buf[200] = "";
int numbytes = snprintf(buf, sizeof(buf), "Example string %s" stringvariable);
write(file, buf, numbytes);
From my tests it seems to have worked but my question is if this is the most correct way to implement it since I am creating a rather large buffer (something I am 100% sure will fit all my printfs) to store it before passing to write.
No, write() is not atomic, not even when it writes all of the data supplied in a single call.
Use advisory record locking (fcntl(fd, F_SETLKW, &lock)) in all readers and writers to achieve atomic file updates.
fcntl()-based record locks work over NFS on both Linux and BSDs; flock()-based file locks may not, depending on system and kernel version. (If NFS locking is disabled like it is on some web hosting services, no locking will be reliable.) Just initialize the struct flock with .l_whence = SEEK_SET, .l_start = 0, .l_len = 0 to refer to the entire file.
Use asprintf() to print to a dynamically allocated buffer:
char *buffer = NULL;
int length;
length = asprintf(&buffer, ...);
if (length == -1) {
/* Out of memory */
}
/* ... Have buffer and length ... */
free(buffer);
After adding the locking, do wrap your write() in a loop:
{
const char *p = (const char *)buffer;
const char *const q = (const char *)buffer + length;
ssize_t n;
while (p < q) {
n = write(fd, p, (size_t)(q - p));
if (n > 0)
p += n;
else
if (n != -1) {
/* Write error / kernel bug! */
} else
if (errno != EINTR) {
/* Error! Details in errno */
}
}
}
Although there are some local filesystems that guarantee write() does not return a short count unless you run out of storage space, not all do; especially not the networked ones. Using a loop like above lets your program work even on such filesystems. It's not too much code to add for reliable and robust operation, in my opinion.
In Linux, you can take a write lease on a file to exclude any other process opening that file for a while.
Essentially, you cannot block a file open, but you can delay it for up to /proc/sys/fs/lease-break-time seconds, typically 45 seconds. The lease is granted only when no other process has the file open, and if any other process tries to open the file, the lease owner gets a signal. (If the lease owner does not release the lease, for example by closing the file, the kernel will automagically break the lease after the lease-break-time is up.)
Unfortunately, these only work in Linux, and only on local files, so they are of limited use.
If readers do not keep the file open, but open, read, and close it every time they read it, you can write a full replacement file (must be on the same filesystem; I recommend using a lock-subdirectory for this), and hard-link it over the old file.
All readers will see either the old file or the new file, but those that keep their file open, will never see any changes.

locking of copy_[to/from]_user() in linux kernel

as stated in: http://www.kernel.org/doc/htmldocs/kernel-hacking.html#routines-copy this functions "can" sleep.
So, do I always have to do a lock (e.g. with mutexes) when using this functions or are there exceptions?
I'm currently working on a module and saw some Kernel Oops at my system, but cannot reproduce them. I have a feeling they are fired because I'm currently do no locking around copy_[to/from]_user(). Maybe I'm wrong, but it smells like it has something to do with it.
I have something like:
static unsigned char user_buffer[BUFFER_SIZE];
static ssize_t mcom_write (struct file *file, const char *buf, size_t length, loff_t *offset) {
ssize_t retval;
size_t writeCount = (length < BUFFER_SIZE) ? length : BUFFER_SIZE;
memset((void*)&user_buffer, 0x00, sizeof user_buffer);
if (copy_from_user((void*)&user_buffer, buf, writeCount)) {
retval = -EFAULT;
return retval;
}
*offset += writeCount;
retval = writeCount;
cleanupNewline(user_buffer);
dispatch(user_buffer);
return retval;
}
Is this save to do so or do I need locking it from other accesses, while copy_from_user is running?
It's a char device I read and write from, and if a special packet in the network is received, there can be concurrent access to this buffer.
You need to do locking iff the kernel side data structure that you are copying to or from might go away otherwise - but it is that data structure you should be taking a lock on.
I am guessing your function mcom_write is a procfs write function (or similar) right? In that case, you most likely are writing to the procfs file, your program being blocked until mcom_write returns, so even if copy_[to/from]_user sleeps, your program wouldn't change the buffer.
You haven't stated how your program works so it is hard to say anything. If your program is multithreaded and one thread writes while another can change its data, then yes, you need locking, but between the threads of the user-space program not your kernel module.
If you have one thread writing, then your write to the procfs file would be blocked until mcom_write finishes so no locking is needed and your problem is somewhere else (unless there is something else that is wrong with this function, but it's not with copy_from_user)

Resources