Different execution flow using read() and fgets() in C - c

I have a sample program that takes in an input from the terminal and executes it in a cloned child in a subshell.
#define _GNU_SOURCE
#include <stdlib.h>
#include <sys/wait.h>
#include <sched.h>
#include <unistd.h>
#include <string.h>
#include <signal.h>
int clone_function(void *arg) {
execl("/bin/sh", "sh", "-c", (char *)arg, (char *)NULL);
}
int main() {
while (1) {
char data[512] = {'\0'};
int n = read(0, data, sizeof(data));
// fgets(data, 512, stdin);
// int n = strlen(data);
if ((strcmp(data, "exit\n") != 0) && n > 1) {
char *line;
char *lines = strdup(data);
while ((line = strsep(&lines, "\n")) != NULL && strcmp(line, "") != 0) {
void *clone_process_stack = malloc(8192);
void *stack_top = clone_process_stack + 8192;
int clone_flags = CLONE_VFORK | CLONE_FS;
clone(clone_function, stack_top, clone_flags | SIGCHLD, (void *)line);
int status;
wait(&status);
free(clone_process_stack);
}
} else {
exit(0);
}
}
return 0;
}
The above code works in an older Linux system (with minimal RAM( but not in a newer one. Not works means that if I type a simple command like "ls" I don't see the output on the console. But with the older system I see it.
Also, if I run the same code on gdb in debugger mode then I see the output printed onto the console in the newer system as well.
In addition, if I use fgets() instead of read() it works as expected in both systems without an issue.
I have been trying to understand the behavior and I couldn't figure it out. I tried doing an strace. The difference I see is that the wait() return has the output of the ls program in the cases it works and nothing for the cases it does not work.
Only thing I can think of is that read(), since its not a library function has undefined behavior across systems. But I can't agree as to how its affecting the output.
Can someone point me out to why I might be observing this behavior?
EDIT
The code is compiled as:
gcc test.c -o test
strace when it's not working as expected is shown below
strace when it's working as expected (only difference is I added a printf("%d\n", n); following the call for read())
Thank you
Shabir

There are multiple problems in your code:
a successful read system call can return any non zero number between 1 and the buffer size depending on the type of handle and available input. It does not stop at newlines like fgets(), so you might get line fragments, multiple lines, or multiple lines and a line fragment.
furthermore, if read fills the whole buffer, as it might when reading from a regular file, there is no trailing null terminator, so passing the buffer to string functions has undefined behavior.
the test if ((strcmp(data, "exit\n") != 0) && n > 1) { is performed in the wrong order: first test if read was successful, and only then test the buffer contents.
you do not set the null terminator after the last byte read by read, relying on buffer initialization, which is wasteful and insufficient if read fills the whole buffer. Instead you should make data one byte longer then the read size argument, and set data[n] = '\0'; if n > 0.
Here are ways to fix the code:
using fgets(), you can remove the line splitting code: just remove initial and trailing white space, ignore empty and comment lines, clone and execute the commands.
using read(), you could just read one byte at a time, collect these into the buffer until you have a complete line, null terminate the buffer and use the same rudimentary parser as above. This approach mimics fgets(), by-passing the buffering performed by the standard streams: it is quite inefficient but avoids reading from handle 0 past the end of the line, thus leaving pending input available for the child process to read.

It looks like 8192 is simply too small a value for stack size on a modern system. execl needs more than that, so you are hitting a stack overflow. Increase the value to 32768 or so and everything should start working again.

Related

How is the speed of printf affected by a presence of a forked process and '\n'?

I had this simple shell like program that works both in interactive and non-interactive mode. I have simplified the code as much as I can to present my question, but it is still a bit long, so sorry for that!
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
/**
*main-entry point for gbk
*Return: returns the index of 0 on sucess
*/
int main(void)
{
char *cmd = malloc(1 * sizeof(char)), *cmdargs[2];
size_t cmdlen = 0;
int childid, len;
struct stat cmdinfo;
while (1)
{
printf("#cisfun$ ");
len = getline(&cmd, &cmdlen, stdin);
if (len == -1)
{
free(cmd);
exit(-1);
}
/*replace the ending new line with \0*/
cmd[len - 1] = '\0';
cmdargs[0] = cmd;
cmdargs[1] = NULL;
childid = fork();
if (childid == 0)
{
if (stat(*cmdargs, &cmdinfo) == 0 && cmdinfo.st_mode & S_IXUSR)
execve(cmdargs[0], cmdargs, NULL);
else
printf("%s: command not found\n", *cmdargs);
exit(0);
}
else
wait(NULL);
}
free(cmd);
exit(EXIT_SUCCESS);
}
To summarize what this program does, it will first print the prompt #cisfun$ , waits for an input in interactive mode and takes the piped value in non-interactive mode, creates a child process, the child process checks if the string passed is a valid executable binary, and if it is, it executes it other wise it prints a command not found message and prompts again.
I have got this program to work fine for most of the scenarios in interactive mode, but when I run it in non-interactive mode all sorts of crazy (unexpected) things start to happen.
For example, when I run echo "/bin/ls"|./a.out, (a.out is the name of the compiled program)
you would first expect the #cisfun$ message to be printed since that is the first thing performed in the while loop, and then the output of the /bin/ls command, and finally #cisfun$ prompt, but that isn't what actually happens. Here is what happens,
It is very weird the ls command is run even before the first print message. I, at first, thought there was some threading going on and the printf was slower than the child process executing the ls command. But I am not sure if that is true as I am a noob. and also things get a bit crazier if I was printing a message with '\n' at the end rather than just a string. (if I change printf("#cisfun$ "); to printf("#cisfun$\n");) the following happens,
It works as it should, so it got me thinking what is the relation between '\n', fork and speed of printf. Just in short what is the explanation for this.
The second question I have is, why doesn't my program execute the first command and go to an interactive mode, I don't understand why it terminates after printing the second #cisfun$ message. By checking the status code (255) after exit I have realized that the effect is the same as pressing ctr+D in the interactive mode, which I believe is exited by the getline function. But I dont understand why EOF is being inserted in the second prompt.

How to use write() or fwrite() for writing data to terminal (stdout)?

I am trying to speed up my C program to spit out data faster.
Currently I am using printf() to give some data to the outside world. It is a continuous stream of data, therefore I am unable to use return(data).
How can I use write() or fwrite() to give the data out to the console instead of file?
Overall my setup consist of program written in C and its output goes to the python script, where the data is processed further. I form a pipe:
./program_in_c | script_in_python
This gives additional benefit on Raspberry Pi by using more of processor's cores.
#include <unistd.h>
ssize_t write(int fd, const void *buf, size_t count);
write() writes up to count bytes from the buffer starting at buf to
the file referred to by the file descriptor fd.
the standard output file descriptor is: 1 in linux at least!
concern using flush the stdoutput buffer as well, before calling to write system call to ensure that all previous garabge was cleaned
fflush(stdout); // Will now print everything in the stdout buffer
write(1, buf, count);
using fwrite:
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);
The function fwrite() writes nmemb items of data, each size bytes
long, to the stream pointed to by stream, obtaining them from the
location given by ptr.
fflush(stdout);
int buf[8];
fwrite(buf, sizeof(int), sizeof(buf), stdout);
Please refare to man pages for further reading, in the links below:
fwrite
write
Well, there's little or no win in trying to overcome the already used buffering system of the stdio.h package. If you try to use fwrite() with larger buffers, you'll probably win no more time, and use more memory than is necessary, as stdio.h selects the best buffer size appropiate to the filesystem where the data is to be written.
A simple program like the following will show that speed is of no concern, as stdio is already buffering output.
#include <stdio.h>
int
main()
{
int c;
while((c = getchar()) >= 0)
putchar(c);
}
If you try the above and below programs:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int
main()
{
char buffer[512];
int n;
while((n = read(0, buffer, sizeof buffer)) > 0)
write(1, buffer, n);
if (n < 0) {
perror("read");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
You will see that there's no significative difference or, even, the first program will be faster, despite it is doing I/O on a per character basis. (as B. Kernighan & Dennis Ritchie wrote it in her first edition of "The C programming language") Most probably the first program will win.
The calls to read() and write() involve a system call each, with a buffer size decided by you. The individual getchar() and putchar() calls don't. They just store the received chars in a memory buffer, as you print them, whose size has been decided by the stdio.h library implementation, based on the filesystem, and it flushes the buffer, once it is full of data. If you grow the buffer size in the second program, you'll see that you get better results increasing it up to a point, but after that you'll see no more increment in speed. The number of calls made to the library is insignificant with respect to the time involved in doing the actual I/O, and selecting a very large buffer, will eat much memory from your system (and a Raspberry Pi memory is limited in this sense, to 1Gb or ram) If you end making swap due to a so large buffer, you'll lose the battle completely.
Most filesystems have a preferred buffer size, because the kernel does write ahead (the kernel reads more than what you asked for, on sequential reads, in prevision that you'll continue reading more after you consumed the data) and this affects the optimum buffer size. For that, the stat(2) system call tells you what is the optimum buffer size, and stdio uses that when it selects the actual buffer size.
Don't think you are going to get better (or much better) than the program listed first above. Even if you use large enough buffers.
What is not correct (or valid) is to intermix calls that do buffering (like all the stdio package's) with basic system calls (like read(2) or write(2) ---as I've seen recommending you to use fflush(3) after write(2), which is totally incoherent--- that do not buffer the data) there's no earn (and probably you'll get your output incorrectly ordered, if you do part of the calls using printf(3) and part using write(2) (this happens more in pipelines like you plan to do, because the buffers are not line oriented ---another characteristic of buffered output in stdio---)
Finally, I recomend you to read "The Unix programming environment" by Dennis Ritchie and Rob Pike. It will teach you a lot of unix, but one very good thing is that it will teach you to use perfectly the stdio package and the unix filesystem calls for reading and writing. With a little of luck you'll find it in .pdf on internet.
The next program shows you the effect of buffering:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int
main()
{
int i;
char *sep = "";
for (i = 0; i < 10; i++) {
printf("%s%d", sep, i);
sep = ", ";
sleep(1);
}
printf("\n");
}
One would assume you are going to see (on the terminal) the program, writing the numbers 0 to 9, separated by , and paced on one second intervals.
But due to the buffering, what you observe is quite different, you'll see how your program waits for 10 seconds without writing anything at all on the terminal, and at the end, writes everything in one shot, including the final line end, when the program terminates, and the shell shows you the prompt again.
If you change the program to this:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int
main()
{
int i;
char *sep = "";
for (i = 0; i < 10; i++) {
printf("%s%d", sep, i);
fflush(stdout);
sep = ", ";
sleep(1);
}
printf("\n");
}
You'll see the expected output, because you have told stdio to flush the buffer at each loop pass. In both programs you did 10 calls to printf(3), but there was only one write(2) at the end to write the full buffer. In the second version you forced stdio to do one such write(2) after each printf, and that showed the data out as the program passed through the loop.
Be careful, because another characteristic of stdio can be confounding you, as printf(3), when you print to a terminal device, flushes the output at each \n, but when you run it through a pipe, it does it only when the buffer fills up completely. This saves system calls (in FreeBSD, for example, the buffer size selected by stdio is around 32kb, large enough to force two blocks to write(2) and optimum (you'll not get better going above that size)
The console output in C works almost the same way as a file. Once you have included stdio.h, you can write on the console output, named stdout (for "standard output"). In the end, the following statement:
printf("hello world!\n");
is the same as:
char str[] = "hello world\n";
fwrite(str, sizeof(char), sizeof(str) - 1, stdout);
fflush(stdout);

Process returned -1 (0xFFFFFFFF)

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
int main() {
printf("Transactional Shell Command Test.\n");
while(1) {
printf("Queue:");
char input[500];
fgets (input, 500, stdin);
if(strstr(input, "qb-write")){
printf("These are the commands you have queued:\n");
FILE *cmd = popen("cat /home/$USER/.queueBASH_transactions", "r");
char buf[256];
while (fgets(buf, sizeof(buf), cmd) != 0) {
printf("%s\n",buf);
}
pclose(cmd);
}
system(strncat("echo ",strncat(input," >> /home/$USER/.qb_transactions",500),500));
usleep(20000);
}
return 0;
}
I am attempting to make a concept for a transactional shell, and I'm having it output every command you enter into a file in the user's home directory. It's not completely finished, but I'm doing one part at a time. When I put in any input to the "shell", it crashes. Codeblocks tells me "Process returned -1 (0xFFFFFFFF)" and then the usual info about runtime. What am I doing wrong here?
strncat appends to its first argument in place, so you need to pass it a writable buffer as the first argument. You're passing a string literal ("echo "), which depending on your compiler and runtime environment may either overwrite unpredictable parts of the memory, or crash because it's trying to write to read-only memory.
char command[500];
strcpy(command, "echo ");
strncat(command, input, sizeof(command)-1-strlen(command));
strncat(command, " >> /home/$USER/.qb_transactions", sizeof(command)-1-strlen(command));
system(command);
As with the rest of your code, I've omitted error checking, so the command will be truncated if it doesn't fit the buffer. Also note that repeated calls to strncat are inefficient since they involve traversing the string many times to determine its end; it would be more efficient to use the return value and keep track of the remaining buffer size, but I'm leaving this as a follow-up exercise.
Of course invoking a shell to append to a file is a bad idea in the first place. If the input contains shell special characters, they'll be evaluated. You should open the log file and write to it directly.
char log_file[PATH_MAX];
strcpy(log_file, getenv("HOME"));
strncat(log_file, "/.qb_transactions", PATH_MAX-1-strlen(log_file));
FILE *log_file = fopen(log_file, "a");
…
while (1) {
…
fputs(cmd, log_file);
}
fclose(log_file);
(Once again, error checking omitted.)

fgetws can't read non-English characters on Linux

I have a basic C program that reads some lines from a text file containing hundreds of lines in its working directory. Here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <ctype.h>
#include <string.h>
#include <locale.h>
#include <wchar.h>
#include <wctype.h>
#include <unistd.h>
int main(int argc, const char * argv[]) {
srand((unsigned)time(0));
char *nameFileName = "MaleNames.txt";
wchar_t line[100];
wchar_t **nameLines = malloc(sizeof(wchar_t*) * 2000);
int numNameLines = 0;
FILE *nameFile = fopen(nameFileName, "r");
while (fgetws(line, 100, nameFile) != NULL) {
nameLines[numNameLines] = malloc(sizeof(wchar_t) * 100);
wcsncpy(nameLines[numNameLines], line, 100);
numNameLines++;
}
fclose(nameFile);
wchar_t *name = nameLines[rand() % numNameLines];
name[wcslen(name) - 1] = '\0';
wprintf(L"%ls", name);
int i;
for (i = 0; i < numNameLines; i++) {
free(nameLines[i]);
}
free(nameLines);
return 0;
}
It basically reads my text file (defined as a macro, it exists at the working directory) line by line. Rest is irrelevant. It runs perfect and as expected on my Mac (with llvm/Xcode). When I try to compile (nothing fancy, again, gcc main.c) and run it on a Linux server, it either:
Exists with error code 2 (meaning no lines are read).
Reads only first 3 lines from my file with hundreds of lines.
What causes this indeterministic (and incorrect) behavior? I've tried commenting out the first line (random seed) and compile again, it always exits with return code 2.
What is the relation between the random methods and reading a file, and why I'm getting this behavior?
UPDATE: I've fixed malloc to sizeof(wchar_t) * 100 from sizeof(wchar_t) * 50. It didn't change anything. My lines are about 15 characters at most, and there are much less than 2000 lines (it is guaranteed).
UPDATE 2:
I've compiled with -Wall, no issues.
I've compiled with -Werror, no issues.
I've run valgrind didn't find any leaks too.
I've debugged with gdb, it just doesn't enter the while loop (fgetws call returns 0).
UPDATE 3: I'm getting a floating point exception on Linux, as numNameLines is zero.
UPDATE 4: I verify that I have read permissions on MaleNames.txt.
UPDATE 5: I've found that accented, non-English characters (e.g. Â) cause problems while reading lines. fgetws halts on them. I've tried setting locale (both setlocale(LC_ALL, "en.UTF-8"); and setlocale(LC_ALL, "tr.UTF-8"); separately) but didn't work.
fgetws() is attempting to read up to 100 wide characters. The malloc() call in the loop allocates 50 wide characters.
The wcscpy() call copies all the wide characters read. If more than 50 wide characters have been read (including the terminating nul) then wcscpy() will overrun the allocated buffer. That results in undefined behaviour.
Instead of multiplying by 50 in the loop, multiply by 100. Or, better yet, compute the length of string read and use that.
Independently of the above, your code will also overrun a buffer if the file contains more than 2000 lines. Your loop needs to check for that.
A number of the functions in your code can fail, and will return a value to indicate that. Your code is not checking for any such failures.
Your code running under OS X is happenstance. The behaviour is undefined, which means there is potential to fail on any host system, when built with any compiler. Appearing to run correctly on one system, and failing on another system, is actually a valid set of responses to undefined behaviour.
Found the solution. It was all about the locale, from the beginning. After experimenting and hours of research, I've stumbled upon this: http://cboard.cprogramming.com/c-programming/142780-arrays-accented-characters.html#post1066035
#include < locale.h >
setlocale(LC_ALL, "");
Setting locale to empty string solved my problem instantly.

C Stop stdout from flushing

I am writing a C program where I am printing to stderr and also using putchar() within the code. I want the output on the console to show all of the stderr and then finally flush the stdout before the program ends. Does anyone know of a method that will stop stdout from flushing when a putchar('\n') occurs?
I suppose i could just do an if statement to make sure it doesn't putchar any newlines but I would prefer some line or lines of code to put at the top of the program to stop all flushing until i say fflush(stdout) at the bottom of the program
What you're trying to do is horribly fragile. C provides no obligation for an implementation of stdio not to flush output, under any circumstances. Even if you get it to work for you, this behavior will be dependent on not exceeding the buffer size. If you really need this behavior, you should probably buffer the output yourself (possibly writing it to a tmpfile() rather than stdout) then copying it all to stdout as the final step before your program exits.
Run your command from the console using pipeling:
my_command >output.txt
All output witten to stderr will appear immediately. The stuff written to stdout will go to output.txt.
Windows only. I'm still looking for the Unix solution myself if anyone has it!
Here is a minimal working example for Windows that sends a buffer to stdout without flushing. You can adjust the maximum buffer size before a flush occurs by changing max_buffer, though I imagine there's some upper limit!
#include <windows.h>
#include <string.h>
int main()
{
const char* my_buffer = "hello, world!";
HANDLE hStdout = GetStdHandle(STD_OUTPUT_HANDLE);
int max_buffer = 1000000;
int num_remaining = strlen(my_buffer);
while (num_remaining)
{
DWORD num_written = 0;
int buffer_size = num_remaining < max_buffer ? num_remaining : max_buffer;
int retval = WriteConsoleA(hStdout, my_buffer, buffer_size, &num_written, 0);
if (retval == 0 || num_written == 0)
{
// Handle error
}
num_remaining -= num_written;
if (num_remaining == 0)
{
break;
}
my_buffer += num_written;
}
}
You can use setvbuf() to fully buffer output to stdout and provide a large enough buffer size for your purpose:
#include <stdio.h>
int main() {
// issue this call before any output
setvbuf(stdout, NULL, _IOFBF, 16384);
...
return 0;
}
Output to stderr is unbuffered by default, so it should go to the console immediately.
Output to stdout is line buffered by default when attached to the terminal. Setting it to _IOFBF (fully buffered) should prevent putchar('\n') from flushing the pending output.

Resources