Segmentation fault in pipe - c

Given the following program:
#include <stdio.h>
int main()
{
char buf[1024];
scanf("%s", buf);
printf("----> %s", buf);
return 0;
}
which is executed as follows:
grep ....| a.out
or
echo ....| a.out
I get a Segmentation fault error. Can anyone explain why?

Whatever you are echoing or grepping must contain more than 1023 characters. (1024 - 1 for the null terminator.)
Instead of using scanf, use fgets and specify a size. Alternatively, use scanf but specify the field length. You can do scanf("%1023s", buf);. If there's more bytes available, you can always do it again to read in the rest.
Given your test input, you should not receive a segfault. I just tried it locally and it worked fine. If you are on Linux, since you wrote a.out instead of ./a.out, depending on how your path is configured you may be running the wrong program (some sort of a.out in your bin folder?)

Don't ever use scanf with unbounded strings. fgets provides a much safer alternative, especially if you provide an intelligent wrapper function like the one in this answer.
I'm assuming that's just sample code here but, just in case it isn't, you can achieve the same effect with:
WhateverYourCommandIs | sed 's/^/----> '
without having to write your own tool to do the job. In fact, with sed, awk and the likes, you probably never need to write text processing tools yourself.

from scanf man:
s Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the terminating null character ('\0'), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.
specifying maximum field width will prevent stack overrun
scanf("%1023s", buf);
and to ensure stack no overrun on printf use memset:
memset(buf,0,1024);
so, programm will be:
#include <stdio.h>
#include <string.h>
int main()
{
char buf[1024];
memset(buf,0,1024);
scanf("%1023s", buf);
printf("----> %s", buf);
return 0;
}

Related

How am I reading out of the bounds of my array?

I'm writing a program that will have a command prompt where the user can infinitely input command strings and I will process them as needed.
I have a command-line limit of 200 characters, but for now, I am performing a hello world test with a limit of 4 characters per command to see how my system would handle an input overflow. To my absolute surprise and confusion, I'm seeing that even though I am declaring my command[5] input array as to only allocate 5 characters, I am able to write outside those bounds and read command[7] without getting any exception or runtime error. In the example below, I input hello world as a command and reading command[7] returns the letter o which is the correct answer (I was expecting an error after trying to read outside the 5 character bound of my array).
Can someone explain what's going on? How can I make sure that the input gets truncated as I was expecting when the user goes over the buffer size that I've established?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
char command[5]; //commands can't be longer than 4 characters
char c;
while (1)
{
printf("# "); //print command prompt
scanf("%[^\n]4s", command); //read command
while ((c = getchar()) != '\n' && c != EOF)
{
/*discard overflow input*/;
}
printf("received command: %c\n", command[7]); //echo character from command
if (strcmp(command, "exit") == 0)
{
break;
}
memset(command, 0, sizeof(command)); //clean command buffer
}
return 0;
}
The C standard doesn't specify what happens when you access memory outside of what you've allocated. It could read correctly, it could read something else (if something that owns that memory overwrites it, it could cause a segmentation fault (if you access memory outside of your program's allocated space).
One option would be to use the width modifier on scanf to ensure you receive at most 4 characters:
scanf("%4s", command);
As multiple people mentioned in the comments and other answers, the C standard establishes that accessing memory outside the allocated bounds results in undefined behavior, but that doesn't mean it will always give an error. I guess that was an interpretation error by me.
Regarding the specific issue I was having where scanf was reading more than 4 characters, the advice provided by #Weather Vane in the comments worked well. All that was needed was changing my scanf command.
From this: scanf("%[^\n]4s", command); //read command
To this: scanf("%4[^\n]s", command); //read command
This way, scanf will only write up to 4 characters into the buffer command, and the contiguous memory will be left untouched. Therefore, if I try to access command[7], I would get garbage or possibly an error.
For anyone wondering about the while loop that discards overflow, see the comment section.

Character Array and Null character

#include <stdio.h>
#include <stdlib.h>
int main()
{
int i;
char str[4];
scanf("%s",str);
printf("%s",str);
}
input scan
output scan
Here I declare an array of 4 characters. I used '%s' that is used for strings. I am not able to understand how can we input 4 char elements and get correct answer when one space should be utilized for the NULL character. The input should only work with up to 3 elements.
scanf() does not check its arguments. You could even enter more than 4 characters and scanf() would happily overwrite the memory area that comes after your array. After that, your program might crash or all kinds of funny things might happen. This is called a buffer overflow and it is a common cause of vulnerabilities in software.
as mentioned when you take more than 3 character as input ,and extra chars and \0 will be written outside of array memory(after it) and over write memory which doesn't belong to array.which will cause undefined behavior.
but you can use these to prevent buffer overflow from happening:
scanf("%3s",str);
or
fgets(str, sizeof str, stdin)

getline how to limit amount of input as you can with fgets

GNU manual
This quote is from the GNU manual
Warning: If the input data has a null character, you can’t tell. So
don’t use fgets unless you know the data cannot contain a null. Don’t
use it to read files edited by the user because, if the user inserts a
null character, you should either handle it properly or print a clear
error message. We recommend using getline instead of fgets.
As I usually do, I spent time searching before asking a question, and I did find a similar question on Stack Overflow from five years ago:
Why is the fgets function deprecated?
Although GNU recommends getline over fgets, I noticed that getline in stdio.h takes any size line. It calls realloc as needed. If I try to set the size to 10 char:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char *buffer;
size_t bufsize = 10;
size_t characters;
buffer = (char *)malloc(bufsize * sizeof(char));
if( buffer == NULL)
{
perror("Unable to allocate buffer");
exit(1);
}
printf("Type something: ");
characters = getline(&buffer,&bufsize,stdin);
printf("%zu characters were read.\n",characters);
printf("You typed: '%s'\n",buffer);
return(0);
}
In the code above, type any size string, over 10 char, and getline will read it and give you the right output.
There is no need to even malloc, as I did in the code above — getline does it for you. I'm setting the buffer to size 0, and getline will malloc and realloc for me as needed.
#include <stdio.h>
#include <stdlib.h>
int main()
{
char *buffer;
size_t bufsize = 0;
size_t characters;
printf("Type something: ");
characters = getline(&buffer,&bufsize,stdin);
printf("%zu characters were read.\n",characters);
printf("You typed: '%s'\n",buffer);
return(0);
}
If you run this code, again you can enter any size string, and it works. Even though I set the buffer size to 0.
I've been looking at safe coding practices from CERT guidelines www.securecoding.cert.org
I was thinking of switching from fgets to getline, but the issue I am having, is I cannot figure out how to limit the input in getline. I think a malicious attacker can use a loop to send an unlimited amount of data, and use up all the ram available in the heap?
Is there a way of limiting the input size that getline uses or does getline have some limit within the function?
Using fgets is not necessarily problematic, all the gnu manual tells you is that if there's a '\0'-Byte in the file, so will there be in your buffer. You won't be able to tell if the null-delimiter in your buffer is the actual end of the file or just a null within the file. This means you can read a 100 char file into a 200 char buffer and it will contain a 50 char c-string.
The stdio.h readline in fact doesn't appear to have any sane length limitation so fread might be viable alternative.
Unlinke C getline and C++ std::getline(), C++ std::istream::getline() is limited to count characters
The GNU manual is just bad. Limiting the input length is usually the right thing to do, especially if input is untrusted, and fgets does this correctly. getline cannot be used safely in such a context.

Using snprintf to avoid buffer overrun

I dont understand why I am getting an output like this: StackOver↨< as snprintf should take care of null termination as expected output is StackOver. I am using devcpp IDE.
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
int main(void)
{
char buffer[10];
printf("%d\n", sizeof(buffer));
snprintf(buffer, sizeof(buffer), "%s", "StackOverflow");
printf("%s", buffer);
return 0;
}
The C Standard states that the copied string shall be nul-terminated:
7.21.6.5 The snprintf function
...
Description
The snprintf function is equivalent to fprintf , except that the
output is written into an array (specified by argument s ) rather than
to a stream. If n is zero, nothing is written, and s may be a
null pointer. Otherwise, output characters beyond the n-1st
are discarded rather than being written to the array, and a null
character is written at the end of the characters actually written
into the array. If copying takes place between objects that
overlap, the behavior is undefined.
It appears you are running with an outdated and/or buggy C runtime library as the snprintf() implementation doesn't seem to properly implement the required behavior.
This code is working fine for me. The buffer only has space for 10 characters so sprintf is only able to write the first 9 characters that you tell it to write ("StackOver"). At the tenth character it stores a terminating null character, since every C string must be null-terminated.
The only suggestion I would make is adding a newline when printing the string at the end:
printf("%s\n", buffer);
The lack of a newline at the end might be the reason why your IDE is showing you that ↨ character.
If you want the buffer to fit "StackOverflow" you need to allocate it to something larger.

String decleration length in C

So I'm writing a small program (I'm new to C, coming from C++), and I want to take in a string of maximum length ten.
I declare a character array as
#define SYMBOL_MAX_LEN 10 //Maximum length a symbol can be from the user (NOT including null character)
.
.
.
char symbol[SYMBOL_MAX_LEN + 1]; //Holds the symbol given by the user (+1 for null character)
So why is it when I use:
scanf("%s", symbol); //Take in a symbol given by the user as a string
I am able to type '01234567890', and the program will still store the entire value?
My questions are:
Does scanf not prevent values from being recorded in the adjacent
blocks of memory after symbol?
How could I prevent the user from entering a value of greater than length SYMBOL_MAX_LEN?
Does scanf put the null terminating character into symbol automatically, or is that something I will need to do manually?
You can limit the number of characters scanf() will read as so:
#include <stdio.h>
int main(void) {
char buffer[4];
scanf("%3s", buffer);
printf("%s\n", buffer);
return 0;
}
Sample output:
paul#local:~/src/c/scratch$ ./scanftest
abc
abc
paul#local:~/src/c/scratch$ ./scanftest
abcdefghijlkmnop
abc
paul#local:~/src/c/scratch$
scanf() will add the terminating '\0' for you.
If you don't want to hardcode the length in your format string, you can just construct it dynamically, e.g.:
#include <stdio.h>
#define SYMBOL_MAX_LEN 4
int main(void) {
char buffer[SYMBOL_MAX_LEN];
char fstring[100];
sprintf(fstring, "%%%ds", SYMBOL_MAX_LEN - 1);
scanf(fstring, buffer);
printf("%s\n", buffer);
return 0;
}
For the avoidance of doubt, scanf() is generally a terrible function for dealing with input. fgets() is much better for this type of thing.
Does scanf not prevent values from being recorded in the adjacent blocks of memory after symbol?
As far as I know, No.
How could I prevent the user from entering a value of greater than length SYMBOL_MAX_LEN?
By using buffer safe functions like fgets.
Does scanf put the null terminating character into symbol automatically, or is that something I will need to do manually?
Only if the size was enough for it to put the nul terminator. For example if your array was of length 10 and you input 10 chars how will it put the nul terminator.
I am able to type '01234567890', and the program will still store the entire value?
This is because you are Unlucky that you are getting your desired result. This will invoke undefined behavior.
Does scanf not prevent values from being recorded in the adjacent blocks of memory after symbol?
No.
How could I prevent the user from entering a value of greater than length SYMBOL_MAX_LEN?
Use fgets.
Does scanf put the null terminating character into symbol automatically, or is that something I will need to do manually?
Yes

Resources