Bus error caused while reading in a string - c

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main() {
char *input = (char *)malloc(sizeof(char));
input = "\0";
while (1){
scanf("%s\n", input);
if (strcmp(input, "0 0 0") == 0) break;
printf("%s\n",input);
}
}
I'm trying to read in a string of integers until "0 0 0" is entered in.
The program spits out bus error as soon as it executes the scanf line, and I have no clue how to fix it.
Below is the error log.
[1] 59443 bus error

You set input to point to the first element of a string literal (while leaking the recently allocated buffer):
input = "\0"; // now the malloc'd buffer is lost
Then you try to modify said literal:
scanf("%s\n", input);
That is undefined behaviour. You can't write to that location. You can fix that problem by removing the first line, input = "\0";.
Next, note that you're only allocating space for one character:
char *input = (char *)malloc(sizeof(char));
Once you fix the memory leak and the undefined behaviour, you can think about allocating more space. How much space you need is for you to say, but you need enough to contain the longest string you want to read in plus an extra character for the null terminator. For example,
char *input = malloc(257);
would allow you to read in strings up to 256 characters long.

The immediate problem, (thanks to another answer) is that you're initializing input wrong, by pointing it at read-only data, then later trying to write to it via scanf. (Yes, even the lowly literal "" is a pointer to a memory area where the empty string is stored.)
The next problem is semantic: there's no point in trying to initialize it when scanf() will soon overwrite whatever you put there. But if you wanted to, a valid way is input[0] = '\0', which would be appropriate for, say, a loop using strcat().
And finally, waiting in the wings to bite you is a deeper issue: You need to understand malloc() and sizeof() better. You're only allocating enough space for one character, then overrunning the 1-char buffer with a string of arbitrary length (up to the maximum that your terminal will allow on a line.)
A rough cut would be to allocate far more, say 256 chars, than you'll ever need, but scanf is an awful function for this reason -- makes buffer overruns painfully easy especially for novices. I'll leave it to others to suggest alternatives.
Interestingly, the type of crash can indicate something about what you did wrong. A Bus error often relates to modifying read-only memory (which is still a mapped page), such as you're trying to do, but a Segmentation Violation often indicates overrunning a buffer of a writable memory range, by hitting an unmapped page.

input = "\0";
is wrong.
'input' is pointer, not memory.
"\0" is string, not char.
You assigning pointer to a new value which points to a segment of memory which holds constants because "\0" is constant string literal.
Now when you are trying to modify this constant memory, you are getting bus error which is expected.
In your case i assume you wanted to initialize 'input' with empty string.
Use
input[0]='\0';
note single quotes around 0.
Next problem is malloc:
char *input = (char *)malloc(sizeof(char));
you are allocating memory for 1 character only.
When user will enter "0 0 0" which is 5 characters + zero you will get buffer overflow and will probably corrupt some innocent variable.
Allocate enough memory upfront to store all user input. Usual values are 256, 8192 bytes it doesn't matter.
Then,
scanf("%s\n", input);
may still overrun the buffer if user enters alot of text. Use fgets(buf, limit(like 8192), stdin), that would be safer.

Related

Malloc array of characters. String

I understand that assigning memory allocation for string requires n+1 due to the NULL character. However, the question is what if you allocate 10 chars but enter an 11 char string?
#include <stdlib.h>
int main(){
int n;
char *str;
printf("How long is your string? ");
scanf("%d", &n);
str = malloc(n+1);
if (str == NULL) printf("Uh oh.\n");
scanf("%s", str);
printf("Your string is: %s\n", str);
}
I tried running the program but the result is still the same as n+1.
If you allocated a char* of 10 characters but wrote 11 characters to it, you're writing to memory you haven't allocated. This has undefined behavior - it may happen to work, it may crash with a segmentation fault, and it may do something completely different. In short - don't rely on it.
If you overrun an area of memory given you by malloc, you corrupt the RAM heap. If you're lucky your program will crash right away, or when you free the memory, or when your program uses the chunk of memory right after the area you overran. When your program crashes you'll notice the bug and have a chance to fix it.
If you're unlucky your code goes into production, and some cybercriminal figures out how to exploit your overrun memory to trick your program into running some malicious code or using some malicious data they fed you. If you're really unlucky, you get featured in Krebs On Security or some other information security news outlet.
Don't do this. If you're not confident of your ability to avoid doing it, don't use C. Instead use a language with a native string data type. Seriously.
what if you allocate 10 chars but enter an 11 char string?
scanf("%s", str); experiences undefined behavior (UB). Anything may happen including "I tried running the program but the result is still the same as n+1." will appear OK.
Instead always use a width with scanf() and "%s" to stop reading once str[] is full. Example:
char str[10+1];
scanf("%10s", str);
Since n is variable here, consider instead using fgets() to read a line of input.
Note that fgets() also reads and saves a trailing '\n'.
Better to use fgets() for user input and drop scanf() call altogether until you understand why scanf() is bad.
str = malloc(n+1);
if (str == NULL) printf("Uh oh.\n");
if (fgets(str, n+1, stdin)) {
str[strcspn(str, "\n")] = 0; // Lop off potential trailing \n
When you write 11 bytes to a 10-byte buffer, the last byte will be out-of-bounds. Depending on several factors, the program may crash, have unexpected and weird behavior, or may run just fine (i.e., what you are seeing). In other words, the behavior is undefined. You pretty much always want to avoid this, because it is unsafe and unpredictable.
Try writing a bigger string to your 10-byte buffer, such as 20 bytes or 30 bytes. You will see problems start to appear.

Heap Overflow Attack

I am learning about heap overflow attacks and my textbook provides the following vulnerable C code:
/* record type to allocate on heap */
typedef struct chunk {
char inp[64]; /* vulnerable input buffer */
void (*process)(char *); /* pointer to function to process inp */
} chunk_t;
void showlen(char *buf)
{
int len;
len = strlen(buf);
printf("buffer5 read %d chars\n", len);
}
int main(int argc, char *argv[])
{
chunk_t *next;
setbuf(stdin, NULL);
next = malloc(sizeof(chunk_t));
next->process = showlen;
printf("Enter value: ");
gets(next->inp);
next->process(next->inp);
printf("buffer5 done\n");
}
However, the textbook doesn't explain how one would fix this vulnerability. If anyone could please explain the vulnerability and a way(s) to fix it that would be great. (Part of the problem is that I am coming from Java, not C)
The problem is that gets() will keep reading into the buffer until it reads a newline or reaches EOF. It doesn't know the size of the buffer, so it doesn't know that it should stop when it hits its limit. If the line is 64 bytes or longer, this will go outside the buffer, and overwrite process. If the user entering the input knows about this, he can type just the right characters at position 64 to replace the function pointer with a pointer to some other function that he wants to make the program call instead.
The fix is to use a function other than gets(), so you can specify a limit on the amount of input that will be read. Instead of
gets(next->inp);
you can use:
fgets(next->inp, sizeof(next->inp), stdin);
The second argument to fgets() tells it to write at most 64 bytes into next->inp. So it will read at most 63 bytes from stdin (it needs to allow a byte for the null string terminator).
The code uses gets, which is infamous for its potential security problem: there's no way to specify the length of the buffer you pass to it, it'll just keep reading from stdin until it encounters \n or EOF. It may therefore overflow your buffer and write to memory outside of it, and then bad things will happen - it could crash, it could keep running, it could start playing porn.
To fix this, you should use fgets instead.
You can fill up next with more than 64 bytes you will by setting the address for process. Thereby enable one to insert whatever address one wishes. The address could be a pointer to any function.
To fix simple ensure that only 63 bytes (one for null) is read into the array inp - use fgets
The function gets does not limit the amount of text that comes from stdin. If more than 63 chars come from stdin, there will be an overflow.
The gets discards the LF char, that would be an [Enter] key, but it adds a null char at the end, thus the 63 chars limit.
If the value at inp is filled with 64 non-null chars, as it can be directly accessed, the showlen function will trigger an access violation, as strlen will search for the null-char beyond inp to determine its size.
Using fgets would be a good fix to the first problem but it will also add a LF char and the null, so the new limit of readable text would be 62.
For the second, just take care of what is written on inp.

Printing char array in C causes segmentation fault

I did a lot of searching around for this, couldn't find any question with the same exact issue.
Here is my code:
void fun(char* name){
printf("%s",name);
}
char name[6];
sscanf(input,"RECTANGLE_SEARCH(%6[A-Za-z0-9])",name)
printf("%s",name);
fun(name);
The name is grabbed from scanf, and it printed out fine at first. Then when fun is called, there is a segmentation fault when it tries to print out name. Why is this?
After looking in my scrying-glass, I have it:
Your scanf did overflow the buffer (more than 6 byte including terminator read), with ill-effect slightly delayed due to circumstance:
Nobody else relied on or re-used the memory corrupted at first, thus the first printf seems to work.
Somewhere after the first and before the second call to printf the space you overwrote got re-used, so the string you read was no longer terminated before encountering not allocated pages.
Thus, a segmentation-fault at last.
Of course, your program was toast the moment it overflowed the buffer, not later when it finally crashed.
Morale: Never write to memory you have not dedicated for that.
Looking at your edit, the format %6[A-Za-z0-9] tries to read up to 6 characters exclusive the terminator, not inclusive!
Since you're reading 6 characters, you have to declare name to be 7 characters, so there's room for the terminating null character:
char name[7];
Otherwise, you'll get a buffer overflow, and the consequences are undefined. Once you have undefined consequences, anything can happen, including 2 successful calls to printf() followed by a segfault when you call another function.
You're probably walking off the end of the array with your printf statement. Printf uses the terminating null character '\0' to know where the end of the string is. Try allocating your array like this:
char name[6] = {'\0'};
This will allocate your array with every element initially set to the '\0' character, which means that as long as you don't overwrite the entire array with your scanf, printf will terminate before walking off the end.
Are you sure that name is zero byte terminated? scanf can overflow your buffer depending on how you are calling it.
If that happens then printf will read beyond the end of the array resulting in undefined behavior and probably a segmentation fault.

Null termination of char array

Consider following case:
#include<stdio.h>
int main()
{
char A[5];
scanf("%s",A);
printf("%s",A);
}
My question is if char A[5] contains only two characters. Say "ab", then A[0]='a', A[1]='b' and A[2]='\0'.
But if the input is say, "abcde" then where is '\0' in that case. Will A[5] contain '\0'?
If yes, why?
sizeof(A) will always return 5 as answer. Then when the array is full, is there an extra byte reserved for '\0' which sizeof() doesn't count?
If you type more than four characters then the extra characters and the null terminator will be written outside the end of the array, overwriting memory not belonging to the array. This is a buffer overflow.
C does not prevent you from clobbering memory you don't own. This results in undefined behavior. Your program could do anything—it could crash, it could silently trash other variables and cause confusing behavior, it could be harmless, or anything else. Notice that there's no guarantee that your program will either work reliably or crash reliably. You can't even depend on it crashing immediately.
This is a great example of why scanf("%s") is dangerous and should never be used. It doesn't know about the size of your array which means there is no way to use it safely. Instead, avoid scanf and use something safer, like fgets():
fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A terminating null byte ('\0') is stored after the last character in the buffer.
Example:
if (fgets(A, sizeof A, stdin) == NULL) {
/* error reading input */
}
Annoyingly, fgets() will leave a trailing newline character ('\n') at the end of the array. So you may also want code to remove it.
size_t length = strlen(A);
if (A[length - 1] == '\n') {
A[length - 1] = '\0';
}
Ugh. A simple (but broken) scanf("%s") has turned into a 7 line monstrosity. And that's the second lesson of the day: C is not good at I/O and string handling. It can be done, and it can be done safely, but C will kick and scream the whole time.
As already pointed out - you have to define/allocate an array of length N + 1 in order to store N chars correctly. It is possible to limit the amount of characters read by scanf. In your example it would be:
scanf("%4s", A);
in order to read max. 4 chars from stdin.
character arrays in c are merely pointers to blocks of memory. If you tell the compiler to reserve 5 bytes for characters, it does. If you try to put more then 5 bytes in there, it will just overwrite the memory past the 5 bytes you reserved.
That is why c can have serious security implementations. You have to know that you are only going to write 4 characters + a \0. C will let you overwrite memory until the program crashes.
Please don't think of char foo[5] as a string. Think of it as a spot to put 5 bytes. You can store 5 characters in there without a null, but you have to remember you need to do a memcpy(otherCharArray, foo, 5) and not use strcpy. You also have to know that the otherCharArray has enough space for those 5 bytes.
You'll end up with undefined behaviour.
As you say, the size of A will always be 5, so if you read 5 or more chars, scanf will try to write to a memory, that it's not supposed to modify.
And no, there's no reserved space/char for the \0 symbol.
Any string greater than 4 characters in length will cause scanf to write beyond the bounds of the array. The resulting behavior is undefined and, if you're lucky, will cause your program to crash.
If you're wondering why scanf doesn't stop writing strings that are too long to be stored in the array A, it's because there's no way for scanf to know sizeof(A) is 5. When you pass an array as the parameter to a C function, the array decays to a pointer pointing to the first element in the array. So, there's no way to query the size of the array within the function.
In order to limit the number of characters read into the array use
scanf("%4s", A);
There isn't a character that is reserved, so you must be careful not to fill the entire array to the point it can't be null terminated. Char functions rely on the null terminator, and you will get disastrous results from them if you find yourself in the situation you describe.
Much C code that you'll see will use the 'n' derivatives of functions such as strncpy. From that man page you can read:
The strcpy() and strncpy() functions return s1. The stpcpy() and
stpncpy() functions return a
pointer to the terminating `\0' character of s1. If stpncpy() does not terminate s1 with a NUL
character, it instead returns a pointer to s1[n] (which does not necessarily refer to a valid mem-
ory location.)
strlen also relies on the null character to determine the length of a character buffer. If and when you're missing that character, you will get incorrect results.
the null character is used for the termination of array. it is at the end of the array and shows that the array is end at that point. the array automatically make last character as null character so that the compiler can easily understand that the array is ended.
\0 is an terminator operator which terminates itself when array is full
if array is not full then \0 will be at the end of the array
when you enter a string it will read from the end of the array

Reading In A String and comparing it C

Im trying to create a C based string menu where a user inputs a command and then a block of code runs.
Whatever i do the conditional is never true:
char *input= "";
fgets(input, 50, stdin);
printf("%s",input);
printf("%d",strcmp( input,"arrive\0"));
if(strcmp( input,"arrive\0")==0){....
Im fairly new to c and am finding strings really annoying.
What am i doing wrong?
Note: current code crashes my program :(
Why strcmp always return non 0:
strcmp will return 0 only when the strings are identical. As for why it's evaluating to different always. It is because fgets puts a newline character at the end of your input buffer before the null termination.
/*Will print 0 if you type in arrive<enter>*/
printf("%d",strcmp( input,"arrive\n"));
Why your program crashes:
Another problem is that input should be a char buffer. Like so: char input[1024];
Currently you have input as a pointer to a null terminated string (which is read only memory)
Friendly suggestion:
Also don't put the null terminated \0 inside the string literals. It is implied automatically when you use a string literal. It doesn't matter to double null terminate as far as strcmp is concerned, but it may cause problems elsewhere in your future programs. And people will wonder why you're doing double null termination.
Try :
#define BUFF_LEN 256
char input[BUFF_LEN];
fgets(input, BUFF_LEN, stdin);
What you have , *input is a pointer to an address of memory that has not been allocated, hence can not be used by your program. The result of using it as you are is undefined, but usually leads to a segmentation fault. If you want to access it as a pointer, you will first need to allocate it:
char *input = malloc(BUFF_LEN);
... of course, test that for failure (NULL) then free() it after you are done using it.
Edit:
At least according to the single UNIX specification, fgets() is guaranteed to null terminate the buffer. Its not necessary to initialize input[].
As others have said, it is not necessary to include null / newlines when using strcmp().
I also strongly, strongly advise you to get used to using strncmp() now, while beginning to avoid many problems down the road.
Try replacing the first line with
char input[50];
memset(input, 0, sizeof(input));
Edit:
However, the real problem why strcmp doesn't return 0 is you have to "trim" the string read from fgets, which in most cases, includes a newline character.

Resources