avoiding char buffer overflow more efficiently - c

i wrote a simple in/out program
whenever i run it and enter the input and exceed the char limit i get
*** stack smashing detected ***: terminated Aborted (core dumped)
i searched it up and found it was a gcc thing for safety,i heard it might lead to seg faults so i experimented turning it off with -fno-stack-protector and it ran normally if i exceeded the char limit
but what if i want to write the program if the input length is unknown, is there a safer way to do this? more efficient that increasing the value in char to an ridiculously large value?
the code:
#include <stdio.h>
int main()
{
char in[1];
printf("in: ");
scanf("%s\0", &in);
printf("\nout: %s\n", in);
}
P.s- im new to C, >2 days old so a simple explanation would be appreciated

char in[1]; can hold only the empty string (a single null terminating byte), which is impossible to use safely with scanf.
Also note that explicitly stating the null terminating byte in a string literal is superfluous, as they are implicitly null terminated.
but what if i want to write the program if the input length is unknown, is there a safer way to do this? more efficient that increasing the value in char to an ridiculously large value?
The counter-questions here are:
What do you consider inefficient?
What do you define as ridiculously large?
As I see it, you have two options:
Use dynamically allocated memory to read strings of an arbitrary size.
Set a realistic upper limit on the length of of input to expect.
An example of #1 can be seen in library functions like POSIX getline or getdelim. Its re-implementation can be as simple as as malloc (realloc), getchar, and a loop.
The use of #2 depends greatly on the context of your program, and what it is supposed to do. Perhaps you a reading a single line, and a smallish buffer will suffice. Maybe you are expecting a larger chunk of data, and need more memory. Only you can decide this for yourself.
In any case, its up to you to avoid undefined behavior by preventing overflows before they happen. It is already too late if one has occurred.
Use field-width specifiers when using %s:
char buf[512];
if (1 != scanf("%511s", buf))
/* read error */;
or use sane functions like fgets, which allow you to pass a buffer size as an argument.

stack smashing detected
i searched it up and found it was a gcc thing for safety
That's indeed gcc's way of spotting run-time bugs in your code by inserting so-called "stack canaries" to spot stack corruption/overflows. More errors detected is a good thing.
i heard it might lead to seg faults
No, bugs in your application lead to seg faults. If the compiler provides ways to detect them before the OS, that's a good thing. Dormant but severe bugs in your program is a bad thing. However, the OS would possibly detect the bug too and say "seg fault".
so i experimented turning it off with -fno-stack-protector and it ran normally if i exceeded the char limit
Basically you know that you are an inexperienced driver and afraid you might hit other cars. To solve, this you drive with your eyes closed instead, so you won't see those cars you could hit. That doesn't mean that they disappear.
char in[1]; can only hold 1 byte of data and if you read out of bounds of this array, you invoke undefined behavior, which will manifest itself as stack smashing or seg faults. Because you are trying to write to memory that doesn't belong to you. This is the bug, this is the problem. The correct solution is to allocate enough memory.
(You also have a bug scanf("%s\0", &in); -> scanf("%s\0", in);. The & isn't needed since in is an array and automatically "decays" into a pointer to its first element when you pass it to a function.)
One sensible way is to allocate 128 bytes or so, and then restrict the input so that it cannot be more than 128 bytes. The proper function to read strings with restricted input length is fgets. So you could either switch to fgets or you could accept that your beginner trial programs need not live up to production quality and just use scanf for now. (You can use scanf safely as shown in another answer, but IMO that's just more cumbersome than using fgets.)
Also I would strongly advise C beginners not to worry about if they allocate 10 bytes or 100 bytes. Learn programming by using a PC and then it won't matter. Optimizing memory consumption is an advanced topic which you will learn later on.

Related

I allocated memory from the heap to store a character but it is holding a string [duplicate]

This question already has answers here:
No out of bounds error
(7 answers)
Closed 5 years ago.
#include<stdio.h>
#include<stdlib.h>
int main()
{
char* name;
name=(char* )malloc(sizeof(char));
printf("Enter a name: ");
scanf("%s",name);
printf("%s",name);
return 0;
}
The above code perfectly stores a whole string when I am just allocating memory for a single character. How is this possible?
You have undefined behavior. Be very scared. Read Lattner's blog on UB.
printf("%s",name);
This works when name is a genuine string (that is some licit and valid memory zone ending with a zero byte). In your program it is not, you are accessing at least one byte past the malloc-ed zone. You have a buffer overflow.
The only case for a single-char zone when it is a valid string is when that single char contains a zero byte so the string is empty.
The above code perfectly stores a whole string
You just have bad luck (BTW, it is not "perfectly"). It could be worse (e.g. the collapse of the universe) and it could be better (e.g. some segmentation violation, nasal demons, ....). This is why you should be afraid of UB.
(to explain what happened on your particular computer, you need to dive into implementation details, e.g. study your operating system, your compiler, your instruction set, the assembler code generated by your compiler, etc etc...; you don't want to spend years of work on that, and even if you did it stays UB)
You should compile with all warnings and debug info (gcc -Wall -Wextra -g) and use the debugger gdb, valgrind and be sure that every memory location your program is dealing with is valid.
BTW malloc could fail, and you should test against that and sizeof(char) is by definition 1. You should try
char* name = malloc(80);
if (!name) { perror("malloc"); exit(EXIT_FAILURE); };
Please also read the documentation of every function you are using. Start with printf and malloc and scanf. Notice that your use of scanf is dangerous. Better end your printf control strings with a \n or use fflush appropriately (since stdout is often line-buffered). Also download and study the specification n1570 of the C11 programming language.
On some systems, you have more than what the C11 standard guarantees. For example POSIX has getline and you could use it like here. Consider also fgets (in the C11 standard) if you don't have a POSIX system (then you would need complex tricks to read arbitrarily long lines, or else specify and document that your program can only handle lines of at most 79 bytes if using malloc(80)).
It is good manners to avoid arbitrary limits in your code; of course your computer still has limitations (you probably won't be able to handle a line of a hundred billion bytes).
Be aware of character encoding (see also this). Use today, in 2017, UTF-8 everywhere (e.g. with the help of libunistring or of glib) but then remember that a Unicode character can span several bytes (that is char-s).
You have allocated memory for just one character which is sufficient to store just the terminating null byte. You need to allocate more.
For example:
char *name = malloc(128); /* allocates 128 bytes */
printf("Enter a name: ");
if (fgets(name, 128, stdin) == NULL) {
/* input failure */
}
Note that scanf() is notorious for reading input. Prefer fgets() instead. Also be aware that fgets() would read in the newline if there's space which you may want to remove.
Also see: Why does some code carefully cast the values returned by malloc to the pointer type being allocated?
You can only safely store one string into that array, and that string has a length of zero.
Perhaps, to your fortune, this code might appear to work for the rest of your life and you'll never hear any complaints about it... but that's really unlikely.
Type a larger number of characters; mash the keyboard with a few KB and I bet you'll start seeing segmentation faults. Be glad I showed you this, because I probably saved you many hours of debugging!
Having said that, I can't provide a 100% guarantee; it's just an educated guess. Perhaps your system doesn't segfault. Perhaps your system lets hackers bypass security, instead. Who knows? The behaviour is undefined.
I could have saved you a lawsuit due to insecure coding, instead, but we can't give a definition to undefined behaviour other than the definition which the standard programming language gives it, which is: non-portable and erratic.
The reason to avoid this should now be clear to you.

Char array can hold more than expected

I tried to run this code in C and expected runtime error but actually it ran without errors. Can you tell me the reason of why this happens?
char str[10];
scanf("%s",str);
printf("%s",str);
As I initialized the size of array as 10, how can code prints string of more than 10 letters?
As soon as you read or write from an array outside of its bounds, you invoke undefined behavior.
Whenever this happens, the program may do anything it wants. It is even allowed to play you a birthday song although it's not your birthday, or to transfer money from your bank account. Or it may crash, or delete files, or just pretend nothing bad happened.
In your case, it did the latter, but it is in no way guaranteed.
To learn further details about this phenomenon, read something about exploiting buffer overflows, this is a large topic.
C doesn't perform any bounds checking on the array. This can lead to buffer overflows attack on your executable.
The bound checking should be done at the user end to make it anti-buffer overflow.
Instead of typing in magic numbers when taking input from fgets in an array, always use the sizeof(array) - 1 operator on the array to take in that much, -1 for leaving a space for '\0' character.
This is a good question. And the answer
is that it there is indeed a memory problem
The string is read and stored from the address of str
up until the length of the actual read string demands,
and it exceeds the place you allocated for it.
Now, it may be not crash immediately, or even ever for
short programs, but it's very likely that when you expand
the program, and define other variables, this string will
overrun them, creating weird bugs of all kinds, and it may
eventually also crash.
In short, this is a real error, but it's not uncommon to have
memory bugs like this one which do not affect at first, but
do create bugs or crash the program later.

why when declare a dynamic variable, reserves more than specified

I have the following code
char *str = (char *) malloc(sizeof(char)*5);
printf("Enter a string: ");
scanf("%s", str);
printf("%s\n", str);
This code supposed will reserve 5 places in memory ex: 5 * 8 bit, this mean that will stores five characters.
Now, when enter any number of characters (not up to five only), does not occur any error whether in compile time or in run time.
is this normal? or there is an error I did not understand in my code ?
C will not prevent you from shooting yourself in the foot. scanf will happily overwrite the buffer given to it, invoking undefined behavior. This error is not reliably detectable at runtime and will silently corrupt memory and break the runtime of your application in unpredictable ways.
It is your responsibility as the programmer to prevent this from happening - in this case, for example, by replacing scanf with much safer fgets.
You've allocated 5 bytes, but scanf will happily continue writing into *un*allocated memory. This is a buffer overflow, and the C runtime assumes you know what you are doing; no bounds checking is performed.
Don't use scanf. Use fgets to read a line of at most 5 bytes:
char *str = malloc(5);
fgets(str, 5, stdin);
If you type a line with more than 4 characters, fgets simply discards the extra characters.
You have indeed allocated space only for 5 bytes (i.e., strings up to 4 characters + the terminating NUL), but your scanf does not know that. It will blissfully overflow the allocated buffer. You are responsible for ensuring that this does not happen; you need to change your code accordingly. Overflowing the buffer is undefined behaviour so in theory “anything” may happen or not happen as a consequence, but in practice it tends to overwrite other things in adjacent memory, corrupting the contents of other variables and possibly even program code (leading to an exploit where a malicious user can craft the input string so as to execute arbitrary code).
(As an additional note, sizeof char is always 1 by definition, you do not need to multiply with it.)
You asked for 5 chars, you get memory that can contain at least 5 chars; the allocator is free to give you more memory for its internal reasons, but there's no standard way to know how much more it gave to you.
Besides, normally there's no immediate error even if you actually overflow a buffer like you did - the standard does not mandate bounds checking, it just says that this is "undefined behavior", i.e. anything can happen, from "your program seems to work" to "universe death" passing through nasal demons.
What actually happens in most implementation is that you will happily write over whatever happens to be after your buffer - typically other local variables or the return address for stack variables, other memory blocks and allocator's data structures for heap allocations. The effect usually is "impossible" bugs due to changing unrelated variables, heap corruption (typically discovered when you call free), segmentation faults and the like.
You must be very careful with this kind of errors, since buffer overflows not only undermine the stability of your application, but can also be exploited for security breaches. Thus, never carelessly write in a buffer - always use functions that allow you to specify the total size of the buffer and that stop at its boundaries.
When you allocate memory dynamically, more space will be allocated than the specified. Generally malloc implementations round the size requested to the multiple of 8 or 16 or some other 2^n. That's may be the one of the reason that you are not getting any error.

Strcpy a bigger string to a smaller array of char

Why when I do this:
char teststrcpy[5];
strcpy(teststrcpy,"thisisahugestring");
I get this message in run time:
Abort trap: 6
Shouldn't it just overwrite what is in the right of the memory of teststrcpy? If not, what does Abort trap means?
I'm using the GCC compiler under MAC OSX
As a note, and in answer to some comments, I am doing this for playing around C, I'm not going to try to do this in production. Don't you worry folkz! :)
Thanks
I don't own one, but I've read that Mac OS treats overflow differently, it won't allow you to overwrite memory incertian instances. strcpy() being one of them
On Linux machine, this code successfully overwrite next stack, but prevented on mac os (Abort trap) due to a stack canary.
You might be able to get around that with the gcc option -fno-stack-protector
Ok, since you're seeing an abort from __strcpy_chk that would mean it's specifically checking strcpy (and probably friends). So in theory you could do the following*:
char teststrcpy[5];
gets(teststrcpy);
Then enter your really long string and it should behave baddly as you wish.
*I am only advising gets in this specific instance in an attempt to get around the OS's protection mechanisms that are in place. Under NO other instances would I suggest anyone use the code. gets is not safe.
Shouldn't it just overwrite what is in the right of the memory of teststrcpy?
Not necessarily, it's undefined behaviour to write outside the allocated memory. In your case, something detected the out-of-bounds write and aborted the programme.
In C there is nobody who tells you that "buffer is too small" if you insist on copying too many characters to a buffer that is too small you will go into undefined behavior terrority
If you would LIKE to overwrite what's after 5th char of teststrcpy, you are a scary man. You can copy a string of size 4 to your teststrcpy (fifth char SHOLULD be reserved for NULL).
Most likely your compiler is using a canary for buffer overflow protection and, thus, raising this exception when there is an overflow, preventing you from writing outside the buffer.
See http://en.wikipedia.org/wiki/Buffer_overflow_protection#Canaries

char pointer overflow

I am new to C and I was wondering if it was possible for a pointer to be overflowed by a vulnerable c function like strcpy(). I have seen it a lot in source code, is it a way to avoid buffer overflows?
Yes it is. This is in fact the classic cause of buffer overflow vulnerabilities. The only way to avoid overflowing the buffer is to ensure that you don't do anything that can cause the overflow. In the case of strcpy the solution is to use strncpy which includes the size of the buffer into which the string is being copied.
Sure, if you don't allocate enough space for a buffer, then you certainly can:
char* ptr = (char*)malloc(3);
strcpy(ptr, "this is very, very bad"); /* ptr only has 3 bytes allocated! */
However, what's really bad is that this code could work without giving you any errors, but it may overwrite some memory somewhere that could cause your program to blow up later, seemingly randomly, and you could have no idea why. Those are the source of hours (sometimes even days) of frustration, which anyone whose spent any significant amount of time writing C will tell you.
That is why with C, you have to be extremely careful with such things, and double, triple, nth degree check your code. After that, check it again.
Some other approaches are
#define MAX_LENGTH_NAME 256
foo()
{
char a[MAX_LENGTH_NAME+1]; // You can also use malloc here
strncpy(a,"Foxy",MAX_LENGTH_NAME);
snprintf(a,MAX_LENGTH_NAME,"%s","Foxy");
}
So its good to know the size of allocated memory and then use the calls to avoid buffer overflows.
Static analysis of already written code may point out these kinds of mistakes and you can change it too.

Resources