Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
These days I have read about buffer overflow attacks, and actually I can't say that I have understand the big picture, I have some doubts in my mind.
So to kill my doubts the question arises, if my program is written in C and all of code used to get input or to copy/merge buffers, checks for bounds, can buffer overflow occur? Or saying directly, is input (wherever that comes) the only method that an attacker can use to cause buffer overflow?
For example, consider the following code:
int main(){
int size = 15;
char buf[size];
fgets(buf, size , stdin);
printf("%s",buf);}
Is susceptible to buffer overflows?
Thank you!:)
Actually guys there is an error in the code, and there could be a potential security problem, coding like that in certain applications! In short checking returns values matter.
Whilst it may be argued that his program is indeed safe, the bigger picture is about the pattern on the code, and ensuring the assumed invariants of the code, which is that buf, contains a NULL terminated string between 0 and 14 bytes long.
From man page :
The fgets() function shall read bytes from stream into the array pointed to by s, until n-1 bytes
are read, or a is read and transferred to s, or an end-of-file condition is encountered.
The string is then terminated with a null byte.
RETURN VALUE
Upon successful completion, fgets() shall return s. If the stream is at end-of-file, the
end-of-file indicator for the stream shall be set and fgets() shall return a null pointer.
If a read error occurs, the error indicator for the stream shall be set, fgets() shall return
a null pointer, [CX] [Option Start] and shall set errno to indicate the error.
Arranging for an error condition, may mean no NULL may be appended to the string and the buffer is automatically allocated, so printf(3) may leak information.. think about Heardbleed.
As chux points out initialising the automatically allocated buffer buf[0] = '\0';, or declaring buf statically so it's system initialised to 0, ought not be relied upon as in event of error, the state of buf is undefined.
So a check on the return value of fgets is necessary. So something more like :
{
char *s;
if ((s = fgets( buf, sizeof buf, stdin)) {
puts( s);
}
}
Here's a link to an article on secure programming, which may be of interest http://www.dwheeler.com/secure-programs/
The use of 'fgets' does prevent the buffer overflow. According to the man page:
The fgets() function reads at most one less than the number of characters specified by size from the given stream and stores them in the string str. Read-
ing stops when a newline character is found, at end-of-file or error. The newline, if any, is retained. If any characters are read and there is no error,
a `\0' character is appended to end the string.
Notice the 'prevent' above. If you set the size larger than the actual buffer, you can then pull in more information than the buffer can hold leading to a buffer overflow. It is advisable to use
sizeof(buf)
to prevent possibly going over the buffer size.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I declared a character pointer, and used it for to scan a string in runtime; I don't know the number of characters that I'm going to enter, so I didn't use calloc or malloc. The program ends when it reached the line scanf("%s", NewMsg_au8).
I'm using CodeBlocks 17.12 editor.
I tried hard coding one of the input case like, NewMsg_au8="0123456789ABCDEF"; — that works fine.
uint8 * NewMsg_au8;
scanf("%s",NewMsg_au8);//<==
printf("Your entered message is: %s\n",NewMsg_au8);
return NewMsg_au8;
gets(s) and scanf("%s", s) are both unsafe and potentially incorrect because:
with those calls as shown, there is no way for either function to determine the maximum number of characters to store into the array pointed to by s, hence overlong input will cause a buffer overrun leading to undefined behavior.
in your case, it is even worse as s is an uninitialized pointer, so both functions would try a store data into a random address in memory causing undefined behavior in all cases.
gets() cannot be used safely and has been deprecated in and then removed from the C Standard.
However, scanf() can be given a limit with a numeric value between % and s:
#include <stdio.h>
#include <string.h>
char *read_string(void) {
char buf[100];
if (scanf("%99s", buf) == 1) {
printf("Your entered message is: %s\n", buf);
return strdup(buf); /* return an allocated copy of the input string */
} else {
/* no input, probably at end of file */
return NULL;
}
}
Note how only 99 characters can be stored into the array buf to allow for the null byte terminator that marks the end of a C string. The %99s conversion specification lets scanf() store at most 100 bytes into buf, including the '\0' terminator.
That is a typical beginners error. You do not save data in pointers (with gets() or scanf()) but in buffers.
Therefore, you have 2 solutions:
Use an array big enough to hold the data. You have to decide yourself what "big enough" means, according to the details of your application.
Use a pointer, and then allocate memory with malloc() - the size, again, you have to decide it. Do not forget to deallocate the memory when you no longer need it.
I tried hard coding one of the input case like, NewMsg_au8="0123456789ABCDEF"; — that works fine.
That is normal, because in that case the compiler automatically allocates enough memory to hold the string.
Please always remember when working with strings: you always need to allocate an extra byte for the terminating null character - the mark of the end of the string. Otherwise, you will need to ask questions again :)
I am learning about heap overflow attacks and my textbook provides the following vulnerable C code:
/* record type to allocate on heap */
typedef struct chunk {
char inp[64]; /* vulnerable input buffer */
void (*process)(char *); /* pointer to function to process inp */
} chunk_t;
void showlen(char *buf)
{
int len;
len = strlen(buf);
printf("buffer5 read %d chars\n", len);
}
int main(int argc, char *argv[])
{
chunk_t *next;
setbuf(stdin, NULL);
next = malloc(sizeof(chunk_t));
next->process = showlen;
printf("Enter value: ");
gets(next->inp);
next->process(next->inp);
printf("buffer5 done\n");
}
However, the textbook doesn't explain how one would fix this vulnerability. If anyone could please explain the vulnerability and a way(s) to fix it that would be great. (Part of the problem is that I am coming from Java, not C)
The problem is that gets() will keep reading into the buffer until it reads a newline or reaches EOF. It doesn't know the size of the buffer, so it doesn't know that it should stop when it hits its limit. If the line is 64 bytes or longer, this will go outside the buffer, and overwrite process. If the user entering the input knows about this, he can type just the right characters at position 64 to replace the function pointer with a pointer to some other function that he wants to make the program call instead.
The fix is to use a function other than gets(), so you can specify a limit on the amount of input that will be read. Instead of
gets(next->inp);
you can use:
fgets(next->inp, sizeof(next->inp), stdin);
The second argument to fgets() tells it to write at most 64 bytes into next->inp. So it will read at most 63 bytes from stdin (it needs to allow a byte for the null string terminator).
The code uses gets, which is infamous for its potential security problem: there's no way to specify the length of the buffer you pass to it, it'll just keep reading from stdin until it encounters \n or EOF. It may therefore overflow your buffer and write to memory outside of it, and then bad things will happen - it could crash, it could keep running, it could start playing porn.
To fix this, you should use fgets instead.
You can fill up next with more than 64 bytes you will by setting the address for process. Thereby enable one to insert whatever address one wishes. The address could be a pointer to any function.
To fix simple ensure that only 63 bytes (one for null) is read into the array inp - use fgets
The function gets does not limit the amount of text that comes from stdin. If more than 63 chars come from stdin, there will be an overflow.
The gets discards the LF char, that would be an [Enter] key, but it adds a null char at the end, thus the 63 chars limit.
If the value at inp is filled with 64 non-null chars, as it can be directly accessed, the showlen function will trigger an access violation, as strlen will search for the null-char beyond inp to determine its size.
Using fgets would be a good fix to the first problem but it will also add a LF char and the null, so the new limit of readable text would be 62.
For the second, just take care of what is written on inp.
In the various cases that a buffer is provided to the standard library's many string functions, is it guaranteed that the buffer will not be modified beyond the null terminator? For example:
char buffer[17] = "abcdefghijklmnop";
sscanf("123", "%16s", buffer);
Is buffer now required to equal "123\0efghijklmnop"?
Another example:
char buffer[10];
fgets(buffer, 10, fp);
If the read line is only 3 characters long, can one be certain that the 6th character is the same as before fgets was called?
The C99 draft standard does not explicitly state what should happen in those cases, but by considering multiple variations, you can show that it must work a certain way so that it meets the specification in all cases.
The standard says:
%s - Matches a sequence of non-white-space characters.252)
If no l length modifier is present, the corresponding argument shall be a
pointer to the initial element of a character array large enough to accept the
sequence and a terminating null character, which will be added automatically.
Here's a pair of examples that show it must work the way you are proposing to meet the standard.
Example A:
char buffer[4] = "abcd";
char buffer2[10]; // Note the this could be placed at what would be buffer+4
sscanf("123 4", "%s %s", buffer, buffer2);
// Result is buffer = "123\0"
// buffer2 = "4\0"
Example B:
char buffer[17] = "abcdefghijklmnop";
char* buffer2 = &buffer[4];
sscanf("123 4", "%s %s", buffer, buffer2);
// Result is buffer = "123\04\0"
Note that the interface of sscanf doesn't provide enough information to really know that these were different. So, if Example B is to work properly, it must not mess with the bytes after the null character in Example A. This is because it must work in both cases according to this bit of spec.
So implicitly it must work as you stated due to the spec.
Similar arguments can be placed for other functions, but I think you can see the idea from this example.
NOTE:
Providing size limits in the format, such as "%16s", could change the behavior. By the specification, it would be functionally acceptable for sscanf to zero out a buffer to its limits before writing the data into the buffer. In practice, most implementations opt for performance, which means they leave the remainder alone.
When the intent of the specification is to do this sort of zeroing out, it is usually explicitly specified. strncpy is an example. If the length of the string is less than the maximum buffer length specified, it will fill the rest of the space with null characters. The fact that this same "string" function could return a non-terminated string as well makes this one of the most common functions for people to roll their own version.
As far as fgets, a similar situation could arise. The only gotcha is that the specification explicitly states that if nothing is read in, the buffer remains untouched. An acceptable functional implementation could sidestep this by checking to see if there is at least one byte to read before zeroing out the buffer.
Each individual byte in the buffer is an object. Unless some part of the function description of sscanf or fgets mentions modifying those bytes, or even implies their values may change e.g. by stating their values become unspecified, then the general rule applies: (emphasis mine)
6.2.4 Storage durations of objects
2 [...] An object exists, has a constant address, and retains its last-stored value throughout its lifetime. [...]
It's this same principle that guarantees that
#include <stdio.h>
int a = 1;
int main() {
printf ("%d\n", a);
printf ("%d\n", a);
}
attempts to print 1 twice. Even though a is global, printf can access global variables, and the description of printf doesn't mention not modifying a.
Neither the description of fgets nor that of sscanf mentions modifying buffers past the bytes that actually were supposed to be written (except in the case of a read error), so those bytes don't get modified.
The standard is somewhat ambiguous on this, but I think a reasonable reading of it is that the answer is: yes, it's not allowed to write more bytes to the buffer than it read+null. On the other hand, a stricter reading/interpretation of the text could conclude that the answer is no, there's no guarantee. Here's what a publicly avaialble draft says about fgets.
char *fgets(char * restrict s, int n, FILE * restrict stream);
The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array.
The fgets function returns s if successful. If end-of-file is encountered and no characters have been read into the array, the contents of the array remain unchanged and a null pointer is returned. If a read error occurs during the operation, the array contents are indeterminate and a null pointer is returned.
There's a guarantee about how much it is supposed to read from the input, i.e. stop reading at newline or EOF and not read more than n-1 bytes. Although nothing is said explicitly about how much it's allowed to write to the buffer, the common knowledge is that fgets's n parameter is used to prevent buffer overflows. It's a little strange that the standard uses the ambiguous term read, which may not necessarily imply that gets can't write to the buffer more than n bytes, if you want to nitpick on the terminology it uses. But note that the same "read" terminology is used about both issues: the n-limit and the EOF/newline limit. So if you interpret the n-related "read" as a buffer-write limit, then [for consistency] you can/should interpret the other "read" the same way, i.e. not write more than what it read when string is shorter than the buffer.
On the other hand, if you distinguish between the uses of the phrase-verb "read into" (="write") and just "read", then you can't read the committee's text the same way. You are guaranteed that it won't "read into" (="write to") the array more than n bytes, but if the input string is terminated sooner by newline or EOF you're only guaranteed the rest (of the input) won't be "read", but whether that implies in won't be "read into" (="written to") the buffer is unclear under this stricter reading. The crucial issue is keyword is "into", which is elided, so the problem is whether the completion given by me in brackets in the following modified quote is the intended interpretation:
No additional characters are read [into the array] after a new-line character (which is retained) or after end-of-file.
Frankly a single postcondition stated as a formula (and would be pretty short in this case) would have been a lot more helpful than the verbiage I quoted...
I can't be bothered to try and analyze their writeup about the *scanf family, because I suspect it's going to be even more complicated given all the other things that happen in those functions; their writeup for fscanf is about five pages long... But I suspect a similar logic applies.
is it guaranteed that the buffer will not be modified beyond the null
terminator?
No, there's no guarantee.
Is buffer now required to equal "123\0efghijklmnop"?
Yes. But that's only because you've used correct parameters to your string related functions. Should you mess up buffer length, input modifiers to sscanf and such, then you program will compile. But it will most likely fail during runtime.
If the read line is only 3 characters long, can one be certain that the 6th character is the same as before fgets was called?
Yes. Once fgets() figures you have a 3 character input string it stores the input in the provided buffer, and it doesn't care about the reset of provided space at all.
Is buffer now required to equal "123\0efghijklmnop"?
Here buffer is just consists of 123 string guaranteed terminating at NUL.
Yes the memory allocated for array buffer will not get de-allocated, however you are making sure/restricting your string buffer can atmost only have 16 char elements which you can read into it at any point of time. Now depends whether you write just a single char or maximum what buffer can take.
For example:
char buffer[4096] = "abc";`
actually does something below,
memcpy(buffer, "abc", sizeof("abc"));
memset(&buffer[sizeof("abc")], 0, sizeof(buffer)-sizeof("abc"));
The standard insists that if any part of char array is initialized that is all it consists of at any moment until obeying its memory boundary.
There are no any guarantees from standard, which is why the functions sscanf and fgets are recommended to be used (with respect to the size of the buffer) as you show in your question (and using of fgets is considered preferable compared with gets).
However, some standard functions use null-terminator in their work, e.g. strlen (but I suppose you ask about string modification)
EDIT:
In your example
fgets(buffer, 10, fp);
untouching characters after 10-th is guaranteed (content and length of buffer will not be considered by fgets)
EDIT2:
Moreover, when using fgets keep in mind that '\n' will be stored in the buffers. e.g.
"123\n\0fghijklmnop"
instead of expected
"123\0efghijklmnop"
Depends on the function in use (and to a lesser degree its implementation). sscanf will start writing when it encounters its first non-whitespace character, and continue writing until its first whitespace character, where it will add a finishing 0 and return. But a function like strncpy (famously) zeroes out the rest of the buffer.
There is however nothing in the C standard which mandates how these functions behave.
We know that stdin is, by default, a buffered input; the proof of that is in usage of any of the mechanisms that "leave data" on stdin, such as scanf():
int main()
{
char c[10] = {'\0'};
scanf("%9s", c);
printf("%s, and left is: %d\n", c, getchar());
return 0;
}
./a.out
hello
hello, and left is 10
10 being newline of course...
I've always been curious, is there any way to "peek" at the stdin buffer without removing whatever may reside there?
EDIT
A better example might be:
scanf("%9[^.]", c);
With an input of "at.ct", now I have "data" (ct\n) left on stdin, not just a newline.
Portably, you can get the next character in the input stream with getchar() and then push it back with ungetc(), which results in a state as if the character wasn't removed from the stream.
The ungetc function pushes the character specified by c (converted to an unsigned char) back onto the input stream pointed to by stream. Pushed-back characters will be returned by subsequent reads on that stream in the reverse order of their pushing.
Only one character of pushback is guaranteed by the standard, but usually, you can push back more.
As mentioned in the other answers resp. the comments there, in practice, you can almost certainly peek at the buffer if you provide your own buffer with setvbuf, although that is not without problems:
If buf is not a null pointer, the array it points to may be used instead of a buffer allocated by the setvbuf function
that leaves the possibility that the provided buffer may not be used at all.
The contents of the array at any time are indeterminate.
that means you have no guarantee that the contents of the buffer reflects the actual input (and it makes using the buffer undefined behaviour if it has automatic storage duration, if we're picky).
However, in practice the principal problem would be finding out where in the buffer the not-yet-consumed part of the buffered input begins and where it ends.
If you want to look at the stdin buffer without changing it, you could tell it to use a another buffer with setbuf, using an array you can access:
char buffer[BUFSIZ];
if (setbuf(stdin, buffer) != 0)
// error
getchar();
printf("%15s\n", buffer);
This let you see something more than ungetc, but I don't think you can go further in a portable way.
Actually this is legal but is not correct for the standard, quoting from it about the setvbuf (setbuf has the same behavior):
The contents of the array at any time are indeterminate.
So this is not what you need if you're looking for complete portability and standard-compliance, but I can't imagine why the buffer should not contain what is expected. However, it seems to work on my computer.
Beware that you have to provide an array of at least BUFSIZ characters to setbuf, and you must not do any I/O operation on the stream before it. If you need more flexibility, take a look at setvbuf.
You could set your own buffer with setvbuf on stdin, and peek there whenever you want.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
#include
int main(void)
{
char str[100]="88888888888888";
char t[20]="";
gets(t);
puts(str);
puts(t);
return 0;
}
The first line
555555555555555555555555555555555
is put in.
Why str is 55555555555? Why str isn't 88888888888888888 or 55555555555588888?
You overriden the t buffer, and reached the str buffer, where the rest of the input, and the null terminator was set. And puts prints only until the null terminator.
Pretty much looks like that:
[ t (20) ][str(100) ]
55555555555555555555 5555555555555\0
Note that although t is declared as char[20], when you print it you get the full input (longer than 20), since puts stops at the null terminator (again).
BTW, this is a buffer overflow, not a stackoverflow, but stack overflow is possible on this codeas well.
As Binyamin said it is caused by the overflow you trigger because of the input string being too long. However it is a bit random thing - sometimes the two memory allocations will happen just next to each other and the string will extend to the neighbouring variables, sometimes it might not happen.
I advise you to place guard conditions for such kind of overflows.
If you see in the gets documentation:
Notice that gets does not behave exactly as fgets does with stdin as
argument: First, the ending newline character is not included with
gets while with fgets it is. And second, gets does not let you specify
a limit on how many characters are to be read, so you must be careful
with the size of the array pointed by str to avoid buffer overflows.
In your case if you do not know the size apriory maybe it is better idea to use fgets as it is more secure (though a bit slower).
When you enter a string of more than 20 5s, it overruns the buffer that was allocated to t and extends into the buffer that was allocated to str.
Then it displays the contents of str, which is the string you entered, starting at the 21st character.
Finally, it displays the contents of t, but since that string doesn't end with a null character, it continues displaying memory (which is the buffer assiged to str) until it encounters the null character after all the 5s.
In order to avoid those allocation overlapping issues you can try this alternative, so that the allocation takes place at runtime if I'm not wrong:
#include <iostream>
int main(void)
{
char *str;
char *t;
str = new char(100);
str = (char*)"88888888888888";
t = new char(20);
std::cin >> t;
puts(str);
puts(t);
return 0;
}