Is null character supposed to cause stack smashing? - c

I'm testing when stack smashing is detected, and I noticed I don't get stack smashing is detected when I write a null character after a char buffer.
In this example, I write characters into a char buffer, and then write the null character at the end (with f-stack-protector):
#include <stdio.h>
int main(int argc, char *argv[])
{
char buf[8];
char c;
printf("Enter a string: ");
int i;
for (i = 0; (c = getchar()) != '\n'; i++) buf[i] = c;
buf[i] = '\0';
printf("string = [%s]\n", buf);
return 0;
}
I noticed that writing 9 characters will give me *** stack smashing detected ***: terminated as expected, but writing 8 characters does not. If I input 8 characters in this example, wouldn't the null character be written after the buffer and smash the stack?
I tried replacing the null character to writing some other character like 'A' and I will also get stack smashing detected. Is the null character special in this regard, or is this behaviour unexpected?

I assume that you are aware that writing to the array out of bounds is undefined behavior. Consequently, what happens when executing your code depends on the system being used.
Further, the -fstack-protector option is not defined by the C standard so the standard can't tell exactly what it does. You need to read the documentation for your specific compiler to see if the details are explained. Again the point is... what happens when executing your code depends on the system being used.
A way to detect stack smashing is to put some kind of magic pattern on the current stack frame and just before the function returns, it's checked that the magic pattern is still there. In other words - if the pattern has changed we have "stack smashing"; otherwise all is good.
So for fun I took your code and ran it - with a small modification - on my system.
#include <stdio.h>
int main(void)
{
char buf[8];
char c;
// NOTICE
printf("buf is at: %p\n", (void*)buf);
printf("c is at: %p\n", (void*)&c);
printf("Enter a string: ");
int i;
for (i = 0; (c = getchar()) != '\n'; i++) buf[i] = c;
// NOTICE
if (i == 8 && buf[i] == '\0') puts("Already NUL");
buf[i] = '\0';
printf("string = [%s]\n", buf);
return 0;
}
And I gave the input "12345678" and got:
buf is at: 0x7ffe562b8450
c is at: 0x7ffe562b844b
Enter a string: 12345678
Already NUL
string = [12345678]
Two things to notice:
the variable c is located at a lower address than buf so writing after buf will not end up in c
It printed "Already NUL" so the byte just after buf already contained the value that the program writes when doing buf[i] = '\0'; so the write does not change any value. So the code detecting "stack smashing" will not see any change of stack data.
Once again I changed your code so that it write an A instead of a NUL in case there already is a NUL just after buf.
#include <stdio.h>
int main(void)
{
char buf[8];
char c;
printf("buf is at: %p\n", (void*)buf);
printf("c is at: %p\n", (void*)&c);
printf("Enter a string: ");
int i;
for (i = 0; (c = getchar()) != '\n'; i++) buf[i] = c;
if (i == 8 && buf[i] == '\0')
{
puts("Already NUL");
buf[i] = 'A';
}
else
{
buf[i] = '\0';
printf("string = [%s]\n", buf);
}
return 0;
}
and I got:
buf is at: 0x7fffa405e070
c is at: 0x7fffa405e06b
Enter a string: 12345678
Already NUL
*** stack smashing detected ***: <unknown> terminated
So on my system it seems that writing a NUL just after buf will not be detected simply because that part of the stack already contained a NUL. So the write didn't change anything.
But just to repeat... all this depends on the system being used (due to undefined behavior) and depends on how the "stack smashing" is detected.

Your buffer of 8 has only room for 7 characters + 1 null terminator. Therefore you invoke undefined behavior if you attempt to write beyond that. The null character isn't special, undefined behavior means that anything can happen, including "the program seems to work fine". See What is undefined behavior and how does it work?
As a side note, any variable storing the result from getchar() should be declared as int. This is because getchar() may return EOF, which is an int not a char. And yes, it is incredibly stupid to standardize a function named getchar() and have it return an int, not a char. The whole of stdio.h is filled with library design faults and it should therefore be avoided in production-quality code.

Exception with memory are unpredictable, so don't think why they didn't happen.
An error may occur if an attempt is made to access a location outside of the memory range, but it may not be.

Related

How the while statement is executed in C or how this array-referenced pointers work?

I started learning C and I had this exercise from the book "Prentice Hall - The C Programming Language".
Chapter 5 Exercise 3:
Write a pointer version of the fuction strcat that we showed in Chapter 2. strcat(s, t) copies the string t to the end of s.
I did the exercise but the first method that came up to my mind was:
void stringcat(char *s, char *t){
int i,j;
i = j = 0;
while(*(s+i) != '\0'){
printf("%d", i);
i++;
}
while ( (*(t+j)) != '\0'){
*(s+i) = *(t+j);
i++;
j++;
}
}
In main I had:
int main(){
char s[] = "Hola";
char t[] = "lala";
stringcat(s,t);
printf("%s\n", s);
}
At first sight I thought it was right but the actual output was Holalalaa.
Of course it was not the output that I expected, but then I coded this:
void stringcat(char *s, char *t){
int i,j;
i = j = 0;
while(*(s+i) != '\0'){
printf("%d", i);
i++;
}
while((*(s+i) = *(t+j)) != '\0'){
i++;
j++;
}
}
And the output was right.
But then I was thinking a lot about the first code because it's very similar to the second one but why the first output was wrong?. Is it something related with the while statement? or something with pointers?. I found it really hard to understand because you can't see what's happening in the array.
Thanks a lot.
Your code has more than the one problem that you found, but let's start with it.
Actually you are asking why
/* ... */
while ((*(t+j)) != '\0') {
*(s+i) = *(t+j);
/* ... */
works differently than
/* ... */
while ((*(s+i) = *(t+j)) != '\0') {
/* ... */
I hope you see it already, now that both cases stand side by side, actually vertically ;-). In the first case the value of t[j] is compared before it is copied to s[i]. In the second case the comparison is done after the copy. That's why the second case copies the terminating '\0' to the target string, and the first case does not.
The output you get works accidentally, it is Undefined Behavior, since you are writing beyond the border of the target array. Fortunately for you, both strings are laying in sequence in the memory, and you are overwriting the source string with its own characters.
Because your first case does not copy the '\0', the final printf() outputs more characters until a '\0' is encountered. By chance this is the last 'a'.
As others commented, the target string has not enough space for the concatenated string. Provide some more space like this:
char s[10] = "Hola"; /* 10 is enough for both strings and the terminating '\0'. */
However, if you had done this already, the error would have not been revealed, because the last 6 characters of s are initialized with '\0'. Not copying the terminating '\0' makes no difference. You can see this if you use
char s[10] = "Hola\0xxxx";
I don't think that your solution is the expected one. Instead of s[i] you are using *(s + i), which is essentially the same, accessing an array. Consider changing s (and in the course, t) in the function and use just *s.
Side note: The printf() in the function is most probably a leftover from debugging. But I'm sure you know.

Loop through user input with getchar

I have written a small script to detect the full value from the user input with the getchar() function in C. As getchar() only returns the first character i tried to loop through it... The code I have tried myself is:
#include <stdio.h>
int main()
{
char a = getchar();
int b = strlen(a);
for(i=0; i<b; i++) {
printf("%c", a[i]);
}
return 0;
}
But this code does not give me the full value of the user input.
You can do looping part this way
int c;
while((c = getchar()) != '\n' && c != EOF)
{
printf("%c", c);
}
getchar() returns int, not char. And it only returns one char per iteration. It returns, however EOF once input terminates.
You do not check for EOF (you actually cannot detect that instantly when getchar() to char).
a is a char, not an array, neither a string, you cannot apply strlen() to it.
strlen() returns size_t, which is unsigned.
Enable most warnings, your compiler wants to help you.
Sidenote: char can be signed or unsigned.
Read a C book! Your code is soo broken and you confused multiple basic concepts. - no offense!
For a starter, try this one:
#include <stdio.h>
int main(void)
{
int ch;
while ( 1 ) {
ch = getchar();
x: if ( ch == EOF ) // done if input terminated
break;
printf("%c", ch); // %c takes an int-argument!
}
return 0;
}
If you want to terminate on other strings, too, #include <string.h> and replace line x: by:
if ( ch == EOF || strchr("\n\r\33", ch) )
That will terminate if ch is one of the chars listed in the string literal (here: newline, return, ESCape). However, it will also match ther terminating '\0' (not sure if you can enter that anyway).
Storing that into an array is shown in good C books (at least you will learn how to do it yourself).
Point 1: In your code, a is not of array type. you cannot use array subscript operator on that.
Point 2: In your code, strlen(a); is wrong. strlen() calculates the length of a string, i.e, a null terminated char array. You need to pass a pointer to a string to strlen().
Point 3: getchar() does not loop for itself. You need to put getchar() inside a loop to keep on reading the input.
Point 4: getchar() retruns an int. You should change the variable type accordingly.
Point 5: The recommended signature of main() is int main(void).
Keeping the above points in mind,we can write a pesudo-code, which will look something like
#include <stdio.h>
#define MAX 10
int main(void) // nice signature. :-)
{
char arr[MAX] = {0}; //to store the input
int ret = 0;
for(int i=0; i<MAX; i++) //don't want to overrrun array
{
if ( (ret = getchar())!= EOF) //yes, getchar() returns int
{
arr[i] = ret;
printf("%c", arr[i]);
}
else
;//error handling
}
return 0;
}
See here LIVE DEMO
getchar() : get a char (one character) not a string like you want
use fgets() : get a string or gets()(Not recommended) or scanf() (Not recommended)
but first you need to allocate the size of the string : char S[50]
or use a malloc ( #include<stdlib.h> ) :
char *S;
S=(char*)malloc(50);
It looks like you want to read a line (your question mentions a "full value" but you don't explain what that means).
You might simply use fgets for that purpose, with the limitation that you have to provide a fixed size line buffer (and handle - or ignore - the case when a line is larger than the buffer). So you would code
char linebuf[80];
memset (linebuf, 0, sizeof(linbuf)); // clear the buffer
char* lp = fgets(linebuf, sizeof(linebuf), stdin);
if (!lp) {
// handle end-of-file or error
}
else if (!strchr(lp, '\n')) {
/// too short linebuf
}
If you are on a POSIX system (e.g. Linux or MacOSX), you could use getline (which dynamically allocates a buffer). If you want some line edition facility on Linux, consider also readline(3)
Avoid as a plague the obsolete gets
Once you have read a line into some buffer, you can parse it (e.g. using manual parsing, or sscanf -notice the useful %n conversion specification, and test the result count of sscanf-, or strtol(3) -notice that it can give you the ending pointer- etc...).

Trouble \0 null terminating a string (C)

I seem to have some trouble getting my string to terminate with a \0. I'm not sure if this the problem, so I decided to make a post.
First of all, I declared my strings as:
char *input2[5];
Later in the program, I added this line of code to convert all remaining unused slots to become \0, changing them all to become null terminators. Could've done with a for loop, but yea.
while (c != 4) {
input2[c] = '\0';
c++;
}
In Eclipse when in debug mode, I see that the empty slots now contain 0x0, not \0. Are these the same things? The other string where I declared it as
char input[15] = "";
shows \000 when in debug mode though.
My problem is that I am getting segmentation faults (on Debian VM. Works on my Linux 12.04 though). My GUESS is that because the string hasn't really been terminated, the compiler doesn't know when it stops and thus continues to try to access memory in the array when it is clearly already out of bound.
Edit: I will try to answer all other questions soon, but when I change my string declaration to the other suggested one, my program crashes. There is a strtok() function, used to chop my fgets input into strings and then putting them into my input2 array.
So,
input1[0] = 'l'
input1[1] = 's'
input1[2] = '\n'
input2[0] = "ls".
This is a shell simulating program with fork and execvp. I will post more code soon.
Regarding the suggestion:
char *input2[5]; This is a perfectly legal declaration, but it
defined input2 as an array of pointers. To contain a string, it needs
to be an array of char.
I will try that change again. I did try that earlier, but I remember it giving me another run-time error (seg fault?). I think it is because of the way I implemented my strtok() function though. I will check it out again. Thanks!
EDIT 2: I added a response below to update my progress so far. Thanks for all the help!
It is here.
.
You code should rather look like this:
char input2[5];
for (int c=0; c < 4; c++) {
input2[c] = '\0';
}
0x0 and \0 are different representation of the same value 0;
Response 1:
Thanks for all the answers!
I made some changes from the responses, but I reverted the char suggestion (or correct string declaration) because like someone pointed out, I have a strtok function. Strtok requires me to send in a char *, so I reverted back to what I originally had (char * input[5]). I posted my code up to strtok below. My problem is that the program works fine in my Ubuntu 12.04, but gives me a segfault error when I try to run it on the Debian VM.
I am pretty confused as I originally thought the error was because the compiler was trying to access an array index that is already out of bound. That doesn't seem like the problem because a lot of people mentioned that 0x0 is just another way of writing \000. I have posted my debug window's variable section below. Everything seems right though as far as I can see.. hmm..
Input2[0] and input[0], input[1 ] are the focus points.
Here is my code up to the strtok function. The rest is just fork and then execvp call:
int flag = 0;
int i = 0;
int status;
char *s; //for strchr, strtok
char input[15] = "";
char *input2[5];
//char input2[5];
//Prompt
printf("Please enter prompt:\n");
//Reads in input
fgets(input, 100, stdin);
//Remove \n
int len = strlen(input);
if (len > 0 && input[len-1] == '\n')
input[len-1] = ' ';
//At end of string (numb of args), add \0
//Check for & via strchr
s = strchr (input, '&');
if (s != NULL) { //If there is a &
printf("'&' detected. Program not waiting.\n");
//printf ("'&' Found at %s\n", s);
flag = 1;
}
//Now for strtok
input2[i] = strtok(input, " "); //strtok: returns a pointer to the last token found in string, so must declare
//input2 as char * I believe
while(input2[i] != NULL)
{
input2[++i] = strtok( NULL, " ");
}
if (flag == 1) {
i = i - 1; //Removes & from total number of arguments
}
//Sets null terminator for unused slots. (Is this step necessary? Does the C compiler know when to stop?)
int c = i;
while (c < 5) {
input2[c] = '\0';
c++;
}
Q: Why didn't you declare your string char input[5];? Do you really need the extra level of indirection?
Q: while (c < 4) is safer. And be sure to initialize "c"!
And yes, "0x0" in the debugger and '\0' in your source code are "the same thing".
SUGGESTED CHANGE:
char input2[5];
...
c = 0;
while (c < 4) {
input2[c] = '\0';
c++;
}
This will almost certainly fix your segmentation violation.
char *input2[5];
This is a perfectly legal declaration, but it defined input2 as an array of pointers. To contain a string, it needs to be an array of char.
while (c != 4) {
input2[c] = '\0';
c++;
}
Again, this is legal, but since input2 is an array of pointers, input2[c] is a pointer (of type char*). The rules for null pointer constants are such that '\0' is a valid null pointer constant. The assignment is equivalent to:
input2[c] = NULL;
I don't know what you're trying to do with input2. If you pass it to a function expecting a char* that points to a string, your code won't compile -- or at least you'll get a warning.
But if you want input2 to hold a string, it needs to be defined as:
char input2[5];
It's just unfortunate that the error you made happens to be one that a C compiler doesn't necessarily diagnose. (There are too many different flavors of "zero" in C, and they're often quietly interchangeable.)

Where does C store '\0'?

The below code reads a line and return the line length. lim is the length of the array s[].
When the input line length is lim, then s[lim] = '\0'. But the array s[] is only lim-length long, from s[0] to s[lim-1]. Will it cause an buffer overflow? I tested it many times, but the code seemed to work just fine.
int getline(char s[], int lim)
{
int c, i;
for(i = 0; i < lim-1 && ( c = getchar())!= EOF && c!= '\n'; i++)
s[i] = c;
if( c == '\n') {
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
The '\0' is just another character. It is stored right after the last character of the string.
Often, you can "get away" with writing off the end of a buffer with no obvious harm, but don't do it. It's a bug.
I once had to debug a program that contained an error like this. The program was writing a single byte past the end of one buffer. In the debug build, there was enough extra stuff on the stack that the single byte extra caused no harm; the crash only occurred in the release build, but the debugger didn't really work since it was the non-debug build. This is an example of why it is good to test your code both in a "debug" build and in a release build (compiled the way you would give it to your users).
This is a good example as to how to clearly define an interface - its input and returned value;
" int getline(char s[], int lim) "
One possible definition of "lim" is, maximum number of characters to be copied to s[], excluding the terminating null-character i.e. '\0'
Example:
char arr[] = "hello";
getline(arr, strlen(arr));
The other definition of "lim" is, Maximum number of characters to be copied into s[] (including the terminating null-character)
Example:
char arr[] = "hello";
getline(arr, sizeof(arr));
You seem to be supposing the 2nd definition of "lim".
This is a function straight out of "The C Programming Language" by K&R. It's from chapter one. It works because it is correct.
Consider "cat". This is a four character array {'c','a','t','\0'}. The length of the string is 3.
If s[]="cat" then s[0]='c', s[3]='\0'. Eh?
The string length returned by srtlen or what have you is the number of characters minus one. The array is allocated to hold all the 4 characters. That's where the '\0' is, at the end of the array.
No, it won't cause buffer overflow. In fact, a '\0' indicates a NULL position, which is considered as the end of an array. When you go from the beginning to the end of an array, the last position containing the '\0' character will never be considered as a position containing valid data.
You could go over all the array by using while(index < size) as a condition, or by using while(array[position] != NULL)

Taking a string as input and storing them in a character array in C [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions must demonstrate a minimal understanding of the problem being solved. Tell us what you've tried to do, why it didn't work, and how it should work. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I am stumped on how to store strings in an array in C, with each character kept separately. As an example, if the user inputs hellop, I want to store it in a given array, say userText, with userText[0] = h, userText[1] = e, userText[2] = l, and so on. I know this is easy stuff, but I'm still new. So if anyone could help, it would be great. Please explain how to do this using pointers.
#include<stdio.h>
void main()
{
char a[10],c;
int i=0;
while((c=getchar())!='\n')
{
scanf("%c",&a[i++]);
c=getchar();
}
for(i=0;i<11;i++)
printf("%c",a[i]);
}
The program outputs some garbage value (eoeoeoeo\363) when I type in hellop.
To read input I recommend using the fgets function. It's a nice, safe alternative to scanf.
First let's declare a buffer like so:
char user_input[20];
Then we can get user input from the command line in the following manner:
fgets(user_input, 20, stdin);
This will store a maximum of 20 characters into the string from the standard input and it will ensure it is null-terminated. The fact that we've limited the input to the size of the array declared earlier ensures that there's no possibility of buffer overruns.
Then let's clear the pesky newline that's been entered into the string using strlen:
user_input[strlen(user_input) -1] = '\0';
As strlen returns the size of the string up to the null terminator but without it, we can be sure at that position lies the newline character (\n). We replace it with a null-terminator(\0) so that the string ends there.
Finally, let's print it using printf:
printf("The user has entered '%s'\n", user_input);
To use fgets and printf you will need to declare the following header:
#include <stdio.h>
For strlen we need another header, namely:
#include <string.h>
Job done.
P.S. If I may address the code you've added to your question.
main is normally declared as int main rather than void main which also requires that main returns a value of some sort. For small apps normally return 0; is put just before the closing brace. This return is used to indicate to the OS if the program executed successfully (0 means everything was OK, non-zero means there was a problem).
You are not null-terminating your string which means that if you were to read in any other way other than with a careful loop, you will have problems.
You take input from the user twice - once with getchar and then with scanf.
If you insist on using your code I've modified it a bit:
#include<stdio.h>
int main()
{
char a[10];
int i=0;
while( (a[i++]=getchar()) != '\n' && i < 10) /* take input from user until it's a newline or equal to 10 */
;
a[i] = '\0'; /* null-terminate the string */
i = 0;
while(a[i] != '\0') /* print until we've hit \0 */
printf("%c",a[i++]);
return 0;
}
It should now work.
To read a string into char array:
char *a = NULL;
int read;
size_t len;
read = getline(&a, &len, stdin);
//free memory
free(a);
Your code is this (except I've added a bunch of spaces to improve its readability):
1 #include <stdio.h>
2 void main()
3 {
4 char a[10], c;
5 int i = 0;
6 while ((c = getchar()) != '\n')
7 {
8 scanf("%c", &a[i++]);
9 c = getchar();
10 }
11 for (i = 0; i < 11; i++)
12 printf("%c", a[i]);
13 }
Line-by-line analysis:
OK (now I've added the space between #include and <stdio.h>).
The main() function returns an int.
OK (it is hard to get an open brace wrong).
Since the return value of getchar() is an int, you need to declare c separately as an int.
OK.
Needs to account for EOF; should be while ((c = getchar()) != EOF && c != '\n'). You're still very open to buffer overflow, though.
OK.
Not OK. This reads another character from standard input, and doesn't check for EOF.
Not OK. This too reads another character from standard input. But when you go back to the top of the loop, you read another character. So, as things stand, if you type abcdefg at the program, c is assigned 'a' in the loop control, then a[0] is assigned 'b', then c is assigned 'c', then the loop repeats with a[1] getting 'e'. If I'd typed 6 characters plus newline, the loop would terminate cleanly. Because I claimed I typed 7 characters, the third iteration assigns 'g' to c, which is not newline, so a[2] gets the newline, and the program waits for more input with the c = getchar(); statement at the end of the loop.
OK (ditto close braces).
Not OK. You don't take into account early termination of the loop, and you unconditionally access a non-existent element a[10] of the array a (which only has elements 0..9 — C is not BASIC!).
OK.
You probably need to output a newline after the for loop. You should return 0; at the end of main().
Because your input buffer is so short, it will be best to code a length check. If you'd used char a[4096];, I'd probably not have bothered you about it (though even then, there is a small risk of buffer overflow with potentially undesirable consequences). All of this leads to:
#include <stdio.h>
int main(void)
{
char a[10];
int c;
int i;
int n;
for (i = 0; i < sizeof(a) && ((c=getchar()) != EOF && c != '\n')
a[i++] = c;
n = i;
for (i = 0; i < n; i++)
printf("%c", a[i]);
putchar('\n');
return 0;
}
Note that neither the original nor the revised code null terminates the string. For the given usage, that is OK. For general use, it is not.
The final for loop in the revised code and the following putchar() could be replaced (safely) by:
printf("%.*s\n", n, a);
This is safe because the length is specified so printf() won't go beyond the initialized data. To create a null terminated string, the input code needs to leave enough space for it:
for (i = 0; i < sizeof(a)-1 && ((c=getchar()) != EOF && c != '\n')
a[i++] = c;
a[i] = '\0';
(Note the sizeof(a)-1!)

Resources