scanf results in seg fault? - c

gcc compiles perfectly well, but as soon as scanf accepts a string it seg faults. I'm sort of at a loss. Here's the code.
char *line[256];
void *mainThread()
{
while (*line != "quit") {
scanf("%s", *line);
printf("%s", *line);
}
return;
}
Is there something about scanf I'm not understanding here?

First, you are allocating an array of pointers to characters, not an array of char:
char *line[256]; /* allocates 256 pointers to a character -
(pointers are initialized to NULL) */
You need to allocate an array of characters instead:
char line[256]; /* allocates 256 characters */
Second, you need to use strcmp to compare the strings - with !=, you are comparing the pointer (address) stored in line[0] (which is the same as *line) with a pointer to the string literal "quit", and they are always different.
You can use the following sscce as a starting point:
#include <string.h>
#include <stdio.h>
char line[256];
int main()
{
while (strcmp(line, "quit") != 0) {
scanf("%s", line);
printf("%s", line);
}
return 0;
}
Some additional notes:
See #Joachims answer for an explanation of the actual cause of the segmentation fault.
You are declaring your function to return a void* pointer, but you are not returning anything (using return with no argument). You should then simply declare it as void.
Do not use scanf() to read input, since it might read more characters than you have allocated which leads to buffer overflows. Use fgets() instead. See also Disadvantages of scanf.
Always compile with all warnings enabled, and take them serious - e.g. if you are using gcc, compile with -Wall -pedantic.

When declaring global variables, they are initialized to zero. And as you are declaring an array of pointers to char (instead of an array of char as I think you really intended) you have an array of 256 NULL pointers.
Using the dereference operator * on an array is the same as doing e.g. array[0], which means that as argument to both scanf and printf you are passing line[0], which as explained above, is a NULL pointer. Dereferencing NULL pointers, like scanf and printf will do, is a case of undefined behavior, one that almost always leads to a crash.

It is compiling because your program is syntactically correct. However, it has serious semantic errors which show up as program crash due to segfault. The global array line is initialized to zero, as any global variable. Since, line is an array of pointers (not an array of characters which is intended), the zeros are interpreted as NULL, the null pointer. *line is same as line[0] and the string literal "quit" evaluates to a pointer to its first element. Therefore the while condition is the same as
while(NULL != "quit") // always true
Next, scanf("%s", *line); tries to write the input string into the buffer pointed to by line[0] which is NULL - a value which is unequal to the address of any memory location. This will result in segfault and cause the program to crash.
There are other mistakes in your code snippet. Let's take them one by one.
char *line[256];
The above statement defines an array line of 256 pointers to characters, i.e., its type is char *[256]. What you need is an array of characters -
char line[256];
You can't compare array in C. What you can do is compare them element-by-element. For strings, you should use the standard library function strcmp. Also, please note that the %s conversion specifier in the format string of scanf reads a string from stdin and writes it into the buffer pointed to by the next argument. It puts a terminating null byte at the end but it does not check for buffer overrun if you input a string too large for the buffer to hold. This would lead to undefined behaviour and most likely segfault due to illegal memory access. You should guard against buffer overrun by specifying maximum field width in the format string.
void *mainThread();
The above function declaration means that mainThread is function which returns a pointer of void * type and take an unspecified but fixed number and type of arguments because empty parentheses mean no information about the parameter list is provided. You should write void in the parameter list to mean that the function takes no arguments. Also note that the empty return statement in your function would cause undefined behaviour if the return value of the function is used because you are not returning anything and would be using the garbage value in the return address of the function instead. Assuming that you want to return the string, it should be defined as -
char line[256];
char *mainThread(void) {
while(strcmp(line, "quit") != 0) {
scanf("%255s", line); // -1 for the terminating null byte
printf("%s", line);
}
return line;
}

Related

Defining strings using pointers Vs. char arrays in C

I am confused about how pointers to characters work. when I run the following code, what happens?
int main()
{
char* word;
scanf("%s",word);
printf("%s",word;
}
the first line in the main is defining a pointer to char without initialization. scanf should store the word somewhere and give the address to the pointer, right? what if I input a big string, would it overwrite something in the memory?
And what happens in the first line in the following code other than defining a pointer to char. Does the compiler set some limits? or I can't exceed the size specified, right? If done, I will have a run time error, right? what is the difference between the two cases?
int main()
{
char word[100];
scanf("%s",word);
printf("%s",word;
}
What about pointers to other types? Can I just keep writing to the following places using offsets?
scanf should store the word somewhere and give the address to the pointer, right?
No. It is the other way around. You define the address where scanf shall store the value. As you fail to initialize the pointer to some valid address, you cause undefined behaviour that might result in a crash in best case or seem to work in worst case.
And what happens in the first line in the following code other than defining a pointer to char.
There is no pointer involved at all. An array is not a pointer. An array provides all the memory it needs to store all its members. A pointer doesn't do this.
Does the compiler set some limits? or I can't exceed the size specified, right?
You can write wherever you want. No one will prevent you from doing this. At least no from trying. If you write to some location that does not belong to the memory you allocated, you again cause undefined behaviour.
The function scanf requires that you pass it the address of a sufficiently large memory buffer for storing the string. If you don't do this, then you will be invoking undefined behavior (i.e. your program may crash).
Simply passing a wild pointer (i.e. an arbitrary memory address) is not sufficient. Rather, you must reserve the memory that you intend to use, for example by declaring an array or by using the function malloc.
Using the %s scanf conversion format specifier by itself is not a good idea, because even if the allocated memory buffer has a size of 100 characters, if the user types more than 99 characters (100 including the terminating null character), then the function will write to the array out of bounds, causing undefined behavior. Therefore, you should always limit the number of characters that are written, in this case by writing %99s instead of simply %s.
Also, before using the result of scanf, you should always check the return value of the function, and only use the result if the function was successful.
int main()
{
char word[100];
if ( scanf( "%99s", word ) == 1 )
printf( "%s\n", word );
else
printf( "input error!\n" );
}
what if I input a big string, would it overwrite something in the memory?
It doesn't have to be a "big" string. Writing even a "small" string to a wild pointer will cause undefined behavior and something important may be overwritten, or your program may crash.
And what happens in the first line in the following code other than defining a pointer to char. Does the compiler set some limits?
The line
char word[100];
will allocate an array of 100 characters, i.e. it will give you a memory buffer that is sufficiently large to store 100 characters. This does not give you a pointer. However, when using the array word in the line
scanf("%s",word);
the array word will decay to a pointer to the first element.
Does the compiler set some limits? or I can't exceed the size specified, right?
The compiler won't prevent you from writing to the array out of bounds, but if you allow this to happen, then your program will have undefined behavior (i.e. your program may crash). Therefore, you probably don't want to allow that to happen.
If done, I will have a run time error, right?
If you are lucky, then yes, your program will crash immediately and you will easily be able to identify and fix the bug. If you are unlucky, then no, your program won't crash, but will work as intended, and you won't notice the bug for a very long time, until much later in development, when one day the bug starts overwriting something important in your program. In that case, the bug will probably be hard to diagnose.
This is because C is not a memory-safe language.
However, because these kinds of bugs are often hard to find, there are tools which can help detect these kinds of bugs, such as valgrind and AddressSanitizer.
According to the description of the conversion specifier %s in the C Standard
If no l length modifier is present, the corresponding argument shall
be a pointer to the initial element of a character array large enough
to accept the sequence and a terminating null character, which will be
added automatically.
That is when you pass a pointer as an argument of the function that corresponds to the format %s it shall point to the first element of a character array where the input string will be stored. The character array shall be large enough to accommodate the entered string (including the appended terminating zero character '\0')
In the first program
int main()
{
char* word;
scanf("%s",word);
printf("%s",word;
}
the pointer word is uninitialized and has an indeterminate value. So these two statements
scanf("%s",word);
printf("%s",word;
invoke undefined behavior.
You need to provide a valid value of the pointer that will point to a character array. For example
char s[100];
char *word = s;
Or you can allocate memory dynamically like
char *word = malloc( 100 * sizeof( char ) );
In the second program
int main()
{
char word[100];
scanf("%s",word);
printf("%s",word;
}
the array word used as an argument is implicitly converted to a pointer to its first element. If you will enter a string that fits in the array with 100 elements then the program will behave correctly.
However if you will enter 100 or more characters without embedded spaces then the program again will have undefined behavior.
To avoid such a situation you can specify the maximum length of the string that can be read in the array word by using the length modifier the following way
scanf("%99s",word);
If you want to input a string that may have embedded spaces you should use another conversion specifier. For example
scanf("%99[^\n]", word );
or
scanf(" %99[^\n]", word );
Here are two demonstration programs that show the difference between the two conversion specifiers used to enter a string.
#include <stdio.h>
int main(void)
{
char word[100];
scanf( "%99s", word );
puts( word );
return 0;
}
If to enter the string
Hello Mohammed Elbagoury
then the program output will be
Hello
And the second program
#include <stdio.h>
int main(void)
{
char word[100];
scanf( "%99[^\n]", word );
puts( word );
return 0;
}
Again if to enter
Hello Mohammed Elbagoury
then the program output will be
Hello Mohammed Elbagoury
If you will enter more than 99 characters then only the first 99 characters will be stored in the array appended with the terminating zero character '\0'.
As for your this question
Can I just keep writing to the following places using offsets?
then you can use the pointer arithmetic to store data in any position of an array. for example
int a[10];
scanf( "%d", a + 5 );
In this case a number will be written in the element of the array a[5].
The above statement is equivalent to
scanf( "%d", &a[5] );

Word given to the standard in using scanf isn't printed without error

I am trying to printf a simple string but I am not being able to.
#include <stdio.h>
int main(){
char *word;
scanf("%s", &word);
printf("%s\n", word);
return 0;
}
When I insert the word my code breaks.
It just stops the program execution but doesn't give me any error.
What am I doing wrong?
Problem 1: you need to allocate space for your word.
Problem 2: Your scanf() syntax is incorrect for a character array.
Problem 3: scanf("%s", ...) itself is susceptible to buffer overruns.
SUGGESTED ALTERNATIVE:
#include <stdio.h>
#define MAXLEN 80
int main(){
char word[MAXLEN];
fgets(word, MAXLEN, stdin);
printf("%s", word);
return 0;
}
word needs space - i.e. memory
So change
char *word;
to
char word[1000]; // Or some other value as appropriate
And to prevent buffer overruns use
scanf("%999s", word); // 1 character for null!
BTW - Do not need &
By char* word, you are creating a pointer to the string but not actually allocating memory for the string which is causing the code to break.
You can either use malloc to dynamically allocate memory or try something like char str[len] where len is the string length.
In C, you need to manage the memory yourself.
char *word just points to a random memory address. No memory has been allocated. When you go to write to it the operation system won't let you and you get a memory access violation.
You need to allocate a specific amount of memory to store the characters, and the terminating null byte. "Strings" in C are arrays of characters, plus a null byte to end it. C doesn't know how much memory has been allocated to a variable.
char word[101];
scanf("%100s", word);
printf("%s\n", word);
That's space for 100 characters, plus the null byte. And scanf is restricted to read only 100 characters so it does not try to access someone else's memory.
Finally, word is already a pointer. No need to take its address.
word is an initialised pointer; you have provided no space into which scanf() can write.
Also &word has type char** where the %s format specifier requires a char*.
char word[32] ;
scanf("%31s", word);
printf("%s\n", word);
What am I doing wrong?
Other than in the program itself, the fundamental thing you're doing wrong is failing to compile with warnings enabled:
Why should I always enable compiler warnings?
If you were to compile your program with warnings enabled, you would get something like:
source>:5:12: warning: format '%s' expects argument of type 'char *', but argument 2
has type 'char **' [-Wformat=]
5 | scanf("%s", &word);
| ~^ ~~~~~
| | |
| | char **
| char *
See this on GodBolt.
That may not tell you what exactly the problem is, but it well direct you to where you're doing something fishy and unexpected.
try this:
#include <stdio.h>
#include <string.h>
int main(void) {
char buffer[80];
fgets(buffer, sizeof(buffer), stdin);
printf("echo %s\n", buffer);
return 0;
}
In C you need to manage memory. In the example above we allocate a buffer of 80 bytes on the stack; asked the standard library to read up to 80 characters and then generated the output.
In your code snippet, you had an uninitialised pointer and then you where giving the address of that pointer (which is a stack address) to scanf. scanf will then proceed to read into your stack... which means that it will typically overwrite the return address of your function.
try this snippet:
int main(void) {
char *buffer;
printf("%p\n", &buffer); // This prints a stack address
}
I was looking for some reference that would explain the concept of clobbering the stack by writing into it. This was the top link I could find.
https://en.wikibooks.org/wiki/X86_Disassembly/Calling_Conventions
When C calls a function, it will do the following:
push arguments into the stack
call the function which results in stack pointer and return address being pushed into the stack
push into the stack space for local variables (e.g. our char *word;) in your initial example.
On x86 stack grows up (i.e. to lower addresses) so when you write into the memory region of a local variable and you didn't reserve space for what you want to write you are effectively overwriting the call stack.
It helps to have an idea of how a to program a CPU with assembly, specially stack and function call / return in order to properly understand C. It is also extremely useful even when one is using higher order languages; sooner or later one needs to understand what is going on under all those layers of abstraction.
Pointer is only the pointer and it does only hold the location of the object it points.
char *pointer;
This declaration creates the pointer which does not reference(point) anything.
You need to create an object and then assign the reference of this object to this pointer.
Examples:
char *word = malloc(100);
char array[100];
char *word = array;
To scanf a string you need to have the pointer assigned with the reference of the ocject which is large enough to accommodate this string.
char *word = malloc(100);
scanf("%s", word);
The difference between word and &word is explained here: Quick question about check allocation function algorithm (C)

C program using char* crashing

The goal is to count all the vowels from a char* the user puts in. The program has other functions and this is called from main.
I have also included stdio.h, stdbool.h, and string.h
char* countWord;
int vowels;
printf("Type the word to count vowels:");
scanf("%s", &countWord);
vowels = vowelCount(countWord);
printf("%d", vowels);
The following is the function I was used. I also tried strlen(string) which caused a crash as well.
int vowelCount(char* string){
int vowels;
int i;
int size;
printf("function entered");
for (; *string; string++){
if (string[i] == 'a'){
vowels++;
} else if(string[i] == 'e'){
vowels++;
} else if(string[i] == 'i'){
vowels++;
} else if(string[i] == 'o'){
vowels++;
} else if(string[i] == 'u'){
vowels++;
}
}
return vowels;
}
What am I doing wrong? I'm new to C but have experience in other languages.
Thanks in advance.
Exercising undefined behavior...
You have not allocated any space for the pointer to point at, so when you're trying to use it, the behavior is undefined.
Simply make some space for it:
char buffer[1000];
char* countWord = buffer;
and there's another mistake:
scanf("%s", &countWord);
^
You shouldn't use an address of (&) operator here. Just drop it. You're reading a string into the target of the pointer, not the pointer itself.
Also note that you're doing some mixed code in your function. You're using an uninitialized variable i, yet that seems unnecessary since you're incrementing the pointer string. So you want to drop i and change the if statement to
if (*string == 'a')
And be sure to initialize vowel as well:
int vowel = 0;
From the scanf man page, the format specifier %s:
"Matches a sequence of non-white-space characters; the next pointer must be a pointer to char, and the array must be large enough to accept all the sequence and the terminating NUL character."
The pointer your code provides matching the sole %s specifier, however, is the address of the variable declared as a pointer to char: &countWord is of type pointer to pointer to char. Thus, scanf writes (or attempts to write) the matched sequence to a location sized for a pointer to char, not necessarily the sequence length + null terminator. This may thus write into unallocated memory, which is undefined behavior (and oftentimes, a segfault). Simply removing the address-of operator will not suffice on its own to resolve the issue, either, because simply declaring a pointer to char does not allocate the space for the characters you presumably want said pointer to point to at some point.
What you must do to read a string using scanf as you have attempted to is to ensure that sufficient space is allocated to store the sequence you will read, then pass scanf a pointer to that space. These could be allocated statically:
char countWord[512]; // Assumes input sequence will consist of no more than 511 characters, since space is needed for the terminating NUL character
Or dynamically:
char* countWord = malloc(sizeof(char) * 512); // Same size as the above, so input still must be no more than 511 characters, but dynamically allocated so will need to be explicitly freed later to avoid leaking memory
Given the space is allocated and sufficient, you can then pass countWord (which defined in either of the manners shown above, is effectively a pointer to char, no address-of required*)
If the sequence your scanf call is reading as input is under some defined input restriction, you can pass scanf a pointer to sufficient allocated space for the maximum allowable input sequence size, guaranteeing the read sequence will not exceed the allocated space.
This, however, depends on the input following said limits. Better would be for your code to limit how much it might read, so it really is guaranteed that you won't access unallocated memory, even if your input source decides not to behave as expected. scanf provides a mechanism to do this by including a field width with the specifier, e.g.:
scanf("%511s", countWord); // Reads at most 511 bytes of input into the location pointed to by countWord, plus the NUL terminator.
Obviously, instead of 511 characters of input you would choose a suitable number that all valid inputs to the program should fall within.
The other answer noting that variables should be initialized before being read from is also true, but crashes specifically are more likely to result from interacting with unallocated memory than allocated but merely uninitialized memory (not that reading from the latter isn't undefined behavior).
*Technically there are some differences between a char array declared explicitly as an array and a pointer to char, but those differences are not particularly relevant to this question.
Your lack of memory allocation for countWord was already mentioned in other answers.
But you also have a problem with usage of uninitialized variable while counting the vowels.
You iterate using string pointer and add some random value i on top of that.
int vowelCount(char* string){
int i; // <<=== not initialized, holding ramdon garbage value
for (; *string; string++){
if (string[i] == 'a'){ // << adding random index to pointer.
...

Why does this C program which dereferences an aliased area of memory cause a segmentation fault?

#include <stdio.h>
#include <stdlib.h>
int main()
{
char *myptr = calloc(500,1);
char *myptr2 = myptr;
*myptr++ = 'A';
*myptr++ = 'B';
*myptr = '\0';
/* This should dereference the beginning of the memory and print until myptr[2] which is '\0' */
printf("Myptr2 points to: %s\n", *myptr2);
free(myptr);
return(EXIT_SUCCESS);
}
Why is line 13 (the printf line) creating a SIGSEV? It should be pointing to the beginning of the memory and then printf should print until it hits the '\0'.
can you tell conceptually what the problem is? If you dereference a pointer to the memory, what does this cause?
When you have a %s slot in the format string, printf expects to see a char* as the corresponding argument. That's why you should pass myptr2 (which is the address of 'A' and from which the subsequent addresses of the string characters can be deduced).
If you pass *myptr2 instead, you are basically passing the character 'A' itself (with no information whatsoever as to where that particular 'A' is — which would have allowed printf to read the rest of the string). Simply put, printf expected a pointer there, so it attempts to treat the corresponding argument as a pointer.
Now notice that the character you passed (by dereferencing a char*, therefore getting a char with the value of 'A') has a size of 1 byte, while a pointer has a size of 4 or 8 bytes typically. This means that printf will most likely read a garbage address made up of a character and some random data found in the stack. There can be no guarantees as to what can happen to the program in this case, so the whole incident invokes undefined behavior.
In your code, you should not dereference myptr2 in second argument of printf, so you have to replace:
printf("Myptr2 points to: %s\n", *myptr2);
with:
printf("Myptr2 points to: %s\n", myptr2);
When using %s with printf, you have to give the pointer on first character of the string.

Why doesn't gets() take a char pointer argument if it can take a char array?

Considering this code snippet:
#include <stdio.h>
int main()
{
char *s;
gets(s);
printf("%s",s);
return 0;
}
I get a runtime error in this case after entering some input at stdin. However if s is declared as an array, s[size], there is no issue. But considering the gets prototype, char *gets(char *s); shouldn't it work?
The gets function expects s to point to a character array that can accept a string. But in this case, s is uninitialized. So gets tries to dereference an uninitialized pointer. This invokes undefined behavior.
If you were to set s to point to a preexisting array or if you used malloc to allocate space, then you can write to it successfully.
In contrast, if s is defined as an array, it decays to a pointer to the first element of the array when passed to gets. Then gets is able to write to the array.
Note however that gets is unsafe because it makes no attempt to verify the size of the buffer you pass to it. If the user enters a string larger than the buffer, gets will write past the end of the buffer, which again invokes undefined behavior.
You should instead use fgets, which accepts the size of the buffer as a parameter.

Resources