characters versus arrays in c - c

Why do 1, 2, and 3 work when 4 generates a segmentation fault? (See below.)
char c[10];
char* d;
1.
scanf("%s", &c);
printf("%s\n", &c);
2.
scanf("%s", c);
printf("%s\n", c);
3.
scanf("%s", &d);
printf("%s\n", &d);
4.
scanf("%s", d);
printf("%s\n", d);

Repeating the code in the question:
char c[10];
char* d;
1.
scanf("%s", &c);
printf("%s\n", &c);
This is likely to work as expected, but in fact the behavior is undefined.
scanf with a "%s" format requires an argument of type char*. &c is of type char (*)[10], i.e., it's a pointer to a char[10] array. It points to the same location in memory as the address of the 0th element of c, but it's of a different type. The same thing happens with the printf: the "%s" format tells it to expect a char* argument, but you're passing it a char(*)[10] argument.
Since scanf is a variadic function, there's no required type checking for arguments other than the format string. The compiler will (probably) happily pass the char (*)[10] value to scanf, assuming that it can handle it. And it probably can, on an implementation where all pointers have the same size, representation, and argument-passing mechanism. But, for example, a C compiler for an exotic architecture could easily make char* pointers bigger than pointers to larger types. Imagine a CPU whose native address points to, say, a 64-bit word; a char* pointer might be composed of a word pointer plus a byte offset.
2.
scanf("%s", c);
printf("%s\n", c);
This is better. c is an array, but in this context an array expression "decays" to a pointer to the array's first element -- which is exactly what scanf with a "%s" format requires. The same thing happens passing c to printf. (But there are still some problems; I'll get to that after the other examples.
3.
scanf("%s", &d);
printf("%s\n", &d);
Since d is a single char* argument, &d is of type char**, and again, you're passing arguments of the wrong type. If all pointers have the same representation (and the same argument-passing mechanism), and the input for the scanf is short enough, this might happen to "work". It treats the char* object as if it were an array of char. If char* is 4 bytes, and the input string is no more than 3 characters long, this will probably work -- as if you had used a char[4] and written the calls correctly. But it's extremely poor practice to store character strings directly into a pointer object, and there's a huge risk of writing past the end of the object, with unpredictable results. (Those unpredictable results include writing into memory that isn't being used for anything else, which could appear to work; such is the nature of undefined behavior.)
(The C standard gives special permission to treat any object as an array of characters, but in this case it's a very bad idea.)
4.
scanf("%s", d);
printf("%s\n", d);
Here the types are all correct, but unless you've initialized d to point to a sufficiently large array of char, it's likely to fail spectacularly (or, worse, appear to work "correctly", which means you've got a subtle bug that will probably show up later).
And now we get to what I mentioned above about other problems.
For example 4, I mentioned that d needs to point to a "sufficiently large" array. How large is "sufficiently large"? There's no answer to that. scanf("%s", ...) reads a whitespace-delimited sequence of characters with no upper bound on its length. If I run your program and hold down the x key, for example, I can provide an input string longer than any buffer you've provided, with unpredictable results (undefined behavior again).
The scanf function's "%s" format cannot be used safely (unless your program runs in an environment where you can control what will appear on the standard input stream).
One good way to read text input is to use fgets to read a line at a time, then use other functions to analyze the result. fgets requires you to specify the maximum length of the input; if the actual input exceeds the limit, it's truncated and left to be read by later calls. It's not quite as convenient as scanf, but it can be done safely. (And never use the gets function; like scanf("%s", ...), it cannot be used safely.)
Suggested reading:
Section 6 of the comp.lang.c FAQ does an excellent job of explaining C arrays and pointers, and how they're related (and not related). Section 12 discusses C standard I/O.
(I'm sorry this answer is so long; I didn't have time to make it shorter.)

You got undefined behavior in cases 3 and 4.
Cases one and two are the same, as both pointing to the first element in the array.
Case 3 is undefined, as you give a pointer to pointer to char when expecting pointer to char.
Case 4 is undefined, as the pointer d is not initialized.

3 works (on many platforms, and with a warning if you turn those on; technically it is undefined behavior) because you're abusing the pointer (treating &d, which is of type (char **), as (char *) and storing characters inside the memory intended for a pointer). 4 dies because the uninitialized pointer points to a random address.

The important question here is whether there is space in which to store the result.
scanf("%s", &c);
printf("%s\n", &c);
Is there storage? Yes, the address you take is that of the first element of the array. The array exists, so you can put the result there.
scanf("%s", c);
printf("%s\n", c);
Is there storage? Yes. Used like this, the array collapses into a pointer, which is passed same as above.
scanf("%s", &d);
printf("%s\n", &d);
Is there storage? Yes. It's not of the appropriate type, (char **, should be char *), but it shouldn't be any different than casting a char into a pointer type and storing it in a variable declared as a pointer. (Other answers say this is undefined behavior. I don't think it is, casting a char or any other integer type to a char * or other pointer type is well-defined, if ill-advised; show me where the standard says this is undefined.)
scanf("%s", d);
printf("%s\n", d);
Is there storage? Not that you've allocated. It could technically be the case that whatever happens to be in d points to a place in memory that won't segfault. Even if it does, it's not your memory and you could be overwriting something important, or it could change unexpectedly. You haven't told d where to find valid memory to point to, so you're playing pointer Russian roulette.

Related

Defining strings using pointers Vs. char arrays in C

I am confused about how pointers to characters work. when I run the following code, what happens?
int main()
{
char* word;
scanf("%s",word);
printf("%s",word;
}
the first line in the main is defining a pointer to char without initialization. scanf should store the word somewhere and give the address to the pointer, right? what if I input a big string, would it overwrite something in the memory?
And what happens in the first line in the following code other than defining a pointer to char. Does the compiler set some limits? or I can't exceed the size specified, right? If done, I will have a run time error, right? what is the difference between the two cases?
int main()
{
char word[100];
scanf("%s",word);
printf("%s",word;
}
What about pointers to other types? Can I just keep writing to the following places using offsets?
scanf should store the word somewhere and give the address to the pointer, right?
No. It is the other way around. You define the address where scanf shall store the value. As you fail to initialize the pointer to some valid address, you cause undefined behaviour that might result in a crash in best case or seem to work in worst case.
And what happens in the first line in the following code other than defining a pointer to char.
There is no pointer involved at all. An array is not a pointer. An array provides all the memory it needs to store all its members. A pointer doesn't do this.
Does the compiler set some limits? or I can't exceed the size specified, right?
You can write wherever you want. No one will prevent you from doing this. At least no from trying. If you write to some location that does not belong to the memory you allocated, you again cause undefined behaviour.
The function scanf requires that you pass it the address of a sufficiently large memory buffer for storing the string. If you don't do this, then you will be invoking undefined behavior (i.e. your program may crash).
Simply passing a wild pointer (i.e. an arbitrary memory address) is not sufficient. Rather, you must reserve the memory that you intend to use, for example by declaring an array or by using the function malloc.
Using the %s scanf conversion format specifier by itself is not a good idea, because even if the allocated memory buffer has a size of 100 characters, if the user types more than 99 characters (100 including the terminating null character), then the function will write to the array out of bounds, causing undefined behavior. Therefore, you should always limit the number of characters that are written, in this case by writing %99s instead of simply %s.
Also, before using the result of scanf, you should always check the return value of the function, and only use the result if the function was successful.
int main()
{
char word[100];
if ( scanf( "%99s", word ) == 1 )
printf( "%s\n", word );
else
printf( "input error!\n" );
}
what if I input a big string, would it overwrite something in the memory?
It doesn't have to be a "big" string. Writing even a "small" string to a wild pointer will cause undefined behavior and something important may be overwritten, or your program may crash.
And what happens in the first line in the following code other than defining a pointer to char. Does the compiler set some limits?
The line
char word[100];
will allocate an array of 100 characters, i.e. it will give you a memory buffer that is sufficiently large to store 100 characters. This does not give you a pointer. However, when using the array word in the line
scanf("%s",word);
the array word will decay to a pointer to the first element.
Does the compiler set some limits? or I can't exceed the size specified, right?
The compiler won't prevent you from writing to the array out of bounds, but if you allow this to happen, then your program will have undefined behavior (i.e. your program may crash). Therefore, you probably don't want to allow that to happen.
If done, I will have a run time error, right?
If you are lucky, then yes, your program will crash immediately and you will easily be able to identify and fix the bug. If you are unlucky, then no, your program won't crash, but will work as intended, and you won't notice the bug for a very long time, until much later in development, when one day the bug starts overwriting something important in your program. In that case, the bug will probably be hard to diagnose.
This is because C is not a memory-safe language.
However, because these kinds of bugs are often hard to find, there are tools which can help detect these kinds of bugs, such as valgrind and AddressSanitizer.
According to the description of the conversion specifier %s in the C Standard
If no l length modifier is present, the corresponding argument shall
be a pointer to the initial element of a character array large enough
to accept the sequence and a terminating null character, which will be
added automatically.
That is when you pass a pointer as an argument of the function that corresponds to the format %s it shall point to the first element of a character array where the input string will be stored. The character array shall be large enough to accommodate the entered string (including the appended terminating zero character '\0')
In the first program
int main()
{
char* word;
scanf("%s",word);
printf("%s",word;
}
the pointer word is uninitialized and has an indeterminate value. So these two statements
scanf("%s",word);
printf("%s",word;
invoke undefined behavior.
You need to provide a valid value of the pointer that will point to a character array. For example
char s[100];
char *word = s;
Or you can allocate memory dynamically like
char *word = malloc( 100 * sizeof( char ) );
In the second program
int main()
{
char word[100];
scanf("%s",word);
printf("%s",word;
}
the array word used as an argument is implicitly converted to a pointer to its first element. If you will enter a string that fits in the array with 100 elements then the program will behave correctly.
However if you will enter 100 or more characters without embedded spaces then the program again will have undefined behavior.
To avoid such a situation you can specify the maximum length of the string that can be read in the array word by using the length modifier the following way
scanf("%99s",word);
If you want to input a string that may have embedded spaces you should use another conversion specifier. For example
scanf("%99[^\n]", word );
or
scanf(" %99[^\n]", word );
Here are two demonstration programs that show the difference between the two conversion specifiers used to enter a string.
#include <stdio.h>
int main(void)
{
char word[100];
scanf( "%99s", word );
puts( word );
return 0;
}
If to enter the string
Hello Mohammed Elbagoury
then the program output will be
Hello
And the second program
#include <stdio.h>
int main(void)
{
char word[100];
scanf( "%99[^\n]", word );
puts( word );
return 0;
}
Again if to enter
Hello Mohammed Elbagoury
then the program output will be
Hello Mohammed Elbagoury
If you will enter more than 99 characters then only the first 99 characters will be stored in the array appended with the terminating zero character '\0'.
As for your this question
Can I just keep writing to the following places using offsets?
then you can use the pointer arithmetic to store data in any position of an array. for example
int a[10];
scanf( "%d", a + 5 );
In this case a number will be written in the element of the array a[5].
The above statement is equivalent to
scanf( "%d", &a[5] );

What is different between array String and common array in C?

#include<stdio.h>
int main()
{
char str[7];
scanf("%s",&str);
for(i=0; i<7; i++)
{
printf("%x(%d) : %c\n",&str[i], &str[i], str[i]);
}
printf("\n\n%x(%d) : %c or %s",&str, &str, str, str);
return 0;
}
I'm confused about pointer of C Array because of Array with String.
Actually I want to save each character for single line input.
It is worked but I found something strange...
The main issue is &str and &str[0] have same address value.
But str have String Value with %s..
str[0] have Char Value with %c..
I used str with %c then it has first two numbers of str's address.
What is going on in Array..?
Where is real address for Stirng value??
And how can scanf("%s",&str) distribute String to each char array space?
Input : 123456789
62fe40(6487616) : 1
62fe41(6487617) : 2
62fe42(6487618) : 3
62fe43(6487619) : 4
62fe44(6487620) : 5
62fe45(6487621) : 6
62fe46(6487622) : 7
62fe40(6487616) : # 123456789
This is result window of my code.
You are confused because the string and the array are the same thing. - In the memory there are only data (and pointers to that data)
When you allocate an integer or a buffer for a string you reserve some of this memory. Strings in c is defined as a sequence of bytes terminated by one byte with the value 0 - The length is not known. With the fix length array you have a known size to work with.
The real value to the string is the pointer to the first character.
When you print with %c it expects a char - str[0] not the pointer - When you print with %s it expects a pointer to a sequence of chars.
printf("\n\n%x(%d) : %c or %s",&str, &str, str[0], str);
What is different between array String and common array in C?
An array is a contiguous sequence of objects of one type.1
A string is a contiguous sequence of characters terminated by the first null character.2 So a string is simply an array of characters where we mark the end by putting a character with value zero. (Often, strings are temporarily held in larger arrays that have more elements after the null character.)
So every string is an array. A string is simply an array with two extra properties: Its elements are characters, and a zero marks the end.
&str and &str[0] have same address value.
&str is the address of the array. &str[0] is the address of the first element.
These are the same place in memory, because the first element starts in the same place the array does. So, when you print them or examine them, they will often appear the same. (Addresses can have different representations, the same way you might write “200” or “two hundred” or “2•102” for the same number. So the same address might sometimes look different. In most modern systems, an address is just a simple number for a place in memory, and you will not see differences. But it can happen.)
printf("%x(%d) : %c\n",&str[i], &str[i], str[i]);
This is not a correct way to print addresses. To print an address properly, convert it to void * and use %p3:
printf("%p(%p) : %c\n", (void *) &str[i], (void *) &str[i], str[i]);
printf("\n\n%x(%d) : %c or %s",&str, &str, str, str);
…
I used str with %c then it has first two numbers of str's address.
In the above printf, the third conversion specification is %c, and the corresponding argument is str. %c is intended to be used for a character,4 but you are passing it an argument that is a pointer. What may have happened here is that printf used the pointer you passed it as if it were an int. Then printf may have used a part of that int as if it were a character and printed that. So you saw part of the address shown as a character. However, it is a bit unclear when you write “it has the first two numbers of str's address”. You could show the exact output to clarify that.
Although printf may have used the pointer as if it were an int, the behavior for this is not defined by the C standard. Passing the wrong type for a printf conversion is improper, and other results can occur, including the program printing garbage or crashing.
And how can scanf("%s",&str) distribute String to each char array space?
The proper way to pass str to scanf for %s is to pass the address of the first character, &str[0]. C has a special rule for arrays like str: If an array is used in an expression other than as the operand of sizeof or the address-of operator &, it is converted to a pointer to its first element.5 So, you can use scanf("%s", str), and it will be the same as scanf("%s", &str[0]).
However, when you use scanf("%s",&str), you are passing the address of the array instead of the address of the first character. Although these are the same location, they are different types. Recall that two different types of pointers to the same address might have different representations. Because scanf does not have knowledge of the actual argument type you pass it, it must rely on the conversion specifier. %s tells scanf to expect a pointer to a character.6 Passing it a pointer to an array is improper.
C has this rule because some machines have different types of pointers, and some systems might pass different types of pointers in different ways. Nonetheless, often code that passes &str instead of str behaves as the author desired because the C implementation uses the same representation for both pointers. So scanf may actually receive the pointer value that it needs to make %s work.7
Footnotes
1 C 2018 6.2.5 20. (This means the information comes from the 2018 version of the C standard, ISO/IEC 9899, Information technology—Programming Languages—C, clause 6.2.5, paragraph 20.)
2 C 2018 7.1.1 1. Note that the terminating null character is considered to be a part of the string, although it is not counted by the strlen function.
3 C 2018 7.21.6.1 8.
4 Technically, the argument should have type int, and printf converts it to unsigned char and prints the character with that code. C 2018 7.21.6.1 8.
5 C 2018 6.3.2.1 3. A string literal used to initialize an array, as in char x[] = "Hello";, is also not converted to a pointer.
6 C 2018 7.21.6.2 12.
7 Even if a C implementation uses the same representations for different types of pointers, that does not guarantee that using one pointer type where another is required will work. When a compiler optimizes a program, it relies on the program’s author having obeyed the rules, and the optimizations may change the program in ways that would not break a program that followed the rules but that do break a program that breaks the rules.
String is only some kind of the shorthand of the zero terminated char array. So there is no difference between the string and the "normal" array.
Where is real address for Stirng value??
Arrays are not pointers and they only decay to pointers. So there is no physical space in the memory where the address of the first element of the array is stored.
The main issue is &str and &str[0] have same address value.
It is not the issue - array is the chunk of memory. So the address of this chunk is the same as the address of its first element. The types are different.

Why scanf works normally when using pointer to a pointer?

I'm wondering why this code can work. I'm assuming that the scanf is assigning the value to the address of a pointer to a char. I know this expression is undefined but why does printf using a pointer can print the correct value?
int main() {
char* p;
p = (char*)malloc(sizeof(char));
scanf("%c", &p);
printf("%c", p);
return 0;
}
And the result is
c
c
p is a variable that holds a memory address, and memory addresses are surely longer than 1 byte. If you store a char value in this variable, the previous value (the malloc'ed memory block) will be lost. printf just treates your variable as a char variable and prints its contents. If you suspected that the char would be stored in the memory block obtained by malloc, no it wasn't.
Try this:
int main() {
char *p, *q;
p = q = (char*)malloc(sizeof(char));
scanf("%c", &p);
printf("%c\n%c\n", p, *q);
return 0;
}
With the scanf(), you are storing (rather forcibly) one byte into a variable that is more than one byte (sizeof(char *), likely 8 bytes on a 64-bit machine). With the printf(), you then read one byte (sizeof(char), always one by standard) of this variable of size sizeof(char *) (more than one byte) and print it. Your variable p is more space than is needed to store a char. Since the sizes don't line up, you're not sure which byte of p will be read by printf(). It could be the byte that scanf() wrote, or it could be garbage data. You just got lucky and printf() read the same byte that scanf() wrote.
If all this sounds a bit uncertain, it is because it involves undefined behaviour. You are using scanf() and printf() improperly, so they make no guarantees as to what will happen. In short, don't do this.
printf() and scanf() don't perform any special type checking on the source/destination given as an argument. They use fancy pointer arithmetic with the arguments on the stack to figure out where to read/write things as needed. After the compiler builds it, printf() and scanf() will not complain. Your compiler should have given you warnings that the types of the arguments given do not match the format string. If it didn't, you either have a bad/old compiler or you should enable more warnings with the command line option -Wall.
To hopefully help explain the other answers, it looks as if you were aiming to do the following - compare the differences then take another look at the other answers and see if that helps, as I'm not sure you're yet clear about what's happening in your code.
int main() {
char* p = malloc(sizeof(char));
scanf("%c", p);
printf("%c", *p);
return 0;
}

How to read a string into a void pointer?

I have been trying to read an input as string from the user inside a void pointer in C. SO i wrote something like the following:
void *ptr;
ptr = calloc(100,sizeof(char));
printf("Enter the string: ");
fgets(*((char *)ptr),100,stdin);
printf("You entered ");
puts(*((char *)ptr));
I know I may not be doing it the right way, so can anybody please help me show the right way of taking a string input in a void pointer?
I want something as
input:- Enter the string: welcome user
output:- You entered: welcome user
Just convert the void* to a char*:
void *ptr;
ptr = calloc(100,sizeof(char));
printf("Enter the string: ");
fgets((char*)ptr,100,stdin);
printf("You entered ");
puts((char*)ptr);
fgets and puts take a pointer as first argument, so you could use (char*)ptr to convert the pointer.
If you write *((char*)ptr) you treat the void pointer as a char pointer, but also dereference it with * which will give you the first character. This is not what you want here.
You need to remove the (char *) casts. Try this:
void *ptr;
ptr = calloc(100,sizeof(char));
printf("Enter the string: ");
fgets((char *)ptr,100,stdin);
printf("You entered ");
puts((char *)ptr);
You are dereferencing the pointer when you pass it to fgets and fputs which means you pass the value of the first character in the memory location. Since you used calloc for the allocation you pass '\0' (the null character, zero).
Your ptr (and also (char *)ptr) points to some memory, i.e. its value is an address, a point in the RAM of your computer. fgets will copy chars from the console input to that memory location, byte by byte. Your expression *((char *)ptr) which you pass as parameter to fgets is, by contrast, the value stored at that location. After calloc() that value is zero, i.e. the character '\0'. Fgets thinks that's a pointer, tries to access the memory at address 0, and crashes. The same would happen with fputs, because under modern PC operating systems even a read access at "weird" addresses is not allowed for user programs. The solution is, as the others pointed out correctly, to omit the dereferencing "*".
By the way, did you not get a compiler warning? While C is not really type safe, it does have function prototypes, and most compilers warn against wrong parameter types when they are called.
What you basically want cannot be achieved due to your arguments in fgets().
As the function fgets() take three arguments
1.pointer to the first cell of memory at where to store the data.
2.No of characters to be written to memeory.
3.stream pointer,from where to read.
But what you are doing wrong is with first argument,as you are not passing a pointer to fgets().You are passing *((char *)ptr) it is of type char because of your unnecessary derefferencing.Removing external * will cure it because its type now becomes char * and you can use it as a legal argument for fgets().

Problems getting the length of a string in c

This is the code:
void main()
{ char strvek[500];
printf("Mata in ett stort tal: ");
scanf("%s", &strvek);
size_t len1 = strlen(strvek);
printf("%d",&len1);
}
The program ends up Printing the memory adress of len1. I want to store the length of the string in len1. If "hello" is entered I want to have the integer 5 for example.
There are thee issues with your code:
scanf does take addresses, but since strvek is an array, it "decays" to a pointer when passed to a function
Users can type more characters than your buffer holds for a buffer overflow, and
printf does not need an address for ints (your code has undefined behavior)
Here is how you fix the first two problem:
scanf("%499s", strvek); // Limit the size to 499 chars + '\0'; no ampersand in front of strvec
Here is how you fix the last problem:
printf("%d", len1); // No ampersand
It may be a little hard at first to remember when to use an ampersand with I/O functions. Generally, remember that scanf needs an ampersand except for strings, and printf does not need an ampersand except the %p format specifier (in which case you need to convert the pointer to void*).
Because you are printing the address of the len1 in the output.
Simply write:
printf("%d",len1);
What people have failed to mention in other answers (so far) is the reason why scanf wants you to pass addresses of values, while printf wants you to pass the values themselves.
In some sense, there's no technical reason why printf could not have been designed to take the addresses to values to print. If that were the convention, printf would simply go look up what is at that address (using the pointer dereference operator, *)...and print it.
Two things though:
That dereference is an "extra step" which is not needed; because just having a copy of the value itself is enough to transmit the information to printf. C likes to avoid extra steps when it can.
An address-based convention would prohibit using printf on literal values, which don't have addresses. You can't write printf("Value is %d", &10); and have it print Value is 10. This could be worked around by making a variable to store the value in and passing the address of the variable... but as I just said, C likes to avoid extra steps.
Yet with scanf, there is a technical reason why an address is required, and not a value. It needs to receive a place to put the data, such that the caller can look at that data later.
Think about reading in integers, for example. If you passed in an integer value of zero (instead of the address of an integer variable) and that's all scanf had to go on...how would it ever get the value it read back to you?
(In the particular example here, with an array of characters, there is a subtle issue regarding the lack of necessity of the address operator: see How come an array's address is equal to its value in C?...but ignore that and focus on the integer example for the concept. :-P)
It shall be
char strvek[500];
[...]
scanf("%s", strvek);
Even better do
scanf("%499s", strvek);
to prevent overflowing the buffer.
When scanning in a "string", that is a char array, passing to scanf() the array itself, lets the array decay to a pointer to its 1st element, which make it unnecessary to use the & (address of) operator.
To print out a size_t typed variable do:
size_t len = ...;
printf("%zu", len1);
printf("%d",&len1) prints the address while printf("%d",len1) prints the length itself. The & operator means "address of"

Resources