char *strchr( const char *s, int c );
I understand that strchr locates the first occurrence of character c in string s. If c is found, a pointer to c in s is returned. Otherwise, a NULL pointer is returned.
So why does below code outputs num to strlen(string) rather than what its designed to do?
num=0;
while((strchr(string,letter))!=NULL)
{
num++;
string++;
}
But this code gives correct output
num=0;
while((string=strchr(string,letter))!=NULL)
{
num++;
string++;
}
I fail to see why assigning the pointer that's returned to another qualified pointer even makes a difference. I'm only just testing if return value is a NULL pointer or not.
string is a pointer.
In the first example, you just move it right one position, regardless of where (or if!) "letter" was found.
In the second example, every time you find a "letter", you:
a) update "string" to point to the letter, then
b) update "string" again to point just past the letter.
Let me try to put it in different way,
strchr
returns a pointer to the located character, or a null pointer if the character does not occur in the string.
In first part of your snippet return value is not being captured, immediate next position of string from where earlier it was pointing is passed as a argument.
In short the snippet is counting total number of character till last appearance of letter
const char* string = "hello"
char letter = 'l'
num=0;
while((strchr(string,letter))!=NULL)
{
num++;
string++;
}
Like this,
+---+---+---+---+---+----+
|'h'|'e'|'l'|'l'|'o'|'/0,|
+---+---+---+---+---+----+
^
|
+-------+ |
+string +-----+
+-------+
+---+---+---+---+---+----+
|'h'|'e'|'l'|'l'|'o'|'/0,|
+---+---+---+---+---+----+
^
|
+-------+ |
+string +---------+
+-------+
+---+---+---+---+---+----+
|'h'|'e'|'l'|'l'|'o'|'/0,|
+---+---+---+---+---+----+
^
|
+-------+ |
+string +-------------+
+-------+
+---+---+---+---+---+----+
|'h'|'e'|'l'|'l'|'o'|'/0,|
+---+---+---+---+---+----+
^
|
+-------+ |
+string +---------------------+
+-------+
In second snippet, return value of strchr is captured back into string and immediate next address is passed as argument in next iteration,
const char* string = "hello"
char letter = 'l'
num=0;
while((string = strchr(string,letter))!=NULL)
{
num++;
string++;
}
Something like this,
+-------+
+string +-----+
+-------+ |
|
/*input pointer to strchr*/
|
v
+---+---+---+---+---+----+
|'h'|'e'|'l'|'l'|'o'|'/0,|
+---+---+---+---+---+----+
|
|
/*return pointer from strchr*/
|
+-------+ |
+string +<------------+
+-------+
+-------+
+string +-----------------+
+-------+ |
|
/*input pointer to strchr*/
|
v
+---+---+---+---+---+----+
|'h'|'e'|'l'|'l'|'o'|'/0,|
+---+---+---+---+---+----+
|
|
/*return pointer from strchr*/
+-------+ |
+string +<----------------+
+-------+
+-------+
+string +---------------------+
+-------+ |
|
/*input pointer to strchr*/
|
v
+---+---+---+---+---+----+
|'h'|'e'|'l'|'l'|'o'|'/0,|
+---+---+---+---+---+----+
/*NULL return from strchr*/
+-------+ |
+string +<--------------------+
+-------+
First code snippet:
So why does below code outputs num to strlen(string) rather than what its designed to do?
The output may not be strlen(string) always and will be depend on the input string string and the character letter passed to strchr(). For e.g. if the input is
string = "hello"
letter = 'l'
then the output you will get is 4 which is not equal to the length of string "hello". If the input is
string = "hello"
letter = 'o'
then the output you will get is 5 which is equal to the length of string "hello". If the input is
string = "hello"
letter = 'x'
then the output you will get is 0.
The output is actually depends on the position of last occurrence of the character letter in the input string.
Reason is that there is only one statement which is modifying the position of string pointer and that statement is
string++;
It is working in this way -
If the character present in the string, the strchr() will return a not null value till the time the input string pointer point to a character on or before the last occurrence of character letter in the string. Once string pointer point to one character after the last occurrence of letter character in the string, the strchr() will return NULL and loop exits and num will be equal to position of the last occurrence of letter character in the string. So, you will get the output within the range from 0 to strlen(string) and not strlen(string) always.
string = "hello", letter = 'e', num = 0
strchr(string, letter) will return not null as 'e' present in input string
num++; string++;
string = "ello", letter = 'e', num = 1
strchr(string, letter) will return not null as 'e' present in input string
num++; string++;
string = "llo", letter = 'e', num = 2
strchr(string, letter) will return null as it does not find 'e' in input string and loop exits
Output will be 2
Second code snippet:
But this code gives correct output
Yes, the reason is the strchr() returned pointer is assigned to string pointer. Take the same example as above, assume the input is
string = "hello"
letter = 'l'
strchr(string, letter) will return the pointer to first occurrence of character l and it is assigned to pointer string. So, now the string pointer pointing first occurrence of l. Which means, now the string is
string = "llo"
and in loop body you are doing
string++;
which will make the string pointer point to next character to the character returned by strchr(). Now the string is
string = "lo"
letter = `l`
and strchr(string, letter) will return the pointer to character which the string is pointing to currently as it is matching to character letter. Due to string++ in the loop body, now the string will point to next character
string = "o"
letter = `l`
and strchr(string, letter) will return NULL and loop will exit. num is incremented as many times as the character letter found in string. Hence the second snippet is giving correct output.
Related
I cannot understand this unexpected C behavior where printing argv[0] prints ./a.out instead of the actual content of argv[0] which should be a memory address because argv is an array of pointers.
If I create an array called char name[] = "hello" then I would expect to see h at name[0] and if char * argv[] holds, for example 3 pointers (memory addresses) then logically argv[0] should be a memory address.
My reasoning is that if I wanted to access the actual content of the memory address that argv[0] points to I should need to do *(argv[0]). What is happening here? Is C doing some kind of magic here?
+------+------+------+---+---+----+
| h | e | l | l | o | \0 |
+------+------+------+---+---+----+
^---- name[0] = h
+------+------+------+
| 0xA7 | 0xCE | 0xC4 |
+------+------+------+
^---- argv[0] = should be 0XA7 (the value of `argv[0]`,
not the value it points to
#include <stdio.h>
int main(int argc, char * argv[]) {
char name[] = "hello";
printf("%c \n", name[0]); // expected h
printf("%s \n", argv[0]); // expected 0xA7 (but got ./a.out instead)
}
$ gcc main.c
$ ./a.out arg1 arg2 arg3
My reasoning is that if I wanted to access the actual content of the memory address that argv[0] points to I should need to do *(argv[0]). What is happening here?
printf is doing the dereferencing. When passed "%s" and a pointer (address), printf doesn't print the address. It prints what's at the address. Specifically, it prints ptr[0], ptr[1], ptr[2] etc until a zero is encountered.
(Keep in mind that ptr[i] is identical to *(ptr+i). I'm going to use the former since it's cleaner.)
Let's start with a simpler example.
char *name = "hello";
printf( "%s", name );
name
+-------------------+ +-----+-----+-----+-----+-----+-----+
| [Some address 1] -------->| 'h' | 'e' | 'l' | 'l' | 'o' | 0 |
+-------------------+ +-----+-----+-----+-----+-----+-----+
Passing "%s", name passes the address of an array containing %s␀ and the address contained by name ("[Some address 1]"). The latter is the address of an array containing hello␀.
This tells printf to print name[0] (h), name[1] (e), name[2] (l), etc until a zero is encountered.
Now let's look at your case.
argv
+-------------------+ +-------------------+ +-----+-----+-----+-----+-----+-----+-----+-----+
| [Some address 2] -------->| [Some address 3] -------->| '.' | '/' | 'a' | '.' | 'o' | 'u' | 't' | 0 |
+-------------------+ +-------------------+ -----+-----+-----+-----+-----+-----+-----+-----+
| [Some address 4] -------->| 'a' | 'r' | 'g' | '1' | 0 |
+-------------------+ +-----+-----+-----+-----+-----+
| [Some address 5] -------->| 'a' | 'r' | 'g' | '2' | 0 |
+-------------------+ +-----+-----+-----+-----+-----+
| [Some address 6] -------->| 'a' | 'r' | 'g' | '3' | 0 |
+-------------------+ +-----+-----+-----+-----+-----+
| NULL |
+-------------------+
or just
argv[0]
+-------------------+ +-----+-----+-----+-----+-----+-----+-----+-----+
| [Some address 3] -------->| '.' | '/' | 'a' | '.' | 'o' | 'u' | 't' | 0 |
+-------------------+ +-----+-----+-----+-----+-----+-----+-----+-----+
Passing "%s", argv[0] passes the address of an array containing %s␀ and the address contained by argv[0] ("[Some address 3]"). The latter is the address of an array containing ./a.out␀.
This tells printf to print argv[0][0] (.), argv[0][1] (/), argv[0][2] (.), etc until a zero is encountered.
%s tells printf “Load characters from the address you are passed and print the characters from there until you find a null character.”
With %s, printf does not print or format the address it is passed. It uses the address to access memory.
For an array declared like:
char name[] = "hello";
the expression name[0] has the type char and by using the conversion specifier c in the call of printf:
printf("%c \n", name[0]);
the first element of the array is outputted as a character.
This declaration of an array:
char * argv[]
that is used as a parameter declaration is adjusted by the compiler to the declaration:
char **argv
and the expression argv[0] has type char *.
The conversion specifier s is designed to output strings pointed to by corresponding arguments like this:
printf("%s \n", argv[0]);
argv[0] points to a string that contains the name of the program that runs.
If you want to output the expression as an address you need to write:
printf("%p \n", ( void * )argv[0]);
To make it more clear consider this statement:
printf( "%s\n", "Hello" );
Its output I think you are expecting is:
Hello
The string literal has the type char[6]. But using as an expression in the call of printf it is implicitly converted to pointer to its first element of the type char *.
So the second argument of the call of printf has the type char * the same way as in the call:
printf("%s \n", argv[0]);
where the second expression also has the type char *.
I wanted to copy string with the following code and it didn't copy the '\0'.
void copyString(char *to, char *from)
{
do{
*to++ = *from++;
}while(*from);
}
int main(void)
{
char to[50];
char from[] = "text2copy";
copyString(to, from);
printf("%s", to);
}
This is output to the code:
text2copyÇ■ ║kvu¡lvu
And every time I rerun the code the code, the character after text2copy changes, so while(*from) works fine but something random is copied instead of '\0'.
text2copyÖ■ ║kvu¡lvu
text2copy╨■ ║kvu¡lvu
text2copy╡■ ║kvu¡lvu
//etc
Why is this happenning?
The problem is that you never copy the '\0' character at the end of the string. To see why consider this:
The string passed in is a constant string sized exactly to fit the data:
char from[] = "text2copy";
It looks like this in memory:
----+----+----+----+----+----+----+----+----+----+----+----
other memory | t | e | x | t | 2 | c | o | p | y | \0 | other memory
----+----+----+----+----+----+----+----+----+----+----+----
^
from
Now let's imagine that you have done the loop several times already and you are at the top of the loop and from is pointing to the 'y' character in text2copy:
----+----+----+----+----+----+----+----+----+----+----+----
other memory | t | e | x | t | 2 | c | o | p | y | \0 | other memory
----+----+----+----+----+----+----+----+----+----+----+----
^
from
The computer executes *to++ = *from++; which copies the 'y' character to to and then increments both to and from. Now the memory looks like this:
----+----+----+----+----+----+----+----+----+----+----+----
other memory | t | e | x | t | 2 | c | o | p | y | \0 | other memory
----+----+----+----+----+----+----+----+----+----+----+----
^
from
The computer executes } while(*from); and realizes that *from is false because it points to the '\0' character at the end of the string so the loop ends and the '\0' character is never copied.
Now you might think this would fix it:
void copyString(char *to, char *from)
{
do{
*to++ = *from++;
} while(*from);
*to = *from; // copy the \0 character
}
And it does copy the '\0' character but there are still problems. The code even more fundamentally flawed because, as #JonathanLeffler said in the comments, for the empty string you peek at the contents of memory that is after the end of the string and because it was not allocated to you accessing it causes undefined behaviour:
----+----+----
other memory | \0 | other memory
----+----+----
^
from
The computer executes *to++ = *from++; which copies the '\0' character to to and then increments both to and from which makes from point to memory you don't own:
----+----+----
other memory | \0 | other memory
----+----+----
^
from
Now the computer executes }while(*from); and accesses memory that isn't yours. You can point from anywhere with no problem, but dereferencing from when it points to memory that isn't yours is undefined behaviour.
The example I made in the comments suggests saving the value copied into a temporary variable:
void copyString(char *to, char *from)
{
int test;
do{
test = (*to++ = *from++); // save the value copied
} while(test);
}
The reason I suggested that particular way was to show you that the problem was WHAT you were testing and not about testing the loop condition afterwards. If you save the value copied and then test that saved value later the character gets copied before it is tested (so the \0 gets copied) and you don't read from the incremented pointer (so there is no undefined behaviour)
But the example #JonathanLeffler had in his comments is shorter, easier to understand, and more idiomatic. It exact does the same thing without declaring a named temporary variable:
void copyString(char *to, char *from)
{
while ((*to++ = *from++) != '\0')
;
}
The code first copies the character and then tests the value that was copied (so the '\0' will be copied) but the incremented pointer is never dereferenced (so there is no undefined behaviour).
the posted code stops looping when it encounters the NUL byte, rather than afterwards.
Regarding:
}while(*from);
Suggest following that line with:
*to = '\0';
I'm looking to split a sting based on a specific sequence of characters but only if they are in order.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
int i = 0;
char **split;
char *tmp;
split = malloc(20 * sizeof(char *));
tmp = malloc(20 * 12 * sizeof(char));
for(i=0;i<20;i++)
{
split[i] = &tmp[12*i];
}
char *line;
line = malloc(50 * sizeof(char));
strcpy(line, "Test - Number -> <10.0>");
printf("%s\n", line);
i = 0;
while( (split[i] = strsep(&line, " ->")) != NULL)
{
printf("%s\n", split[i]);
i++;
}
}
This will print out:
Test
Number
<10.0
However I just want to split around the -> so it could give the output:
Test - Number
<10.0>
I think the best way to do the splits with an ordered sequence of delimeters is
to replicate strtok_r behaviour using strstr, like this:
#include <stdio.h>
#include <string.h>
char *substrtok_r(char *str, const char *substrdelim, char **saveptr)
{
char *haystack;
if(str)
haystack = str;
else
haystack = *saveptr;
char *found = strstr(haystack, substrdelim);
if(found == NULL)
{
*saveptr = haystack + strlen(haystack);
return *haystack ? haystack : NULL;
}
*found = 0;
*saveptr = found + strlen(substrdelim);
return haystack;
}
int main(void)
{
char line[] = "a -> b -> c -> d; Test - Number -> <10.0> ->No->split->here";
char *input = line;
char *token;
char *save;
while(token = substrtok_r(input, " ->", &save))
{
input = NULL;
printf("token: '%s'\n", token);
}
return 0;
}
This behaves like strtok_r but only splits when the substring is found. The
output of this is:
$ ./a
token: 'a'
token: ' b'
token: ' c'
token: ' d; Test - Number'
token: ' <10.0>'
token: 'No->split->here'
And like strtok and strtok_r, it requires that the source string is
modifiable, as it writes the '\0'-terminating byte for creating and returning
the tokens.
EDIT
Hi, would you mind explaining why '*found = 0' means the return value is only the string in-between delimiters. I don't really understand what is going on here or why it works. Thanks
The first thing you've got to understand is how strings work in C. A string is
just a sequence of bytes (characters) that ends with the '\0'-terminating
byte. I wrote bytes and characters in parenthesis, because a character in C is
just a 1-byte value (on most systems a byte is 8 bit long) and the integer
values representing the characters are those defined in the ASSCI code
table, which are 7-bit long values. As you can see from the table the
value 97 represents the character 'a', 98 represents 'b', etc. Writing
char x = 'a';
is the same as doing
char x = 97;
The value 0 is an special value for strings, it is called NUL (null character)
or '\0'-terminating byte. This value is used to tell the functions where a
string ends. A function like strlen that returns the length of a string, does
it by counting how many bytes it encounters until it encounters a byte with
the value 0.
That's why strings are stored using char arrays, because a pointer to an array
gives to the start of the memory block where the sequence of chars is stored.
Let's look at this:
char string[] = { 'H', 'e', 'l', 'l', 'o', 0, 48, 49, 50, 0 };
The memory layout for this array would be
0 1 2 3 4 5 6 7 8 9
+-----+-----+-----+-----+-----+----+-----+-----+-----+----+
| 'H' | 'e' | 'l' | 'l' | 'o' | \0 | '0' | '1' | '2' | \0 |
+-----+-----+-----+-----+-----+----+-----+-----+-----+----+
or to be more precise with the integer values
0 1 2 3 4 5 6 7 8 9 10
+----+-----+-----+-----+-----+---+----+----+----+---+
| 72 | 101 | 108 | 108 | 111 | 0 | 48 | 49 | 50 | 0 |
+----+-----+-----+-----+-----+---+----+----+----+---+
Note that the value 0 represents '\0', 48 represents '0', 49 represents
'1' and 50 represents '2'. If you do
printf("%lu\n", strlen(string));
the output will be 5. strlen will find the value 0 at the 5th position and
stop counting, however string stores two strings, because from the 6th
position on, a new sequence of characters starts that also terminates with 0, thus making it a
second valid string in the array. To access it, you would need to have pointer
that points past the first 0 value.
printf("1. %s\n", string);
printf("2. %s\n", string + strlen(string) + 1);
The output would be
Hello
012
This property is used in functions like strtok (and mine above) to return you
a substring from a larger string, without the need of creating a copy (that would be
creating a new array, dynamically allocating memory, using strcpy to create
the copy).
Assume you have this string:
char line[] = "This is a sentence;This is another one";
Here you have one string only, because the '\0'-terminating byte comes after
the last 'e' in the string. If I however do:
line[18] = 0; // same as line[18] = '\0';
then I created two strings in the same array:
"This is a sentence\0This is another one"
because I replaced the semicolon ';' with '\0', thus creating a new string
from position 0 to 18 and a second one from position 19 to 38. If I do now
printf("string: %s\n", line);
the output will be
string: This is a sentence
Now let's us take look at the function itself:
char *substrtok_r(char *str, const char *substrdelim, char **saveptr);
The first argument is the source string, the second argument is the delimiters
strings and the third one is doule pointer of char. You have to pass a pointer
to a pointer of char. This will be used to remember where the function should
resume scanning next, more on that later.
This is the algorithm:
if str is not NULL:
start a new scan sequence from str
otherwise
resume scanning from string pointed to by *saveptr
found position of substring_d pointed to by 'substrdelim'
if no such substring_d is found
if the current character of the scanned text is \0
no more substrings to return --> return NULL
otherwise
return the scanned text and set *saveptr to
point to the \0 character of the scanned text,
so that the next iteration ends the scanning
by returning NULL
otherwise (a substring_d was found)
create a new substring_a until the found one
by setting the first character of the found
substring_d to 0.
update *saveptr to the start of the found substring_d
plus it's previous length so that *saveptr
points to the past the delimiter sequence found in substring_d.
return new created substring_a
This first part is easy to understand:
if(str)
haystack = str;
else
haystack = *saveptr;
Here if str is not NULL, you want to start a new scan sequence. That's why
in main the input pointer is set to point to the start of the string saved
in line. Every other iteration must be called with str == NULL, that's
why the first thing is done in the while loop is to set input = NULL; so
that substrtok_r resumes scanning using *saveptr. This is the standard
behaviour of strtok.
The next step is to look for a delimiting substring:
char *found = strstr(haystack, substrdelim);
The next part handles the case where no delimiting substring is
found2:
if(found == NULL)
{
*saveptr = haystack + strlen(haystack);
return *haystack ? haystack : NULL;
}
*saveptr is updated to point past the whole source, so that it points to the
'\0'-terminating byte. The return line can be rewritten as
if(*haystack == '\0')
return NULL
else
return haystack;
which says if the source already is an empy string1, then return
NULL. This means no more substring are found, end calling the function. This
is also standard behaviour of strtok.
The last part
*found = 0;
*saveptr = found + strlen(substrdelim);
return haystack;
is handles the case when a delimiting substring is found. Here
*found = 0;
is basically doing
found[0] = '\0';
which creates substrings as explained above. To make it clear once again, before
Before
*found = 0;
*saveptr = found + strlen(substrdelim);
return haystack;
the memory looks like this:
+-----+-----+-----+-----+-----+-----+
| 'a' | ' ' | '-' | '>' | ' ' | 'b' | ...
+-----+-----+-----+-----+-----+-----+
^ ^
| |
haystack found
*saveptr
After
*found = 0;
*saveptr = found + strlen(substrdelim);
the memory looks like this:
+-----+------+-----+-----+-----+-----+
| 'a' | '\0' | '-' | '>' | ' ' | 'b' | ...
+-----+------+-----+-----+-----+-----+
^ ^ ^
| | |
haystack found *saveptr
because strlen(substrdelim)
is 3
Remember if I do printf("%s\n", haystack); at this point, because the '-' in
found has been set to 0, it will print a. *found = 0 created two strings out
of one like exaplained above. strtok (and my function which is based on
strtok) uses the same technique. So when the function does
return haystack;
the first string in token will be the token before the split. Eventually
substrtok_r returns NULL and the loop exists, because substrtok_r returns
NULL when no more split can be created, just like strtok.
Fotenotes
1An empty string is a string where the first character is already the
'\0'-terminating byte.
2This is very important part. Most of the standard functions in the C
library like strstr will not return you a new string in memory, will
not create a copy and return a copy (unless the documentation says so). The
will return you a pointer pointing to the original plus an offset.
On success strstr will return you a pointer to the start of the substring,
this pointer will be at an offset to the source.
const char *txt = "abcdef";
char *p = strstr(txt, "cd");
Here strstr will return a pointer to the start of the substring "cd" in
"abcdef". To get the offset you do p - txt which returns how many bytes
there are appart
b = base address where txt is pointing to
b b+1 b+2 b+3 b+4 b+5 b+6
+-----+-----+-----+-----+-----+-----+------+
| 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | '\0' |
+-----+-----+-----+-----+-----+-----+------+
^ ^
| |
txt p
So txt points to address b, p points to address b+2. That's why you get
the offset by doing p-txt which would be (b+2) - b => 2. So p points to
the original address plus the offset of 2 bytes. Because of this bahaviour
things like *found = 0; work in the first place.
Note that doing things like txt + 2 will return you a new pointer pointing to
the where txt points plus the offset of 2. This is called pointer arithmetic.
It's like regualr arithmetic but here the compiler takes the size of an object
into consideration. char is a type that is defined to have the size of 1,
hence sizeof(char) returns 1. But let's say you have an array of integers:
int arr[] = { 7, 2, 1, 5 };
On my system an int has size of 4, so an int object needs 4 bytes in memory.
This array looks like this in memory:
b = base address where arr is stored
address base base + 4 base + 8 base + 12
in bytes +-----------+-----------+-----------+-----------+
| 7 | 2 | 1 | 5 |
+-----------+-----------+-----------+-----------+
pointer arr arr + 1 arr + 2 arr + 3
arithmetic
Here arr + 1 returns you a pointer pointing to where arr is stored plus an
offset of 4 bytes.
#include <stdio.h>
#include <string.h>
int main()
{
int i;
char a[10];
for(i=0;i<10;i++)
{
scanf("%s",a);// >how this line is working in memory.
}
return 0;
}
In the above code, I would like to know how the string is saved in memory, since I have initialised it as a 1D character array, but does the array work as a list of strings, or a single string? Why?
In C, a string is a sequence of character values terminated by a 0-valued character - IOW, the string "Hello" is represented as the character sequence 'H', 'e', 'l', 'l', 'o', 0. Strings are stored in arrays of char (or wchar_t for wide strings):
char str[] = "Hello";
In memory, str would look something like this:
+---+
str: |'H'| str[0]
+---+
|'e'| str[1]
+---+
|'l'| str[2]
+---+
|'l'| str[3]
+---+
|'o'| str[4]
+---+
| 0 | str[5]
+---+
It is possible to store multiple strings in a single 1D array, although almost nobody does this:
char strs[] = "foo\0bar";
In memory:
+---+
strs: |'f'| strs[0]
+---+
|'o'| strs[1]
+---+
|'o'| strs[2]
+---+
| 0 | strs[3]
+---+
|'b'| strs[4]
+---+
|'a'| strs[5]
+---+
|'r'| strs[6]
+---+
| 0 | strs[7]
+---+
The string "foo" is stored starting at strs[0], while the string "bar" is stored starting at strs[4].
Normally, to store an array of strings, you'd either use a 2D array of char:
char strs[][MAX_STR_LEN] = { "foo", "bar", "bletch" };
or a 1D array of pointers to char:
char *strs[] = { "foo", "bar", "bletch" };
In the first case, the contents of the string are stored within the strs array:
+---+---+---+---+---+---+---+
strs: |'f'|'o'|'o'| 0 | ? | ? | ? |
+---+---+---+---+---+---+---+
|'b'|'a'|'r'| 0 | ? | ? | ? |
+---+---+---+---+---+---+---+
|'b'|'l'|'e'|'t'|'c'|'h'| 0 |
+---+---+---+---+---+---+---+
In the second, each strs[i] points to a different, 1D array of char:
+---+ +---+---+---+---+
strs: | | strs[0] ------> |'f'|'o'|'o'| 0 |
+---+ +---+---+---+---+
| | strs[1] ----+
+---+ | +---+---+---+---+
| | strs[2] -+ +-> |'b'|'a'|'r'| 0 |
+---+ | +---+---+---+---+
|
| +---+---+---+---+---+---+---+
+----> |'b'|'l'|'e'|'t'|'c'|'h'| 0 |
+---+---+---+---+---+---+---+
In your code, a can (and is usually intended to) store a single string that's 9 characters long (not counting the 0 terminator). Like I said, almost nobody stores multiple strings in a single 1D array, but it is possible (in this case, a can store 2 4-character strings).
char a[10];
You've allocated 10 bytes of the stack for a. But right now, it contains garbage because you never gave it a value.
Scanf doesn't know any of this. All it does is copy bytes from the standard input into a, ignorant of its size.
And why are you doing a loop 10 times? You will overwrite a each loop iteration, so you'll only have the value from the final time.
A string is per definition a null-terminated character array. So every character array becomes a string as soon as it contains a \0 somewhere, defining the end of that string.
In memory the string is just a bunch of bytes laying (for simplicity but not necessarily) in sequence. Take the string "Hello" for example
+---+---+---+---+---+---+
| H | e | l | l | o | \0|
+---+---+---+---+---+---+
Your array char a[10] is pointing to the beginning of such a memory location ('H' in the example) with enough space to store 10 characters.
By using scanf you are storing an string (character sequence + terminating \0) in that buffer (over and over again). scanf stores the characters in there and adds a terminating \0 to the element after the last one written to. This allows you to safely store any character sequence that is at most 9 characters long, since the 10th character needs to be the \0
Does a 1D array work as a list of strings, or a single string?
quite broad question.
char a[10]; declared table a which has a size of 10 the char elements
char *a[10]; declared table a which has a size of 10 char * elements which can possible point to the string (when you allocate memory for it and copy the valid C string)
In your code:
'scanf("%s",a);' a means the address of the first element of the array. So scanf writes data there, every time overwriting the previous content. If your scanf-ed input will need more 10 elements (including trailing 0) to be stored, you will get an UB and very possible SEGFAUT
You are over-writing same buffer in loop 10 times, which means buffer will contain data entered in last reading and previous 9 strings will be lost.
Also entering more than 9 characters would cause buffer overflow which would invoke undefined behavior.
You should limit number of characters scanned from input buffer and then clear the rest of the buffer. (Not fflush(stdin);)
scanf("%9s",a);
Does a 1D array work as a list of strings, or a single string?
If its terminated with null character then yes, its string, like this. And a is the address of first element.
+---+---+---+---+---+---+----+
| S | t | r | i | n | g | \0 |
+---+---+---+---+---+---+----+
a a+1 a+2
And if you pass this array ie. to printf(), he will print all characters until he reach \0.
If you would like to read list of strings, you have to declare 2D array or pointer-to-pointer-to-char and allocate enough memory for pointers.
int c;
char a[10][10];
for(i=0;i<10;i++)
{
scanf("%9s",a[i]);
while ((c = getchar()) != '\n' && c != EOF) ;
}
I'm trying to compare a string with another string and if they match I want the text "That is correct" to output but I can't seem to get it working.
Here is the code:
int main ()
{
char * password = "Torroc";
char * userInput;
printf("Please enter your password: ");
scanf("%s", userInput);
if (strcmp(password, userInput) == 0) {
printf("That is correct!");
}
}
In your code, userInput pointer does not have provision to hold the string that you are about to pass using the scanf call. You need to allocate space for the cstring userInput in your stack, before you try to save/assign any string to it. So...
You need to change the following code:
char * userInput;
to:
char userInput[200];
Here, 200 is just an arbitrary value. In your case, please select the max. length of the string + 1 for the (\0).
When you enter characters you need to store the characters somewhere.
char* userInput;
is an uninitialized pointer.
So first you declare an array for your input
char userInput[128];
Now when reading from the keyboard you need to make sure the user does not enter more characters than 127 + one for \0 because it would overwrite the stack so best way to read from the keyboard is to use fgets, it also good to check the return value, if the user simply pressed ENTER without writing anything fgets returns NULL.
if (fgets(userInput, sizeof(userInput), stdin) != NULL) {
Now you have the string that the user entered plus the end of line character. To remove it you can do something like
char* p = strchr(userInput,'\n');
if ( p != NULL ) *p = '\0';
Now you can compare the strings
if (strcmp(password, userInput) == 0) {
puts("That is correct!");
}
When you think about "a string" in C, you should see it as an array of char's.
Let me use the identifier s instead of userInput for brevity:
0 1 2 3 4 5 9
+---+---+---+---+---+---+-- --+---+
s -> | p | i | p | p | o | \0| ... | |
+---+---+---+---+---+---+-- --+---+
is what
char s[10] = "pippo";
would create.
In other words, it's a block of memory where the first 6 bytes have been initialized as shown. There is no s variable anywhere.
Instead, declaring a char * like in
char *s;
would create a variable that can hold a pointer to char:
+------------+
s| 0xCF024408 | <-- represent an address
+------------+
If you think this way, you notice immediately that doing:
scanf("%s",s);
only make sense in the first case, where there is (hopefully) enough memory to hold the string.
In the second case, the variable s points to some random address and you will end up writing something into an unknown memory area.
For completeness, in cases like:
char *s = "pippo";
you have the following situation in memory:
0 1 2 3 4 5
+---+---+---+---+---+---+ Somewhere in the
0x0B320080 | p | i | p | p | o | \0| <-- readonly portion
+---+---+---+---+---+---+ of memory
+------------+ a variable pointing
s| 0x0B320080 | <-- to the address where
+------------+ the string is
You can make s pointing somewhere else but you can't change the content of the string pointed by s.