Please help me understand what is the difference using an initialized character array like char line[80]="1:2" ( which doesn't work !! ) and using char line[80] followed by strcpy(line,"1:2").
As per my understanding in first case I have a charachter array, it has been allocated memory, and i am copying a string literal to it. And the same is done in the second case. But obviously I am wrong. So what is wrong with my understanding.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
void tokenize(char* line)
{
char* cmd = strtok(line,":");
while (cmd != NULL)
{
printf ("%s\n",cmd);
cmd = strtok(NULL, ":");
}
}
int main(){
char line[80]; //char line[80]="1:2" won't work
/*
char *line;
line = malloc(80*sizeof(char));
strcpy(line,"1:2");
*/
strcpy(line,"1:2");
tokenize(line);
return 0;
}
You are wrong. The result of these two code snippets
char line[80] = "1:2";
and
char line[80];
strcpy( line, "1:2" );
is the same. That is this statement
char line[80] = "1:2";
does work.:)
There is only one difference. When the array is initialized by the string literal all elements of the array that will not have a corresponding initialization character of the string literal will be initialized by '\0' that is they will be zero-initialized. While when function strcpy is used then the elements of the array that were not overwritten by characters of the string literal will have undeterminated values.
The array is initialized by the string literal (except the zero initialization of its "tail") the same way as if there was applied function strcpy.
The exact equivalence of the result exists between these statements
char line[80] = "1:2";
and
char line[80];
strncpy( line, "1:2", sizeof( line ) );
that is when you use function strncpy and specify the size of the target array.
If you mean that you passed on to function tokenize directly the string literal as an argument of the function then the program has undefined behaviour because you may not change string literals.
According to the C Standard (6.4.5 String literals)
7 It is unspecified whether these arrays are distinct provided their
elements have the appropriate values. If the program attempts to
modify such an array, the behavior is undefined.
Related
I want to pass string to a c function using pointer to char and modify it and it gave me segmentation fault. I don't know why ?
Note*: I know I can pass the string to array of character will solve the problem
I tried to pass it to array of character and pass to function the name of array and it works , but I need to know what the problem of passing the pointer to character.
void convertToLowerCase(char* str){
int i=0;
while(str[i] != '\0')
{
if(str[i]>='A'&& str[i]<='Z'){
str[i]+=32;
}
i++;
}
}
int main(void){
char *str = "AHMEDROSHDY";
convertToLowerCase(str);
}
I expect the output str to be "ahmedroshdy", but the actual output segmentation fault
This (you had char str* which is a syntax error, fixed that):
char *str = "AHMEDROSHDY";
is a pointer to a string literal, thus it cannot be modified, since it is stored in read-only memory.
You modify it here str[i]+=32;, which is not allowed.
Use an array instead, as #xing suggested, i.e. char str[] = "AHMEDROSHDY";.
To be more precise:
char *str = "AHMEDROSHDY";
'str` is a pointer to the string literal. String literals in C are not modifacable
in the C standard:
The behavior is undefined in the following circumstances: ...
The program attempts to modify a string literal
Generally, you can initialize a pointer with any string literals like char *str = "Hello". I think this means "Hello" returns the address of 'H'. However, the below isn't allowed.
#include <stdio.h>
#include <stdlib.h>
typedef struct {
char name[64];
} Student;
Student initialization(char *str) {
//Student tmp = {}; strcpy(tmp.name, str) //(*1)This is allowed.
//Student tmp = {"Hello"}; //(*2)This is allowed.
Student tmp = {str}; //(*3)This is not allowed.
return tmp;
}
int main(void) {
(...)
}
Could anyone tell me the reason why (*2) is allowed but (*3) is not allowed? Compiling this code makes the error below.
warning: initialization makes integer from pointer without a cast [-Wint-conversion]
Student tmp = {str};
^
All these cases you are trying to initialize a char array. Now after saying that - we can see it makes thing easier. Just like an char array where if we write down a string literal directly it initializes the char array with the content of the string literal.
But in the second case, the string literal which is basically a char array is converted to a pointer to the first element of it (the fist character of string literal) which is then used to initialize the char array. That will not work. Note that, even if str is a pointer to a char array which is not a literal this won't work. For the same reason as specified. Standard allows initialization from the string literal directly. Not other way round.
From standard 6.7.9p14
An array of character type may be initialized by a character string literal or UTF-8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
I was writing code to reinforce my knowledge, I got segmentation fault. So, I also got that I have to restock(completing imperfect knowledge) on my knowledge. The problem is about strtok(). When I run the first code there is no problem, but in second, I get segmantation fault. What is my "imperfect knowledge" ? Thank you for your appreciated answers.
First code
#include <stdio.h>
#include <string.h>
int main() {
char str[] = "team_name=fenerbahce";
char *token;
token = strtok(str,"=");
while(token != NULL)
{
printf("%s\n",token);
token = strtok(NULL,"=");
}
return 0;
}
Second code
#include <stdio.h>
#include <string.h>
int main() {
char *str= "team_name=fenerbahce";
char *token;
token = strtok(str,"=");
while(token != NULL)
{
printf("%s\n",token);
token = strtok(NULL,"=");
}
return 0;
}
From strtok -
This function is destructive: it writes the '\0' characters in the elements of the string str. In particular, a string literal cannot be used as the first argument of strtok.
And in the second case, str is a string literal which resides in read only memory. Any attempt to modify string literals lead to undefined behavior.
You see string literals are the strings you write in "". For every such string, no-matter where it is used, automatically a global space is alloacted to store it. When you assign it to an array - you copy it's content into a new memory, that of the array. Otherwise you just store a pointer to it's global memory storage.
So this:
int main()
{
const char *str= "team_name=fenerbahce";
}
Is equal to:
const char __unnamed_string[] { 't', 'e', /*...*/, '\0' };
int main()
{
const char *str= __unnamed_string;
}
And when assigning the string to array, like this:
int main()
{
char str[] = "team_name=fenerbahce";
}
To this:
const char __unnamed_string[] { 't', 'e', /*...*/, '\0' };
int main()
{
char str[sizeof(__unnamed_string) / sizeof(char)];
for(size_t i(0); i < sizeof(__unnamed_string) / sizeof(char); ++i)
str[i] = __unnamed_string[i];
}
As you can see there is a difference. In the first case you're just storing a single pointer and in the second - you're copying the whole string into local.
Note: String literals are un-editable so you should store their address at a constant.
In N4296 - § 2.13.5 .8 states:
Ordinary string literals and UTF-8 string literals are also referred
to as narrow string literals. A narrow string literal has type “array
of n const char”, where n is the size of the string as defined below,
and has static storage duration
The reason behind this decision is probably because this way, such arrays can be stored in read-only segments and thus optimize the program somehow. For more info about this decision see.
Note1:
In N4296 - § 2.13.5 .16 states:
Evaluating a string-literal results in a string literal object with
static storage duration, initialized from the given characters as
specified above.
Which means exactly what I said - for every string-literal an unnamed global object is created with their content.
char *str= "team_name=fenerbahce";
char str[]= "team_name=fenerbahce";
The "imperfect" knowledge is about the difference between arrays and pointers! It's about the memory you cannot modify when you create a string using a pointer.
When you create a string you allocate some memory that will store those values (the characters of the string). In the next lines I will refer to this when I'll talk about the "memory allocated at the start".
When you create a string using an array you will create an array that will contain the same characters as the ones of the string. So you will allocate more memory.
When you create a string using a pointer you will point to the address of memory that contains that string (the one allocated at the start).
You have to assume that the memory created at the start is not writable (that's why you'll have undefined behavior, which means segmentation fault most of the times so don't do it).
Instead, when you create the array, that memory will be writable! That's why you can modify with a command like strtok only in this case
I found this program on-line, that claims to split a string on the format "firstName/lastName". I tested, and it works:
char *splitString(char* ptrS, char c){
while( *ptrS != c ){
if( *ptrS == '\0'){
return NULL;
}
ptrS++;
}
return ptrS;
}
int main(){
char word[100];
char* firstName;
char* lastName;
printf("Please insert a word : ");
fgets(word, sizeof(word), stdin);
word[strlen(word)-1] = '\0';
firstName = word;
lastName = splitString(word, '/');
if(lastName == NULL ){
printf("Error: / not found in string\n");
exit(-1);
}
*lastName = '\0';
lastName++;
printf("First name %s, Last name %s\n",firstName, lastName);
return(0);
}
What I'm seeing here however, is only one char array being created, and the firstName and lastName pointers being placed properly.Maybe it is because I'm a little confused about the relation between pointers and arrays but I have a few questions about this program:
How many strings -as char arrays- are produced after the program is executed?
Are the char pointers used in this program the same as strings?
Why I can use those char pointers used as strings in printf? I can use char pointers as strings on every C program?
As a corollary, what's the relationship between pointers and arrays? Can they be used interchangeably?
How many strings -as char arrays- are produced after the program is executed?
You have one char[], of size 100. However, at the end of the program's execution, you actually have two strings stored in it.
A string is a set of chars stored contiguously in memory terminated with '\0' (null/0).
In this program we take the input, and search for the '/' character. When found, it is replaced with the string terminator:
"John/Smith\0"
"John\0Smith\0"
The pointer to the beginning of the array will allow us to access the first string ("John"), and the pointer that previously pointed to the '/' character is incremented, so that it now points to "Smith"
initialisation:
"John/Smith\0"
^
|
word/firstname
after lastName = splitString(word, '/');:
"John/Smith\0"
^ ^
| lastname
word/firstname
after *lastName = '\0'; lastName++;:
"John\0Smith\0"
^ ^
| lastname
word/firstname
So there is still only one array (word) but you have two pointers (firstname &lastname) to the strings within.
Are the char pointers used in this program the same as strings?
The char* pointers are not strings, but they point to them. You can pass the pointers to any function that is expecting a string (I expand a bit below)
Why I can use those char pointers used as strings in printf? I can use char pointers as strings on every C program?
In C (and C++) functions cannot pass arrays of data. A function like this:
int func(char string[]){}
will not actually be passed the array itself, but instead, it will be passed a pointer to the beginning of the array.
It is often the case that the [] notation is used where the function will be operating on an array (in function headers char* str and char str[] are equivalent) as it removes any ambiguity that the pointer isn't there to reference a single char (or int etc).
Printf() takes pointers to the string, and knowing that it is a string (thanks to the %s identifier in the format string) it will print out each char from the memory location that the pointer identifies until it hits the '\0' character.
A valid string always has this terminator, and so when working with strings, this is safe.
You will often see functions that take an array, and an additional parameter to denote the size of the array:
int do_something_with_array(int array[], size_t array_length){}
This is because without a terminating value or knowing the size of the array, there is no way to know that you have left the array and started processing out of bounds which isn't allowed and can cause memory corruption or cause the runtime to kill your program (crash).
It is a common misconception that pointers and arrays are the same.
In most cases you handle arrays via pointers, so that handling the data is the same whether you have an array or a pointer to one. There are some things to keep in mind however; this as an example.
How many strings -as char arrays- are produced after the program is executed?
There are 1 char array in this program, namely word[100], others like firstName and last Name are just char pointers..
Are the char pointers used in this program the same as strings?
Char pointers are very different to Strings, Char pointers are something you can use to point at a char or a string. In this program firstName and lastName are only char pointers that are used to point at different places of the character array word[] to split the array
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why do I get a segmentation fault when writing to a string?
I'm experiencing a strange issue with my C code. I'm trying to split string using strtok function, but I get access violation exception. Here's my code:
char *token;
char *line = "LINE TO BE SEPARATED";
char *search = " ";
token = strtok(line, search); <-- this code causes crash
However, if I change char *line to char line[], everything works as expected and I don't get any error.
Anyone can explain why I get that (strange for me) behavior with strtok? I thought char* and char[] was the same and exact type.
UPDATE
I'm using MSVC 2012 compiler.
strtok() modifies the string that it parses. If you use:
char* line = "...";
then a string literal is being modified, which is undefined behaviour. When you use:
char[] line = "...";
then a copy of the string literal is being modified.
When assigning "LINE TO BE SEPARATED" to char *line, you make line point to an constant string written in the program executable. You are not allowed to modify it. You should declare those kind of variable as const char *.
When declared as char[], your string is declared on the stack of your function. Thus, you are able to modify it.
char *s = "any string"
is a definition of pointer which point to string or array of char(s). in the above example s is pointing to a constant string
char s[] = "any string"
is a definition of array of char(s). in the above example s is an array of char(s) which contains the charcters {'a','n','y',' ','s','t','r,'i','n','g','\0'}
strtock changes the content of your input string. it replaces the delimators in your string by the '\0' (null).
So, you can not use strtok with constant strings like this:
char *s="any string"
you can use strtok with dynamic memory or static memory like:
char *s = malloc(20 * sizeof(char)); //dynamic allocation for string
strcpy(s,"any string");
char s[20] = "any string"; //static allocation for string
char s[] = "any string"; //static allocation for string
To answer your question: char* and char[] are not same type?
This:
char *line = "LINE TO BE SEPARATED";
Is a string literal defined in read only memory. You can't change this string.
This, however:
char line[] = "LINE TO BE SEPARATED";
Is now a character array (the quoted text was copied into the array) placed on the stack. You are allowed to modify the characters in this array.
So they are both character arrays, but placed in different parts of memory.