I wrote a small function foo that changes a string.
When I use the function, sometimes I receive a SIGSEGV-fault. This is dependent on how the string is initialized. In the calling function main, a string is initialized through memory allocation and calling strcpy. I can change that string correctly.
The other string (TestString2) is initialized when I declared the variable. I cannot trim this string but get the SIGSEGV-fault.
Why is this?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void foo(char *Expr)
{
*Expr = 'a';
}
int main()
{
char *TestString1;
char *TestString2 = "test ";
TestString1 = malloc (sizeof(char) * 100);
strcpy(TestString1, "test ");
foo(TestString1);
foo(TestString2);
return 0;
}
In the case of TestString2, you set it to the address of a string constant. These constants cannot be modified, and typically reside in a read-only section of memory. Because of this, you invoke undefined behavior which in this case manifests as a crash.
The case of TestString1 is valid because it points to dynamically allocated memory which you are allowed to change.
Related
I"m a begginer in C, usually I used C++. I try to work whith a struct with a char array in it, but when I used an other char *str it raises a segfault.
My code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct s_obj t_obj;
struct s_obj {
char *str;
};
int main() {
char *str; // if disable no segmentation fault
t_obj obj;
printf("%lu\n",strlen(obj.str));
return(0);
}
I tried to understand what you mean by "the argument of strlen must be a string" #anastaciu... so I tried to make a code to do that, but the result is the same: segfault when an other char *str is used.
I can't find a way to init my char *str in the struct.
typedef struct s_obj t_obj;
struct s_obj {
char *str;
};
int main() {
char *str; // if disable no segmentation fault fault
t_obj obj;
obj.str = strcpy(obj.str, "truc");
// printf("%lu\n",strlen(obj.str));
printf("%s\n",obj.str);
return(0);
}
The line
printf("%lu\n", strlen(obj.str));
Invokes undefined behavior, the argument of strlen must be a string, aka a null terminated char array, obj.str is not a string it is just an uninitialized pointer, you'll need to allocate memory for it or othewise make it point to a valid memory location.
For example:
t_obj obj;
obj.str = calloc(100, sizeof *obj.str); //99 character string, 0 initialized
//malloc does not "clear" the allocated memory
//if you use it you can't use strlen before strcpy
printf("%zu\n",strlen(obj.str)); //will print 0, the string is empty
obj.str = strcpy(obj.str, "truc");
printf("%zu\n",strlen(obj.str)); //will print 4, the length of the string
Live demo
Tha fact that the program does not behave as badly when you remove char *str; is a matter that's well within the scope of undefined behavior:
C is not C++ ;) Seems you've missed an important difference regarding the two.
C++ Example:
#include <string>
struct t_obj {
std::string str;
};
void foo(){
t_obj obj; // <-- In C++ this is enough to get a properly initialized instance.
}
In C++, this code will give you a properly initialized object with an (also initialized) string.
But in C (as in your sample):
typedef struct t_obj t_obj;
struct t_obj {
char *str;
};
void foo(){
t_obj obj; // <-- Nothing gets initialized here.
}
There is no initialization as in the C++ example above. obj will simply be a chunk of (not initialized) memory. You have to initialize it yourself.
There's also a Problem with your 2nd sample:
strcpy does not work that way. We need to pass an allocated chunk of memory to strcpy and it will copy data to that place we gave to it.
But as we pass a "not initialzed pointer", strcpy will try to write our data somewhere in memory.
I think question "whats the difference between C strings and C++ strings?" might be helpful. It explains some details about the difference of C and C++ strings.
In either case, you are using obj.str uninitialised.
The address it holds in indeterminate, and the content of the memory location it points to also indeterminate. So, it's not null-terminated, and using it with strlen() (i.e., with the functions which expects a string argument), will cause out of bound access, which is essentially invalid memory access, which in turn invokes undefined behaviour.
For reference, C11, chapter 7.24, String handling <string.h>
[...]Unless explicitly stated otherwise in the description of a particular function in this subclause, pointer arguments on such a call shall still have valid values,[...]
At least, initialize the pointers to a null-value.
now the code work fine like that:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct s_obj t_obj;
struct s_obj {
char *str;
};
int main() {
char *str;
t_obj obj;
if (!(obj.str = (char*)malloc(sizeof(char))))
return (0);
obj.str = strcpy(obj.str, "truc");
printf("%s\n",obj.str);
free(obj.str);
return(0);
}
The following program works fine, and I'm surprised why :
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
void xyz(char **value)
{
// *value = strdup("abc");
*value = "abc"; // <-- ??????????
}
int main(void)
{
char *s1;
xyz(&s1);
printf("s1 : %s \n", s1);
}
Output :
s1 : abc
My understanding was that I have to use strdup() function to allocate memory for a string in C for which I have not allocated memory. But in this case the program seems to be working fine by just assigning string value using " ", can anyone please explain ?
String literals don't exist in the ether. They reside in your programs memory and have an address.
Consequently you can assign that address to pointers. The behavior of your program is well defined, and nothing bad will happen, so long as you don't attempt to modify a literal through a pointer.
For that reason, it's best to make the compiler work for you by being const correct. Prefer to mark the pointee type as const whenever possible, and your compiler will object to modification attempts.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
void xyz(char const **value)
{
*value = "abc";
}
int main(void)
{
char const *s1;
xyz(&s1);
printf("s1 : %s \n", s1);
s1[0] = 'a'; << Error on this line
}
Your program works fine because string literal "abc" are character arrays , and what actually happens while assigning a string literal to a pointer is , the compiler first creates a character array and then return the address of the first element of the array just like when we call the name of any other array.
so in your program you have passed address of a char pointer to the function xyz
and
*value = "abc";
for this statement , compiler first creates a character array in the memory and then returns its address which in turn gets stored in the char pointer.It is worth knowing that the compiler creates the char array in read only memory.Thus , the address returned refers to a const char array.Any attempt to modify its value will return it compile-time error.
You can define a string in C with char *str = "some string";, str is a pointer which points to the location of the first letter in a string.
The two following codes are similar but the first has a structure, the second not.
Why this code works (with no warnings)?
#include <stdio.h>
#include <string.h>
struct prova
{
char *stringa;
};
int main()
{
struct prova p;
strcpy (p.stringa, "example\0");
printf("%s\n", p.stringa);
return 0;
}
But the following code doesn't work?
Segmentation fault (core dumped)
With this warning:
code.c: In function ‘main’:
code.c:8:9: warning: ‘stringa’ is used uninitialized in this function [-Wuninitialized]
strcpy (stringa, "example\0");
#include <stdio.h>
#include <string.h>
int main()
{
char *stringa;
strcpy (stringa, "example\0");
printf("%s\n", stringa);
return 0;
}
Thank you!
Neither is correct because you copy to an address specified by an uninitialized variable. Therefore both programs invoke undefined behaviour.
The fact that one of the programs works is down to pure chance. One possible form of undefined behaviour is that your program runs correctly.
You need to initialize the pointer to refer to a sufficiently sized block of memory. For instance:
char *stringa = malloc(8);
Note that you do not need to add a null terminator to a string literal. That is implicit. So, given this memory allocation you can then write:
strcpy(stringa, "example");
You need to give the string some memory for it copy the characters to.
Use malloc
besides the first example does not compile.
When you write
struct prova
{
char *stringa;
};
int main()
{
struct prova p;
strcpy (p.stringa, "example\0");
notice that p.stringa points to nowhere in particular but you copy to it.
The following C program attempts to fetch and print the host name of the current RHEL host. It throws a segmentation fault on this machine. As per the definition of gethostname I should be able to pass a char pointer, shouldn't I?
When I use a char array instead (like char hname[255]), the call to gethostname works. (If I did this how would I return the array to main?)
#include <stdio.h>
#include <unistd.h>
char * fetchHostname()
{
// using "char hname[255]" gets me around the issue;
// however, I dont understand why I'm unable to use
// a char pointer.
char *hname;
gethostname(hname, 255 );
return hname;
}
int main()
{
char *hostname = fetchHostname();
return 0;
}
Output:
pmn#rhel /tmp/temp > gcc -g test.c -o test
pmn#rhel /tmp/temp >
pmn#rhel /tmp/temp > ./test
Segmentation fault
pmn#rhel /tmp/temp >
As gethostname man said:
The gethostname() function shall return the standard host name for
the current machine. The namelen argument shall
specify the size of the array pointed to by the name argument.
The returned name shall be null-terminated, except that
if namelen is an insufficient length to hold the host name,
then the returned name shall be truncated and it is
unspecified whether the returned name is null-terminated.
You need a place to store the function information, so declare hostname as an array, not a pointer.
#include <unistd.h>
char * fetchHostname(char *hostname, int size)
{
// using "char hname[255]" gets me around the issue;
// however, I dont understand why I'm unable to use
// a char pointer.
gethostname(hostname, size);
return hostname;
}
int main()
{
char hostname[HOST_NAME_MAX + 1];
fetchHostname(hostname, HOST_NAME_MAX);
return 0;
}
When I use a char array instead (like char hname[255]), the call to
gethostname works. (If I did this how would I return the array to
main?)
By passing a pointer to the array from main() to your function. Note that this approach makes your function fetchHostname() to be just a wrapper for function gethostname():
#include <stdio.h>
#include <unistd.h>
void fetchHostname(char *hname)
{
gethostname(hname,255);
}
int main()
{
char hostname[256];
fetchHostname(hostname);
return 0;
}
Or by declaring your hname[] array local static, so it is valid even after the program leaves the function (this approach is not thread-safe):
#include <stdio.h>
#include <unistd.h>
char *fetchHostname (void)
{
static char hname[256];
gethostname(hname,255);
return hname;
}
int main()
{
char *hostname;
hostname = fetchHostname();
return 0;
}
Though there are many technically correct answers I don't think that they actually explain to you where you went wrong.
gethostname(char *name, size_t len); is documented here and it basically says that the parameter name is an array of characters that you have allocated where this function will copy the hostname into. And how you do that is explained in the many other wonderful answers here.
That is why this works if you make the array yourself but causes a segmentation fault if you just give it a pointer.
Also, you were giving this function an uninitialized pointer so when it tried to copy data to that address (which is just some completely random place in memory) it caused your program to terminate because it was trying to write a string where it isn't allowed to.
This type of mistake tells me that you need to brush up on what pointers actually are and that you need to understand that they can be used to allow a function to return a value in a different way than using the return statement.
Update: please see #Tio Pepe's answer.
You need to allocate space in your array (either statically or dynamically):
char hname[HOST_NAME_MAX + 1];
otherwise you are passing an uninitialised pointer that could point anywhere.
Your call to gethostname():
gethostname(hname, 255);
is a contract that says here is a pointer that points to at least 255 characters of allocated space.
Also, you are trying to return a pointer to space allocated on the stack. That's not good.
Yoi need to dynamically allocate the character array if you want to return it.
char * hname;
hname = malloc((HOST_NAME_MAX +1) * sizeof(char));
But be aware that you now have to manage when that space gets freed.
Returning a pointer to a local variable is undefined behavior, because it will go out of scope. If you allocate it on the heap (via dynamic allocation), you will not have this problem.
char * fetchHostname()
{
char* hname= malloc(sizeof(char) * 256);
gethostname(hname, 255);
return hname;
}
int main()
{
char *hostname = fetchHostname();
printf(hostname);
free(hostname);
return 0;
}
char *hname; //defined inside a function and would be destroyed after the function is executed
After the execution of fetchHostname() the address returned to the hostname is not valid and acessing it would result in segmentation fault
Can I initialize string after declaration?
char *s;
s = "test";
instead of
char *s = "test";
You can, but keep in mind that with that statements you are storing in s a pointer to a read-only string allocated elsewhere. Any attempt to modify it will result in undefined behavior (i.e., on some compilers it may work, but often will just crash). That's why usually you use a const char * for that thing.
Yes, you can.
#include <stdio.h>
int
main(void)
{
// `s' is a pointer to `const char' because `s' may point to a string which
// is in read-only memory.
char const *s;
s = "hello";
puts(s);
return 0;
}
NB: It doesn't work with arrays.
#include <stdio.h>
int
main(void)
{
char s[32];
s = "hello"; // Syntax error.
puts(s);
return 0;
}
It is correct for pointers (as mentioned above) because the string inside quotes is allocated from the compiler at compile time, so you can point to this memory address. The problems comes when you try change its contents or when you have a fixed size array that want to point there