I am confused about some basics in C string declaration. I tried out the following code and I noticed some difference:
char* foo(){
char* str1="string";
char str2[7]="string";
char* str3=(char)malloc(sizeof(char)*7);
return str1;
/* OR: return str2; */
/* OR: return str3; */
}
void main() {
printf("%s",foo());
return 0;
}
I made foo() return str1/2/3 one at a time, and tried to print the result in the main. str2 returned something weird, but str1 and str3 returned the actual "string".
1.Now, what's the difference between the three declarations? I think the reason why str2 didn't work is because it is declared as a local variable, is that correct?
2.Then what about str1? If the result remains after the foo() ended, wouldn't that cause memory leak?
3.I'm simply trying to write a function that returns a string in C, and use the value returned by that function for other stuff, which str declaration above should I use?
Thanks in advance!
char* str1="string";
This makes str1 a pointer; it points to the first character of the string literal. You should define it as const, because you're not allowed to modify a string literal:
const char *str1 = "string";
...
char str2[7]="string";
This makes str2 an array of char (not a pointer), and copies the contents of the string literal into it. There's no need to define it as const; the array itself is writable. You can also omit the size and let it be determined by the initializer:
char str2[] = "string";
Then sizeof str2 == 7 (6 bytes for "string" plus 1 for the terminating '\0').
This:
char* str3=(char)malloc(sizeof(char)*7);
is written incorrectly, and it shouldn't even compile; at the very least, you should have gotten a warning from your compiler. You're casting the result of malloc() to type char. You should be converting it to char*:
char *str3 = (char*)malloc(sizeof(char) * 7);
But the cast is unnecessary, and can mask errors in some cases; see question 7.7 and following in the comp.lang.c FAQ:
char *str3 = malloc(sizeof(char) * 7);
But sizeof(char) is 1 by definition, so you can just write:
char *str3 = malloc(7);
malloc() allocates memory, but it doesn't initialize it, so if you try to print the string that str3 points to, you'll get garbage -- or even a run-time crash if the allocated space doesn't happen to contain a terminating null character '\0'. You can initialize it with strcpy(), for example:
char *str3 = malloc(7);
if (str3 == NULL) {
fprintf(stderr, "malloc failed\n");
exit(EXIT_FAILURE);
}
strcpy(str3, "string");
You have to be very careful that the data you're copying is no bigger than the allocated space. (No, `strncpy() is not the answer to this problem.)
void main() is incorrect; it should be int main(void). If your textbook told you to use void main() find a better textbook; its author doesn't know C very well.
And you need appropriate #include directives for any library functions you're using: <stdio.h> for printf(), <stdlib.h> for exit() and malloc(), and <string.h> for strcpy(). The documentation for each function should tell you which header to include.
I know this is a lot to absorb; don't expect to understand it all right away.
I mentioned the comp.lang.c FAQ; it's an excellent resource, particularly section 6, which discusses arrays and pointers and the often confusing relationship between them.
As for your question 3, how to return a string from a C function, that turns out to be surprisingly complicated because of the way C does memory allocation (basically it leaves to to manage it yourself). You can't safely return a pointer to a local variable, because the variable ceases to exist when the function returns, leaving the caller with a dangling pointer, so returning your str2 is dangerous. Returning a string literal is ok, since that corresponds to an anonymous array that exists for the entire execution of your program. You can declare an array with static and return a pointer to it, or you can use malloc() (which is the most flexible approach, but it means the caller needs to free() the memory), or you can require the caller to pass in a pointer to a buffer into which your function will copy the result.
Some languages let you build a string value and simply return it from a function. C, as you're now discovering, is not one of those languages.
char* str1="string";
This creates a pointer to a literal string that will be located on either .data or .text segments and is accessible at all times. Whenever you do something like that, be sure to declare it const, because if you try to modify it, nasty things might happen.
char str2[7]="string";
This creates a local buffer on the stack with a copy of the literal string. It becomes unavailable once the function returns. That explains the weird result you're getting.
char* str3=(char)malloc(sizeof(char)*7);
This creates a buffer on the heap (uninitialized) that will be available until you free it. And free it you must, or you will get a memory leak.
The string literal "string" is stored as a 7-element array of char with static extent, meaning that the memory for it is allocated at program startup and held until the program terminates.
The declaration
char *str1 = "string";
assigns the address of the string literal to str1. Even though the variable str1 ceases to exist when the function exits, its value (the address of the literal "string") is still valid outside of the function.
The declaration
char str2[7] = "string";
declares str2 as an array of char, and copies the contents of the string literal to it. When the function exits, str2 ceases to exist, and its contents are no longer meaningful.
The declaration
char *str3 = (char*) malloc(sizeof(char) * 7);
which can be simplified to
char *str3 = malloc(sizeof *str3 * 7);
allocates an uninitialized 7-byte block of memory and copies its address to str3. When the function exits, the variable str3 ceases to exist, but the memory it points to is still allocated. As written, this is a memory leak, because you don't preserve the value of the pointer in the calling code. Note that, since you don't copy anything to this block, the output in main will be random.
Also, unless your compiler documentation explicitly lists void main() as a legal signature for the main function, use int main(void) instead.
Related
I'm trying to understand how this memory allocation thing works when it comes to char and strings.
I know that the name of a declared array is just like a pointer to the first element of the array, but that array will live in the stack of the memory.
On the other hand, we use malloc when we want to use the heap of the memory, but I found that you can just initialize a char pointer and assign a string to it on the very same declaration line, so I have some questions regarding this matter:
1) If I just initialize a char pointer and assign a string to it, where is this information living? In the heap? In the stack?
Ex: char *pointer = "Hello world";
2) I tried to use malloc to initialize my char pointer and use it later to assign a string to it, but I can't compile it, I get error, what is wrong with this logic? this is what I'm trying to do:
char *pointer = malloc(sizeof(char) * 12);
*pointer = "Hello world";
Can you please help me understand more about pointers and memory allocation? Thank you very much! :)
1) If I just initialize a char pointer and assign a string to it,
where is this information living? In the heap? In the stack? Ex: char
*pointer = "Hello world";
Neither on the stack or on the heap. "Hello world" is a string literal, generally created in the rodata (read-only data) segment of the executable. In fact, unless you specify differently, the compiler is free to only store a single copy of "Hello world" even if you assign it to multiple pointers. While typically you cannot assign strings to pointers, since this is a string literal, you are actually assigning the address for the literal itself -- which is the only reason that works. Otherwise, as P__J__ notes, you must copy strings from one location to another.
2) I tried to use malloc to initialize my char pointer and use it
later to assign a string to it, but I can't compile it, I get error,
what is wrong with this logic? this is what I'm trying to do:
char *pointer = malloc(sizeof(char) * 12);
*pointer = "Hello world";
You are mixing apples and oranges here. char *pointer = malloc (12); allocates storage for 12-characters (bytes) and then assigns the beginning address for that new block of storage to pointer as its value. (remember, a pointer is just a normal variable that holds the address to something else as its value)
For every allocation, there must be a validation that the call succeeded, or you will need to handle the failure. Allocation can, and does fail, and when it fails, malloc, calloc * realloc all return NULL. So each time you allocate, you
char *pointer = malloc(12); /* note: sizeof (char) is always 1 */
if (pointer == NULL) { /* you VALIDATE each allocation */
perror ("malloc-pointer");
return 1;
}
Continuing with your case above, you have allocated 12-bytes and assigned the starting address for the new memory block to pointer. Then, you inexplicably, derefernce pointer (e.g. *pointer which now has type char) and attempt to assign the address of the string literal as that character.
*pointer = "Hello world"; /* (invalid conversion between pointer and `char`) */
What you look like you want to do is to copy "Hello world" to the new block of memory held by pointer. To do so, since you already know "Hello world" is 12-characters (including the nul-terminating character), you can simply:
memcpy (pointer, "Hello world", 12);
(note: if you already have the length, there is no need to call strcpy and cause it to scan for the end-of-string, again)
Now your new allocated block of memory contains "Hello world", and the memory is mutable, so you can change any of the characters you like.
Since you have allocated the storage, it is up to you to free (pointer); when that memory is no longer in use.
That in a nutshell is the difference between assigning the address of a string literal to a pointer, or allocating storage and assigning the first address in the block of new storage to your pointer, and then copying anything you like to that new block (so long as you remain within the allocated bounds of the memory block allocated).
Look things over and let me know if you have further questions.
1) If I just initialize a char pointer and assign a string to it, where is this information living? In the heap? In the stack? Ex: char *pointer = "Hello world";
The string literal "Hello world" has static storage duration. This means it exists somewhere in memory accessible to your program, for as long as your program is running.
The exact place where it is located is implementation specific.
2) I tried to use malloc to initialize my char pointer and use it later to assign a string to it, but I can't compile it, I get error, what is wrong with this logic? this is what I'm trying to do:
char *pointer = malloc(sizeof(char) * 12);
*pointer = "Hello world";
This doesn't work, since *pointer is the first char in the dynnamically allocated memory (returned by malloc()). The string literal "Hello world" is represented as an array of characters. An array of characters cannot be stored in a single char.
What you actually need to do in this case is copy data from the string literal to the dynamically allocated memory.
char *pointer = malloc(sizeof(char) * 12);
strcpy(pointer, "Hello world"); /* strcpy() is declared in standard header <string.h> *
Note that this doesn't change the data that represents the string literal. It copies the data in that string literal into the memory pointed at by pointer (and allocated dynamically by malloc()).
If you really want pointer to point at the (first character of) the string literal, then do
const char *pointer = "Hello world";
The const represents the fact that modifying a string literal gives undefined behaviour (and means the above prevents using pointer to modify that string literal).
If you want to write really bad code you could also do
char *pointer = "Hello world"; /* Danger Will Robinson !!!! */
or (same net effect)
char *pointer;
pointer = "Hello world"; /* Danger Will Robinson !!!! */
This pointer can now be used to modify contents of the string literal - but that causes undefined behaviour. Most compilers (if appropriately configured) will give warnings about this - which is one of many hints that you should not do this.
Long question but the answer is very easy. Firstly you need to understand what the pointer is.
*pointer = "Hello world";
Here you try to assign a pointer to the char. If you remove the * then you will assign the pointer to the string literal to the pointer.
Unless you have overloaded the assign operator it will not copy it to the allocated memory. Your malloc is pointless as you assign new value to the pointer and the memory allocated by the malloc is lost
You need to strcpy it instead
where is this information living?
It is implementation dependant where the string literals are stored. It can be anywhere. Remember that you can't modify the string literals. Attempt to do so is an Undefined Behaviour
I know that the name of a declared array is just like a pointer to the first element of the array,…
This is not correct. An array behaves similarly to a pointer in many expressions (because it is automatically converted), but it is not just like a pointer to the first element. When used as the operand of sizeof or unary &, an array will be an array; it will not be converted to a pointer. Additionally, an array cannot be assigned just like a pointer can; you cannot assign a value to an array.
… but that array will live in the stack of the memory.
The storage of an array depends on its definition. If an object is defined outside of any function, it has static storage duration, meaning it exists for the entire execution of the program. If it is defined inside a function without _Thread_local or static has automatic storage duration, meaning it exists until execution of its associated block of code ends. C implementations overwhelmingly use the stack for objects of automatic storage duration (neglecting the fact that optimization can often make use of the stack unnecessary), but alternatives are possible.
On the other hand, we use malloc when we want to use the heap of the memory, but I found that you can just initialize a char pointer and assign a string to it on the very same declaration line, so I have some questions regarding this matter:
1) If I just initialize a char pointer and assign a string to it, where is this information living? In the heap? In the stack? Ex: char *pointer = "Hello world";
For the string literal "Hello world", the C implementation creates an array of characters with static storage duration, the same as if you had written char MyString[12]; or int x; at file scope. Assuming this array is not eliminated or otherwise altered by optimization, it is in some general area of memory that the C implementation uses for data built into the program (possibly a “.rodata” or similar read-only section).
Then, for char *pointer, the C implementation creates a pointer. If this definition appears outside a function, it has static storage duration, and the C implementation uses some general area of memory for that. If it appears inside a function, it has automatic storage duration, and the C implementation likely uses stack space for it.
Then, for the = "Hello world", the C implementation uses the array to initialize pointer. To do this, it converts the array to a pointer to its first element, and it uses that pointer as the initial value of pointer.
2) I tried to use malloc to initialize my char pointer and use it later to assign a string to it, but I can't compile it, I get error, what is wrong with this logic? this is what I'm trying to do:
char *pointer = malloc(sizeof(char) * 12);
*pointer = "Hello world";
The second line is wrong because *pointer has a different role in a declaration than it does in an expression statement.
In char *pointer = …;, *pointer presents a “picture” of what should be a char. This is how declarations work in C. It says *pointer is a char, and therefore pointer is a pointer to a char. However, the thing being defined and initialized is not *pointer but is pointer.
In contrast, in *pointer = "Hello world";, *pointer is an expression. It takes the pointer pointer and applies the * operator to it. Since pointer is a pointer to char, *pointer is a char. Then *pointer = "Hello world"; attempts to assign "Hello world" to a char. This is an error because "Hello world" is not a char.
What you are attempting to do is to assign a pointer to "Hello world" to pointer. (More properly, a pointer to the first character of "Hello world".) To do this, use:
pointer = "Hello world";
I am sorry, I might me asking a dumb question but I want to understand is there any difference in the below assignments? strcpy works in the first case but not in the second case.
char *str1;
*str1 = "Hello";
char *str2 = "World";
strcpy(str1,str2); //Works as expected
char *str1 = "Hello";
char *str2 = "World";
strcpy(str1,str2); //SEGMENTATION FAULT
How does compiler understand each assignment?Please Clarify.
Edit: In the first snippet you wrote *str1 = "Hello" which is equivalent to assigning to str[0], which is obviously wrong, because str1 is uninitialized and therefore is an invalid pointer. If we assume that you meant str1 = "Hello", then you are still wrong:
According to C specs, Attempting to modify a string literal results in undefined behavior: they may be stored in read-only storage (such as .rodata) or combined with other string literals so both snippets that you provided will yield undefined behavior.
I can only guess that in the second snippet the compiler is storing the string in some read-only storage, while in the first one it doesn't, so it works, but it's not guaranteed.
Sorry, both examples are very wrong and lead to undefined behaviour, that might or might not crash. Let me try to explain why:
str1 is a dangling pointer. That means str1 points to somewhere in your memory, writing to str1 can have arbitrary consequences. For example a crash or overriding some data in memory (eg. other local variables, variables in other functions, everything is possible)
The line *str1 = "Hello"; is also wrong (even if str1 were a valid pointer) as *str1 has type char (not char *) and is the first character of str1 which is dangling. However, you assign it a pointer ("Hello", type char *) which is a type error that your compiler will tell you about
str2 is a valid pointer but presumably points to read-only memory (hence the crash). Normally, constant strings are stored in read-only data in the binary, you cannot write to them, but that's exactly what you do in strcpy(str1,str2);.
A more correct example of what you want to achieve might be (with an array on the stack):
#define STR1_LEN 128
char str1[STR1_LEN] = "Hello"; /* array with space for 128 characters */
char *str2 = "World";
strncpy(str1, str2, STR1_LEN);
str1[STR1_LEN - 1] = 0; /* be sure to terminate str1 */
Other option (with dynamically managed memory):
#define STR1_LEN 128
char *str1 = malloc(STR1_LEN); /* allocate dynamic memory for str1 */
char *str2 = "World";
/* we should check here that str1 is not NULL, which would mean 'out of memory' */
strncpy(str1, str2, STR1_LEN);
str1[STR1_LEN - 1] = 0; /* be sure to terminate str1 */
free(str1); /* free the memory for str1 */
str1 = NULL;
EDIT: #chqrlie requested in the comments that the #define should be named STR1_SIZE not STR1_LEN. Presumably to reduce confusion because it's not the length in characters of the "string" but the length/size of the buffer allocated. Furthermore, #chqrlie requested not to give examples with the strncpy function. That wasn't really my choice as the OP used strcpy which is very dangerous so I picked the closest function that can be used correctly. But yes, I should probably have added, that the use of strcpy, strncpy, and similar functions is not recommended.
There seems to be some confusion here. Both fragments invoke undefined behaviour. Let me explain why:
char *str1; defines a pointer to characters, but it is uninitialized. It this definition occurs in the body of a function, its value is invalid. If this definition occurs at the global level, it is initialized to NULL.
*str1 = "Hello"; is an error: you are assigning a string pointer to the character pointed to by str1. str1 is uninitialized, so it does not point to anything valid, and you channot assign a pointer to a character. You should have written str1 = "Hello";. Furthermore, the string "Hello" is constant, so the definition of str1 really should be const char *str1;.
char *str2 = "World"; Here you define a pointer to a constant string "World". This statement is correct, but it would be better to define str2 as const char *str2 = "World"; for the same reason as above.
strcpy(str1,str2); //Works as expected NO it does not work at all! str1 does not point to a char array large enough to hold a copy of the string "World" including the final '\0'. Given the circumstances, this code invokes undefined behaviour, which may or may not cause a crash.
You mention the code works as expected: it only does no in appearance: what really happens is this: str1 is uninitialized, if it pointed to an area of memory that cannot be written, writing to it would likely have crashed the program with a segmentation fault; but if it happens to point to an area of memory where you can write, and the next statement *str1 = "Hello"; will modify the first byte of this area, then strcpy(str1, "World"); will modify the first 6 bytes at that place. The string pointed to by str1 will then be "World", as expected, but you have overwritten some area of memory that may be used for other purposes your program may consequently crash later in unexpected ways, a very hard to find bug! This is definitely undefined behaviour.
The second fragment invokes undefined behaviour for a different reason:
char *str1 = "Hello"; No problem, but should be const.
char *str2 = "World"; OK too, but should also be const.
strcpy(str1,str2); //SEGMENTATION FAULT of course it is invalid: you are trying to overwrite the constant character string "Hello" with the characters from the string "World". It would work if the string constant was stored in modifiable memory, and would cause even greater confusion later in the program as the value of the string constant was changed. Luckily, most modern environemnts prevent this by storing string constants in a read only memory. Trying to modify said memory causes a segment violation, ie: you are accessing the data segment of memory in a faulty way.
You should use strcpy() only to copy strings to character arrays you define as char buffer[SOME_SIZE]; or allocate as char *buffer = malloc(SOME_SIZE); with SOME_SIZE large enough to hold what you are trying to copy plus the final '\0'
Both code are wrong, even if "it works" in your first case. Hopefully this is only an academic question! :)
First let's look at *str1 which you are trying to modify.
char *str1;
This declares a dangling pointer, that is a pointer with the value of some unspecified address in the memory. Here the program is simple there is no important stuff, but you could have modified very critical data here!
char *str = "Hello";
This declares a pointer which will point to a protected section of the memory that even the program itself cannot change during execution, this is what a segmentation fault means.
To use strcpy(), the first parameter should be a char array dynamically allocated with malloc(). If fact, don't use strcpy(), learn to use strncpy() instead because it is safer.
Following are some basic questions that I have with respect to strings in C.
If string literals are stored in read-only data segment and cannot be changed after initialisation, then what is the difference between the following two initialisations.
char *string = "Hello world";
const char *string = "Hello world";
When we dynamically allocate memory for strings, I see the following allocation is capable enough to hold a string of arbitary length.Though this allocation work, I undersand/beleive that it is always good practice to allocate the actual size of actual string rather than the size of data type.Please guide on proper usage of dynamic allocation for strings.
char *str = (char *)malloc(sizeof(char));
scanf("%s",str);
printf("%s\n",str);
1.what is the difference between the following two initialisations.
The difference is the compilation and runtime checking of the error as others already told about this.
char *string = "Hello world";--->stored in read only data segment and
can't be changed,but if you change the value then compiler won't give
any error it only comes at runtime.
const char *string = "Hello world";--->This is also stored in the read
only data segment with a compile time checking as it is declared as
const so if you are changing the value of string then you will get an
error at compile time ,which is far better than a failure at run time.
2.Please guide on proper usage of dynamic allocation for strings.
char *str = (char *)malloc(sizeof(char));
scanf("%s",str);
printf("%s\n",str);
This code may work some time but not always.The problem comes at run-time when you will get a segmentation fault,as you are accessing the area of memory which is not own by your program.You should always very careful in this dynamic memory allocation as it will leads to very dangerous error at run time.
You should always allocate the amount of memory you need correctly.
The most error comes during the use of string.You should always keep in mind that there is a '\0' character present at last of the string and during the allocation its your responsibility to allocate memory for this.
Hope this helps.
what is the difference between the following two initialisations.
String literals have type char* for legacy reasons. It's good practice to only point to them via const char*, because it's not allowed to modify them.
I see the following allocation is capable enough to hold a string of arbitary length.
Wrong. This allocation only allocates memory for one character. If you tried to write more than one byte into string, you'd have a buffer overflow.
The proper way to dynamically allocate memory is simply:
char *string = malloc(your_desired_max_length);
Explicit cast is redundant here and sizeof(char) is 1 by definition.
Also: Remember that the string terminator (0) has to fit into the string too.
In the first case, you're explicitly casting the char* to a const one, meaning you're disallowing changes, at the compiler level, to the characters behind it. In C, it's actually undefined behaviour (at runtime) to try and modify those characters regardless of their const-ness, but a string literal is a char *, not a const char *.
In the second case, I see two problems.
The first is that you should never cast the return value from malloc since it can mask certain errors (especially on systems where pointers and integers are different sizes). Specifically, unless there is an active malloc prototype in place, a compiler may assume that it returns an int rather than the correct void *.
So, if you forget to include stdlib.h, you may experience some funny behaviour that the compiler couldn't warn you about, because you told it with an explicit cast that you knew what you were doing.
C is perfectly capable of implicit casting between the void * returned from malloc and any other pointer type.
The second problem is that it only allocates space for one character, which will be the terminating null for a string.
It would be better written as:
char *string = malloc (max_str_size + 1);
(and don't ever multiply by sizeof(char), that's a waste of time - it's always 1).
The difference between the two declarations is that the compiler will produce an error (which is much preferable to a runtime failure) if an attempt to modify the string literal is made via the const char* declared pointer. The following code:
const char* s = "hello"; /* 's' is a pointer to 'const char'. */
*s = 'a';
results in the VC2010 emitted the following error:
error C2166: l-value specifies const object
An attempt to modify the string literal made via the char* declared pointer won't be detected until runtime (VC2010 emits no error), the behaviour of which is undefined.
When malloc()ing memory for storing of strings you must remember to allocate one extra char for storing the null terminator as all (or nearly all) C string handling functions require the null terminator. For example, to allocate a buffer for storing "hello":
char* p = malloc(6); /* 5 for "hello" and 1 for null terminator. */
sizeof(char) is guaranteed to be 1 so is unrequired and it is not necessary to cast the return value of malloc(). When p is no longer required remember to free() the allocated memory:
free(p);
Difference between the following two initialisations.
first, char *string = "Hello world";
- "Hello world" stored in stack segment as constant string and its address is assigned to pointer'string' variable.
"Hello world" is constant. And you can't do string[5]='g', and doing this will cause a segmentation fault.
Where as 'string' variable itself is not constant. And you can change its binding:
string= "Some other string"; //this is correct, no segmentation fault
const char *string = "Hello world";
Again "Hello world" stored in stack segment as constant string and its address is assigned to 'string' variable.
And string[5]='g', and this cause segmentation fault.
No use of const keyword here!
Now,
char *string = (char *)malloc(sizeof(char));
Above declaration same as first one but this time you are assignment is dynamic from Heap segment (not from stack)
The code:
char *string = (char *)malloc(sizeof(char));
Will not hold a string of arbitrary length. It will allocate a single character and return a pointer to char character. Note that a pointer to a character and a pointer to what you call a string are the same thing.
To allocate space for a string you must do something like this:
char *data="Hello, world";
char *copy=(char*)malloc(strlen(data)+1);
strcpy(copy,data);
You need to tell malloc exactly how many bytes to allocate. The +1 is for the null terminator that needs to go onto the end.
As for literal string being stored in a read-only segment, this is an implementation issue, although is pretty much always the case. Most C compilers are pretty relaxed about const'ing access to these strings, but attempting to modify them is asking for trouble, so you should always declare them const char * to avoid any issues.
That particular allocation may appear to work as there's probably plenty of space in the program heap, but it doesn't. You can verify it by allocating two "arbitrary" strings with the proposed method and memcpy:ing some long enough string to the corresponding addresses. In the best case you see garbage, in the worst case you'll have segmentation fault or assert from malloc or free.
No guides I've seen seem to explain this very well.
I mean, you can allocate memory for a char*, or write char[25] instead? What's the difference? And then there are literals, which can't be manipulated? What if you want to assign a fixed string to a variable? Like, stringVariable = "thisIsALiteral", then how do you manipulate it afterwards?
Can someone set the record straight here? And in the last case, with the literal, how do you take care of null-termination? I find this very confusing.
EDIT: The real problem seems to be that as I understand it, you have to juggle these different constructs in order to accomplish even simple things. For instance, only char * can be passed as an argument or return value, but only char[] can be assigned a literal and modified. I feel like it's obvious that we frequently/always needs to be able to do both, and that's where my pitfall is.
What is the difference between an allocated char* and char[25]?
The lifetime of a malloc-ed string is not limited by the scope of its declaration. In plain language, you can return malloc-ed string from a function; you cannot do the same with char[25] allocated in the automatic storage, because its memory will be reclaimed upon return from the function.
Can literals be manipulated?
String literals cannot be manipulated in place, because they are allocated in read-only storage. You need to copy them into a modifiable space, such as static, automatic, or dynamic one, in order to manipulate them. This cannot be done:
char *str = "hello";
str[0] = 'H'; // <<== WRONG! This is undefined behavior.
This will work:
char str[] = "hello";
str[0] = 'H'; // <<=== This is OK
This works too:
char *str = malloc(6);
strcpy(str, "hello");
str[0] = 'H'; // <<=== This is OK too
How do you take care of null termination of string literals?
C compiler takes care of null termination for you: all string literals have an extra character at the end, filled with \0.
Your question refers to three different constructs in C: char arrays, char pointers allocated on the heap, and string literals. These are all different is subtle ways.
Char arrays, which you get by declaring char foo[25] inside a function, that memory is allocated on the stack, it exists only within the scope you declared it, but exactly 25 bytes have been allocated for you. You may store whatever you want in those bytes, but if you want a string, don't forget to use the last byte to null-terminate it.
Character pointers defined with char *bar only hold a pointer to some unallocated memory. To make use of them you need to point them to something, either an array as before (bar = foo) or allocate space bar = malloc(sizeof(char) * 25);. If you do the latter, you should eventually free the space.
String literals behave differently depending on how you use them. If you use them to initialize a char array char s[] = "String"; then you're simply declaring an array large enough to exactly hold that string (and the null terminator) and putting that string there. It's the same as declaring a char array and then filling it up.
On the other hand, if you assign a string literal to a char * then the pointer is pointing to memory you are not supposed to modify. Attempting to modify it may or may not crash, and leads to undefined behavior, which means you shouldn't do it.
Since other aspects are answered already, i would only add to the question "what if you want the flexibility of function passing using char * but modifiability of char []"
You can allocate an array and pass the same array to a function as char *. This is called pass by reference and internally only passes the address of actual array (precisely address of first element) instead of copying the whole. The other effect is that any change made inside the function modifies the original array.
void fun(char *a) {
a[0] = 'y'; // changes hello to yello
}
main() {
char arr[6] = "hello"; // Note that its not char * arr
fun(arr); // arr now contains yello
}
The same could have been done for an array allocated with malloc
char * arr = malloc(6);
strcpy(arr, "hello");
fun(arr); // note that fun remains same.
Latter you can free the malloc memory
free(arr);
char * a, is just a pointer that can store address, which might be of a single variable or might be the first element of an array. Be ware, we have to assign to this pointer before actually using it.
Contrary to that char arr[SIZE] creates an array on the stack i.e. it also allocates SIZE bytes. So you can directly access arr[3] (assuming 3 is less than SIZE) without any issues.
Now it makes sense to allow assigning any address to a, but not allowing this for arr, since there is no other way except using arr to access its memory.
I have written this code which is simple
#include <stdio.h>
#include <string.h>
void printLastLetter(char **str)
{
printf("%c\n",*(*str + strlen(*str) - 1));
printf("%c\n",**(str + strlen(*str) - 1));
}
int main()
{
char *str = "1234556";
printLastLetter(&str);
return 1;
}
Now, if I want to print the last char in a string I know the first line of printLastLetter is the right line of code. What I don't fully understand is what the difference is between *str and **str. The first one is an array of characters, and the second??
Also, what is the difference in memory allocation between char *str and str[10]?
Thnks
char* is a pointer to char, char ** is a pointer to a pointer to char.
char *ptr; does NOT allocate memory for characters, it allocates memory for a pointer to char.
char arr[10]; allocates 10 characters and arr holds the address of the first character. (though arr is NOT a pointer (not char *) but of type char[10])
For demonstration: char *str = "1234556"; is like:
char *str; // allocate a space for char pointer on the stack
str = "1234556"; // assign the address of the string literal "1234556" to str
As #Oli Charlesworth commented, if you use a pointer to a constant string, such as in the above example, you should declare the pointer as const - const char *str = "1234556"; so if you try to modify it, which is not allowed, you will get a compile-time error and not a run-time access violation error, such as segmentation fault. If you're not familiar with that, please look here.
Also see the explanation in the FAQ of newsgroup comp.lang.c.
char **x is a pointer to a pointer, which is useful when you want to modify an existing pointer outside of its scope (say, within a function call).
This is important because C is pass by copy, so to modify a pointer within another function, you have to pass the address of the pointer and use a pointer to the pointer like so:
void modify(char **s)
{
free(*s); // free the old array
*s = malloc(10); // allocate a new array of 10 chars
}
int main()
{
char *s = malloc(5); // s points to an array of 5 chars
modify(&s); // s now points to a new array of 10 chars
free(s);
}
You can also use char ** to store an array of strings. However, if you dynamically allocate everything, remember to keep track of how long the array of strings is so you can loop through each element and free it.
As for your last question, char *str; simply declares a pointer with no memory allocated to it, whereas char str[10]; allocates an array of 10 chars on the local stack. The local array will disappear once it goes out of scope though, which is why if you want to return a string from a function, you want to use a pointer with dynamically allocated (malloc'd) memory.
Also, char *str = "Some string constant"; is also a pointer to a string constant. String constants are stored in the global data section of your compiled program and can't be modified. You don't have to allocate memory for them because they're compiled/hardcoded into your program, so they already take up memory.
The first one is an array of characters, and the second??
The second is a pointer to your array. Since you pass the adress of str and not the pointer (str) itself you need this to derefence.
printLastLetter( str );
and
printf("%c\n",*(str + strlen(str) - 1));
makes more sense unless you need to change the value of str.
You might care to study this minor variation of your program (the function printLastLetter() is unchanged except that it is made static), and work out why the output is:
3
X
The output is fully deterministic - but only because I carefully set up the list variable so that it would be deterministic.
#include <stdio.h>
#include <string.h>
static void printLastLetter(char **str)
{
printf("%c\n", *(*str + strlen(*str) - 1));
printf("%c\n", **(str + strlen(*str) - 1));
}
int main(void)
{
char *list[] = { "123", "abc", "XYZ" };
printLastLetter(list);
return 0;
}
char** is for a string of strings basically - an array of character arrays. If you want to pass multiple character array arguments you can use this assuming they're allocated correctly.
char **x;
*x would dereference and give you the first character array allocated in x.
**x would dereference that character array giving you the first character in the array.
**str is nothing else than (*str)[0] and the difference between *str and str[10] (in the declaration, I assume) I think is, that the former is just a pointer pointing to a constant string literal that may be stored somewhere in global static memory, whereas the latter allocates 10 byte of memory on the stack where the literal is stored into.
char * is a pointer to a memory location. for char * str="123456"; this is the first character of a string. The "" are just a convenient way of entering an array of character values.
str[10] is a way of reserving 10 characters in memory without saying what they are.(nb Since the last character is a NULL this can actually only hold 9 letters. When a function takes a * parameter you can use a [] parameter but not the other way round.
You are making it unnecessarily complicated by taking the address of str before using it as a parameter. In C you often pass the address of an object to a function because it is a lot faster then passing the whole object. But since it is already a pointer you do not make the function any better by passing a pointer to a pointer. Assuming you do not want to change the pointer to point to a different string.
for your code snippet, *str holds address to a char and **str holds address to a variable holding address of a char. In another word, pointer to pointer.
Whenever, you have *str, only enough memory is allocated to hold a pointer type variable(4 byte on a 32 bit machine). With str[10], memory is already allocated for 10 char.