The function below uses const char *s1
What the function does is perhaps not important. It returns 1 if the string
contains characters other than allowed characters. It returns 0 if it
doesn't.
int TEST (const char *s1);
int TEST (const char *s1) {
char * s2= "o123";
return s1[ strspn(s1, s2) ] != '\0';
}
The function seems to work just fine if I were to remove const portion from it.
Why do I need const terminology in that function, if it seems to work fine without it? What is the importance of it? What are the implications of
using?
not using?
When you have const char *s1 you are telling the compiler that you will not modify the contents of the memory that s1 is pointing to. It's a semantic signal for the compiler so you do not make mistakes in the function (attempts to change the contents will results in errors). It might also allow the compiler to add some optimizations.
But more importantly it's a signal to other programmers reading your code, and using your function. And with "other programmers" I also include you in a few weeks, months or years time, when you might come back to the code and have forgotten the details of it.
importance of “const”?
3 reasons:
Applicability
Code correctness
Speed
Consider what the calling code is allowed to do
int TEST_with_const(const char *s1);
int TEST_without_const(char *s1);
char *cs;
const char *ccs
...
TEST_with_const(cs); // allowed
TEST_with_const(ccs); // allowed
TEST_without_const(cs); // allowed
TEST_without_const(ccs); // fails to compile.
The last one fails because TEST_without_const(char *s1); is telling the outside world, I want a pointer to memory that I might alter. ccs is not a pointer to modifiable memory.
Assuming TEST_without_const(char *s1); has the same body of int TEST (const char *s1), it did not need a pointer to modifiable memory
By using const char *s1 the function's body will complain during compilation should a modification of the memory pointed to by s1 is attempted. If the function design does not need to modify the memory pointed to by s1, this check helps insure correctness.
Depending on complier, some optimizations can occur with const char *, but not with char *.
const in this function prevents you to write something to memory where variable points to.
Here, you are only checking value if it is not equal to \0 and it is ok because you do only read operation.
int TEST (const char *s1) {
char * s2= "o123";
return s1[ strspn(s1, s2) ] != '\0';
}
If you do this, trying to write through s1 pointer:
int TEST (const char *s1) {
char * s2= "o123";
s1[0] = '4'; //Error, s1 is const pointer, can't write to memory pointed to by s1
return s1[ strspn(s1, s2) ] != '\0';
}
const pointer in your case means that this function should not write any data to input pointer, only read operation can be performed. You can in fact change where s1 points, only write to memory through s1 is not allowed. This is to prevent any mistakes to now write unexpectedly because it may in some cases lead to undefined behaviour.
The usage of const prevents you or somebody else working with your code from doing something accidentally wrong. Moreover it is for documentation purposes as the reader of the code knows that these functions are not changing the characters (read-only) the pointer points to.
Further the compiler may optimize your code if you use const correctness because it knows that these values are not changing inside the functions being read-only. But the first reason of documenting and making your function safer is more important.
There are three forms you always come accross with: char* s1, const char* s1 and const char* const s1. The forth but rarely needed form is: char* const. The meanings are as follows:
1.
char* s1:
In this case s1 is just a pointer to memory of one or more characters.
The following can be/can not be done inside of the function:
/* Function might changes content of s1 */
int TEST (char* s1)
{
s1[0] = 'A'; /* works */
*s1 = 'A'; /* same as above works */
++s1; /* this works because the copy of the pointer is non constant */
...
}
The following calls can be/can not be made:
char* str;
const char* const_str;
...
TEST(str); /* works as str as no restrictions */
TEST(const_str); /* fails because function might change const const_str */
2.
const char* s1:
The term: const char* s1 means s1 is a pointer to memory of one or more characters that can't be changed with this pointer. The following can be/can not be done inside of the function:
/* Function is read-only and doesn't changes content of s1 */
int CONST_TEST (const char* s1)
{
s1[0] = 'A'; /* compiler fail */
*s1 = 'A'; /* same as above compiler fail */
++s1; /* this works because the copy of the pointer is non constant */
...
}
The following calls can be/can not be made:
char* str;
const char* const_str;
...
CONST_TEST(str); /* works as str as no restrictions */
CONST_TEST(const_str); /* works because function is read-only and const_str is it also */
The compiler will fail compiling and tell you that you try to write to a memory location that is marked as constant.
3.
const char* const s1:
This means: s1 is a constant pointer to memory of one or more characters that can't be changed with this pointer. This is an extension to the first approach, the pointer s1 itself is passed by value and this copy can't be changes inside the function. So that you can not do something like ++s1 inside the function, therefore the pointer will allways point to the same memory location.
4.
char* const s1:
This is like case 3 but without s1 pointing to memory marked as constant. Therefore the content s1 is pointing to can be changed like case 1 but with a pointer being constant.
Related
Is there a reason strcpy's signature is this:
char *strcpy(char *dest, const char *src);
instead of this?
char *strcpy(char *const dest, const char *src);
As far as I know, the function will never change the pointer.
Am I misunderstanding what const pointers should be used for? In my mind, when a function I write accepts a pointer which won't be changed (via realloc, etc.), then I mark it as a const pointer so the caller can be assured their pointer won't be moved on them. (In case they have other structs/etc. referencing that pointer location that would become outdated)
Is that an OK practice, or will it have unintended consequences?
The source code for strcpy is very roughly this:
char *strcpy(char *dest, const char *src)
{
while (*dest++ = *src++);
}
Here we actually do modify dest but this has no consequence for the caller because inside the strcpyfunction, dest is a local variable.
But following code won't compile because dest is const:
char *strcpy(char * const dest, const char *src)
{
while (*dest++ = *src++);
}
we would need to write this:
char *strcpy(char * const dest, const char *src)
{
char *temp = dest;
while (*temp++ = *src++);
}
which introduces a non necessary temp variable.
Qualifiers on function arguments are completely ignored in declarations (prototypes). This is required by the C language, and it makes sense, because there is no possible meaning such qualification could have to callers of the function.
In the case of const char *src, the argument src is not what's qualified; the type it points to is qualified. In your hypothetical declaration for dest, the qualifier applies to dest and is thus meaningless.
char *foo(char *const dest, const char *src) { ... } means the pointer dest will not change in the body of the function.
It does not mean the data pointed to by dest will or will not change.
const char *src assures the calling code that the data pointed to by src will not change.
In calling a function like strcpy(d,s) or foo(d,s), the calling code does not care if the function changes its copy of the pointer. All the calling code cares about is if the data pointed to by s or d is changed and that is control by the const of the left side of *
char *dest, // data pointed by `dest` may or may not change.
const char *src // data pointed by `src` will not change change because of `src`.
It is meaningless to mark it as const as you suggest. In C, function arguments are passed as copy. It means that the variable dest inside strcpy() is actually a new variable (pushed on the stack), which holds the same content (here, the address).
Look at this function prototype:
void foo(int const a);
This has no semantic value, because we know that the original variable a that we passed to foo() can't be changed because it is a copy. Only the copy will potentially change. When foo return, we are guaranteed that the original variable a is unchanged.
In function parameters, you want to use the keyword const only if the function could actually persistently change the state of the variable. For example:
size_t strlen(const char *s);
This marks the content of the variable s (ie, the value stored at the address s points to) as const. Therefore, you are guaranteed your string will be unchanged when strlen returns.
As far as I know, the function will never change the pointer.
Yes, but you can. You are free to change the pointer. No need to make dest pointer const.
For example:
int main(void)
{
char *s = "Hello";
char *d = malloc(6);
strcpy(d, s);
puts(d);
strcpy(d, "World");
puts(d);
}
Here is my question. in C, i saw code like this:
char *s = "this is a string";
but then, s is not actually pointing to an actual memory right?
and if you try to use s to modify the string, the result is undefined.
my question is, what is the point of assigning a string to the pointer
then?
thanks.
char *s = "this is a string";
This is a string literal. So the string is stored in read-only location and that memory address is returned to s . So when you try to write to the read-only location you see undefined behavior and might see a crash.
Q1:s is not actually pointing to an actual memory right?
You are wrong s is holding the memory address where this string is stored.
Q2:what is the point of assigning a string to the pointer then?
http://en.wikipedia.org/wiki/String_literal
When you do a char *s = "this is a string";, the memory is automatically allocated and populated with this string and a pointer to that memory is returned back to the caller (you). So, you do not need to explicitly put the string to some memory.
s is not actually pointing to an actual memory right?
Wrong, it does point to an actual memory whose allocation implementation is hidden from you. And this memory lies in the Read-Only sector of memory, so that it can't be changed/modified. Hence the keyword const as these literals are called constant literals.
if you try to use s to modify the string, the result is undefined.
Because, you are trying to modify memory which is marked as Read-Only.
what is the point of assigning a string to the pointer then?
Another way to achieve the same is,
char temp[260] = {0} ;
char *s ;
strcpy (temp, "this is a string");
s = temp ;
Here the memory temp is managed by you.
char *s = "this is a string" ;
Here the memory is managed by the OS.
By using const char * instead of a char [] the string will be stored in read only memory space. This allows the compiler to eliminate string duplication.
Try running this program:
#include <stdio.h>
int main()
{
const char *s1 = "This is a string";
const char *s2 = "This is a string";
if (s1 == s2) {
puts("s1 == s2");
} else {
puts("s1 != s2");
}
}
For the me it outputs s1 == s2 which means that the string pointers point to the same memory location.
Now try replacing const char * with char []:
int main()
{
const char s1[] = "This is a string";
const char s2[] = "This is a string";
if (s1 == s2) {
puts("s1 == s2");
} else {
puts("s1 != s2");
}
}
This outputs s1 != s2 which means that the compiler had to duplicate the string memory.
By using char * instead of char [] the compiler can do these optimizations that will decrease the size of you executable file.
Also note that you should not use char *s = "string". You should use const char *s = "string" instead. char *s is deprecated and unsafe. By using const char * you avoid the mistake of passing the string to a function that tries to modify the string.
s is not actually pointing to an actual memory right?
Technically, it is pointing to read-only memory. But the compiler is allowed to do whatever it wants as long as if follows the as-if rule. For example, if you never use s, it can be removed from your code completely.
Since it is read-only, any attempt to modify it is undefined behaviour. You can and should use const to indicate that the target of a pointer is immutable:
const char* s = "Hello const";
my question is, what is the point of assigning a string to the pointer then?
Just like storing a constant to any other type. You don't always need to modify strings. But you may want to pass a pointer to a string around to functions that don't care whether they point to a literal or to an array you own:
void foo(const char* str) {
// I won't modify the target of str. I don't care who owns it.
printf("foo: %s", str);
}
void bar(const char* str) {}
char* a = "Hello, this is a literal";
char b[] = "Hello, this is a char array and I own it";
foo(a);
bar(a);
foo(b);
Look at this code:
char *s = "this is a string";
printf("%s",s);
as you can see,I used "assigning a string to the pointer".Is that clear?
And know that s is pointing to an actual memory,but it is read-only.
If you are assigning like this,
char *s = "this is a string";
It will stored in the read only memory. So this is the reason for the undefined behavior. In this s will pointing to the some memory in the read-only area.
If you print the address like this you can get the some memory address.
printf("%p",s);
So in this case, if you allocate a memory and the copy the value to that pointer, you can access that pointer like array.
Everybody else has told you about the read-only memory and potential for undefined behavior if you attempt to modify the string, so I'll skip that part and answer the question, "what is the point of assigning a string to a pointer then?".
There are two reasons why
1) Just for brevity. After assigning the string to the pointer you can refer to the string as s instead of repeatedly typing "this is a string". This assumes of course that you intend to use the string in multiple function calls.
2) Because you may want to change the string that the pointer references. For example, in the following code, s is initialized assuming that the code will succeed, and is subsequently changed if there's a failure. At the end, the string that s points to is printed.
const char *s = "Yay, it worked!!!";
if ( openTheFile() == FAILED )
s = "Dang, couldn't open the file";
else if ( readTheFile() == FAILED )
s = "Oops, there's nothing in the file";
printf( "%s\n", s );
Note that const char * means that the string that s points to cannot be changed. It doesn't mean that s itself can't be changed.
In your case the pointer s is just pointing to the location that carries the first literal of the string. So if we want to change the string, it creates confusion as pointer s is pointing to the previous location. And if you want to change string using pointer, you should take care of the previous string's ending (i.e NULL).
im trying to copy a const char array to some place in the memory and point to it .
lets say im defining this var under the main prog :
char *p = NULL;
and sending it to a function with a string :
myFunc(&p, "Hello");
now i want that at the end of this function the pointer will point to the letter H but if i puts() it, it will print Hello .
here is what i tried to do :
void myFunc(char** ptr , const char strng[] ) {
*ptr=(char *) malloc(sizeof(strng));
char * tmp=*ptr;
int i=0;
while (1) {
*ptr[i]=strng[i];
if (strng[i]=='\0') break;
i++;
}
*ptr=tmp;
}
i know its a rubbish now, but i would like to understand how to do it right, my idea was to allocate the needed memory, copy a char and move forward with the pointer, etc..
also i tried to make the ptr argument byreferenec (like &ptr) but with no success due to a problem with the lvalue and rvalue .
the only thing is changeable for me is the function, and i would like not to use strings, but chars as this is and exercise .
thanks for any help in advance.
Just replace all the char* with std::string. Do that until you have a very specific reason to not use existing utilities, which is something you don't have as a beginner. None of the code above requires malloc() or raw pointers.
Some more notes:
const char strng[] as parameter is the same as const char* strng. The array syntax doesn't make it an array, it remains a pointer. I don't use this syntax in order to avoid this confusion.
Use static_cast or one of the other C++ casts not the C-style like (char*)malloc(..). The reason is that they are safer.
Check the return value of malloc(), it can return null. Also, you must call free() eventually, otherwise your application leaks memory.
Finally, the pointer does point to the 'H', which is just the first element of the string. Output *p instead of p to see this.
You code work as desired except
*ptr[i]=strng[i];
should be
(*ptr)[i]=strng[i];
Without the parens, it acts like `*(ptr[i])=strng[i];
2) Also
malloc(sizeof(strng));
s/b
malloc(strlen(strng)+1);
You may want to look at strdup(strng).
[edit per OP's request]
difference between *(ptr[i]) and (*ptr)[i]
// OP desired function
(*ptr)[i] = 'x';
// Dereference ptr, getting the address of a char *array.
// Assign 'x' to the i'th element of that char * array.
// OP post with the undesired function
*(ptr[i]) = 'x';
// This is like
char *q = ptr[i];
*q = 'x';
// This make sense _if_ ptr were an array of char *.
// Get the i'th char * from ptr and assign to q.
// Assign 'x' to to the location pointer to by q.
This is all the code needed...
nothing more...
void myFunc(char **pp, char * str){
*pp = str;
}
The only issue here is that "Hello" resides in read only section because it is a constant string... so you can't change "Hello" to something else...
I tried to write my own implementation of the strchr() method.
It now looks like this:
char *mystrchr(const char *s, int c) {
while (*s != (char) c) {
if (!*s++) {
return NULL;
}
}
return (char *)s;
}
The last line originally was
return s;
But this didn't work because s is const. I found out that there needs to be this cast (char *), but I honestly don't know what I am doing there :( Can someone explain?
I believe this is actually a flaw in the C Standard's definition of the strchr() function. (I'll be happy to be proven wrong.) (Replying to the comments, it's arguable whether it's really a flaw; IMHO it's still poor design. It can be used safely, but it's too easy to use it unsafely.)
Here's what the C standard says:
char *strchr(const char *s, int c);
The strchr function locates the first occurrence of c
(converted to a char) in the string pointed to by s. The
terminating null character is considered to be part of the string.
Which means that this program:
#include <stdio.h>
#include <string.h>
int main(void) {
const char *s = "hello";
char *p = strchr(s, 'l');
*p = 'L';
return 0;
}
even though it carefully defines the pointer to the string literal as a pointer to const char, has undefined behavior, since it modifies the string literal. gcc, at least, doesn't warn about this, and the program dies with a segmentation fault.
The problem is that strchr() takes a const char* argument, which means it promises not to modify the data that s points to -- but it returns a plain char*, which permits the caller to modify the same data.
Here's another example; it doesn't have undefined behavior, but it quietly modifies a const qualified object without any casts (which, on further thought, I believe has undefined behavior):
#include <stdio.h>
#include <string.h>
int main(void) {
const char s[] = "hello";
char *p = strchr(s, 'l');
*p = 'L';
printf("s = \"%s\"\n", s);
return 0;
}
Which means, I think, (to answer your question) that a C implementation of strchr() has to cast its result to convert it from const char* to char*, or do something equivalent.
This is why C++, in one of the few changes it makes to the C standard library, replaces strchr() with two overloaded functions of the same name:
const char * strchr ( const char * str, int character );
char * strchr ( char * str, int character );
Of course C can't do this.
An alternative would have been to replace strchr by two functions, one taking a const char* and returning a const char*, and another taking a char* and returning a char*. Unlike in C++, the two functions would have to have different names, perhaps strchr and strcchr.
(Historically, const was added to C after strchr() had already been defined. This was probably the only way to keep strchr() without breaking existing code.)
strchr() is not the only C standard library function that has this problem. The list of affected function (I think this list is complete but I don't guarantee it) is:
void *memchr(const void *s, int c, size_t n);
char *strchr(const char *s, int c);
char *strpbrk(const char *s1, const char *s2);
char *strrchr(const char *s, int c);
char *strstr(const char *s1, const char *s2);
(all declared in <string.h>) and:
void *bsearch(const void *key, const void *base,
size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
(declared in <stdlib.h>). All these functions take a pointer to const data that points to the initial element of an array, and return a non-const pointer to an element of that array.
The practice of returning non-const pointers to const data from non-modifying functions is actually an idiom rather widely used in C language. It is not always pretty, but it is rather well established.
The reationale here is simple: strchr by itself is a non-modifying operation. Yet we need strchr functionality for both constant strings and non-constant strings, which would also propagate the constness of the input to the constness of the output. Neither C not C++ provide any elegant support for this concept, meaning that in both languages you will have to write two virtually identical functions in order to avoid taking any risks with const-correctness.
In C++ you wild be able to use function overloading by declaring two functions with the same name
const char *strchr(const char *s, int c);
char *strchr(char *s, int c);
In C you have no function overloading, so in order to fully enforce const-correctness in this case you would have to provide two functions with different names, something like
const char *strchr_c(const char *s, int c);
char *strchr(char *s, int c);
Although in some cases this might be the right thing to do, it is typically (and rightfully) considered too cumbersome and involving by C standards. You can resolve this situation in a more compact (albeit more risky) way by implementing only one function
char *strchr(const char *s, int c);
which returns non-const pointer into the input string (by using a cast at the exit, exactly as you did it). Note, that this approach does not violate any rules of the language, although it provides the caller with the means to violate them. By casting away the constness of the data this approach simply delegates the responsibility to observe const-correctness from the function itself to the caller. As long as the caller is aware of what's going on and remembers to "play nice", i.e. uses a const-qualified pointer to point to const data, any temporary breaches in the wall of const-correctness created by such function are repaired instantly.
I see this trick as a perfectly acceptable approach to reducing unnecessary code duplication (especially in absence of function overloading). The standard library uses it. You have no reason to avoid it either, assuming you understand what you are doing.
Now, as for your implementation of strchr, it looks weird to me from the stylistic point of view. I would use the cycle header to iterate over the full range we are operating on (the full string), and use the inner if to catch the early termination condition
for (; *s != '\0'; ++s)
if (*s == c)
return (char *) s;
return NULL;
But things like that are always a matter of personal preference. Someone might prefer to just
for (; *s != '\0' && *s != c; ++s)
;
return *s == c ? (char *) s : NULL;
Some might say that modifying function parameter (s) inside the function is a bad practice.
The const keyword means that the parameter cannot be modified.
You couldn't return s directly because s is declared as const char *s and the return type of the function is char *. If the compiler allowed you to do that, it would be possible to override the const restriction.
Adding a explicit cast to char* tells the compiler that you know what you're doing (though as Eric explained, it would be better if you didn't do it).
UPDATE: For the sake of context I'm quoting Eric's answer, since he seems to have deleted it:
You should not be modifying s since it is a const char *.
Instead, define a local variable that represents the result of type char * and use that in place of s in the method body.
The Function Return Value should be a Constant Pointer to a Character:
strchr accepts a const char* and should return const char* also. You are returning a non constant which is potentially dangerous since the return value points into the input character array (the caller might be expecting the constant argument to remain constant, but it is modifiable if any part of it is returned as as a char * pointer).
The Function return Value should be NULL if No matching Character is Found:
Also strchr is supposed to return NULL if the sought character is not found. If it returns non-NULL when the character is not found, or s in this case, the caller (if he thinks the behavior is the same as strchr)
might assume that the first character in the result actually matches (without the NULL return value
there is no way to tell whether there was a match or not).
(I'm not sure if that is what you intended to do.)
Here is an Example of a Function that Does This:
I wrote and ran several tests on this function; I added a few really obvious sanity checks to avoid potential crashes:
const char *mystrchr1(const char *s, int c) {
if (s == NULL) {
return NULL;
}
if ((c > 255) || (c < 0)) {
return NULL;
}
int s_len;
int i;
s_len = strlen(s);
for (i = 0; i < s_len; i++) {
if ((char) c == s[i]) {
return (const char*) &s[i];
}
}
return NULL;
}
You're no doubt seeing compiler errors anytime you write code that tries to use the char* result of mystrchr to modify the string literal being passed to mystrchr.
Modifying string literals is a security no-no, because it can lead to abnormal program termination and possibly denial-of-service attacks. Compilers may warn you when you pass a string literal to a function taking char*, but they aren't required to.
How do you use strchr correctly? Let's look at an example.
This is an example of what not to do:
#include <stdio.h>
#include <string.h>
/** Truncate a null-terminated string $str starting at the first occurence
* of a character $c. Return the string after truncating it.
*/
const char* trunc(const char* str, char c){
char* pc = strchr(str, c);
if(pc && *pc && *(pc+1)) *(pc+1)=0;
return str;
}
See how it modifies the string literal str via the pointer pc? That's no bueno.
Here's the right way to do it:
#include <stdio.h>
#include <string.h>
/** Truncate a null-terminated string $str of $sz bytes starting at the first
* occurrence of a character $c. Write the truncated string to the output buffer
* $out.
*/
char* trunc(size_t sz, const char* str, char c, char* out){
char* c_pos = strchr(str, c);
if(c_pos){
ptrdiff_t c_idx = c_pos - str;
if((size_t)n < sz){
memcpy(out, str, c_idx); // copy out all chars before c
out[c_idx]=0; // terminate with null byte
}
}
return 0; // strchr couldn't find c, or had serious problems
}
See how the pointer returned by strchr is used to compute the index of the matching character in the string? The index (also equal to the length up to that point, minus one) is then used to copy the desired part of the string to the output buffer.
You might think "Aw, that's dumb! I don't want to use strchr if it's just going to make me memcpy." If that's how you feel, I've never run into a use case of strchr, strrchr, etc. that I couldn't get away with using a while loop and isspace, isalnum, etc. Sometimes it's actually cleaner than using strchr correctly.
I'm learning the C language.
My question is:
Why is the param of strlen a "const" ?
size_t strlen(const char * string);
I'm thinking it's because string is an address so it doesn't change after initialization. If this is right, does that mean every time you build a function using a pointer as a param, it should be set to a constant ?
Like if I decide to build a function that sets an int variable to its double, should it be defined as:
void timesTwo(const int *num)
{
*num *= 2;
}
or
void timesTwo(int *num)
{
*num *= 2;
}
Or does it make no difference at all ?
C string is a pointer to a zero-terminated sequence of characters. const in front of char * indicates to the compiler and to the programmer calling the function that strlen is not going to modify the data pointed to by the string pointer.
This point is easier to understand when you look at strcpy:
char * strcpy ( char * destination, const char * source );
its second argument is const, but its first argument is not. This tells the programmer that the data pointed to by the first pointer may be modified by the function, while the data pointed to by the second pointer will remain constant upon return from strcpy.
The parameter to strlen function is a pointer to a const because the function is not expected to change what the pointer points to - it is only expected to do something with the string without altering it.
In your function 'timesTwo', if you intend to change the value that 'num' points to, you should not pass it in as a pointer-to-const. So, use the second example.
Basically, the function is promising that it will not modify the contents of of the input string through that pointer; it says that the expression *string may not be written to.
Here are the various permutations of the const qualifier:
const char *s; // s may be modified, *s may not be modified
char const *s; // same as above
char * const s; // s may not be modified, *s may be modified
const char * const s; // neither s nor *s may be modified
char const * const s; // same as above