c - why convert main() argument to const - c

(I am a beginner in C, maybe my question is not very smart, but I did google before I ask.)
I saw following code in git source code:
int main(int argc, char **av) {
const char **argv = (const char **) av;
// ... more code ...
}
It converts char **av to const char **argv, I thought it meant to make the argument immutable, but I wrote a program and found that both argv and argv[i] are mutable.
Question 1: What is the purpose & goodness of that line of code?
Question 2: What is the behavior of a const pointer? I did google but didn't find a good answer.
#Update
I test more according to the answers, and it seems that argv[i][j] is immutable, but argv and argv[i] is mutable.
So the const on pointer makes the original value immutable, but the pointer itself is still mutable.
Thus I guess the major purpose of the code from git is also to prevent change of the original arguments.
testing code:
#include <stdio.h>
int main(int argc, char * av[]) {
// modify value pointed by a non-const pointer - ok
av[0][0] = 'h';
printf("argv[0] = %s\n", av[0]);
// modify const pointer itself - ok
const char **argv = (const char **) av;
argv[0] = "fake";
printf("argv[0] = %s\n", argv[0]);
char *arr[] = {"how", "are", "you"};
argv = (const char **)arr;
printf("argv[0] = %s\n", argv[0]);
// modify the value itself which is pointed by a const pointer - bad, an error will be thrown,
/*
argv[0][0] = 'x';
printf("argv[0] = %s\n", argv[0]);
*/
return 0;
}
The current code could compile & run without warning or error, but if un-comment the 2 commented lines at end, then it will throw following error when compile:
error: assignment of read-only location ‘**argv’

In practice it is not very useful here (and the generated code won't change much, if the compiler is optimizing).
However, argv is not mutable, so the compiler would for instance catch as an error an assignment like
argv[1][0] = '_'; // wrong
A const thing cannot be assigned to. So a const pointer can't be assigned, and a pointer to const means that the dereferenced pointer is a location which cannot be assigned. (and you can mix both: having a const pointer to const)
BTW, main  -in standard C99- is a very special function. You cannot declare it in arbitrary ways (it almost always should be declared int main(int, char**) or int main(void) ....) and you perhaps cannot call it (e.g. it cannot be recursive), but that may be different in C and in C++. So declaring int main (int, const char**) would be illegal.

1) There is really no point in using a const pointer to access the parameters later on, except to make sure they are not changed.
2) The purpose of const pointers is to make sure that they are not changed throughout the code. You can live without them, but it helps avoiding bugs.

Related

Is there a better way to fetch this value from "char **"?

I'm studying C in order to start doing some fun low-level code stuff, and I've stumbled into a scenario which I can't wrap my head around, and I'm sure it's because I don't have a lot of experience with it.
Currently my code is very simple: it takes some arguments, and gets the first parameter passed to main, and stores it as a path string. The first question that came to mind, was whether it would be correct to store the main params as char *args[] or char **args, and I decided to go with char **args since according to this question there could be some scenarios where the first would not be accessible, and I just wanted to make a code that would be as complete as possible, and learn the whys on the process.
Here's is the code:
int main(int argc, char **args) {
if (args[1] == NULL) return 1;
// Get path of input file
char *path = &*args[1];
fputs(path, stdout);
return 0;
}
Given the code above, what would be a better way of fetching the value stored in *args[1]? It seems very cryptic when I look at it, and it took me a while to get to it as well.
My understanding is that char **args, is a pointer, to an array of pointers. Thus, if I'm to store a string or any other value for later use in one of the indexes of args, I would have to assign a new pointer to a memory location (*path), and assign the value of the given index to it (&*args[i]). Am I over complicating things? Or is this thought process correct?
For starters these two function declarations
int main(int argc, char **args)
and
int main(int argc, char *args[])
are fully equivalent because the compiler adjusts the parameter that has an array type to pointer to the array element type.
In the initializer expression of this declaration
char *path = &*args[1];
applying the two operators & and * sequentially is redundant. So you may just write
char *path = args[1];
Also in general instead of the condition in the if statement
if ( args[1] == NULL) return 1;
it will be more safer to write
if ( argc < 2 ) return 1;
You can simply write:
char *path = args[1];
& and * operators are inverses of each other, so &* or *& can simply be removed from an expression.

Re-Initializing a const char array in C is not giving error

While initializing a const char array, I tried to change the string and I was able to change it without any issue.
I was learning how to initialize a const char array.
I think I am doing some mistake here which I am not able to find.
int main(int argc, char const *argv[])
{
const char *strs[10];
strs[0] = "wwww.google.com";
printf("%s\n", strs[0]);
strs[1] = "https://wwww.google.com";
strs[0] = "ss";
printf("%s\n", strs[0]);
return 0;
}
Output:
1st init: wwww.google.com
2nd init: ss -> Here, I expect it to throw error
const char* s = "Hi";
tells the compiler that the content that the pointer points to is constant. This means that s[0] = 'P'; will result in a compilation error. But you can modify the pointer. On the other hand,
char* const s = "Hi";
tells the compiler that that the pointer is constant. This means that s = "Pi"; will result in a compilation error. But no compilation error will be thrown when you try to modify the string*
Your code depicts the former behaviour, not the latter as you seem to have thought
* Modifying string literals will invoke Undefined Behaviour
const char *strs[10];
strs is an array of 10 pointers to const char. You can change the pointers; you cannot change the chars
strs[2] = NULL; // ok: change the pointer
strs[0][0] = '#'; // wrong; cannot change the char
Maybe try
const char * const strs[10] = {"www.google.com",
"https://www.google.com",
"www.google.com/",
"https://www.google.com/",
NULL, NULL };
which makes strs an array of 10 read-only pointers to const char. You cannot change the pointers after initialization.
To put this in simple English(not necessarily 100% accurate but serves to conceptualise), this
const char *strs[10];
initialises a contant array strs which contains none constant elements. Thus, the elements in the array can be changed but the array itself cannot be changed

Passing params to function and casting

I am new to C language and I have some misunderstanding in the following exercise:
void printAllStrings(const char** arr[])
{
while(*arr!=NULL)
{
char** ptr=(char**)*arr;
while(*ptr!=NULL)
{
printf("%s\n",*ptr);
ptr++;
}
arr++;
}
}
int main()
{
char* arrP1[]={"father","mother",NULL};
char* arrP2[]={"sister","brother","grandfather",NULL};
char* arrP3[]={"grandmother",NULL};
char* arrP4[]={"uncle","aunt",NULL};
char** arrPP[]={arrP1,arrP2,arrP3,arrP4,NULL};
printf("Before sort :\n");
printAllStrings(arrPP);
sort(arrPP);
printf("\nAfter sort :\n");
printAllStrings(arrPP);
printf("\nMaximum length string : %s \n",maxLengthString(arrPP));
return 0;
}
The code above prints all strings.
My questions is:
In printAllStrings function the the passed parameter(char** arr[]) array of strings, could we pass pointer on pointer - char** arr.
What the meaning of this row char** ptr=(char**)*arr; I undersatnd that this casting of pointers to char type.
But why the pointer have to be casted is already points to char type?
In printAllStrings function the the passed parameter(char** arr[]) array of strings, could we pass pointer on pointer - char** arr[].
In your examples above, you have char** arr[] and char** arr[] (the two are the same) so your "could we pass?" question is unclear. If you are asking if you could change the parameter to (char ***arr), then yes, you could because the first level of indirection (e.g. [ ]) is converted to a pointer.
What the meaning of this row char** ptr=(char**)*arr; I undersatnd that this casting of pointers to char type. But why the pointer have
to be casted is already points to char type?
The reason is your parameter is const char** arr[] and then you declare char** ptr which discards the const qualifier on arr. const char ** and char ** are not the same. So when you attempt to initialize ptr with the dereferenced arr, e.g. (char** ptr=arr;) the compiler complains about the discard of the const qualifier.
Rather than fixing the problem correctly, e.g.
const char **ptr = *arr;
you "fudge" the initialization to force the discard of the const qualifier -- resulting in ptr not retaining the const type which can prevent the compiler from warning when you attempt to use ptr in a non-constant way (really bad things happen when you just cast away const qualifiers)
I may be wrong -- but it looks like the point of the assignment is to have you preserve the const nature of the string literals you use to initialize your array of pointers. So rather than declaring the arrays as:
char* arrP1[]={"father","mother",NULL};
You should declare them as arrays of const pointers to char, e.g.
const char *arrP1[]={"father","mother",NULL};
Your parameter for printAllStrings then makes sense, and the compiler with warn if you try and do something you are not allowed to do like changing the string literals, e.g. if you try:
arrP1[0][0] = 'l';
(the compiler will throw error: assignment of read-only location ‘*arrP1[0])
If you carry the types consistently through your code, you will not have to "fudge" with any casts anywhere, and the compiler can help protect you from yourself. For example, a simple rework of your types to make sure your string literals are const qualified (while still allowing you to sort your arrays) could be done with something like:
#include <stdio.h>
#include <string.h>
void printAllStrings (const char **arr[])
{
while (*arr != NULL) {
const char **ptr = *arr;
while (*ptr != NULL) {
printf ("%s\n", *ptr);
ptr++;
}
arr++;
}
}
const char *maxLengthString (const char **arr[])
{
size_t max = 0;
const char *longest = NULL;
while (*arr != NULL) {
const char **ptr = *arr;
while (*ptr != NULL) {
size_t len = strlen (*ptr);
if (len > max) {
max = len;
longest = *ptr;
}
ptr++;
}
arr++;
}
return longest;
}
int main (void) {
const char *arrP1[] = {"father", "mother", NULL};
const char *arrP2[] = {"sister", "brother", "grandfather", NULL};
const char *arrP3[] = {"grandmother", NULL};
const char *arrP4[] = {"uncle", "aunt", NULL};
const char **arrPP[] = {arrP1, arrP2, arrP3, arrP4, NULL};
printf ("Before sort :\n");
printAllStrings (arrPP);
// sort (arrPP); /* you didn't post sort, so the following swaps */
const char **tmp = arrPP[0]; /* simple swap example */
arrPP[0] = arrPP[1];
arrPP[1] = tmp;
printf ("\nAfter sort :\n");
printAllStrings (arrPP);
printf ("\nMaximum length string : %s \n", maxLengthString (arrPP));
return 0;
}
(You didn't post sort(), so above the elements a simply swapped to show your arrPP retains the ability to be sorted, and a quick implementation of the maxLengthString () was added to make your last statement work -- but note, it just finds the first of any longest strings if more than one are the same length)
Example Use/Output
$ ./bin/array_ptp_const_char
Before sort :
father
mother
sister
brother
grandfather
grandmother
uncle
aunt
After sort :
sister
brother
grandfather
father
mother
grandmother
uncle
aunt
Maximum length string : grandfather
Look things over and let me know if you have further questions. I'm not sure if this is what you were looking for, but based on your code and questions, it seemed the most logical choice.
I can only answer the 2nd question as I can not understand the 1st the way it is now.
char** ptr=(char**)*arr;
// this is the same but maybe confusing because arr is a pointer now and it gets iterated in the loop.
char** ptr=(char**)arr[0];
arrhas the type pointer (decayed from an array) to pointer to pointer to char.
ptrhas the type pointer to pointer to char.
As you can see ptr has one level of reference less than arr. arrholds all the pointers to your declared arrays of type pointer to char. (arrP1, arrP2, arrP3, arrP4).
By dereferencing arr you get the pointer to one of these arrays.(arrP1 in the first iteration)
Then you print where the pointer stored in ptr[0] points to and iterate to ptr[1], to print this. After the loop arrgets iterated and yields the pointer to arrP2 and you start again with the ptr-loop.
You have to keep in mind that arrays and pointers are quite similar(but not exactly the same) in their usage and passing an array to function lets it decay to a pointer.
Edit: David's answer is great, I misread the focus of the 2nd question, so this answer is a bit offtopic. I will leave it here, because I think it still helps understanding what happens with all the pointer magic. For a relevant answer especially about const correctness David's answer is the one.

Can I pass a const char* array to execv?

This is the prototype for execv:
int execv(const char *path, char *const argv[]);
Can I pass an array of const char pointers as the second argument?
This example program gives a warning when USE_CAST is not set:
#include <unistd.h>
int main(int argc, char *argv[])
{
if (argc > 0) {
const char *exe_name = "/bin/echo", *message = "You ran";
const char *exe_args[] = { exe_name, message, argv[0], NULL };
#ifdef USE_CAST
execv("/bin/echo", (char **) exe_args);
#else
execv("/bin/echo", exe_args);
#endif
}
return 0;
}
When compiling, gcc says, "passing argument 2 of 'execv' from incompatible pointer type" if I don't use the cast.
From the POSIX documentation for execv (halfway through the Rationale section), it looks like the second argument is a char *const array only for backwards compatibility:
The statement about argv[] and envp[] being constants is included to make explicit to future writers of language bindings that these objects are completely constant. ... It is unfortunate that the fourth column cannot be used...
where the "fourth column" refers to const char* const[].
Is the (char **) cast safe to use here? Should I create a char * array and pass that to execv instead?
Can I pass an array of const char pointers as the second argument?
Well yes, you already know that you can cast in order to do so.
From the POSIX documentation for execv (halfway through the Rationale section), it looks like the second argument is a char *const array only for backwards compatibility:
I wouldn't put it in those terms, but yes, there is a compatibility aspect to the chosen signature. The section you reference explains that C does not have a wholly satisfactory way to express the degree of const-ness that POSIX requires execv() to provide for the arguments. POSIX guarantees that the function will not change either the pointers in argv or the strings to which they point.
With that being the case, I think it not unreasonable to cast the argv pointer as you propose to do, though I would leave a comment in my code explaining why doing so is safe.
On the other hand, you should consider simply leaving the const off of your array declaration:
char *exe_name = "echo", *message = "You ran";
char *exe_args[] = { exe_name, message, argv[0], NULL };
Or, in your simple example, even this would do:
char *exe_args[] = { "echo", message, argv[0], "You ran", NULL };
C string literals correspond to arrays of type char, not const char, so this is perfectly legal as far as C is concerned, even though actually trying to modify the contents of those strings might fail.
On the third hand, modern C has array literals, so you could even do this:
execv("/bin/echo", (char *[]) { "echo", "You ran ", argv[0], NULL });
In that last case you don't even have a cast (the thing that resembles one is just part of the syntax for an array literal).
If you're just going to cast away the const, then you shouldn't use const to begin with. Most (dare I say all) compilers will accept the following code
char *exe_name = "/bin/echo";
char *message = "You ran";
char *exe_args[] = { exe_name, message, argv[0], NULL };
execv( exe_args[0], exe_args );
If that is not pedantically correct enough for you, then the other option is
char exe_name[] = "/bin/echo";
char message[] = "You ran";
char *exe_args[] = { exe_name, message, argv[0], NULL };
execv( exe_args[0], exe_args );
Note that execv is going to make copies of the strings (to create the argv array for the executable), so it doesn't matter whether the strings are actually const or not.

Using void pointers in generic accessor function with ANSI C

I have a working example of a piece of C code that I'm using to teach myself about using pointers effectively in a non-trivial application. (I have a dream to contribute a missing feature to a C library which I'm relying on.)
My sample code loo like this:
#include <stdio.h>
#include <stdlib.h>
struct config_struct {
int port;
char *hostname;
};
typedef struct config_struct config;
void setup(config*);
void change(config*);
void set_hostname(config*, char*);
void get_hostname_into(config*, char**);
void teardown(config*);
void inspect(config*);
int main() {
char* hostname;
config* c;
c = calloc( 1, sizeof(config));
setup(c);
inspect(c);
change(c);
inspect(c);
set_hostname(c, "test.com");
inspect(c);
get_hostname_into(c, &hostname);
inspect(c);
printf("retrieved hostname is %s (%p)\n", hostname, &hostname);
teardown(c);
printf("retrieved hostname is %s (%p) (after teardown)\n", hostname, &hostname);
return EXIT_SUCCESS;
}
void setup(config* c) {
c->port = 9933;
c->hostname = "localhost";
}
void change(config* c) {
c->port = 12345;
c->hostname = "example.com";
}
void set_hostname(config* c, char* new_hostname) {
c->hostname = new_hostname;
}
void get_hostname_into(config* c, char** where) {
*where = c->hostname;
}
void teardown(config* c) {
free(c);
}
void inspect(config* c) {
printf("c is at %p\n", c);
printf("c is %ld bytes\n", sizeof(*c));
printf("c:port is %d (%p)\n", c->port, &(c->port));
printf("c:hostname is %s (%p)\n", c->hostname, &(c->port));
}
It's required by the nature of the library (the function is get_session_property(session*, enum Property, void*) - thus I'm looking for a way to dereference a void pointer; I was able to successfully implement this for an int, but have been kicking my heels trying to figure out how to do it for a char* (something about a void* to int making some sense, but I can't fathom how to do it for void* to char*.
My successful implementation (with tests) for the library is on my Github fork of the project, here.
The closest I have come is:
enum Property { Port, Hostname };
void get_property(config*, enum Property, void*);
void get_property(config* c, enum Property p, void* target) {
switch(p) {
case Port:
{
int *port;
port = (int *) target;
*port = c->port;
}
break;
case Hostname:
{
char *hostname;
hostname = (char *) target;
*hostname = c->hostname;
}
break;
}
}
Which mercifully doesn't segfault, but also leaves char *get_hostname_into_here null, raising the warning (which I can't figure out:)
untitled: In function ‘get_property’:
untitled:33: warning: assignment makes integer from pointer without a cast
Full source code of my contrived example here; please when answering explain, or recommend any reading you have on using void pointers and/or good C style, it seems like everyone has a different idea, and a couple of people I know in the real world simply said "the library is doing it wrong, don't use void pointers) - whilst it would be nice if the library would make the struct public; for encapsulation and other good reasons, I think the void pointers, generic function approach is perfectly reasonable in this case.
So, what am I doing wrong in my hostname branch of the get_property() function that the char* is NULL after the call to get_property(c, Hostname, &get_hostname_into_here);
char *get_hostname_into_here;
get_property(c, Hostname, &get_hostname_into_here);
printf("genericly retrieved hostname is %s (%p)\n", get_hostname_into_here, &get_hostname_into_here);
// Expect get_hostname_into_here not to be NULL, but it is.
full source code for example (with output).
Like I said in my comments above, it's not possible to give a precise answer, because it's not clear what the aim is here. But I see two possibilities:
1: target is pointing at a char buffer
If this is the case, then it would seem that you'll need to copy the contents of your string into that buffer. This is not possible to do safely, because you don't know how big the receiving buffer is. But if you don't care about that, then you need to do something like:
strcpy((char *)target, c->hostname);
2: target is pointing at a char *
If this is the case, then the intention is presumably either to modify that char * to point at the existing string, or to dynamically create a new buffer, copy the string, and then modify the char * to point at it.
So either:
char **p = (char **)target;
*p = c->hostname;
or:
char **p = (char **)target;
*p = malloc(strlen(c->hostname)+1);
strcpy(p, c->hostname);
Note
You get the warning message because in this line:
*hostname = c->hostname;
*hostname is of type char, whereas c->hostname is of type char *. The compiler is telling you that this conversion doesn't make any sense. If I were you, I would set your compiler up to treat warnings as errors (e.g. with the -Werror flag for GCC), because warnings should always be adhered to!
The get_property function should be altered so that target is a double void pointer, meaning that you can change the pointer itself (not only the memory it refers to):
void get_property (config *c, enum Property p, void **target) {
switch (p) {
case Port:
*((int *) (*target)) = c->port;
break;
case Hostname:
*target = c->config;
break;
}
}
And then use the function like that:
int port;
int *pport = &port;
char *hostname;
get_propery(c, Port, &pport);
get_propery(c, Hostname, &hostname);
There's no answer to your question until you provide more details about the get_property function. It is clear that the void *target parameter is used to pass an external "space" in which you are supposed to place the result - the value of the requested property.
What is the nature of that recipient space?
In case of a int property it it pretty clear form your code: the pointer points to some int object in which you are supposed to place the property value. Which is what you do correctly.
But what about string properties? There are at least two possibilities here
1) The void *target parameter points to the beginning of a char [] buffer, which is supposedly large enough to receive any property value. In that case your code should looks as follows
case Hostname:
{
char *hostname = target;
strcpy(hostname, c->hostname);
}
break;
The function in this case would be called as
char hostname_buffer[1024];
get_property(c, Hostname, hostname_buffer);
This is actually the "correct" way to do it, except that you need to take certain steps to make sure you don't overrun the target buffer by some long property value.
2) The void *target parameter points to an pointer of char * type, which is supposed to receive the hostname pointer value from the property. (In that case the target actually holds a char ** value.) The code would look as
case Hostname:
{
char **hostname = target;
*target = c->hostname;
}
break;
The function in this case would be called as
char *hostname;
get_property(c, Hostname, &hostname);
This second variant doesn't look good to me, since in this case you are essentially returning a pointer to internal data of property structure. It is not a good idea to give the outside world access to the internals of [supposedly opaque] data structure.
P.S. One generally does not need to explicitly cast to and from void * pointrs in C language.
get_hostname_into_here is defined as:
char *get_hostname_into_here;
And you're passing a reference to it, namely a char**. In get_property, you're casting the void* into a char* instead of a char**, and then dereferencing it before the assignment. In order to get the string correctly, use:
case Hostname:
{
char **hostname;
hostname = (char **) target;
*hostname = c->hostname;
}
break;
Let's consider a simplified example of your get_property function which is the same in all the important ways:
void get_property_hostname(config* c, void* target) {
char * hostname = (char *) target;
*hostname = c->hostname;
}
On the first line of the function, you are making a "char *" pointer which points to the same location as "target". On the second line of the function, when you write *hostname = ..., you are writing to the char that hostname points at, so you are writing to the first byte of memory that target points to. This is not what you want; you are only giving one byte of data to the caller of the function. Also, the compiler complains because the the left-hand side of the assignment has type "char" while the right-hand side has the type "char *".
There are at least three correct ways to return a string in C:
1) Return a pointer to the original string
If you do this, the user will have access to the string and could modify it if he wanted to. You must tell him not to do that. Putting the const qualifier on it will help achieve that.
const char * get_property_hostname(config* c) {
return c->hostname;
}
2) Duplicate the string and return a pointer to the duplicate
If you do this, the caller of the function must pass the duplicate string to free() when he is done using it. See the documentation of strdup.
const char * get_property_hostname(config * c) {
return strdup(c->hostname);
}
3) Write the string to a buffer that the caller has allocated
If you do this, then it is up to the caller of the function when and how he wants to allocate and free the memory. This is what a lot of APIs in the Microsoft Windows operating system do because it offers the most flexibility to the caller of the function.
void get_property_hostname(config * c, char * buffer, int buffer_size)
{
if (strlen(c->hostname)+1 > buffer_size)
{
// Avoid buffer overflows and return the empty string.
buffer[0] = 0;
}
else
{
strcpy(buffer, c->hostname);
}
}
Then to use this function, you can do something like:
void foo(){
char buffer[512];
get_property_hostname(c, buffer, sizeof(buffer));
...
// buffer is on stack, so it gets freed automatically when foo returns
}
EDIT 1: I will leave it as an exercise for you to figure out how to integrate the ideas presented here back into your generic get_property function. If you take the time to understand what is going on here, it shouldn't be too hard, but you may have to add some extra parameters.
EDIT 2: Here's how you would adapt method 1 to use a void pointer that points to a char * instead of using a return value:
void get_property_hostname(config* c, void * target) {
*(char **)target = c->hostname;
}
Then you would call it like this:
void foobar() {
char * name;
get_property_hostname(c, &name);
...
}

Resources