I always mess up how to use const int*, const int * const, and int const * correctly. Is there a set of rules defining what you can and cannot do?
I want to know all the do's and all don'ts in terms of assignments, passing to the functions, etc.
Read it backwards (as driven by Clockwise/Spiral Rule):
int* - pointer to int
int const * - pointer to const int
int * const - const pointer to int
int const * const - const pointer to const int
Now the first const can be on either side of the type so:
const int * == int const *
const int * const == int const * const
If you want to go really crazy you can do things like this:
int ** - pointer to pointer to int
int ** const - a const pointer to a pointer to an int
int * const * - a pointer to a const pointer to an int
int const ** - a pointer to a pointer to a const int
int * const * const - a const pointer to a const pointer to an int
...
And to make sure we are clear on the meaning of const:
int a = 5, b = 10, c = 15;
const int* foo; // pointer to constant int.
foo = &a; // assignment to where foo points to.
/* dummy statement*/
*foo = 6; // the value of a can´t get changed through the pointer.
foo = &b; // the pointer foo can be changed.
int *const bar = &c; // constant pointer to int
// note, you actually need to set the pointer
// here because you can't change it later ;)
*bar = 16; // the value of c can be changed through the pointer.
/* dummy statement*/
bar = &a; // not possible because bar is a constant pointer.
foo is a variable pointer to a constant integer. This lets you change what you point to but not the value that you point to. Most often this is seen with C-style strings where you have a pointer to a const char. You may change which string you point to but you can't change the content of these strings. This is important when the string itself is in the data segment of a program and shouldn't be changed.
bar is a constant or fixed pointer to a value that can be changed. This is like a reference without the extra syntactic sugar. Because of this fact, usually you would use a reference where you would use a T* const pointer unless you need to allow NULL pointers.
For those who don't know about Clockwise/Spiral Rule:
Start from the name of the variable, move clockwisely (in this case, move backward) to the next pointer or type. Repeat until expression ends.
Here is a demo:
I think everything is answered here already, but I just want to add that you should beware of typedefs! They're NOT just text replacements.
For example:
typedef char *ASTRING;
const ASTRING astring;
The type of astring is char * const, not const char *. This is one reason I always tend to put const to the right of the type, and never at the start.
Like pretty much everyone pointed out:
What’s the difference between const X* p, X* const p and const X* const p?
You have to read pointer declarations
right-to-left.
const X* p means "p points to an X that is const": the X object can't be changed via p.
X* const p means "p is a const pointer to an X that is non-const": you can't change the pointer p itself, but you can change the X object via p.
const X* const p means "p is a const pointer to an X that is const": you can't change the pointer p itself, nor can you change the X object via p.
Constant reference:
A reference to a variable (here int), which is constant. We pass the variable as a reference mainly, because references are smaller in size than the actual value, but there is a side effect and that is because it is like an alias to the actual variable. We may accidentally change the main variable through our full access to the alias, so we make it constant to prevent this side effect.
int var0 = 0;
const int &ptr1 = var0;
ptr1 = 8; // Error
var0 = 6; // OK
Constant pointers
Once a constant pointer points to a variable then it cannot point to any other variable.
int var1 = 1;
int var2 = 0;
int *const ptr2 = &var1;
ptr2 = &var2; // Error
Pointer to constant
A pointer through which one cannot change the value of a variable it points is known as a pointer to constant.
int const * ptr3 = &var2;
*ptr3 = 4; // Error
Constant pointer to a constant
A constant pointer to a constant is a pointer that can neither change the address it's pointing to and nor can it change the value kept at that address.
int var3 = 0;
int var4 = 0;
const int * const ptr4 = &var3;
*ptr4 = 1; // Error
ptr4 = &var4; // Error
This question shows precisely why I like to do things the way I mentioned in my question is const after type id acceptable?
In short, I find the easiest way to remember the rule is that the "const" goes after the thing it applies to. So in your question, "int const *" means that the int is constant, while "int * const" would mean that the pointer is constant.
If someone decides to put it at the very front (eg: "const int *"), as a special exception in that case it applies to the thing after it.
Many people like to use that special exception because they think it looks nicer. I dislike it, because it is an exception, and thus confuses things.
The general rule is that the const keyword applies to what precedes it immediately. Exception, a starting const applies to what follows.
const int* is the same as int const* and means "pointer to constant int".
const int* const is the same as int const* const and means "constant pointer to constant int".
Edit:
For the Dos and Don'ts, if this answer isn't enough, could you be more precise about what you want?
Simple Use of const.
The simplest use is to declare a named constant. To do this, one declares a constant as if it was a variable but add const before it. One has to initialize it immediately in the constructor because, of course, one cannot set the value later as that would be altering it. For example:
const int Constant1=96;
will create an integer constant, unimaginatively called Constant1, with the value 96.
Such constants are useful for parameters which are used in the program but are do not need to be changed after the program is compiled. It has an advantage for programmers over the C preprocessor #define command in that it is understood & used by the compiler itself, not just substituted into the program text by the preprocessor before reaching the main compiler, so error messages are much more helpful.
It also works with pointers but one has to be careful where const to determine whether the pointer or what it points to is constant or both. For example:
const int * Constant2
declares that Constant2 is variable pointer to a constant integer and:
int const * Constant2
is an alternative syntax which does the same, whereas
int * const Constant3
declares that Constant3 is constant pointer to a variable integer and
int const * const Constant4
declares that Constant4 is constant pointer to a constant integer. Basically ‘const’ applies to whatever is on its immediate left (other than if there is nothing there in which case it applies to whatever is its immediate right).
ref: http://duramecho.com/ComputerInformation/WhyHowCppConst.html
It's simple but tricky. Please note that we can apply the const qualifier to any data type (int, char, float, etc.).
Let's see the below examples.
const int *p ==> *p is read-only [p is a pointer to a constant integer]
int const *p ==> *p is read-only [p is a pointer to a constant integer]
int *p const ==> Wrong Statement. Compiler throws a syntax error.
int *const p ==> p is read-only [p is a constant pointer to an integer].
As pointer p here is read-only, the declaration and definition should be in same place.
const int *p const ==> Wrong Statement. Compiler throws a syntax error.
const int const *p ==> *p is read-only
const int *const p ==> *p and p are read-only [p is a constant pointer to a constant integer]. As pointer p here is read-only, the declaration and definition should be in same place.
int const *p const ==> Wrong Statement. Compiler throws a syntax error.
int const int *p ==> Wrong Statement. Compiler throws a syntax error.
int const const *p ==> *p is read-only and is equivalent to int const *p
int const *const p ==> *p and p are read-only [p is a constant pointer to a constant integer]. As pointer p here is read-only, the declaration and definition should be in same place.
I had the same doubt as you until I came across this book by the C++ Guru Scott Meyers. Refer the third Item in this book where he talks in details about using const.
Just follow this advice
If the word const appears to the left of the asterisk, what's pointed to is constant
If the word const appears to the right of the asterisk, the pointer itself is constant
If const appears on both sides, both are constant
To remember in easy way :
If const is before * then value is constant.
If const is after * then address is constant.
if const are available both before and after * then both value and address are constant.
e.g.
int * const var; //here address is constant.
int const * var; //here value is constant.
int const * const var; // both value and address are constant.
The C and C++ declaration syntax has repeatedly been described as a failed experiment, by the original designers.
Instead, let's name the type “pointer to Type”; I’ll call it Ptr_:
template< class Type >
using Ptr_ = Type*;
Now Ptr_<char> is a pointer to char.
Ptr_<const char> is a pointer to const char.
And const Ptr_<const char> is a const pointer to const char.
For me, the position of const i.e. whether it appears to the LEFT or RIGHT or on both LEFT and RIGHT relative to the * helps me figure out the actual meaning.
A const to the LEFT of * indicates that the object pointed by the pointer is a const object.
A const to the RIGHT of * indicates that the pointer is a const pointer.
The following table is taken from Stanford CS106L Standard C++ Programming Laboratory Course Reader.
There are many other subtle points surrounding const correctness in C++. I suppose the question here has simply been about C, but I'll give some related examples since the tag is C++ :
You often pass large arguments like strings as TYPE const & which prevents the object from being either modified or copied. Example :
TYPE& TYPE::operator=(const TYPE &rhs) { ... return *this; }
But TYPE & const is meaningless because references are always const.
You should always label class methods that do not modify the class as const, otherwise you cannot call the method from a TYPE const & reference. Example :
bool TYPE::operator==(const TYPE &rhs) const { ... }
There are common situations where both the return value and the method should be const. Example :
const TYPE TYPE::operator+(const TYPE &rhs) const { ... }
In fact, const methods must not return internal class data as a reference-to-non-const.
As a result, one must often create both a const and a non-const method using const overloading. For example, if you define T const& operator[] (unsigned i) const;, then you'll probably also want the non-const version given by :
inline T& operator[] (unsigned i) {
return const_cast<char&>(
static_cast<const TYPE&>(*this)[](i)
);
}
Afaik, there are no const functions in C, non-member functions cannot themselves be const in C++, const methods might have side effects, and the compiler cannot use const functions to avoid duplicate function calls. In fact, even a simple int const & reference might witness the value to which it refers be changed elsewhere.
I drew an image below to explain this, maybe helpful.
int const v and const int v are identical.
The const with the int on either sides will make pointer to constant int:
const int *ptr=&i;
or:
int const *ptr=&i;
const after * will make constant pointer to int:
int *const ptr=&i;
In this case all of these are pointer to constant integer, but none of these are constant pointer:
const int *ptr1=&i, *ptr2=&j;
In this case all are pointer to constant integer and ptr2 is constant pointer to constant integer. But ptr1 is not constant pointer:
int const *ptr1=&i, *const ptr2=&j;
if const is to the left of *, it refers to the value (it doesn't matter whether it's const int or int const)
if const is to the right of *, it refers to the pointer itself
it can be both at the same time
An important point: const int *p does not mean the value you are referring to is constant!!. It means that you can't change it through that pointer (meaning, you can't assign $*p = ...`). The value itself may be changed in other ways. Eg
int x = 5;
const int *p = &x;
x = 6; //legal
printf("%d", *p) // prints 6
*p = 7; //error
This is meant to be used mostly in function signatures, to guarantee that the function can't accidentally change the arguments passed.
This mostly addresses the second line: best practices, assignments, function parameters etc.
General practice. Try to make everything const that you can. Or to put that another way, make everything const to begin with, and then remove exactly the minimum set of consts necessary to allow the program to function. This will be a big help in attaining const-correctness, and will help ensure that subtle bugs don't get introduced when people try and assign into things they're not supposed to modify.
Avoid const_cast<> like the plague. There are one or two legitimate use cases for it, but they are very few and far between. If you're trying to change a const object, you'll do a lot better to find whoever declared it const in the first pace and talk the matter over with them to reach a consensus as to what should happen.
Which leads very neatly into assignments. You can assign into something only if it is non-const. If you want to assign into something that is const, see above. Remember that in the declarations int const *foo; and int * const bar; different things are const - other answers here have covered that issue admirably, so I won't go into it.
Function parameters:
Pass by value: e.g. void func(int param) you don't care one way or the other at the calling site. The argument can be made that there are use cases for declaring the function as void func(int const param) but that has no effect on the caller, only on the function itself, in that whatever value is passed cannot be changed by the function during the call.
Pass by reference: e.g. void func(int ¶m) Now it does make a difference. As just declared func is allowed to change param, and any calling site should be ready to deal with the consequences. Changing the declaration to void func(int const ¶m) changes the contract, and guarantees that func can now not change param, meaning what is passed in is what will come back out. As other have noted this is very useful for cheaply passing a large object that you don't want to change. Passing a reference is a lot cheaper than passing a large object by value.
Pass by pointer: e.g. void func(int *param) and void func(int const *param) These two are pretty much synonymous with their reference counterparts, with the caveat that the called function now needs to check for nullptr unless some other contractual guarantee assures func that it will never receive a nullptr in param.
Opinion piece on that topic. Proving correctness in a case like this is hellishly difficult, it's just too damn easy to make a mistake. So don't take chances, and always check pointer parameters for nullptr. You will save yourself pain and suffering and hard to find bugs in the long term. And as for the cost of the check, it's dirt cheap, and in cases where the static analysis built into the compiler can manage it, the optimizer will elide it anyway. Turn on Link Time Code Generation for MSVC, or WOPR (I think) for GCC, and you'll get it program wide, i.e. even in function calls that cross a source code module boundary.
At the end of the day all of the above makes a very solid case to always prefer references to pointers. They're just safer all round.
Just for the sake of completeness for C following the others explanations, not sure for C++.
pp - pointer to pointer
p - pointer
data - the thing pointed, in examples x
bold - read-only variable
Pointer
p data - int *p;
p data - int const *p;
p data - int * const p;
p data - int const * const p;
Pointer to pointer
pp p data - int **pp;
pp p data - int ** const pp;
pp p data - int * const *pp;
pp p data - int const **pp;
pp p data - int * const * const pp;
pp p data - int const ** const pp;
pp p data - int const * const *pp;
pp p data - int const * const * const pp;
// Example 1
int x;
x = 10;
int *p = NULL;
p = &x;
int **pp = NULL;
pp = &p;
printf("%d\n", **pp);
// Example 2
int x;
x = 10;
int *p = NULL;
p = &x;
int ** const pp = &p; // Definition must happen during declaration
printf("%d\n", **pp);
// Example 3
int x;
x = 10;
int * const p = &x; // Definition must happen during declaration
int * const *pp = NULL;
pp = &p;
printf("%d\n", **pp);
// Example 4
int const x = 10; // Definition must happen during declaration
int const * p = NULL;
p = &x;
int const **pp = NULL;
pp = &p;
printf("%d\n", **pp);
// Example 5
int x;
x = 10;
int * const p = &x; // Definition must happen during declaration
int * const * const pp = &p; // Definition must happen during declaration
printf("%d\n", **pp);
// Example 6
int const x = 10; // Definition must happen during declaration
int const *p = NULL;
p = &x;
int const ** const pp = &p; // Definition must happen during declaration
printf("%d\n", **pp);
// Example 7
int const x = 10; // Definition must happen during declaration
int const * const p = &x; // Definition must happen during declaration
int const * const *pp = NULL;
pp = &p;
printf("%d\n", **pp);
// Example 8
int const x = 10; // Definition must happen during declaration
int const * const p = &x; // Definition must happen during declaration
int const * const * const pp = &p; // Definition must happen during declaration
printf("%d\n", **pp);
N-levels of Dereference
Just keep going, but may the humanity excommunicate you.
int x = 10;
int *p = &x;
int **pp = &p;
int ***ppp = &pp;
int ****pppp = &ppp;
printf("%d \n", ****pppp);
const int* - pointer to constant int object.
You can change the value of the pointer; you can not change the value of the int object, the pointer points to.
const int * const - constant pointer to constant int object.
You can not change the value of the pointer nor the value of the int object the pointer points to.
int const * - pointer to constant int object.
This statement is equivalent to 1. const int* - You can change the value of the pointer but you can not change the value of the int object, the pointer points to.
Actually, there is a 4th option:
int * const - constant pointer to int object.
You can change the value of the object the pointer points to but you can not change the value of the pointer itself. The pointer will always point to the same int object but this value of this int object can be changed.
If you want to determine a certain type of C or C++ construct you can use the Clockwise/Spiral Rule made by David Anderson; but not to confuse with Anderson`s Rule made by Ross J. Anderson, which is something quite distinct.
simple mnemonic:
type pointer <- * -> pointee name
I like to think of int *i as declaring "the dereference of i is int"; in this sense, const int *i means "the deref of i is const int", while int *const i means "deref of const i is int".
(the one danger of thinking like this is it may lead to favoring int const *i style of declaration, which people might hate/disallow)
Nobody has mentioned the system underlying declarations which Kernighan and Ritchie pointed out in their C book:
Declarations mimic expressions.
I'll repeat this because it so essential and gives a clear strategy to parse even the most complicated declarations:
Declarations mimic expressions.
The declarations contain the same operators as expressions the declared identifier can appear in later, with the same priority they have in expressions. This is why the "clockwise spiral rule" is wrong: The evaluation order is strictly determined by the operator precedences, with complete disregard for left, right or rotational directions.
Here are a few Examples, in order of increasing complexity:
int i;: When i is used as-is, it is an expression of type int. Therefore, i is an int.
int *p;: When p is dereferenced with *, the expression is of type int. Therefore, p is a pointer to int.
const int *p;: When p is dereferenced with *, the expression is of type const int. Therefore, p is a pointer to const int.
int *const p;: p is const. If this constant expression is dereferenced with *, the expression is of type int. Therefore, p is a const pointer to int.
const int *const p;: p is const. If this constant expression is dereferenced with *, the expression is of type const int. Therefore, p is a const pointer to const int.
So far we didn't have any issues with operator precedence yet: We simply evaluated right-to-left. This changes when we have fun with arrays of pointers and pointers to arrays. You may want to have a cheat sheet open.
int a[3];: When we apply the array indexing operator to a, the result is an int. Therefore, a is an array of int.
int *a[3];: Here the indexing operator has higher precedence, so we apply it first: When we apply the array indexing operator to a, the result is an int *. Therefore, a is an array of pointers to int. This is not uncommon.
int (*a)[3];: Here the operator precedence is overridden by round parentheses, exactly as in any expression. Consequently, we dereference first. We know now that a is a pointer to some type. *a, the dereferenced pointer, is an expression of that type. When we apply the array indexing operator to *a, we obtain a plain int, which means that *a is an array of three ints, and a is a pointer to that array. This is fairly uncommon outside of C++ templates, which is why the operator precedences are not catering to this case. Note how the use of such a pointer is the model for its declaration: int i = (*a)[1];. The parentheses are mandatory to dereference first.
int (*a)[3][2];: There is nothing preventing anybody from having pointers to multi-dimensional arrays, one case where circular spiral clockwise advice becomes obvious nonsense.
A thing that sometimes comes up in real life are function pointers. We need parentheses there as well because the function call operator (operator()() in C++, simple syntax rule in C) has higher priority than the dereferencing operator*(), again because it's more common to have functions returning pointers than pointers to functions:
int *f();: Function call first, so f is a function. The call must be dereferenced to result in an int, so the return value is a pointer to int. Usage: int i = *f();.
int (*fp)();: Parentheses change order of operator application. Because we must dereference first we know that fp is a pointer to something. Because we can apply the function call operator to *fp we know (in C) that fp is a pointer to a function; in C++ we only know that it is something for which operator()() is defined. Since the call takes no parameters and returns an int, fp is in C++ a pointer to a function with that signature. (In C, an empty parameter list indicates that nothing is known about the parameters, but future C specifications may forbid that obsolete use.)
int *(*fp)();: Of course we can return pointers to int from a function pointed to.
int (*(*fp)())[3];: Dereference first, hence a pointer; apply function call operator next, hence a pointer to function; dereference the return value again, hence a pointer to a function returning a pointer; apply the indexing operator to that: pointer to function returning pointer to array. The result is an int, hence pointer to function returning pointer to array of ints.-
All parentheses are necessary: As discussed, we must prioritize dereferencing of the function pointer with (*fp) before anything else happens. Obviously, we need the function call; and since the function returns a pointer to an array (not to its first element!), we must dereference that as well before we can index it. I admit that I wrote a test program to check this because I wasn't sure, even with this fool-proof method ;-). Here it is:
#include <iostream>
using namespace std;
int (*f())[3]
{
static int arr[3] = {1,2,3};
return &arr;
}
int (*(*fp)())[3] = &f;
int main()
{
for(int i=0; i<3; i++)
{
cout << (*(*fp)())[i] << endl;
}
}
Note how beautifully the declaration mimics the expression!
Lot of people answered correctly I will just organize well here and put some Extra info which is missing in given Answers.
Const is keyword in C language also known as qualifier. Const can
applied to the declaration of any variable to specify that it's value
will not changed
const int a=3,b;
a=4; // give error
b=5; // give error as b is also const int
you have to intialize while declaring itself as no way to assign
it afterwards.
How to read ?
just read from right to left every statement works smoothly
3 main things
type a. p is ptr to const int
type b. p is const ptr to int
type c. p is const ptr to const int
[Error]
if * comes before int
two types
1. const int *
2. const const int *
we look first
Major type 1. const int*
ways to arrange 3 things at 3 places 3!=6
i. * at start
*const int p [Error]
*int const p [Error]
ii. const at start
const int *p type a. p is ptr to const int
const *int p [Error]
iii. int at start
int const *p type a.
int * const p type b. p is const ptr to int
Major type 2. const const int*
ways to arrange 4 things at 4 places in which 2 are alike 4!/2!=12
i. * at start
* int const const p [Error]
* const int const p [Error]
* const const int p [Error]
ii. int at start
int const const *p type a. p is ptr to const int
int const * const p type c. p is const ptr to const int
int * const const p type b. p is const ptr to int
iii. const at start
const const int *p type a.
const const * int p [Error]
const int const *p type a.
const int * const p type c.
const * int const p [Error]
const * const int p [Error]
squeezing all in one
type a. p is ptr to const int (5)
const int *p
int const *p
int const const *p
const const int *p
const int const *p
type b. p is const ptr to int (2)
int * const p
int * const const p;
type c. p is const ptr to const int (2)
int const * const p
const int * const p
just little calculation
1. const int * p total arrangemets (6) [Errors] (3)
2. const const int * p total arrangemets (12) [Errors] (6)
little Extra
int const * p,p2 ;
here p is ptr to const int (type a.)
but p2 is just const int please note that it is not ptr
int * const p,p2 ;
similarly
here p is const ptr to int (type b.)
but p2 is just int not even cost int
int const * const p,p2 ;
here p is const ptr to const int (type c.)
but p2 is just const int.
Finished
I'm taking a specialization on Coursera and in a lesson it explains the qsort() function that sorts a given array:
void qsort(void *base, size_t nmemb, size_t size, int (*compar)(const void *, const void *));
where we should provide qsort() with four parameters - the array to sort, number of elements in the array, size of each element of the array, and a pointer to a function (compar) which takes two const void *s and returns an int. The lesson says that we need to write the compar function to be compatible with the qsort function, so if we would like to compare two strings the function should look like:
int compareStrings(const void * s1vp, const void * s2vp) {
// first const: s1vp actually points at (const char *)
// second const: cannot change *s1vp (is a const void *)
const char * const * s1ptr = s1vp;
const char * const * s2ptr = s2vp;
return strcmp(*s1ptr, *s2ptr);
}
void sortStringArray(const char ** array, size_t nelements) {
qsort(array, nelements, sizeof(const char *), compareStrings);
}
It says: Note that the pointers passed in are pointers to the elements in the array (that is, they point at the boxes in the array), even though those elements are themselves pointers (since they are strings). When we convert them from void *s, we must take care to convert them to the correct type—here, const char * const *—and use them appropriately, or our function will be broken in some way. For example, consider the following broken code:
// BROKEN DO NOT DO THIS!
int compareStrings(const void * s1vp, const void * s2vp) {
const char * s1 = s1vp;
const char * s2 = s2vp;
return strcmp(s1, s2);
}
The thing that I can't really get is why didn't we consider s1vp and s2vp as pointers to pointers? I mean, since the arguments passed to the function compareStrings are addresses of pointers pointing to strings (address of pointer), shouldn't we have declared s1vp and s2vp as int compareStrings(const void ** s1vp, const void ** s2vp) since they are receiving addresses of pointers?
In other words, I'm passing, for example, the address of the first element of the array of strings, which is actually a pointer, to s1vp. So now s1vp is receiving address of pointer not a variable, so We should declare it as pointer to pointer, right? It gives me warning when I try to do so...
A void * can point to any datatype. The fact that the datatype in question is also a pointer doesn't change things.
Also, you can't change the signature of the comparison function, otherwise it would be incompatible with what qsort is expecting and can lead to undefined behavior.
I am a beginner programmer in C who wants to get used to terminology and pointers.
I have found the following working function prototype while searching for a way to sort the elements of a numerical array. The function was qsort and it utilized pointers. Now what I understood is that the word "const" ensures that the values a and b are unchanged but not the pointers. Correct me if I am wrong here. My questions are:
Why do we use void * the function can we not use int * from the
start?
How does the construction *(int*)a in the return part
work?
Why does the qsort algorithm needs this many arguments?
int compare (const void *a, const void *b)
{
return ( *(int*)a - *(int*)b );
}
Many thanks for the answers.
PS: That is a pretty complicated task for me.
qsort was made this way so it could be used as a generic sorter. If it would use int from the start it could only be used to compare integers. This way you could also, for example, sort strings by passing strcmp as the compare function to qsort.
*(int*)a casts a to a pointer-to-int and then dereferences it, so you get the integer stored at a. Note that this doesn't change a or the value that a points to.
qsort requires 4 arguments: the array to sort, the number of elements in that array and the size of the elements and finally the compare function. It needs all this information, because again, it is made to be as generic as possible.
It needs the number of elements because in C pointers don't carry information about the size of the buffer that follows them. And it needs to know the size of each element so it can pass the elements to the compare function correctly. For examle, to compare ints you would pass sizeof(int) as the size parameter. To compare strings you would use sizeof(char *).
ADDIT as suggested by H2CO3 the reason for using const void * is to indicate that the compare function may not change the value pointed to by a and b. This, of course, to ensure that sorting the array doesn't suddenly change the values in the array. And, as H2CO3 said, it would be cleaner to cast to (const int *) so that you cannot accidentally change the value after casting it:
return *(const int *)a - *(const int *)b;
You could also get rid of the cast with:
int compare(const void * a, const void * b){
const int * ia = a;
const int * ib = b;
return *ia - *ib;
}
depending on your tastes regarding casts. (I prefer to avoid them)
Finally, to clarify the asterisks:
*(int *)a
^ ^
| └ cast to integer pointer
└ dereference (integer) pointer
Now what I understood is that the word "const" ensures that the values a and b are unchanged but not the pointers
You understood wrong.
const int *a;
declare a as pointer to constant int type. This means that the word const ensures that you can't modify the value of the variable a points to by modifying *a.
Why do we use void * the function can we not use int * from the start?
void * is used to point any type of variable.
How does the construction *(int*)a in the return part work?
*(int *) is used to cast a as a pointer to int and then dereferencing it to get the value stored at location it points to.
The other answers are excellent. I just want to add that it's ofter easier to read if you are very clear in your callback function.
int compare (const void *a, const void *b)
{
return ( *(int*)a - *(int*)b );
}
becomes
int compare (const void *a, const void *b)
{
int ia = *(int *)a;
int ib = *(int *)b;
return ia - ib;
}
In this case it's not too important but as your compare funcion gets complex, you may want to get your variables to "your type" before doing the compare.
Since you asked in the comment below, here is a very step by step version:
int compare (const void *a, const void *b)
{
int *pa = (int *)a;
int *pb = (int *)b;
int ia = *pa;
int ib = *pb;
return ia - ib;
}
qsort() function is an example of a generic algorithm that was implemented as a general-purpose routine. The idea is to make it useful for sorting arbitrary objects, not just int or float. Because of that (and because of the C language design), qsort() resorts to taking a comparison function as a parameter that accepts two generic (in C sense) pointers. It is up to that function (provided by qsort() user) to cast these pointers to correct type, perform correct comparison and return an indication of ordering.
Similarly, since qsort() doesn't know beforehand how large objects are, it takes the object size as a parameter. As far as qsort() is concerned, the objects are blobs of bytes of equal size contiguously arranged in memory.
Finally, since neither of the operations qsort() performs can cause an error, it doesn't return an error code. Actually there is a situation where qsort() might fail, which is illegal parameters passed to it, but in a tradition of many other standard C library routines, it does not guarantee any error checking on parameters promising undefined behavior in such a case.
This is from a 'magic' array library that I'm using.
void
sort(magic_list *l, int (*compare)(const void **a, const void **b))
{
qsort(l->list, l->num_used, sizeof(void*),
(int (*)(const void *,const void *))compare);
}
My question is: what on earth is the last argument to qsort doing?
(int (*)(const void *, const void*))compare)
qsort takes int (*comp_fn)(const void *,const void *) as it's comparator argument, but this sort function takes a comparator with double pointers. Somehow, the line above converts the double pointer version to a single pointer version. Can someone help explain?
That's exactly what the cast you quoted does: it converts a pointer of type
int (*)(const void **, const void **)
to a pointer of type
int (*)(const void *, const void *)
The latter is what is expected by qsort.
Thing like this are encountered rather often in bad quality code. For example, when someone wants to sort an array of ints, they often write a comparison function that accepts pointers to int *
int compare_ints(const int *a, const int *b) {
return (*a > *b) - (*a < *b);
}
and when the time comes to actually call qsort they forcefully cast it to the proper type to suppress the compiler's complaints
qsort(array, n, sizeof *array, (int (*)(const void *,const void *)) compare_ints);
This is a "hack", which leads to undefined behavior. It is, obviously, a bad practice. What you see in your example is just a less direct version of the same "hack".
The proper approach in such cases would be to declare the comparison function as
int compare_ints(const void *a, const void *b) {
int a = *(const int *) a;
int b = *(const int *) b;
return (a > b) - (a < b);
}
and then use it without any casts
qsort(array, n, sizeof *array, compare_ints);
In general, if one expects their comparison functions to be used as comparators in qsort (and similar functions), one should implemnent them with const void * parameters.
The last argument to qsort is casting a function pointer taking double pointers, to one taking single pointers that qsort will accept. It's simply a cast.
On most hardware you can assume that pointers all look the same at the hardware level. For example, in a system with flat 64bit addressing pointers will always be a 64bit integer quantity. The same is true of pointers to pointers or pointers to pointers to pointers to pointers.
Therefore, whatever method is used to invoke a function with two pointers will work with any function that takes two pointers. The specific type of the pointers doesn't matter.
qsort treats pointers generically, as though each is opaque. So it doesn't know or care how they're dereferenced. It knows what order they're currently in and uses the compare argument to work out what order they should be in.
The library you're using presumably keeps lists of pointers to pointers about. It has a compare function that can compare two pointers to pointers. So it casts that across to pass to qsort. It's just syntactically nicer than, e.g.
qsort(l->list, l->num_used, sizeof(void*), compare);
/* elsewhere */
int compare(const void *ptr1, const void *ptr2)
{
// these are really pointers to pointers, so cast them across
const void **real_ptr1 = (const void **)ptr1;
const void **real_ptr2 = (const void **)ptr2;
// do whatever with real_ptr1 and 2 here, e.g.
return (*real_ptr2)->sort_key - (*real_ptr1)->sort_key;
}
It is casting a function pointer. I imagine that the reason is so that compare can be applied to the pointers that are dereferenced rather than whatever they are pointing to.
(int (*)(const void *,const void *))compare is a C style cast to cast the function pointer compare to a function pointer with two const void * args.
The last argument is a function pointer. It specifies that it takes a pointer to a function that returns an int and takes two const void ** arguments.
Suppose I have an array of pointers to char in C:
char *data[5] = { "boda", "cydo", "washington", "dc", "obama" };
And I wish to sort this array using qsort:
qsort(data, 5, sizeof(char *), compare_function);
I am unable to come up with the compare function. For some reason this doesn't work:
int compare_function(const void *name1, const void *name2)
{
const char *name1_ = (const char *)name1;
const char *name2_ = (const char *)name2;
return strcmp(name1_, name2_);
}
I did a lot of searching and found that I had to use ** inside of qsort:
int compare_function(const void *name1, const void *name2)
{
const char *name1_ = *(const char **)name1;
const char *name2_ = *(const char **)name2;
return strcmp(name1_, name2_);
}
And this works.
Can anyone explain the use of *(const char **)name1 in this function? I don't understand it at all. Why the double pointer? Why didn't my original function work?
Thanks, Boda Cydo.
If it helps keep things straight in your head, the type that you should cast the pointers to in your comparator is the same as the original type of the data pointer you pass into qsort (that the qsort docs call base). But for qsort to be generic, it just handles everything as void*, regardless of what it "really" is.
So, if you're sorting an array of ints, then you will pass in an int* (converted to void*). qsort will give you back two void* pointers to the comparator, which you convert to int*, and dereference to get the int values that you actually compare.
Now replace int with char*:
if you're sorting an array of char*, then you will pass in a char** (converted to void*). qsort will give you back two void* pointers to the comparator, which you convert to char**, and dereference to get the char* values you actually compare.
In your example, because you're using an array, the char** that you pass in is the result of the array of char* "decaying" to a pointer to its first element. Since the first element is a char*, a pointer to it is a char**.
Imagine your data was double data[5] .
Your compare method would receive pointers (double*, passed as void*) to the elements (double).
Now replace double with char* again.
qsort is general enough to sort arrays consisting of other things than pointers. That's why the size parameter is there. It cannot pass the array elements to the comparison function directly, as it does not know at compile time how large they are. Therefore it passes pointers. In your case you get pointers to char *, char **.
The comparison function takes pointers to the type of object that's in the array you want to sort. Since the array contains char *, your comparison function takes pointers to char *, aka char **.
Maybe it is easier to give you an code example from me. I am trying to sort an array of TreeNodes and the first few lines of my comparator looks like:
int compareTreeNode(const void* tt1, const void* tt2) {
const TreeNode *t1, *t2;
t1=*(const TreeNode**)tt1;
t2=*(const TreeNode**)tt2;
After that you do your comparison using t1 and t2.
from man qsort:
The contents of the array are sorted in ascending
order according to a comparison function pointed to by
compar, which is called with two arguments that **point**
to the objects being compared.
So it sounds like the comparison function gets pointers to the array elements. Now a pointer to a char * is a char **
(i.e. a pointer to a pointer to a character).
char *data[5] = { "boda", "cydo", "washington", "dc", "obama" };
is a statement asking the compiler for an array of size 5 of character pointers. You have initialized those pointers to string literals, but to the compiler, it's still an array of five pointers.
When you pass that array into qsort, the array of pointers decays into a pointer pointing to the first element, in accordance with C array parameter passing rules.
Therefore you must process one level of indirection before you can get to the actual character arrays containing the constants.
#bodacydo here is a program that may explain what other programmers are trying to convey but this would be in context of "integers"
#include <stdio.h>
int main()
{
int i , j;
int *x[2] = {&i, &j};
i = 10; j = 20;
printf("in main() address of i = %p, address of j = %p \r\n", &i, &j);
fun(x);
fun(x + 1);
return 0;
}
void fun(int **ptr)
{
printf("value(it would be an address) of decayed element received = %p, double dereferenced value is %d \r\n",*ptr, **ptr);
printf("the decayed value can also be printed as *(int **)ptr = %p \r\n", *(int **)ptr );
}
qsort() passes a pointer to the user-defined comparison function and as you have a char * (pointer to char array) hence your comparison function should dereference from pointer to pointer hence char **.