Type of a pointer? - c

There are countless questions about pointers here on SO, and countless resources on the internet, but I still haven't been able to understand this.
This answer quotes A Tutorial on Pointers and Arrays in C: Chapter 3 - Pointers and Strings:
int puts(const char *s);
For the moment, ignore the const. The parameter passed to puts() is a pointer, that is the value of a pointer (since all parameters in C are passed by value), and the value of a pointer is the address to which it points, or, simply, an address. Thus when we write puts(strA); as we have seen, we are passing the address of strA[0].
I don't understand this, at all.
Why does puts() need a pointer to a string constant? puts() doesn't modify and return its argument, just writes it to stdout, and then the string is discarded.
Ignoring the why, how is it that puts()'s prototype, which explicity takes a pointer to a string constant, accepts a string literal, not a pointer to one? That is, why does puts("hello world"); work when puts()'s prototype would indicate that puts() needs something more like char hello[] = "hello world"; puts(&hello);?
If you give, for instance, printf() a pointer to a string constant, which is apparently what it wants, GCC will complain and your program will segfault, because:
error: format ‘%s’ expects argument of type ‘char *’, but argument 2 has type ‘char (*)[6]’
But giving printf() a string constant, not a pointer to a string, works fine.
This Programmers.SE question's answers make a lot of sense to me.
Going off that question's answers, pointers are just numbers which represent a position in memory. Numbers for memory addresses are unsigned ints, and C is written in (native) C and assembly, so pointers are simply architecture-defined uints.
But this is not the case, since the compiler is very clear in its errors about how int, int * and int ** are not the same. They are a pathway that eventually points to something in memory.
Why do functions that need a pointer accept something which is not a pointer, and reject a pointer?
I'm aware a "string constant" is actually an array of characters but I'm trying to simplify here.

The expression "hello world" has type char[12].
In most contexts, use of an array is converted to a pointer to its first element: in the case of "hello world" it is converted to a pointer to the 'h', of type char*.
When using puts("Hello world"), the array is converted to char*.
Note that the conversion from array of specific size, loses the size information.
char array[42];
printf("size of array is %d\n", (int)sizeof array);
printf("size of pointer is %d\n", (int)sizeof &array[0]);

puts() doesn't need a pointer to a string, it needs a pointer (*) to a character (char). It happens that in C, a pointer to a character (char *) can be assimilated to a string (an array of chars), provided that the end of the string is a null character \0.

Why does puts() need a pointer to a string constant? puts() doesn't modify and return its argument, just writes it to stdout, and then the string is discarded.
puts receives a pointer to the first character in a string; it will then "walk" down that string until it sees a 0 terminator. A naive implementation would look something like this:
void puts( const char *ptr )
{
while ( *ptr ) // loop until we see a 0-valued byte
putchar( *ptr++ ); // write the current character, advance the pointer
// to point to the next character in the string.
putchar( '\n' );
}
Ignoring the why, how is it that puts()'s prototype, which explicity takes a pointer to a string constant, accepts a string literal, not a pointer to one? That is, why does puts("hello world"); work when puts()'s prototype would indicate that puts() needs something more like char hello[] = "hello world"; puts(&hello);?
Except when it is the operand of the sizeof or unary & operator, or is a string literal being used to initialize another array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T", and the value of the expression will be the address of the first element of the array.
String literals are stored as arrays of char (const char in C++); thus, the string literal "hello world" is an expression of type "12-element array of char". When you call puts( "hello world" );, the string literal is not the operand of the sizeof or unary & operators, so the type of the expression is converted to "pointer to char", and the value of the expression is the address of the first character in the string.
If you give, for instance, printf() a pointer to a string constant, which is apparently what it wants, GCC will complain and your program will segfault, because:
error: format ‘%s’ expects argument of type ‘char *’, but argument 2 has type ‘char (*)[6]’
Remember above where I said an array expression is converted to a pointer type except when it is the operand of the sizeof or unary & operators or used to initialize another array in a declaration. Assume the declaration
char hello[] = "hello world";
Like above, the expression "hello world" has type 12-element array of char; however, because it is being used to initialize another array of char in a declaration, it is not converted to a pointer expression; instead, the contents of the string literal are copied to the hello array.
Similarly, if you call printf as follows:
printf( "%s", &hello );
then the expression hello is not converted to a pointer to char; instead, the type of the expression &hello is "pointer to 12-element array of char", or char (*)[12]. Since the %s conversion specifier expects a char *, you should just pass the array expression as
printf( "%s", hello );
and with string literals, just use the literal:
printf( "%s", "hello world" );
Going off that question's answers, pointers are just numbers which represent a position in memory. Numbers for memory addresses are unsigned ints, and C is written in (native) C and assembly, so pointers are simply architecture-defined uints.
But this is not the case, since the compiler is very clear in its errors about how int, int * and int ** are not the same. They are a pathway that eventually points to something in memory.
C is a (more or less) strongly-typed language; types matter. Even though an int, int *, and int ** may take up the same amount of space in memory1, semantically they are very different things and are (usually) not interchangable. A pointer to an int is a distinct type from a pointer to float, which is a distinct type from a pointer to an array of char, etc. This matters for things like pointer arithmetic; when you write
T *p = some_address();
p++;
The expression p++ advances p to point to the next object of type T. If sizeof (T) is 1, then p++ advances a single byte; if sizeof (T) is 4, then p++ advances 4 bytes (assuming a byte-addressed architecture, which most of us work on).
1. Or not. There is no guarantee that pointers to different types have the same size or representation as each other, nor is it guaranteed that they're just unsigned integers; on a segmented architecture, they may have a more complicated page:offset representation.

Several questions in there, but hopefully I can illustrate how pointers to pointers work.
The reason puts need a pointer, is that C really does not have a built in type for a string. A string is just a bunch of char one after another. Hence, puts needs a pointer to the first of the chars.
The string literal, "degrades gracefully" to a pointer. This is fancy compiler speak meaning that a string literal actually is a string of chars and is represented by a pointer to the first of the chars.
You need a pointer to a pointer to a type, for instance, if you want to "return" an array from a function, like so:
bool magic_super_function(int frob, int niz, char** imageptr /* pointer to pointer */)
{
char* img = malloc(frob * niz * IMAGE_DEPTH);
if (NULL == ptr) {
return false;
}
*imageptr = img;
return true;
}
Sometimes an example (even contrived) can illustrate a point. You would call
this function like so:
char* img; /* pointer to char */
if (true == magic_super_function(12, 8, &img /* pointer to pointer (to char)*/ )) {
/* Here img is valid and allocated */
/* Do something with img */
} else {
/* img has no valid value here. Do not use it. */
/* Something failed */
}

An int is different from int* because of how it will be used in the code. You can expect to access the memory location that int* points to and find an integer value. This is called 'strong typing' and the language does this so that there are strict rules for how you use your variables. So even though an int and int* might both be the same size, an int cannot be used as a pointer. Similarly an int** is a pointer to a pointer, so would have to be dereferenced twice to find the actual integer value it refers to.
In the example of puts(const char*) the definition of the function tells you that the function expects a memory location (pointer) to a null-terminated set of char values. When doing the operation, puts will dereference the location you give it, and print the characters found there. The const part tells you it won't be changing the values either so that it's safe to send a const array of char to it. When you send a literal string like puts("hello"), the compiler turns that into a pointer to "hello" for you as a convenience, so a pointer is still sent (not a copy of the string).
Regarding your question about printf, note that char* and char*[6] are different. The first indicates a pointer to a null-terminated string, where the second is a pointer to a set of exactly six char values which may not be null-terminated. The compiler complains because if puts(&hello) tried to treat the input parameter as a null-terminated string, it would not stop after then length of the array, and would access memory that it should not.

int **r = 90; r is a double pointer and you are assigning 90 to the pointer. When you dereference, it will try to dereference address 0x90.

Why does puts() need a pointer to a string constant?
puts() is defined in such a way so that it can make use of the actual parameter instead of copying it and reusing it. Because it improves performance. Moreover it takes a const pointer so it can't change the content pointed by the pointer. It is call by reference.
How is it that puts()'s prototype, which explicitly takes a pointer to
a string constant, accepts a string literal, not a pointer to one?
When you pass a string literal, first the string literal is stored in read only memory and then a pointer to that memory is actually passed. So you can call puts() with any literal like puts("abcd"), puts("xyz"). It will work.
error: format ‘%s’ expects argument of type ‘char *’, but argument 2 has type ‘char (*)[6]’
Here your are actually passing a pointer to an array of 6 chars not a char *. So the compiler will complain this error.

Related

Dereferencing char pointer returns int ? why? [duplicate]

This question already has answers here:
Char pointers and the printf function
(6 answers)
Closed 1 year ago.
For example, the following code returns an error and a warning when compiled and an int when changed to %d
Warning:
format %s expects argument of type char *, but argument 2 has type int
void stringd() {
char *s = "Hello";
printf("derefernced s is %s", *s);
}
*s is an expression of type char since it's the dereference operator applied to a pointer-to-char1. As a result, it gets promoted to an int when passed to printf; in order to print a null-terminated string, you need to pass the pointer to the first character (i.e. just s).
1 even though s is not a const pointer, you should not try to modify the characters it points to as they may be placed in read-only memory where string literals are stored on some architectures/environments; see this discussion for more details.
The variable s is a pointer to the first out of a series of characters which are consecutive in memory (colloquially referred to as a "string", though it's not quite the same). It's a pointer to a character, thus char *.
Dereferencing s (by doing *s) gives you the first of those characters, h, whose type is now just char. One layer of indirection was stripped away.
Thus, the issue is that you're trying to pass a character (char), where a string (char *) was expected. char * was expected because you used the %s type character in your format string to printf. Instead, you should use %c, which expects single, simple char.
The mistake here is actually quite grave. If you were allowed to pass this 'h' where a char * was expected, you would end up with the ASCII code of 'h' (0x68) being passed where a pointer was expected. printf would be none-the-wiser, and would try to dereference that value, treating 0x68 like a pointer to the beginning of a string. Of course, that's probably not a valid memory location in your program, so that should seg-fault pretty reliability, if it were allowed to happen.

Why does direct passage of string to printf correctly works?

I know that in C both of these works:
char* string = "foo";
printf("string value: %s", string);
and more simply:
printf("string value: %s", "foo");
But I was asking myself why.
I know that %s identifier expects the argument to be a char*, and string actually is (and it will be the same with an array of characters, because this two datatypes are pretty the same in C)
but when I pass directly a string to printf shouldn't it be different? I mean "foo" is not a pointer anymore... Right?.
The string constant "foo" has type char []. When passed to a function, the array decays to a pointer, i.e. char *. So you can pass it to a function that expects the same.
For the same reason, you can also pass a variable of this type:
char string[4] = "foo";
printf("string value: %s", string);
The "foo" is string literal. It represents an unnamed array object with static storage duration of type char[4] (that is, without const qualifier), that is passed to function by value, just as it would be with any "normal" array.
Even though the array is not const, you are not allowed to modify its values. Such modification results in undefined behavior:
char* string = "foo";
string[0] = 'b'; // wrong, this invokes UB
The array has four elements, because of trailing null character '\0', sometimes reffered as NUL character. Please don't confuse it with NULL, which is a different thing. The purpose of that character is to terminate given string literal.
The function's parameter receives pointer to char, as array object is converted into pointer to array's first element (i.e. pointer to first character in array). To be precise, not the whole pointer is passed, only the address (i.e. the value of the pointer) it holds.
In C, all strings are null-terminating char[] so your example will interact in the same exact way.
The ISO C standard, section 7.1.1, defines a string this way:
A string is a contiguous sequence of characters terminated by and
including the first null character.
What printf() gets, is a pointer:
ISO/IEC 9899:TC3, 6.5.2.2 – 4:
An argument may be an expression of any object type. In preparing for the call to a function, the arguments are evaluated, and each parameter is assigned the value of the corresponding argument.81)
81) A parameter declared to have array or function type is adjusted to have a pointer type as described in 6.9.1.
ISO/IEC 9899:TC3, 6.9.1 – 10:
On entry to the function, the size expressions of each variably modified parameter are evaluated and the value of each argument expression is converted to the type of the corresponding parameter as if by assignment. (Array expressions and function designators as arguments were converted to pointers before the call.)
"foo", in the end, is a pointer literal pointing to a statically allocated 4-byte memory region (likely marked read-only) that is initialized with the content: 'f'. 'o', 'o', '\0'.

Assign pointer type to string type

I'm expecting a compile error , taking into account that a pointer has to be assigned in %p, but the codes below doesn't give me error when i intentionally assign a pointer to %s. By adding an ampersand &, by right it should generate the address of the array and assign the memory address into %p, instead of giving the value of the string. Unless I dereference the pointer, but I don't dereference the pointer at all, I never put an asterisk * in front of my_pointer in printf.
#include <stdio.h>
int main()
{
char words[] = "Daddy\0Mommy\0Me\0";
char *my_pointer;
my_pointer = &words[0];
printf("%s \n", my_pointer);
return 0;
}
please look at this :
printf("%s \n", my_pointer);
My understanding is , *my_pointer (with asterisk *)should give me the value of the string.
But my_pointer (without asterisk) shouldn't give me the value of the string, but it should give me only the memory address,but when I run this code, I get the value of string eventhough I didn't put the asterisk * at the front. I hope I'm making myself clear this time.
Here:
printf("%s \n", my_pointer);
%s, expects a char* and since my_pointer is a char* which points to an array holding a NUL-terminated string, the printf has no problems and is perfectly valid. Relevant quote from the C11 standard (emphasis mine):
7.21.6.1 The fprintf function
[...]
The conversion specifiers and their meanings are:
[...]
s - If no l length modifier is present, the argument shall be a pointer to the initial
element of an array of character type. 280) Characters from the array are
written up to (but not including) the terminating null character. If the
precision is specified, no more than that many bytes are written. If the
precision is not specified or is greater than the size of the array, the array shall
contain a null character.
[...]
IMO, You are being confused here:
taking into account that a pointer has to be assigned in %p, but the codes below doesn't give me error when i intentionally assign a pointer to %s
First of all, %s, %p etc are conversion specifiers. They are used in some functions like printf, scanf etc.
Next, you are the one specifying the type of the pointers. So here:
my_pointer = &words[0];
&words[0] as well as my_pointer is of type char*. Assigning these two is therefore perfectly valid as both are of the same type.
The compiler is treating your code exactly as it is required to.
The %s format specifier tells printf() to expect a const char * as the corresponding argument. It then deems that pointer to be the address of the first element of an array of char and prints every char it finds until it encounters one with value zero ('\0').
Strictly speaking, the compiler is not even required to check that my_pointer is, or can be implicitly converted to, a const char *. However, most modern compilers (assuming the format string is supplied at compile time) do that.
In c, array name is also pointer to the first element, means in your case words and &words[0] when as a pointer, they have the same value.
And, you assign it to another pointer of the same type, so this is legal.
About string in c, it's just an array of chars ending with '\0', with its name pointer to the first char.

Sizeof doesn't return the true size of variable in C

Consider the following code
#include <stdio.h>
void print(char string[]){
printf("%s:%d\n",string,sizeof(string));
}
int main(){
char string[] = "Hello World";
print(string);
}
and the output is
Hello World:4
So what's wrong with that ?
It does return the true size of the "variable" (really, the parameter to the function). The problem is that this is not of the type you think it is.
char string[], as a parameter to a function, is equivalent to char* string. You get a result of 4 because that is the size, on your system, of a char*.
Please read more here: http://c-faq.com/aryptr/index.html
It is the size of the char pointer, not the length of the string.
Use strlen from string.h to get the string length.
string is a pointer and its size is 4. You need strlen probably.
a array will change into a pointer as parameter of function in ANSI C.
Except when it is an operand of the sizeof or unary & operators, or is a string literal being used to initialize another array in a declaration, an array expression will have its type implicitly converted ("decay") from "N-element array of T" to "pointer to T" and its value will be the address of the first element in the array (n1256, 6.3.2.1/3).
The object string in main is a 12-element array of char. In the call to print in main, the type of the expression string is converted from char [12] to char *. Therefore, the print function receives a pointer value, not an array. In the context of a function parameter declaration, T a[] and T a[N] are both synonymous with T *; note that this is only true for function parameter declarations (this is one of C's bigger misfeatures IMO).
Thus, the print function is working with a pointer type, not an array type, so sizeof string returns the size of a char *, not the size of the array.
A string in c is just an array of characters. It isn't necessarily NUL terminated (although in your case it is). There is no way for the function to know how long the string is that's passed to it - it's just given the address of the string as a pointer.
"String" is that pointer and on your machine (a 32 bit machine) it takes 4 bytes to store a pointer. So sizeof(string) is 4
You asked the systems for the sizeof(the address to the begining of a character array), string is an object, to get information about it's lenght out you have to ask it through the correct OO interface.
In the case of std::string the member function string.length(), will return the number of characters stored by the string object.
http://www.java2s.com/Code/Cpp/Data-Type/StringSizeOf.htm
see here it has same output as yours...and find what ur doing wrong

c string basics, why unassigned?

I am trying to learn the basics, I would think that declaring a char[] and assigning a string to it would work.
thanks
int size = 100;
char str[size];
str = "\x80\xbb\x00\xcd";
gives error "incompatible types in assignment". what's wrong?
thanks
You can use a string literal to initialize an array of char, but you can't assign an array of char (any more than you can assign any other array). OTOH, you can assign a pointer, so the following would be allowed:
char *str;
str = "\x80\xbb\x00\xcd";
This is actually one of the most difficult parts of learning a programming language.... str is an array, that is, a part of memory (size times a char, so size chars) that has been reserved and labeled as str. str[0] is the first character, str[1] the second... str[size-1] is the last one. str itself, without specifiying any character, is a pointer to the memory zone that was created when you did
char str[size]
As Jerry so clearly said, in C you can not initialize arrays that way. You need to copy from one array to other, so you can do something like this
strncpy(str, "\x80\xbb\x00\xcd", size); /* Copy up to size characters */
str[size-1]='\0'; /* Make sure that the string is null terminated for small values of size */
Summarizing: It's very important to make a difference between pointers, memory areas and array.
Good luck - I am pretty sure that in less time than you imagine you will be mastering these concepts :)
A char-array can be implicitely cast to a char* when used as Rvalue, but not when used as Lvalue - that's why the assignment won't work.
You cannot assign array contents using the =operator. That's just a fact of the C language design. You can initialize an array in the declaration, such as
char str[size] = "\x80\xbb\x00\xcd";
but that's a different operation from an assignment. And note that in this case, and extra '\0' will be added to the end of the string.
The "incompatible types" warning comes from how array expressions are treated by the language. First of all, string literals are stored as arrays of char with static extent (meaning they exist over the lifetime of the program). So the type of the string literal "\x80\xbb\x00\xcd" is "4 5-element array of char". However, in most circumstances, an expression of array type will implicitly be converted ("decay") from type "N-element array of T" to "pointer to T", and the value of the expression will be the address of the first element in the array. So, when you wrote the statement
str = "\x80\xbb\x00\xcd";
the type of the literal was implicitly converted from "4 5-element array of char" to "pointer to char", but the target of the assignment is type "100-element array of char", and the types are not compatible (above and beyond the fact that an array expression cannot be the target of the = operator).
To copy the contents of one array to another you would have to use a library function like memcpy, memmove, strcpy, etc. Also, for strcpy to function properly, the source string must be 0-terminated.
Edit per R's comment below, I've struck out the more dumbass sections of my answer.
To assign a String Literal to the str Array you can use a the String copy function strcpy.
char a[100] = "\x80\xbb\x00\xcd"; OR char a[] = "\x80\xbb\x00\xcd";
str is the name of an array. The name of an array is the address of the 0th element. Therefore, str is a pointer constant. You cannot change the value of a pointer constant, just like you cannot change a constant (you can't do 6 = 5, for example).

Resources