Difficult time understanding string pointers in C - c

I am looking to learn C and having a really hard time grasping the concepts of string pointers (and just pointers in general).
I have the following program:
#include<stdio.h>
#include<string.h>
int main()
{
// char test[6] = "hello";
char *test = "hello"; // test is a pointer to 'h'
int size_of_test_pt = sizeof(test); // this is the size of a pointer? which is 8 bytes.
int size_of_test = sizeof(*test); // this is the size of h, which is 1.
printf("\nSize of pointer to test: %d\n", size_of_test_pt); // prints 8
printf("\nSize of test: %d\n", size_of_test); // prints 1
printf("\nPrint %s\n", test); // why does this print 'hello', I thought test was a pointer?
printf("\nPrint %c\n", *test); // this is printing the first character of hello, I thought this would print hello.
printf("\nPrint %i\n", *test); // this prints 104...is this ASCII for h
return 0;
}
Everything makes sense until the last 3 print statements. If test is a pointer variable. Why does printf print out the word "hello" rather than an address?
For the printf("\nPrint %c\n", *test) call is it the right understanding that I am dereferencing test, which is an address and accessing the first element, then printing it to the screen?

The conversion specifier %s is designed to output strings and it expects as an argument a pointer expression of the type char * or const char * that points to the first character of a string that to be outputted.
In this case the function outputs all characters starting from the address pointed to by the supplied pointer until the terminating zero character '\0' is encountered.
If you want to output the value of such a pointer you need to write
printf("\nPrint %p\n", ( void * )test);
The conversion specifier %c is designed to output a single object of the type char.
Pay attention to that the pointer test points to the first character of the string literal "hello". So dereferencing the pointer you will get the first character of the string literal.
Here is a demonstration program.
#include <stdio.h>
int main( void )
{
char *test = "hello";
for ( ; *test != '\0'; ++test )
{
printf( "%c ", *test );
}
putchar( '\n' );
}
The program output is
h e l l o
The string literal is stored in memory as a character array containing the following elements
{ 'h', 'e', 'l', 'l', 'o', '\0' }

If test is a pointer variable. Why does printf print out the word "hello" rather than an address?
In short, it's because the %s expects its parameter to be of type char *.
For the printf("\nPrint %c\n", *test) call is it the right understanding that I am dereferencing test, which is an address and accessing the first element, then printing it to the screen?
Yes. The %c format specifier expects its parameter to be a char, and that's exactly what you get when you dereference a pointer to char. Furthermore, it knows to print that value as a character. In the following line, the %i parameter expects value of type int, and since char is a kind of int, it accepts that and prints the numeric value of the character (h) that you provided.

In C a "string" is a sequence of characters, e.g. h, e, l, l, o. To indicate that the sequence is ending, the special value 0 is appended to the sequence.
If there were no such things as pointers, passing a string to a function would involve copying the entire sequence of characters. This is normally not practical, so instead we just pass on a reference (i.e. a pointer) telling where to find the first character of the string.
This is so commonly used that the pointer itself is sometimes referred to as a string even though it is really a pointer to a string. The string itself is the sequence of characters.
Think of a pointer like a bibliographic reference identifying a particular book. Including such a reference in a text is much more practical than inserting a copy of the entire book.

Related

Why *p returning an integer for (char*p = "hello world")

Why *pointer is an integer but not the string content "Hello from pointer" Thanks!
int main(void) {
char *pointer;
pointer = "Hello from pointer";
printf("*pointer is %d\n", *pointer);
printf("\n");
}
the output is *pointer is 72
It's because the ASCII code for 'H' (which is the first element of the array) is 72. It is completely normal.
Here is the ASCII Code table
pointer = "Hello from pointer"; is pointing to the first letter of this string literal which is H and ASCII Value of 'H' is 72, that's why the output is 72.
In C, a constant character string always represents a pointer to that string. And therefore this statement is valid:
char *pointer = "Hello...";
This statement declares pointer as a pointer to character and assigns to pointer the constant character string "Hello..."
That'ss why when you printf("%d", *pointer); it outputs 72. Because, pointer is pointing to the first character of that constant string literal, which is 'H' and because of the format flag %d in printf() statement, it prints out the ASCII value of 'H', which is 72. Hope you got your explanation.
Here printf("*pointer is %d\n", *pointer); in this line you've used %d format specifier, not %s to print out the string pointer.
Again you should not de-reference the pointer variable when you print string from the string pointer. Try to search and find about String array vs String pointer.
So, the line should be printf("*pointer is %s\n", pointer);
More explanation:
char *pointer = "Hello from pointer";
After compiling this line, "Hello from pointer" will be stored in the memory. And like array variable, the pointer variable will hold the base address of this character array. So, the variable pointer will hold the address of H here.
Thus when you de-reference the pointer variable, it will show the value H. As the format specifier you used is %d, it's printing the integer value (ASCII Value) of the character H (72).
This is caused by a buffer overflow. (Read more here)
The direct answer is because 72 is the integer representation of the character "H" from "Hello from pointer".
Basically, there's two things going on.
1) That variable only expects you to give it one char
2) A pointer only points to a single part in memory, so since strings are treated sort of like "arrays" (per se), you're getting only the first value in the "array"
So if you were to print *pointer[1] you would get e. So the idea is to know the length of your string so you can safely determine how much space you need.
But remember, memory is volatile, so even though whatever compiler you use will open up a certain amount of "stack memory", you can overflow that with a long string.
If you don't know the length of your string and/or your string is very long, you can dynamically allocate it with malloc() and free().

Pointer to a string and the difference between assignment and dereferencing

In the following code:
#include <stdio.h>
int main(void) {
char* message = "Hello C Programmer!";
printf("%s", message);
return 0;
}
I don't fully understand why it wasn't necessary to prepend an '*' to message in the printf call. I was under the assumption that message, since it is a pointer to a char, the first letter in the double quoted string, would display the address of the 'H'.
The %s format operator requires its corresponding argument to be a char * pointer. It prints the entire string that begins at that address. A string is a sequence of characters ending with a null byte. That's why this prints the entire message.
If you supply an array as the corresponding argument, it's automatically converted to a pointer to the first character of the array. In general, whenever an array is used as an r-value, it undergoes this conversion.
You don't need to use the * operator because the argument is supposed to be a pointer. If you used *message you would only pass the 'H' character to printf(). You would do this if you were using the %c format instead of %s -- its corresponding argument should be a char.
You're right that message is a pointer, of type "pointer to char", or char *. So if you were trying to print a character (an individual character), you would certainly have needed a *:
printf("first character: %c\n", *message);
In a printf format specifier, %c expects a character, which is what *message gives you.
But you weren't trying to print a single character, you were trying to print the whole string. And in a printf format specifier, %s expects a pointer to a character, the first of usually several characters to print. So
printf("entire string: %s\n", message);
is correct.
The conversion specifier %s is used to output strings (a sequence of characters terminated with the zero character '\0') pointers to first characters of which are passed to the function printf as an argument.
You can imagine that the function internally executes the following loop
for ( ; *message != '\0'; ++message )
{
printf( "%c", *message );
}
If you will supply the expression *message then it has type char and the function printf will try to interpret its value that is character 'H' as a value of a pointer. As result the function call will have undefined behavior.
To output the value of the pointer message you should use the conversion specifier %p like
printf( "%p", message );
Or as an integer value as for example
#include <stdio.h>
#include <inttypes.h>
int main(void)
{
char *message = "Hello C Programmer!";
printf( "The value of the pointer message is %" PRIiPTR "\n", ( intptr_t )message );
return 0;
}
In addition, a pointer must be used here.
If you passed a character, it would be passed by value, and printf would lose the original address of the character. That means it would no longer be possible to access the characters that come after the first.
You need to pass a pointer by value to ensure you're retaining the original address of the head of the string.

Char pointers and the printf function

I was trying to learn pointers and I wrote the following code to print the value of the pointer:
#include <stdio.h>
int main(void) {
char *p = "abc";
printf("%c",*p);
return 0;
}
The output is:
a
however, if I change the above code to:
#include <stdio.h>
int main(void) {
char *p = "abc";
printf(p);
return 0;
}
I get the output:
abc
I don't understand the following 2 things:
why did printf not require a format specifier in the second case? Is printf(pointer_name) enough to print the value of the pointer?
as per my understanding (which is very little), *p points to a contiguous block of memory that contains abc. I expected both outputs to be the same, i.e.
abc
are the different outputs because of the different ways of printing?
Edit 1
Additionally, the following code produces a runtime error. Why so?
#include <stdio.h>
int main(void) {
char *p = "abc";
printf(*p);
return 0;
}
For your first question, the printf function (and family) takes a string as first argument (i.e. a const char *). That string could contain format codes that the printf function will replace with the corresponding argument. The rest of the text is printed as-is, verbatim. And that's what is happening when you pass p as the first argument.
Do note that using printf this way is highly unrecommended, especially if the string is contains input from a user. If the user adds formatting codes in the string, and you don't provide the correct arguments then you will have undefined behavior. It could even lead to security holes.
For your second question, the variable p points to some memory. The expression *p dereferences the pointer to give you a single character, namely the one that p is actually pointing to, which is p[0].
Think of p like this:
+---+ +-----+-----+-----+------+
| p | ---> | 'a' | 'b' | 'c' | '\0' |
+---+ +-----+-----+-----+------+
The variable p doesn't really point to a "string", it only points to some single location in memory, namely the first character in the string "abc". It's the functions using p that treat that memory as a sequence of characters.
Furthermore, constant string literals are actually stored as (read-only) arrays of the number of character in the string plus one for the string terminator.
Also, to help you understand why *p is the same as p[0] you need to know that for any pointer or array p and valid index i, the expressions p[i] is equal to *(p + i). To get the first character, you have index 0, which means you have p[0] which then should be equal to *(p + 0). Adding zero to anything is a no-op, so *(p + 0) is the same as *(p) which is the same as *p. Therefore p[0] is equal to *p.
Regarding your edit (where you do printf(*p)), since *p returns the value of the first "element" pointed to by p (i.e. p[0]) you are passing a single character as the pointer to the format string. This will lead the compiler to convert it to a pointer which is pointing to whatever address has the value of that single character (it doesn't convert the character to a pointer to the character). This address is not a very valid address (in the ASCII alphabet 'a' has the value 97 which is the address where the program will look for the string to print) and you will have undefined behavior.
p is the format string.
char *p = "abc";
printf(p);
is the same as
print("abc");
Doing this is very bad practice because you don't know what the variable
will contain, and if it contains format specifiers, calling printf may do very bad things.
The reason why the first case (with "%c") only printed the first character
is that %c means a byte and *p means the (first) value which p is pointing at.
%s would print the entire string.
char *p = "abc";
printf(p); /* If p is untrusted, bad things will happen, otherwise the string p is written. */
printf("%c", *p); /* print the first byte in the string p */
printf("%s", p); /* print the string p */
why did printf not require a format specifier in the second case? Is printf(pointer_name) enough to print the value of the pointer?
With your code you told printf to use your string as the format string. Meaning your code turned equivalent to printf("abc").
as per my understanding (which is very little), *p points to a contiguous block of memory that contains abc. I expected both outputs to be the same
If you use %c you get a character printed, if you use %s you get a whole string. But if you tell printf to use the string as the format string, then it will do that too.
char *p = "abc";
printf(*p);
This code crashes because the contents of p, the character 'a' is not a pointer to a format string, it is not even a pointer. That code should not even compile without warnings.
You are misunderstanding, indeed when you do
char *p = "Hello";
p points to the starting address where literal "Hello" is stored. This is how you declare pointers. However, when afterwards, you do
*p
it means dereference p and obtain object where p points. In our above example this would yield 'H'. This should clarify your second question.
In case of printf just try
printf("Hello");
which is also fine; this answers your first question because it is effectively the same what you did when passed just p to printf.
Finally to your edit, indeed
printf(*p);
above line is not correct since printf expects const char * and by using *p you are passing it a char - or in other words 'H' assuming our example. Read more what dereferencing means.
why did printf not require a format specifier in the second case? Is printf(pointer_name) enough to print the value of the pointer?
"abc" is your format specifier. That's why it's printing "abc". If the string had contained %, then things would have behaved strangely, but they didn't.
printf("abc"); // Perfectly fine!
why did printf not require a format specifier in the second case? Is printf(pointer_name) enough to print the value of the pointer?
%c is the character conversion specifier. It instructs printf to only print the first byte. If you want it to print the string, use...
printf ("%s", p);
The %s seems redundant, but it can be useful for printing control characters or if you use width specifiers.
The best way to understand this really is to try and print the string abc%def using printf.
The %c format specifier expects a char type, and will output a single char value.
The first parameter to printf must be a const char* (a char* can convert implicitly to a const char*) and points to the start of a string of characters. It stops printing when it encounters a \0 in that string. If there is not a \0 present in that string then the behaviour of that function is undefined. Because "abc" doesn't contain any format specifiers, you don't pass any additional arguments to printf in that case.

C programming strchr returns pointer

#include <stdio.h>
#include <string.h>
int main ()
{
const char str[] = "http://www.tutorialspoint.com";
const char ch = '.';
char *ret;
ret = strchr(str, ch);
printf("String after |%c| is - |%s|\n", ch, ret);
return(0);
}
This code is copied from tutorialspoint.
From what I understand, ret is a pointer to a character. To use the value/ what the pointer is pointed to, I do *ret.
However, in this example, just by calling ret,printf() prints out .tutorialspoint.com. Why don't we use *ret to get .tutorialspoint.com since the string is the value in ret, which is accessed by *ret?
Have a look at the %s conversion specifier properties. When used in printf(), it expects an argument of type char * and so it gets. All is well.
To quote C11 standard, chapter ยง7.21.6.1
s If no l length modifier is present, the argument shall be a pointer to the initial
element of an array of character type.280) Characters from the array are
written up to (but not including) the terminating null character. [...]
In other words, to print out a string using %s, you have to supply the pointer to the start of the string, that is res, not *res.
To use the value/ what the pointer is pointed to, I do *ret.
Yes this is useful if code needs to use that value, a single character only. But code wants to print the contents of a string. A string is a sequence (or array) of characters up to and including the null character.
Why don't we use *ret to get ".tutorialspoint.com" since the string is the value in ret, which is accessed by *ret?
*ret resolves to a single a character. printf("%s", ... expects a pointer to a sequence or array of characters. By using printf("|%s|\n", ret);, printf() knows the address of the beginning of the string and can then print the initial character '.' and then subsequent ones 't', 'u', 't', ... until encountering a null character.
Better code would have checked the ret != NULL before attempting the printf()
ret = strchr(str, ch);
if (ret != NULL) {
printf("String after |%c| is - |%s|\n", ch, ret);
}
"since the string is the value in ret, which is accessed by *ret"
No. A string in C is a number of chars ending with \0. So *ret will point to the first char of the many. There is no unitary value of a String as e.g. in Java.
* is the dereference operator.*ret will give the . since you use * to access the pointed element. The %s-printf format uses a pointer to chars (the pointer is denoted only by the name ret). It goes on until it encounters \0, the terminating character.
The function strchr from Linux man(3):
The strchr() function returns a pointer to the first occurrence of
the character c in the string s.
You could implement it manually with a while loop and simply incrementing the pointer. However, you would still need the %s-specifier to print it. Or the tedious for loop and a function call on every char. Inefficient.
Note: Every for-loop is to be represented by a while loop, so the above mentioned loops are not firmly set.

Can a pointer to a string be used in a printf?

I am thinking of something like:
#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
int main(void) {
//test pointer to string
char s[50];
char *ptr=s;
printf("\nEnter string (s): ");
fgets(s, 50, stdin);
printf("S: %s\nPTR: %s\n", s, *ptr);
system("PAUSE");
return 0;
}
Or should I use a for loop with *(s+i) and the format specifier %c?
Is that the only possible way to print a string through a pointer and a simple printf?
Update: The printf operates with the adress of the first element of the array so when I use *ptr I actually operate with the first element and not it's adress. Thanks.
The "%s" format specifier for printf always expects a char* argument.
Given:
char s[] = "hello";
char *p = "world";
printf("%s, %s\n", s, p);
it looks like you're passing an array for the first %s and a pointer for the second, but in fact you're (correctly) passing pointers for both.
In C, any expression of array type is implicitly converted to a pointer to the array's first element unless it's in one of the following three contexts:
It's an argument to the unary "&" (address-of) operator
It's an argument to the unary "sizeof" operator
It's a string literal in an initializer used to initialize an array object.
(I think C++ has one or two other exceptions.)
The implementation of printf() sees the "%s", assumes that the corresponding argument is a pointer to char, and uses that pointer to traverse the string and print it.
Section 6 of the comp.lang.c FAQ has an excellent discussion of this.
printf("%s\n", ptr);
Is this what you want?
By the way, from printf(3), here's the documentation for the s conversion specifier (i.e %s):
If no l modifier is present: The const char * argument is expected to
be a pointer to an array of character type (pointer to a string).
Characters from the array are written up to (but not including) a
terminating null byte ('\0'); if a precision is specified, no more
than the number specified are written. If a precision is given, no
null byte need be present; if the precision is not specified, or is
greater than the size of the array, the array must contain a
terminating null byte.
you should do "printf("S: %s\nPTR: %s\n", s, ptr);
" instead of printf("S: %s\nPTR: %s\n", s, *ptr);
difference between ptr and *ptr is: ptr gives you the address in the memory of the variable you are pointing to and *ptr gives rather the value of the pointed variable In this case is *ptr = ptr[0]
this code will show what i mean:
printf("\tS: %s\n\tPTR: %s\n\tAddress of the pointed Value: %x\n\tValue of the whole String: %s\n\tValue of the first character of the String: %c\n", s, ptr,ptr,ptr,*ptr);
In my experience you should get segmentation fault when you try to use %s directive with *p.

Resources