C programming strchr returns pointer - c

#include <stdio.h>
#include <string.h>
int main ()
{
const char str[] = "http://www.tutorialspoint.com";
const char ch = '.';
char *ret;
ret = strchr(str, ch);
printf("String after |%c| is - |%s|\n", ch, ret);
return(0);
}
This code is copied from tutorialspoint.
From what I understand, ret is a pointer to a character. To use the value/ what the pointer is pointed to, I do *ret.
However, in this example, just by calling ret,printf() prints out .tutorialspoint.com. Why don't we use *ret to get .tutorialspoint.com since the string is the value in ret, which is accessed by *ret?

Have a look at the %s conversion specifier properties. When used in printf(), it expects an argument of type char * and so it gets. All is well.
To quote C11 standard, chapter ยง7.21.6.1
s If no l length modifier is present, the argument shall be a pointer to the initial
element of an array of character type.280) Characters from the array are
written up to (but not including) the terminating null character. [...]
In other words, to print out a string using %s, you have to supply the pointer to the start of the string, that is res, not *res.

To use the value/ what the pointer is pointed to, I do *ret.
Yes this is useful if code needs to use that value, a single character only. But code wants to print the contents of a string. A string is a sequence (or array) of characters up to and including the null character.
Why don't we use *ret to get ".tutorialspoint.com" since the string is the value in ret, which is accessed by *ret?
*ret resolves to a single a character. printf("%s", ... expects a pointer to a sequence or array of characters. By using printf("|%s|\n", ret);, printf() knows the address of the beginning of the string and can then print the initial character '.' and then subsequent ones 't', 'u', 't', ... until encountering a null character.
Better code would have checked the ret != NULL before attempting the printf()
ret = strchr(str, ch);
if (ret != NULL) {
printf("String after |%c| is - |%s|\n", ch, ret);
}

"since the string is the value in ret, which is accessed by *ret"
No. A string in C is a number of chars ending with \0. So *ret will point to the first char of the many. There is no unitary value of a String as e.g. in Java.
* is the dereference operator.*ret will give the . since you use * to access the pointed element. The %s-printf format uses a pointer to chars (the pointer is denoted only by the name ret). It goes on until it encounters \0, the terminating character.
The function strchr from Linux man(3):
The strchr() function returns a pointer to the first occurrence of
the character c in the string s.
You could implement it manually with a while loop and simply incrementing the pointer. However, you would still need the %s-specifier to print it. Or the tedious for loop and a function call on every char. Inefficient.
Note: Every for-loop is to be represented by a while loop, so the above mentioned loops are not firmly set.

Related

How to write my_strchr() in C

Right now I hope to write my own my_strchr() in the C language.
I checked that the answer should be like this:
char *my_strchr(const char *src1, int c) {
while (*src1 != '\0') { //when the string goes to the last, jump out
if (c == *src1) {
return src1;
}
src1++;
}
return NULL;
}
I'm quite confused by:
why we use *src1 in the while loop condition (*src1 != '\0'), is *src1 a pointer to the const char*? Can we just use src1 instead?
When we return value and src1++, we do not have that *src1, why?
For this function, it in fact prints the whole string after the required characters, but why is that? Why does it not print only the character that we need to find?
src1 is the pointer to the character, but we need the character itself. It's the same reason as in point 2, but the other way round.
If you write return *src1; you simply return the character you've found, that's always c, so your function would be pointless. You want to return the pointer to that char.
Because that's what the function is supposed to do. It returns the pointer to the first character c found. So printing the string pointed by that pointer prints the rest of the string.
It's important here to remember that in C a string is a series of characters that ends with a null ('\0') character. We reference the string in our code using a character pointer that points to the beginning of the string. When we pass a string as a parameter to a function what we're really getting is a pointer to the first character in the string.
Because of this fact, we can use pointer math to increment through a string. The pattern:
while (*src1 != '\0') {
//do stuff
src1++;
}
is a very common idiom in C. We might phrase it in English as:
While the value of the character in the string we are looking at (dereference src1 with the * operator) is not (inequality operator !=) the end of string indicator (null byte, 0 or '\0'), do some stuff, then move the pointer to point to the next character in the string (increment operator ++).
We often use the same kind of code structure to process other arrays or linked lists of things, comparing pointers to the NULL pointer.
To question #2, we're returning the value of the pointer from this function src1 and not the value of what it points to *scr1 because the question that this function should answer is "Where is the first instance of character c in the string that starts at the location pointed to by src1.
Question #3 implies that the code that calls this function is printing a string that starts from the pointer returned from this function. My guess is that the code looks something like this:
printf("%s", my_strchr(string, 'a'));
printf() and friends will print from the location provided in the argument list that matches up with the %s format specifier and then keep printing until it gets to the end of string character ('\0', the null terminator).
In C, a string is basically an array of char, an array is a pointer pointing to the first element in the array, and a pointer is the memory address of the value. Therefore:
You can use *src1 to get the value that it is pointing to.
src1++ means to +1 on the address, so you are basically moving where the pointer is pointing at.
Since you are returning a pointer, it is essentially equal to a string (char array).
In addition to Jabberwocky's answer, please note that the code in the question has 2 bugs:
c must be converted to char for the comparison with *src1: strchr("ABC", 'A' + 256) returns a pointer to the string literal unless char has more than 8 bits.
Furthermore, if c converted to a char has the value 0, a pointer to the null terminator should be returned, not NULL as in the posted code.
Here is a modified version:
char *my_strchr(const char *s, int c) {
for (;;) {
if ((char)c == *s) {
return src1;
}
if (*s++ == '\0') {
return NULL;
}
}
}

Difficult time understanding string pointers in C

I am looking to learn C and having a really hard time grasping the concepts of string pointers (and just pointers in general).
I have the following program:
#include<stdio.h>
#include<string.h>
int main()
{
// char test[6] = "hello";
char *test = "hello"; // test is a pointer to 'h'
int size_of_test_pt = sizeof(test); // this is the size of a pointer? which is 8 bytes.
int size_of_test = sizeof(*test); // this is the size of h, which is 1.
printf("\nSize of pointer to test: %d\n", size_of_test_pt); // prints 8
printf("\nSize of test: %d\n", size_of_test); // prints 1
printf("\nPrint %s\n", test); // why does this print 'hello', I thought test was a pointer?
printf("\nPrint %c\n", *test); // this is printing the first character of hello, I thought this would print hello.
printf("\nPrint %i\n", *test); // this prints 104...is this ASCII for h
return 0;
}
Everything makes sense until the last 3 print statements. If test is a pointer variable. Why does printf print out the word "hello" rather than an address?
For the printf("\nPrint %c\n", *test) call is it the right understanding that I am dereferencing test, which is an address and accessing the first element, then printing it to the screen?
The conversion specifier %s is designed to output strings and it expects as an argument a pointer expression of the type char * or const char * that points to the first character of a string that to be outputted.
In this case the function outputs all characters starting from the address pointed to by the supplied pointer until the terminating zero character '\0' is encountered.
If you want to output the value of such a pointer you need to write
printf("\nPrint %p\n", ( void * )test);
The conversion specifier %c is designed to output a single object of the type char.
Pay attention to that the pointer test points to the first character of the string literal "hello". So dereferencing the pointer you will get the first character of the string literal.
Here is a demonstration program.
#include <stdio.h>
int main( void )
{
char *test = "hello";
for ( ; *test != '\0'; ++test )
{
printf( "%c ", *test );
}
putchar( '\n' );
}
The program output is
h e l l o
The string literal is stored in memory as a character array containing the following elements
{ 'h', 'e', 'l', 'l', 'o', '\0' }
If test is a pointer variable. Why does printf print out the word "hello" rather than an address?
In short, it's because the %s expects its parameter to be of type char *.
For the printf("\nPrint %c\n", *test) call is it the right understanding that I am dereferencing test, which is an address and accessing the first element, then printing it to the screen?
Yes. The %c format specifier expects its parameter to be a char, and that's exactly what you get when you dereference a pointer to char. Furthermore, it knows to print that value as a character. In the following line, the %i parameter expects value of type int, and since char is a kind of int, it accepts that and prints the numeric value of the character (h) that you provided.
In C a "string" is a sequence of characters, e.g. h, e, l, l, o. To indicate that the sequence is ending, the special value 0 is appended to the sequence.
If there were no such things as pointers, passing a string to a function would involve copying the entire sequence of characters. This is normally not practical, so instead we just pass on a reference (i.e. a pointer) telling where to find the first character of the string.
This is so commonly used that the pointer itself is sometimes referred to as a string even though it is really a pointer to a string. The string itself is the sequence of characters.
Think of a pointer like a bibliographic reference identifying a particular book. Including such a reference in a text is much more practical than inserting a copy of the entire book.

Why C function strlen() returns a wrong length of a char?

My C codes are listed below:
char s="MIDSH"[3];
printf("%d\n",strlen(&s));
The result of running is 2, which is wrong because char s is just an 'S'.
Does anybody know why and how to solve this problem?
That's actually quite an interesting question. Let's break it up:
"MIDSH"[3]
String literals have array types. So the above applies the subscript operator to the array and evaluates to the 4th character 'S'. It then assigns it to the single character variable s.
printf("%d\n",strlen(&s));
Since s is a single character, and not part of an actual string, the behavior is undefined for the above code.
Signature of strlen is:
size_t strlen(const char *s);
/* The strlen() function calculates the
length of the string s, excluding the
terminating null byte ('\0'). */
strlen expects the input const char array is null terminated. But when you pass the address of an auto variable, you can't guarantee this and thus your program has an undefined behavior.
Does anybody know why and how to solve this problem?
sizeof(char) is guaranteed to be 1. So use sizeof or 1.
The statement
printf("%d\n",strlen(&s));
make no sense for the given case. strlen expects a null terminating string, s is of char type and &s need not necessarily point to an string. What you are getting is one the result of undefined behavior of the program.
To get the size of s you can use sizeof operator
printf("%zu\n", sizeof(s));
The strlen function treats its argument as a pointer to a sequence of characters, where the sequence is terminated by the '\0' character.
By passing a pointer to the single character variable s you effectively say that &s is the first character in such a sequence, but it's not. That means strlen will continue to search in memory under false premises and you will have undefined behavior.
when you use
"char s=" you create a new address on the stack for 's',and this address can't be add or reduce!so though you give strlen a char* but it can't find '\0' by add address.All is wrong.
you should use strlen with a address for char which is a array.like:
char* s = "MIDSH";
printf("%d\n", strlen(s)); //print 5
s++;
printf("%d\n", strlen(s)); //print 4

Char pointers and the printf function

I was trying to learn pointers and I wrote the following code to print the value of the pointer:
#include <stdio.h>
int main(void) {
char *p = "abc";
printf("%c",*p);
return 0;
}
The output is:
a
however, if I change the above code to:
#include <stdio.h>
int main(void) {
char *p = "abc";
printf(p);
return 0;
}
I get the output:
abc
I don't understand the following 2 things:
why did printf not require a format specifier in the second case? Is printf(pointer_name) enough to print the value of the pointer?
as per my understanding (which is very little), *p points to a contiguous block of memory that contains abc. I expected both outputs to be the same, i.e.
abc
are the different outputs because of the different ways of printing?
Edit 1
Additionally, the following code produces a runtime error. Why so?
#include <stdio.h>
int main(void) {
char *p = "abc";
printf(*p);
return 0;
}
For your first question, the printf function (and family) takes a string as first argument (i.e. a const char *). That string could contain format codes that the printf function will replace with the corresponding argument. The rest of the text is printed as-is, verbatim. And that's what is happening when you pass p as the first argument.
Do note that using printf this way is highly unrecommended, especially if the string is contains input from a user. If the user adds formatting codes in the string, and you don't provide the correct arguments then you will have undefined behavior. It could even lead to security holes.
For your second question, the variable p points to some memory. The expression *p dereferences the pointer to give you a single character, namely the one that p is actually pointing to, which is p[0].
Think of p like this:
+---+ +-----+-----+-----+------+
| p | ---> | 'a' | 'b' | 'c' | '\0' |
+---+ +-----+-----+-----+------+
The variable p doesn't really point to a "string", it only points to some single location in memory, namely the first character in the string "abc". It's the functions using p that treat that memory as a sequence of characters.
Furthermore, constant string literals are actually stored as (read-only) arrays of the number of character in the string plus one for the string terminator.
Also, to help you understand why *p is the same as p[0] you need to know that for any pointer or array p and valid index i, the expressions p[i] is equal to *(p + i). To get the first character, you have index 0, which means you have p[0] which then should be equal to *(p + 0). Adding zero to anything is a no-op, so *(p + 0) is the same as *(p) which is the same as *p. Therefore p[0] is equal to *p.
Regarding your edit (where you do printf(*p)), since *p returns the value of the first "element" pointed to by p (i.e. p[0]) you are passing a single character as the pointer to the format string. This will lead the compiler to convert it to a pointer which is pointing to whatever address has the value of that single character (it doesn't convert the character to a pointer to the character). This address is not a very valid address (in the ASCII alphabet 'a' has the value 97 which is the address where the program will look for the string to print) and you will have undefined behavior.
p is the format string.
char *p = "abc";
printf(p);
is the same as
print("abc");
Doing this is very bad practice because you don't know what the variable
will contain, and if it contains format specifiers, calling printf may do very bad things.
The reason why the first case (with "%c") only printed the first character
is that %c means a byte and *p means the (first) value which p is pointing at.
%s would print the entire string.
char *p = "abc";
printf(p); /* If p is untrusted, bad things will happen, otherwise the string p is written. */
printf("%c", *p); /* print the first byte in the string p */
printf("%s", p); /* print the string p */
why did printf not require a format specifier in the second case? Is printf(pointer_name) enough to print the value of the pointer?
With your code you told printf to use your string as the format string. Meaning your code turned equivalent to printf("abc").
as per my understanding (which is very little), *p points to a contiguous block of memory that contains abc. I expected both outputs to be the same
If you use %c you get a character printed, if you use %s you get a whole string. But if you tell printf to use the string as the format string, then it will do that too.
char *p = "abc";
printf(*p);
This code crashes because the contents of p, the character 'a' is not a pointer to a format string, it is not even a pointer. That code should not even compile without warnings.
You are misunderstanding, indeed when you do
char *p = "Hello";
p points to the starting address where literal "Hello" is stored. This is how you declare pointers. However, when afterwards, you do
*p
it means dereference p and obtain object where p points. In our above example this would yield 'H'. This should clarify your second question.
In case of printf just try
printf("Hello");
which is also fine; this answers your first question because it is effectively the same what you did when passed just p to printf.
Finally to your edit, indeed
printf(*p);
above line is not correct since printf expects const char * and by using *p you are passing it a char - or in other words 'H' assuming our example. Read more what dereferencing means.
why did printf not require a format specifier in the second case? Is printf(pointer_name) enough to print the value of the pointer?
"abc" is your format specifier. That's why it's printing "abc". If the string had contained %, then things would have behaved strangely, but they didn't.
printf("abc"); // Perfectly fine!
why did printf not require a format specifier in the second case? Is printf(pointer_name) enough to print the value of the pointer?
%c is the character conversion specifier. It instructs printf to only print the first byte. If you want it to print the string, use...
printf ("%s", p);
The %s seems redundant, but it can be useful for printing control characters or if you use width specifiers.
The best way to understand this really is to try and print the string abc%def using printf.
The %c format specifier expects a char type, and will output a single char value.
The first parameter to printf must be a const char* (a char* can convert implicitly to a const char*) and points to the start of a string of characters. It stops printing when it encounters a \0 in that string. If there is not a \0 present in that string then the behaviour of that function is undefined. Because "abc" doesn't contain any format specifiers, you don't pass any additional arguments to printf in that case.

Passing string through a function (C programming)

I have just started learning pointers, and after much adding and removing *s my code for converting an entered string to uppercase finally works..
#include <stdio.h>
char* upper(char *word);
int main()
{
char word[100];
printf("Enter a string: ");
gets(word);
printf("\nThe uppercase equivalent is: %s\n",upper(word));
return 0;
}
char* upper(char *word)
{
int i;
for (i=0;i<strlen(word);i++) word[i]=(word[i]>96&&word[i]<123)?word[i]-32:word[i];
return word;
}
My question is, while calling the function I sent word which is a pointer itself, so in char* upper(char *word) why do I need to use *word?
Is it a pointer to a pointer? Also, is there a char* there because it returns a pointer to a character/string right?
Please clarify me regarding how this works.
That's because the type you need here simply is "pointer to char", which is denoted as char *, the asterisk (*) is part of the type specification of the parameter. It's not a "pointer to pointer to char", that would be written as char **
Some additional remarks:
It seems you're confusing the dereference operator * (used to access the place where a pointer points to) with the asterisk as a pointer sign in type specifcations; you're not using a dereference operator anywhere in your code; you're only using the asterisk as part of the type specification! See these examples: to declare variable as a pointer to char, you'd write:
char * a;
To assign a value to the space where a is pointing to (by using the dereference operator), you'd write:
*a = 'c';
An array (of char) is not exactly equal to a pointer (to char) (see also the question here). However, in most cases, an array (of char) can be converted to a (char) pointer.
Your function actually changes the outer char array (and passes back a pointer to it); not only will the uppercase of what was entered be printed by printf, but also the variable word of the main function will be modified so that it holds the uppercase of the entered word. Take good care the such a side-effect is actually what you want. If you don't want the function to be able to modify the outside variable, you could write char* upper(char const *word) - but then you'd have to change your function definition as well, so that it doesn't directly modify the word variable, otherwise the Compiler will complain.
char upper(char c) would be a function that takes a character and returns a character. If you want to work with strings the convention is that strings are a sequence of characters terminated by a null character. You cannot pass the complete string to a function so you pass the pointer to the first character, therefore char *upper(char *s). A pointer to a pointer would have two * like in char **pp:
char *str = "my string";
char **ptr_to_ptr = &str;
char c = **ptr_ptr_ptr; // same as *str, same as str[0], 'm'
upper could also be implemented as void upper(char *str), but it is more convenient to have upper return the passed string. You made use of that in your sample when you printf the string that is returned by upper.
Just as a comment, you can optimize your upper function. You are calling strlen for every i. C strings are always null terminated, so you can replace your i < strlen(word) with word[i] != '\0' (or word[i] != 0). Also the code is better to read if you do not compare against 96 and 123 and subtract 32 but if you check against and calculate with 'a', 'z', 'A', 'Z' or whatever character you have in mind.
the *words is even though a pointer bt the array word in function and the pointer word are actually pointing to the one and the same thing while passing arguments jst a copy of the "pointee" ie the word entered is passed and whatever operation is done is done on the pointer word so in the end we have to return a pointer so the return type is specified as *.

Resources