Passing string through a function (C programming) - c

I have just started learning pointers, and after much adding and removing *s my code for converting an entered string to uppercase finally works..
#include <stdio.h>
char* upper(char *word);
int main()
{
char word[100];
printf("Enter a string: ");
gets(word);
printf("\nThe uppercase equivalent is: %s\n",upper(word));
return 0;
}
char* upper(char *word)
{
int i;
for (i=0;i<strlen(word);i++) word[i]=(word[i]>96&&word[i]<123)?word[i]-32:word[i];
return word;
}
My question is, while calling the function I sent word which is a pointer itself, so in char* upper(char *word) why do I need to use *word?
Is it a pointer to a pointer? Also, is there a char* there because it returns a pointer to a character/string right?
Please clarify me regarding how this works.

That's because the type you need here simply is "pointer to char", which is denoted as char *, the asterisk (*) is part of the type specification of the parameter. It's not a "pointer to pointer to char", that would be written as char **
Some additional remarks:
It seems you're confusing the dereference operator * (used to access the place where a pointer points to) with the asterisk as a pointer sign in type specifcations; you're not using a dereference operator anywhere in your code; you're only using the asterisk as part of the type specification! See these examples: to declare variable as a pointer to char, you'd write:
char * a;
To assign a value to the space where a is pointing to (by using the dereference operator), you'd write:
*a = 'c';
An array (of char) is not exactly equal to a pointer (to char) (see also the question here). However, in most cases, an array (of char) can be converted to a (char) pointer.
Your function actually changes the outer char array (and passes back a pointer to it); not only will the uppercase of what was entered be printed by printf, but also the variable word of the main function will be modified so that it holds the uppercase of the entered word. Take good care the such a side-effect is actually what you want. If you don't want the function to be able to modify the outside variable, you could write char* upper(char const *word) - but then you'd have to change your function definition as well, so that it doesn't directly modify the word variable, otherwise the Compiler will complain.

char upper(char c) would be a function that takes a character and returns a character. If you want to work with strings the convention is that strings are a sequence of characters terminated by a null character. You cannot pass the complete string to a function so you pass the pointer to the first character, therefore char *upper(char *s). A pointer to a pointer would have two * like in char **pp:
char *str = "my string";
char **ptr_to_ptr = &str;
char c = **ptr_ptr_ptr; // same as *str, same as str[0], 'm'
upper could also be implemented as void upper(char *str), but it is more convenient to have upper return the passed string. You made use of that in your sample when you printf the string that is returned by upper.
Just as a comment, you can optimize your upper function. You are calling strlen for every i. C strings are always null terminated, so you can replace your i < strlen(word) with word[i] != '\0' (or word[i] != 0). Also the code is better to read if you do not compare against 96 and 123 and subtract 32 but if you check against and calculate with 'a', 'z', 'A', 'Z' or whatever character you have in mind.

the *words is even though a pointer bt the array word in function and the pointer word are actually pointing to the one and the same thing while passing arguments jst a copy of the "pointee" ie the word entered is passed and whatever operation is done is done on the pointer word so in the end we have to return a pointer so the return type is specified as *.

Related

c function definition calls for pointer but example does not use pointers

I relatively new to low level programming such as c. I am reviewing the strstr() function here. When reviewing the function definition char *strstr(const char *str1, const char *str2); I understand that function will return a pointer or a NULL depending if str2 was found in str1.
What I can't understand though, is if the funciton requires the two inputs to be pointers, when does the example not use pointers?
#include <string.h>
int main ()
{
char string[55] ="This is a test string for testing";
char *p;
p = strstr (string,"test");
if(p)
{
printf("string found\n" );
printf ("First occurrence of string \"test\" in \"%s\" is"\
" \"%s\"",string, p);
}
else printf("string not found\n" );
return 0;
}
In strstr(string,"test");, string is an array of 55 char. What strstr needs here is a pointer to the first element of string, which we can write as &string[0]. However, as a convenience, C automatically converts string to a pointer to its first element. So the desired value is passed to strstr, and it is a pointer.
This automatic conversion happens whenever an array is used in an expression and is not the operand of sizeof, is not the operand of unary &, and is not a string literal used to initialize an array.
"test" is a string literal. It causes the creation of an array of char initialized with the characters in the string, followed by a terminating null character. The string literal in source code represents that array. Since it is an array used in an expression, it too is converted to a pointer to its first element. So, again, the desired pointer is passed to strstr.
You could instead write &"test"[0], but that would confuse people who are not used to it.

Explanation of integer and register constant character in function

char firstmatch(char *s1, char *s2) {
char *temp;
temp = s1;
do {
if (strchr(s2, *temp) != 0)
return temp;
temp++;
} while (*temp != 0);
return 0;
}
char *strchr(register const char *s, int c) {
do {
if (*s == c) {
return (char*)s;
}
} while (*s++);
return (0);
}
I am new to programming and I have been given this code which finds the first character in a string s1 that is also in string s2. The task is to understand the C code and convert into Assembly code. As of right now my focus is just to understand what the C code is doing and I am currently having difficulty with pointers. I can sort through the code on the firstmatch() function and make my way down but I am kind of confused with the char * strchr() function. I am unable to understand whats the point of int c in regards to a constant character pointer? I'd appreciate if somebody could help explain it.
The function strchr() in your code sample is an incomplete implementation of the Standard C library function that locates the first occurrence of a character in a C string, if any.
The argument has type int for historical reasons: in early versions of the language functions arguments would be typed only if the implicit type int did not suffice. character arguments were passed as int values, so typing the argument differently was unnecessary.
The register keyword is obsolete: early C compilers were not as advanced as current ones and the programmer could help code generators determine which variables to store in CPU registers by adorning their definitions with the register keyword. Modern compilers are more efficient and usually beat programmers at this game, hence this keyword is mostly ignored nowadays.
Note however that this implementation behaves differently from the Standard function: the value of c must be converted to char before the comparison. As noted by chux, all functions in <string.h> treat bytes in C strings and memory blocks as unsigned chars for comparison purposes.
Here is a more readable version with the correct behavior:
#include <string.h>
char *strchr(const char *str, int c) {
const unsigned char *s = (const unsigned char *)str;
do {
if (*s == (unsigned char)c) {
return (char *)s;
}
} while (*s++ != '\0');
return NULL;
}
The int c argument might as well be char c. The type of *temp is char.
The strchr function takes a pointer into a nul terminated string and a char and either returns the pointer to the next occurrence of the char or null if it reached the nul at the end of the string.
strchr() receives a pointer to (think, memory address of) the first (or the only) character in a sequence.
The function extracts a character from memory using that pointer s and sees if its value matches the value of c. If there's a match, it returns the pointer.
If there's no match, it advances the pointer to the next character in the sequence (that is, increments the memory address by 1) and repeats.
If there's no match and the value of the character from memory is 0, NULL is returned.
The pointer being to a const char implies that memory isn't going to be written to, but may be read from. Indeed, the function never tries to write using the pointer.
So, you read chars from memory and compare them to an int. In most expressions chars implicitly convert to signed int (if such a conversion is generally possible without loss of any value of type char) or unsigned int (otherwise). See integer promotions on this. If after this both sides of the == operator are signed ints, everything is trivial, just compare those. If one is unsigned int (the promoted *s character) while the other one is signed int (c), the signed one is converted to unsigned (see the same linked article for the logic/rules), after which both sides of == have the same type (unsigned int) and are comparable (this is one of the key ideas of C, most binary operators convert their inputs to a common type and produce the result of that common type).
Simply put, in C you can compare different arithmetic types and the compiler will insert necessary (per the language rules) conversions. That said, not all conversions preserve value (e.g. a conversion from signed int to unsigned int doesn't preserve negative values, however they are converted in a well-defined manner) and that may be surprising (e.g. -1 > 1u evaluates to 1, which seems absurd to anyone knowing a bit of math), especially to the ones new to the language.
The real question here seems "Why isn't c defined as char?".
If one inspects the standard C library functions, they'll find that values of type char are (almost?) never passed or returned, although passing or returning pointers to char is quite common. Individual characters are typically passed by means of the int type. The reason for this is probably that, like mentioned above, char would convert to int or unsigned int in an expression anyway, so some additional conversions (back to char and then again to int) may be avoided.
The char *s1 represents a string in C.
The 0 represents the Acsii equivalent of '\0' which is the termination of a string in C. Chars and integers are interchangeable, but you need to know the Ascii value of each char. The letter 'A' is equivalent to integer 65 by Ascii value. This should answer your question about int c. It doesn't make any behavioral difference for the code.
Now suppose you had the string hello and meh, you would have:
char * s1 = ['h', 'e','l','l','o','\0']
char * s2 = ['m', 'e', 'h','\0']
So you call:
firstmatch('hello', 'meh')
temp is assigned the value of 'hello'.
Now you call
strchr('meh', 'h')
*temp in this case scenario is equivalent to temp[0], which is 'h'.
In the strchr, it loops through each letter of 'meh', starting from 'm'.
First iteration:
'm' == 'h' -> false therefore proceed to next letter (*s++)
Second iteration:
'e' == 'h' -> false therefore proceed to next letter (*s++)
Third iteration:
'h' == 'h' -> true therefore return a char value that is not 0.
This returns us to the firstmatch function inside the if condition.
Since the if condition passes on the third iteration, it returns us 'h'.
Suppose the third iteration failed, it would increment onto the next letter in s1, which would be 'e', and follow the same procedure described above.
Finally, the (*temp != 0) means that if we encounter the '\0' in the s1 for 'hello' we defined above, then it stops the entire loop and returns 0. Indicating there is no same letter.
Read about pointer arithmetic in C/C++ if you don't understand why *temp == temp[0]. Likewise *temp++ == temp[n+1] (n representing the current character).

Using a string in CS50 library

Hi all I have a question regarding a passing a string to a function in C. I am using CS50 library and I know they are passing string as a char array (char pointer to a start of array) so passing is done by reference. My function is receiving array as argument and it returns array. When I change for example one of the element of array in function this change is reflected to original string as I expect. But if I assign new string to argument, function returns another string and original string is not change. Can you explain the mechanics behind this behaviour.
#include <stdlib.h>
#include <cs50.h>
#include <stdio.h>
string test(string s);
int main(void)
{
string text = get_string("Text: ");
string new_text = test(text);
printf("newtext: %s\n %s\n", text, new_text);
printf("\n");
return 0;
}
string test(string s)
{
//s[0] = 'A';
s = "Bla";
return s;
}
First example reflects change in the first letter on both text and newtext strings, but second example prints out text unchanged and newtext as "Bla"
Thanks!
This is going to take a while.
Let's start with the basics. In C, a string is a sequence of character values including a 0-valued terminator. IOW, the string "hello" is represented as the sequence {'h', 'e', 'l', 'l', 'o', 0}. Strings are stored in arrays of char (or wchar_t for "wide" strings, which we won't talk about here). This includes string literals like "Bla" - they're stored in arrays of char such that they are available over the lifetime of the program.
Under most circumstances, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T", so most of the time when we're dealing with strings we're actually dealing with expressions of type char *. However, this does not mean that an expression of type char * is a string - a char * may point to the first character of a string, or it may point to the first character in a sequence that isn't a string (no terminator), or it may point to a single character that isn't part of a larger sequence.
A char * may also point to the beginning of a dynamically allocated buffer that has been allocated by malloc, calloc, or realloc.
Another thing to note is that the [] subscript operator is defined in terms of pointer arithmetic - the expression a[i] is defined as *(a + i) - given an address value a (converted from an array type as described above), offset i elements (not bytes) from that address and dereference the result.
Another important thing to note is that the = is not defined to copy the contents of one array to another. In fact, an array expression cannot be the target of an = operator.
The CS50 string type is actually a typedef (alias) for the type char *. The get_string() function performs a lot of magic behind the scenes to dynamically allocate and manage the memory for the string contents, and makes string processing in C look much higher level than it really is. I and several other people consider this a bad way to teach C, at least with respect to strings. Don't get me wrong, it's an extremely useful utility, it's just that once you don't have cs50.h available and have to start doing your own string processing, you're going to be at sea for a while.
So, what does all that nonsense have to do with your code? Specifically, the line
s = "Bla";
What's happening is that instead of copying the contents of the string literal "Bla" to the memory that s points to, the address of the string literal is being written to s, overwriting the previous pointer value. You cannot use the = operator to copy the contents of one string to another; instead, you'll have to use a library function like strcpy:
strcpy( s, "Bla" );
The reason s[0] = A worked as you expected is because the subscript operator [] is defined in terms of pointer arithmetic. The expression a[i] is evaluated as *(a + i) - given an address a (either a pointer, or an array expression that has "decayed" to a pointer as described above), offset i elements (not bytes!) from that address and dereference the result. So s[0] is pointing to the first element of the string you read in.
This is difficult to answer correctly without a code example. I will make one but it might not match what you are doing.
Let's take this C function:
char* edit_string(char *s) {
if(s) {
size_t len = strlen(s);
if(len > 4) {
s[4] = 'X';
}
}
return s;
}
That function will accept a pointer to a character array and if the pointer is not NULL and the zero-terminated array is longer than 4 characters, it will replace the fifth character at index 4 with an 'X'. There are no references in C. They are always called pointers. They are the same thing, and you get access to a pointed-at value with the dereference operator *p, or with array syntax like p[0].
Now, this function:
char* edit_string(char *s) {
if(s) {
size_t len = strlen(s);
if(len > 4) {
char *new_s = malloc(len+1);
strcpy(new_s, s);
new_s[4] = 'X';
return new_s;
}
}
s = malloc(1);
s[0] = '\0';
return s;
}
That function returns a pointer to a newly allocated copy of the original character array, or a newly allocated empty string. (By doing that, the caller can always print it out and call free on the result.)
It does not change the original character array because new_s does not point to the original character array.
Now you could also do this:
const char* edit_string(char *s) {
if(s) {
size_t len = strlen(s);
if(len > 4) {
return "string was longer than 4";
}
}
s = "string was not longer than 4";
return s;
}
Notice that I changed the return type to const char* because a string literal like "string was longer than 4" is constant. Trying to modify it would crash the program.
Doing an assignment to s inside the function does not change the character array that s used to point to. The pointer s points to or references the original character array and then after s = "string" it points to the character array "string".

Integer warning for char pointer

Can someone help me understand why I would be getting "warning: cast to pointer from integer of different size" for the following two lines of code?
so I have a pointer to a string (char *string) and a double pointer (char **final) that needs to the store the address of the last char in string... I thought the following lines of code would work but I keep getting the error... How do I fix it?
char last = *string;
*final = (char *)last;
(char *)last
last is of type char. Casting it to a pointer means the numeric code of the character stored in last will be interpreted as an address. So if last contains A, then this will cause the value 65 to be interpreted as an address. (Assuming ASCII). The compiler is smart and indicates that this is probably not the behavior you intend.
If string is a pointer to the last character in the string, last is a copy of that character. Since it's just a copy of the value, it bears no relationship to the location in the original string. To save that pointer into what final points to, you should do:
*final = string;
To declare a variable you have to specify what type you want the variable to be, and then what you want to call the variable. If you want a variable of type "char", called "last", it can be achieved by the following syntax:
char last;
If you want a pointer to a variable of a certain data type, you add the asterisk symbol like so:
char *last;
Now you have a pointer that you can use to point at a place in memory which have to contain a char. If you are trying to create a "string" in c, that is nothing more but a series of char's, that are ordered consecutively in memory. You can use a char pointer to point at the first char in this series of char's, and then you can use specific functions that work on strings (for example strcpy or strlen), by giving this char pointer as input argument.
Now to your problem. Let's say you create a string like this:
char *str = "example";
what you have done is create a series of char's, namely
'e', 'x', 'a', 'm', 'p', 'l', 'e', '\0'
(where the '\0' is the NULL character that marks the end of the string. This is necessary for any functions working on strings to recognize where the string ends). The char pointer you have created called "str" points at the first char, that is 'e'. Remember, the pointer has the address of this char, and all the rest of the chars are stored in the address space following this first char.
To access a particular char in this string, you have to dereference the pointer "str". If you want the first char in the string, you do this:
char first = *char;
This will save the first char in a variable of type char called "first", that is in this case the letter 'e'. To get the second char you do this:
char second = *(char+1);
What you're actually doing is "reading" (dereferencing) the value that your char pointer "str" is pointing to + 1 step of size "char" in memory. In this example, this means that the variable of type char called "second" now contains (the ASCII-value representing) the second letter in the string, that is 'x'.
If you want the size of a string you can use the function strlen. The syntax is this:
int length = strlen(str);
where "str" is our char pointer that is pointing at the first char in our string (that is 'e'). strlen will return the length of the string, not including the NULL character '\0' that simply marks the end of the string. That means in our example, length will equal 7, since there are 7 letters in the word "example". If you want to extract the last letter of this string, now all you have to do is what we did before, but remember that indexing in C start at 0. This means that if you have a string of length 7, the last element of this string will be located at "index" 6. Thus, to get the last char of a string you have to do this:
char last = *(str+length-1);
or if you have not saved length to a variable of type int, you can do it like this instead:
char last = *(str+strlen(str)-1);
If you want a pointer, pointing to the last char of the string, you have to initialize a new char pointer and make it point to place (memory address) where the last char of "str" is located. By the same logic as before, this is given by the memory address of the char at "index" 6 of our original string "str". So you create a new pointer, and let that pointer point to this memory address like this:
char *last = str+strlen(str)-1;
Remember that you need to include the header file string.h at the top of your file like so:
#include <string.h>

Arguments passed to puts function in C

I have only recently started learning C. I was going through the concept of arrays and pointers, when I came across a stumbling block in my understanding of it.
Consider this code -
#include<stdio.h>
int main()
{
char string[]="Hello";
char *ptr;
ptr=string;
puts(*ptr);
return(0);
}
It compiles, but runs into segmentation fault on execution.
The warning that I get is:
type error in argument 1 to `puts'; found 'char' expected 'pointer to char'
Now *ptr does return a character "H" and my initial impression was that it would just accept a char as an input.
Later, I came to understand that puts() expects a pointer to a character array as it's input, but my question is when I pass something like this - puts("H"), isn't that the same thing as puts(*ptr), given that *ptr does contain the character "H".
"H" is a string literal that consists of 2 bytes 'H' and '\0'. Whenever you have "H" in your code, a pointer to the memory region with 2 bytes is meant. *ptr simply returns a single char variable.
By doing puts(*str), you're dereferencing the str variable. This would then try and use the 'H' character as a memory address (since that's what str) points to, then segfault since it will be an invalid pointer (since it will probably fall outside your process' memory). This is because the puts function accepts a pointer as an argument.
What you really want is puts(str).
As an aside, the latter example puts("h") populates the string table with "h" at compile time and replaces the definition there with an implicit pointer.
The puts() function takes a pointer to a string and what you are doing is specifying a single character.
Take a look at this Lesson 9: C Strings.
So rather than doing
#include<stdio.h>
int main()
{
char string[]="Hello";
char *ptr;
ptr=string; // store address of first character of the char array into char pointer variable ptr
// ptr=string is same as ptr=&string[0] since string is an array and an
// array variable name is treated like a constant pointer to the first
// element of the array in C.
puts(*ptr); // get character pointed to by pointer ptr and pass to function puts
// *ptr is the same as ptr[0] or string[0] since ptr = &string[0].
return(0);
}
You should instead be doing
#include<stdio.h>
int main()
{
char string[]="Hello";
char *ptr;
ptr=string; // store address of first character of the char array into char pointer variable ptr
puts(ptr); // pass pointer to the string rather than first character of string.
return(0);
}
When ever you enter string in gets or want to display it using puts you had to actually pass the location of the pointer or the string
for example
char name[] = "Something";
if you want to print that
you have to write printf("%s",name); --> name actually stores the address of the string "something"
and by using puts if you want to display
puts(name) ----> same as here address is put in the arguments
No.
'H' is the character literal.
"H" is, in effect, a character array with two elements, those being 'H' and the terminating '\0' null byte.
puts is waiting as input a string pointer so it's waiting a memory address. but in your example you provided the content of the memory which is *ptr. the *ptr is the content of the memory with address ptr which is h
ptr is memory address
*ptr is the content of this memory
the input parameter of puts is an address type but you have provided a char type (content of the address)
the puts start the printing character by character starting by the address you give it as input until the memory which contain 0 and then it stop printing

Resources