Strings in C (strstr function) - c

I'm trying to reproduce the behavior of the strstr() function, which tries to find a substring in a string and for that I created the following function and compared it to the original. I did the iterations all on paper to understand what was going on in the function, but I don't understand why the command return (&str[i]); prints ab and not just a. When the function enters in if (to_find[j] == '\0'), the values of i and j are 2 and 2, so it should just print &str[2], which is a. Why is printing ab instead of just a?
#include <stdio.h>
#include <unistd.h>
#include <string.h>
char *ft_strstr(char *str, char *to_find)
{
int i;
int j;
i = 0;
if (*to_find == '\0')
return (str);
while (str[i] != '\0')
{
j = 0;
while (str[i + j] == to_find[j])
{
//printf("%d", i);
//printf("%d\n", j);
j++;
if (to_find[j] == '\0')
return (&str[i]);
}
i++;
}
return (0);
}
int main()
{
char i[] = "ab";
char dest[] = "a ab";
printf("%s", ft_strstr(dest, i));
//printf("%s", strstr(dest, i));
}

return (&str[i]); What this does is:
str[i] this is the same as *(str+i). It means take the address where str points to, add i to it and get its value.
&x means get the address of x
(&str[i]) Is a combination of the 2 above. It means take the address where str points to, add i to it and get its value and then get the address of that value. Which means the last 2 steps cancel each other out and you get: take the address where str points to, add i to it. It is the same as (str+i).
The return statement just means return this pointer.
Now after you called ft_strstr(dest, i) you get this pointer. In your case this pointer points to the second a in the string "a ab". You give this pointer to printf() and with the "%s" you told printf() to print the string where this pointer points to till there is a '\0'-byte. 'b' is not a '\0'-byte so it will also be printed.
The returned pointer points to here when you call printf():
V
+---+---+---+---+---+
| a | | a | b |\0 |
+---+---+---+---+---+
printf() will then check if it points to a '\0'-byte, which is false so the byte ('a') is printed and the pointer is incremented. Then the same is done for the byte ('b') and then it points to '\0', which means printf() stops here.

return (&str[i]); does not print anything. It just returns a value. And the value is the address to the i:th element of str.
The printing happens in printf("%s", ft_strstr(dest, i)); and what happens here is that you start with a format string, and the %s is a specifier that basically says "print characters until hitting the zero terminator, and start with the address specified".

When you print the return value from strstr using the specifier %s it will print "the needle" (aka the string to find) and all of the original string after the needle.
In your case there is nothing after the needle so your output is just "ab". Had your original string been "a abHelloWorld" your output would have been "abHelloWorld".
If you just want the first character of the needle do:
printf("%c", *ft_strstr(dest, i));
Also you can try this to get a better understanding:
char str[] = "Hello World";
printf("%s\n", str);
printf("%s\n", &str[0]);
printf("%s\n", &str[1]);
printf("%s\n", &str[2]);
printf("%s\n", &str[3]);
...
which will give you output
Hello World
Hello World
ello World
llo World
lo World
...
As you can see, printing from &str[n] will skip the first N characters and print the rest of the original string.
BTW:
char *ft_strstr(char *str, char *to_find)
should be
char *ft_strstr(const char *str, const char *to_find)

printf("%s", ...) prints a string, not a single character. To print a character, use the %c format specifier and provide the first element of the array as the argument:
printf("%c", *ft_strstr(dest, i));
Another option is to use putchar(...) for printing a single character.
For more information on printf format specifiers, see here.

Within this while loop
while (str[i + j] == to_find[j])
{
//printf("%d", i);
//printf("%d\n", j);
j++;
if (to_find[j] == '\0')
return (&str[i]);
}
the variable i is not being changed. It stays unchanged. It is the variable j that is being changed.
On the other hand the return statement
return (&str[i]);
returns a pointer to the symbol at the position i that is to the beginning of the substring equal to the string to_find.
{ay attention to that the array dest declared like
char dest[]="a ab";
in fact has the following content
char dest[] = { 'a', ' ', 'a', 'b', '\0' };
and the function returns a pointer to its substring
char dest[] = { 'a', ' ', 'a', 'b', '\0' };
^
|
{ 'a', 'b', '\0' }
that is a pointer to the first character of the string "ab" that is a substring of the string stored in the array dest.
Bear in main that when you write for example
printf( "%s\n", dest );
when this call is equivalent to
int i = 0;
printf( "%s\n", &dest[i] );

Related

delete leading characters before a string in C (concept question)

I’m learning C, dealing with strings and pointers. An exercise calls for deleting all leading characters (‘X’ in this case) before a string. The called function must accept a string, i.e. a pointer to a char. I have found multiple ways of doing this by searching, but I do not understand why the following code does not work...what concept am I missing?
//delete all leading X characters by passing pointer to a string
#include <stdio.h>
#include <string.h>
void delChar(char* str)
{
char* walker; //declare pointer
walker = str; //point to beginning of passed string
while(*walker == 'X') walker++; //move pointer past amy leading 'X'
printf("\nwalker string is now: %s", walker); //prints "Test" as expected
str = walker; //set str pointer to walker
printf("\nstr string is now: %s", str); //prints "Test" as expected
return;
}
int main()
{
char* myStr = "XXXXXXXXXXXXTest";
printf("Before function call: %s", myStr); //prints "XXXXXXXXXXXXTest" as expected
delChar(myStr); //pass pointer to the string
printf("\nAfter function call: %s", myStr); //WHY DOES THIS print "XXXXXXXXXXXXTest" ?
return 0;
}
There are multiple ways in which characters can be deleted from a string, and it is not clear which you want.
In C, memory contents cannot be “deleted.” Memory is formed of bytes, and bytes hold values. When we have a string, it exists in a certain place in memory, and the bytes of the string cannot be made to go away.
Three ways to delete characters from a string are:
Given the start address of the string, return the address of the desired part of the string.
Given a pointer to the start of the string, update the pointer to point to the desired part of the string.
Move characters from later in the string to earlier.
Here are sample implementations:
#include <stdio.h>
/* Deletion method 0: Find the first character that is not an "X" and
return its address.
*/
static char *DeleteMethod0(char *string)
{
for (char *p = string; ; ++p)
if (*p != 'X')
return p;
}
// Deletion method 1: Update the pointer to the start of the string.
static void DeleteMethod1(char **string)
{
while (**string == 'X')
++*string;
}
// Deletion method 2: Move characters.
static void DeleteMethod2(char *string)
{
// Find the point where we stop deleting.
char *source = string;
while (*source == 'X')
++source;
// Copy the undeleted part of the string to the start.
while (*source)
*string++ = *source++;
*string = '\0';
}
int main(void)
{
char *string = "XXXXXXXXXXXXTest";
char buffer[] = "XXXXXXXXXXXXTest";
printf("The string is %s.\n", string);
printf("The buffer contains %s.\n", buffer);
char *after = DeleteMethod0(string);
printf("The string after deletion by getting new address %s.\n", after);
DeleteMethod1(&string);
printf("The string after deletion by updating the pointer is %s.\n", string);
DeleteMethod2(buffer);
printf("The buffer after deletion by moving characters is %s.\n", buffer);
}
Another option would be to make a new copy of the desired part of the string, in memory either supplied by the caller or allocated by the deletion routine.
For starters the function should be declared like
char * delChar( char *str );
The function parameter str is a local variable of the function. So this assignment
str = walker;
does not change the pointer myStr declared in main. This pointer is passed to the function by value. That is the function deals with a copy of the pointer. And the assignment does not change the original pointer myStr. It changes only its local variable str that was initialized by a copy of the value of the pointer myStr.
Also you may not change a string literal. Any attempt to change a string literal results in undefined behavior. But you need indeed to change the passed string as at least followed from your assignment
delete leading characters before a string in C
That is the task is not to find the pointer that points to the first character that is not equal to 'X'. You need to remove leading characters equal to 'X' from a string.
In main you need to declare a character array instead of a pointer to a string literal as for example
char myStr[] = "XXXXXXXXXXXXTest";
The function itself can be defined the following way
char * delChar( char *str )
{
char *walker = str;
while ( *walker == 'X' ) ++walker;
if ( walker != str ) memmove( str, walker, strlen( str ) + 1 - ( walker - str ) );
return str;
}
And in main it is enough to write
printf("After function call: %s\n", delChar( myStr ) );
Here is a demonstration program
#include <stdio.h>
#include <string.h>
char * delChar( char *str )
{
char *walker = str;
while (*walker == 'X') ++walker;
if (walker != str) memmove( str, walker, strlen( str ) + 1 - ( walker - str ) );
return str;
}
int main( void )
{
char myStr[] = "XXXXXXXXXXXXTest";
printf( "Before function call: %s\n", myStr );
printf( "After function call: %s\n", delChar( myStr ) );
}
The program output is
Before function call: XXXXXXXXXXXXTest
After function call: Test
The function will be more flexible if to declare a second parameter that will specify a character that should be deleted from the beginning of a string. For example
char * delChar( char *str, char c )
{
if ( c != '\0' )
{
char *walker = str;
while (*walker == c) ++walker;
if (walker != str) memmove( str, walker, strlen( str ) + 1 - ( walker - str ) );
}
return str;
}
In this case the function is called like
printf( "After function call: %s\n", delChar( myStr, 'X'));
str variable in function delChar will created over stack and store address you have passed and will be destroyed when function returns
void delChar(char* str) // str variable will created over stack and store address you have passed and will be destroyed when function returns
{
char* walker; //declare pointer
walker = str; //point to beginning of passed string
while(*walker == 'X') walker++; //move pointer past amy leading 'X'
printf("\nwalker string is now: %s", walker); //prints "Test" as expected
str = walker; //set str pointer to walker
printf("\nstr string is now: %s", str); //prints "Test" as expected
return;
}
After the return str in main will still point to start of the string.
you need to return the address and store
you can track count using counter and return the count Like below
#include<stdio.h>
int delChar(char* str)
{
int count = 0;
while(*str++ == 'X')
count++; // increment the count when x found
printf("\nwalker string is now: %s", str+count);
return count;
}
int main()
{
char* myStr = "XXXXXXXXXXXXTest";
int count;
printf("Before function call: %s", myStr);
count = delChar(myStr);
printf("\nAfter function call: %s", myStr+count);
return 0;
}
Thank you for the thoughtful replies and comments. I infer that this has basically been a question about pointers; that is. modifying a string without needing to return a pointer from a function call. I simplified the example code question for the benefit of those who might be learning as well, with comments (see below). Let me know if I’m off base here...
#include <stdio.h>
#include <stdlib.h>
void func(char** str) //note POINTER-TO-POINTER parameter!
{
printf("\nintial func str: %s", *str); //note dereference before str; prints "FULL TEXT"
*str += 5; //increment pointer five spaces to right to modify the string
printf("\nafter func str: %s", *str); //note dereference before str; prints "TEXT"
}
int main()
{
char* myStr = "FULL TEXT"; //need to initialize string with pointer variable
//char myStr[] = "FULL TEXT"; //initializing string as array will not work for this example!
printf("\n...before MAIN func call: %s", myStr); //prints "FULL TEXT"
/*pass ADDRESS of pointer variable instead of pointer variable itself, i.e. func
parameter needs to be a pointer-to-a-pointer...this is essentially passing by REFERENCE
instead of by VALUE (where a copy would get clobbered when returning to main)*/
func(&myStr); //note ADDRESS symbol, i.e. address of pointer variable
printf("\n...after MAIN func call: %s", myStr); //prints "TEXT", modified string remains after func call
return 0;
}

Why following code snippets assignment gives confusing output?

I'm studying C. I came across with string arrays. I'm bit confused about the following codes. I was anticipating one kind of output; however, getting completely different kind of output or program crush due to read access violation.
I've run this code on visual studio 2017, with _CRT_SECURE_NO_WARNINGS
// case 1
char* name[2];
//name[0] = (char*)malloc(sizeof(char*) * 10);
//name[1] = (char*)malloc(sizeof(char*) * 10);
name[0] = "john";
name[1] = 'doe';
printf("%s\n", name[0]); // prints john
//printf("%s\n", name[1]); // gives read access violation exception, why??? even with dynamically allocated memory
// case 2
char* name2[2] = { "emma", "olsson" };
printf("%s\n", name2[0]); // prints emma
printf("%s\n", name2[1]); // prints olsson, why no error???
// case 3
for (int i = 0; i < 2; i++)
{
name[i] = name2[i];
}
printf("%s\n", name[0]); // prints emma
printf("%s\n", name[1]); // prints olsson, why no error???
// case 4
char inputName[10];
int i = 0;
while (i < 2)
{
fgets(inputName, sizeof(inputName), stdin); // first input: Max second input: Payne
char* pos = strchr(inputName, '\n');
if (pos != NULL)
*pos = '\0';
name[i++] = inputName;
}
printf("%s\n", name[0]); // prints Payne, why not Max???
printf("%s\n", name[1]); // prints Payne
For case 1, 'doe' is not a string.
Case 2 works because you are initializing you pointers with string literals.
Case 3 works too because you assign the same initialized pointer in case 2 to case 1 pointers. Your name array pointers are basically set to point to where name2 ones are pointing.
In case 4, you declared inputName which points to a set of 10 chars. Then each time you get a new input you are writing it to the same memory section. And by doing this:name[i++] = inputName;
you are not copying a new char array to name[i] as you might think. Instead, you are telling name[i] char pointer to point to inputName. So it is normal that name prints last input twice, because that's what inputName points to, as well as both name char pointers.
It is unclear whether OP's code runs within main() or a user-defined function and what kind of value returns. That said, after removing superfluous variable redeclarations, here's how I achieved working code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char * name[2];
char * name2[2]={ "emma", "olsson" };
char inputName[10];
char names[BUFSIZ];
int i = 0;
// case 1
name[0] = "john";
name[1] = "doe";
printf("%s %s\n", name[0],name[1]); //john doe
// case 2
printf("%s %s\n", name2[0],name2[1]);//emma olsson
// case 3
for (i = 0; i < 2; i++){
name[i] = name2[i];
}
printf("%s %s\n", name[0],name[1]);//emma olsson
// case 4
i=0;
while (fgets(inputName, sizeof(inputName), stdin) != NULL && (i < 2) ){
strcat(names,inputName);
i++;
}
printf("\n%s\n",names);
return 0;
}
See live code here
OP should replace the single quotes around doe with double quotes which denote a null-terminated string. Single quotes are meant for a single character, i.e. 'a' refers to a byte value while "a" signifies a string containing two characters, an 'a' and a '\0'.
Also, OP should include two other libraries to facilitate execution. In particular, string.h is needed for the built-in string functions to execute properly.
Case 2 and Case 3 work because strings are encompassed by double quotes instead of single quotes. Note that in each case the "%s" format specifier for the printf() indicates that a string needs to be displayed.
In last case, fgets() with respect to stdin, upon success returns user input as a string. But that input will be overridden in the while loop, unless in each iteration you concatenate the old input with the new. Otherwise, when the inputName element values change because its address remains constant, only the latest input string displays. Here's some code that illustrates this point:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char * name[2];
char inputName[10];
int i = 0;
while (fgets(inputName, sizeof(inputName), stdin) != NULL && (i < 2) ){
printf("inputName: %p points to: %s",inputName,inputName);
name[i++] = inputName;
}
printf("\n name[0]: %p points to: %s\n name[1]: %p points to: %s",name[0],name[0],name[1],name[1]);
return 0;
}
Output:
inputName: 0x7fff8a511a50 points to: Maxine
inputName: 0x7fff8a511a50 points to: Pauline
name[0]: 0x7fff8a511a50 points to: Pauline
name[1]: 0x7fff8a511a50 points to: Pauline
See live code.
Incidentally, you don't need an array to display the names and indeed one may display the names outside of the loop as long as within the loop the code concatenates user input.

string not printing properly in C

I am encountering a problem while printing out a string using a while loop in a standalone function.
I have the following code:
#include <stdio.h>
int pword(char *);
int main() {
char s[] = "Alice";
pword(s);
return 0;
}
int pword(char *s) {
while(*s!='\0') {
printf("%s", s);
s++;
}
printf("\n");
return 0;
}
This is printing: Aliceliceicecee.
you're printing the offseted word each time, instead of the character.
Try changing (for instance)
printf("%s", s);
by
printf("%c", *s);
or since you don't really need formatting, use
putchar(*s);
(all this means that you're basically rewriting puts with a loop. So if no further processing is required on the characters, maybe you should just stick with standard functions)
%s means expect a const char * argument
%c means expect a character argument. The character argument is printed. Null characters are ignored;
You are looking for later one.
More info on %s: The argument is taken to be a string (character pointer), and characters from the string
are printed until a null character or until the number of characters indicated by the
precision specification is reached; however, if the precision is 0 or missing, all characters up to a null are printed;
Seeing no answer explained what exactly was going on, here is what you are actually doing:
int pword(char *s) { /* s = "Alice" (s is a char* that holds the address of "Alice" string)*/
while(*s!='\0') { /* check if the first char pointed to by s != '\0' */
printf("%s", s); /* print the string that start at s*/
s++; /* move s (the char pointer) 1 step forward*/
} /* s points to "lice" -> "ice" -> "ce" -> "e" */
printf("\n");
return 0;
}
In order to print the string "Alice" you could have just used printf("%s", s); as it would take the address pointed to by s, where "Alice" is stored, and print it until reaching null-terminator ('\0').
If you want to use a loop and print char by char, you should have used printf("%c", *s);. Using %c is meant for printing char where %s is for printing strings. Another thing to note is the s vs *s, where the former is a char* (pointer to char) that can hold number of consecutive chars, and the later (*s)is *(char*) i.e. dereferenced char*, that holds a single char.
To sum up:
print char by char
int pword(char *s) {
while(*s!='\0') {
printf("%c", *s);
s++;
}
printf("\n");
return 0;
}
print the whole string at once
int pword(char *s) {
printf("%s\n", s);
return 0;
}
If you want to print character by character, you should use *s in the printf statement like below.
#include <stdio.h>
int pword(char *);
int main() {
char s[] = "Alice";
pword(s);
return 0;
}
int pword(char *s) {
while(*s!='\0') {
printf("%c", *s);
s++;
}
printf("\n");
return 0;
}

Format "%s" expects and agument of type char* etc, I just want to print the alphabet

Why can't I print the alphabet using this code?
void ft_putchar(char c)
{
write(1, &c, 1);
}
int print_alf(char *str)
{
int i;
i = 0;
while (str[i])
{
if (i >= 'A' && i <= 'Z')
ft_putchar(str[i]);
else
ft_putchar('\n');
i++;
}
return (str);
}
int main ()
{
char a[26];
printf("%s", print_alf(a));
return (0);
}
I get this warning
format ' %s ' expects type 'char*' but argument 2 has type 'int'
How do I print the alphabet using a string, and write function?
Your entire print_alf function looks suspicious.
You are returning str which is of type char *. Therefore the return type of print_alf should to be char * instead of int.
Your while (str[i]) loop makes no sense at all since you are passing uninitialized memory to it. So your code will very likely corrupt the memory since the while loop will continue to run until a '\0' is found within the memory which does not need to be the case within the boundaries of the passed memory (a).
You are not adding a zero termination character ('\0') at the end of the string. This will result in printf("%s", print_alf(a)); printing as many characters beginning at the address of a until a '\0' is found within the memory.
Here is a suggestion how to fix all that problems:
char *print_alf(char *str, size_t len)
{
char letter;
if ((str) && (len >= 27)) // is str a valid pointer and length is big enough?
{
for (letter = 'A'; letter <= 'Z'; letter++) // iterate all characters of the alphabet
{
*str = letter;
str++;
}
*str = '\0'; // add zero termination!!!
}
else
{
str = NULL; // indicate an error!
}
return (str);
}
int main()
{
char a[26 + 1]; // ensure '\0' fits into buffer!
printf("%s", print_alf(a, sizeof(a)));
return (0);
}
Make up your mind whether print_alf should return a string which you then print with printf or whether print_alf should be a void function that does the printing, which you should then just call without printf. At the moment, your code tries to be a mixture of both.
The easiest way is to just print the alphabet:
void print_alf(void)
{
int c;
for (c = 'A'; c <= 'Z'; c++) putchar(c);
}
Call this function like so:
print_alf(); // print whole alphabet to terminal
A more complicated variant is to fill a string with the alphabet and then print that string. That's what you tried to achieve, I think. In that case, you must pass a sufficiently big buffer to the function and return it. Note that if you want to use the string functions and features of the standard lib (of which printf("%s", ...) is one) you must null-terminate your string.
char *fill_alf(chat *str)
{
int i;
for (i = 0; i < 26; i++) str[] = 'A' + i;
str[26] = '\0';
return str;
}
It is okay to return the buffer that was passed into the function, but beware of cases where you return local character buffers, which will lead to undefined behaviour.
You can call it as you intended in your original code, but note that you must make your buffer at least 27 characters big to hold the 26 letters and the null terminator:
char a[27];
printf("%s\n", fill_alf(a));
Alternatively, you could do the filling and printing in twp separate steps:
char a[27];
fill_alf(a); // ignore return value, because it's 'a'
printf("%s\n", a); // print filled buffer
If you just want to print the alphabet, the print_alf variant is much simpler and straightforward. If you want to operate further on the alphabet, eg do a shuffle, consider using fill_alf.
Your print_alf(char *str) function actually returns an integer which causes the error (it is defined to return int). When you specify %s to printf it expects characters, not numbers.
You can fix this by changing the return type of your function to char and if everything else works in your code you'll be good to go.

String slicing in array prints two characters?

The C program below print the first and last character of 16 words strings:
#include<stdio.h>
#include<string.h>
void main()
{
char first, last;
char *str = "abcdefghijklmnop";
first = str[0];
last = str[15];
printf("%s", &first);
printf("%s", &last);
}
The output I am seeking is a and p. But, when I run this code I get the output:
apa
What am I doing wrong?
You're missing understanding of pointers. When you assign a character to first and last, then those characters will essentially be copied to first and last. Since first and last are distinct variables, their addresses have no relation to the char *str pointer. Also, printf("%s", &first); (and the same with last) invokes undefined behaviour, since printf expects a 0-terminated string, but you pass one character only, after which there's no zero terminator.
What you can do is either use pointers:
char *first = str + 0;
char *last = str + 15;
printf("%s %s", first, last);
This will print abcdefghijklmnop p
or to print the two chars only:
char first = str[0];
char last = str[15];
printf("%c %c", first, last);
This will print a p.
below lines will bring correct result
printf("%c", first );
printf("%c", last );
Your variables first and last are actual character values, not strings/pointers. You need to use %c instead. Try:
#include<stdio.h>
#include<string.h>
void main()
{
char first, last;
char *str = "abcdefghijklmnop";
first = str[0];
last = str[15];
printf("%c", first);
printf("%c", last);
}
The %s expects to get a pointer that points to an array of character values and keeps reading until it reaches a NULL character \0.
You can read more about this here http://pw1.netcom.com/~tjensen/ptr/ch3x.htm and here http://www.codingunit.com/printf-format-specifiers-format-conversions-and-formatted-output.
I find that the following C++ page has a better diagrams for visualizing pointers http://cplusplus.com/doc/tutorial/pointers/.

Resources