I am trying to search two different strings in a line using strstr.
sBuffer = "This is app test"
s1= strstr (sBuffer, "This");
s2= strstr (sBuffer, "test");
printf("%s\n", s1); //prints - This is app test
printf("%s\n", s2); //prints - test
if (s1 && s2)
//do something
Expected output for s1 should be the string "This" but it is printing the entire string for s1.
s2 however is printed correctly.
Any help appreciated.
EDIT: Although all the answers are correct (upvoted all answers), I am accepting dasblinkenlight's answer. This is because I realize checking the boolean condition as shown below would suffice my requirement. Thanks for all the answers.
if ( (strstr (sBuffer, "This")) && (strstr (sBuffer, "test")) )
//do something
You're not understanding what the function does.
It gives you the address in sBuffer ("the haystack") where the search string ("the needle") has been found. It doesn't modify the haystack string, so it won't terminate the sub-string.
You have:
+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+----+
sBuffer: | T | h | i | s | | i | s | | a | p | p | | t | e | s | t | \0 |
+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+----+
^ ^
| |
| |
strstr(sBuffer, "Test") strstr(sBuffer, "test")
As you can see, strstr(sBuffer, "Test") will simply return sBuffer, which of course still contains the rest of the characters, it's the same memory buffer.
If you need to extract the sub-string that you found, you must do so yourself. A suitable function to use is strlcpy() if you have it, else strncpy() will work since you know the exact length of the data to copy.
The return value of strstr is the pointer to the original, unmodified, string at the point of the match. The reason why the second call displays test is a coincidence: test simply happens to be at the end of the searched string. Had the sBuffer been "This is app test of strstr", the output for the second call would be test of strstr, not simply test.
To fix this, you can change your program like this:
printf("%s\n", s1 ? "This" : "");
printf("%s\n", s2 ? "test" : "");
The reason this works is that you know that the only case when strstr would return a non-null pointer is when it finds the exact match to what you've been searching for. If all you need is a boolean "found/not found" flag, you can simply test s1 and s2 for NULL. You are using this trick already in your final if statement.
strstr() returns a pointer to the first character of the substring it found. It doesn't NUL-terminate the string after the searched substring, this is the expected and correct behavior.
As to the solution: if you have a non-const string, you can simply modify it so that it's NUL-terminated at the correct position (but then beware of the modifications you made). If not, then make a copy of the substring.
const char *haystack = "abcd efgh ijkl";
const char *needle = "efgh";
const char *p = strstr(haystack, needle);
if (p) {
size_t l = strlen(needle);
char buf[l + 1];
memcpy(buf, p, l);
buf[l] = 0;
printf("%s\n", buf);
}
Related
I'm looking to split a sting based on a specific sequence of characters but only if they are in order.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
int i = 0;
char **split;
char *tmp;
split = malloc(20 * sizeof(char *));
tmp = malloc(20 * 12 * sizeof(char));
for(i=0;i<20;i++)
{
split[i] = &tmp[12*i];
}
char *line;
line = malloc(50 * sizeof(char));
strcpy(line, "Test - Number -> <10.0>");
printf("%s\n", line);
i = 0;
while( (split[i] = strsep(&line, " ->")) != NULL)
{
printf("%s\n", split[i]);
i++;
}
}
This will print out:
Test
Number
<10.0
However I just want to split around the -> so it could give the output:
Test - Number
<10.0>
I think the best way to do the splits with an ordered sequence of delimeters is
to replicate strtok_r behaviour using strstr, like this:
#include <stdio.h>
#include <string.h>
char *substrtok_r(char *str, const char *substrdelim, char **saveptr)
{
char *haystack;
if(str)
haystack = str;
else
haystack = *saveptr;
char *found = strstr(haystack, substrdelim);
if(found == NULL)
{
*saveptr = haystack + strlen(haystack);
return *haystack ? haystack : NULL;
}
*found = 0;
*saveptr = found + strlen(substrdelim);
return haystack;
}
int main(void)
{
char line[] = "a -> b -> c -> d; Test - Number -> <10.0> ->No->split->here";
char *input = line;
char *token;
char *save;
while(token = substrtok_r(input, " ->", &save))
{
input = NULL;
printf("token: '%s'\n", token);
}
return 0;
}
This behaves like strtok_r but only splits when the substring is found. The
output of this is:
$ ./a
token: 'a'
token: ' b'
token: ' c'
token: ' d; Test - Number'
token: ' <10.0>'
token: 'No->split->here'
And like strtok and strtok_r, it requires that the source string is
modifiable, as it writes the '\0'-terminating byte for creating and returning
the tokens.
EDIT
Hi, would you mind explaining why '*found = 0' means the return value is only the string in-between delimiters. I don't really understand what is going on here or why it works. Thanks
The first thing you've got to understand is how strings work in C. A string is
just a sequence of bytes (characters) that ends with the '\0'-terminating
byte. I wrote bytes and characters in parenthesis, because a character in C is
just a 1-byte value (on most systems a byte is 8 bit long) and the integer
values representing the characters are those defined in the ASSCI code
table, which are 7-bit long values. As you can see from the table the
value 97 represents the character 'a', 98 represents 'b', etc. Writing
char x = 'a';
is the same as doing
char x = 97;
The value 0 is an special value for strings, it is called NUL (null character)
or '\0'-terminating byte. This value is used to tell the functions where a
string ends. A function like strlen that returns the length of a string, does
it by counting how many bytes it encounters until it encounters a byte with
the value 0.
That's why strings are stored using char arrays, because a pointer to an array
gives to the start of the memory block where the sequence of chars is stored.
Let's look at this:
char string[] = { 'H', 'e', 'l', 'l', 'o', 0, 48, 49, 50, 0 };
The memory layout for this array would be
0 1 2 3 4 5 6 7 8 9
+-----+-----+-----+-----+-----+----+-----+-----+-----+----+
| 'H' | 'e' | 'l' | 'l' | 'o' | \0 | '0' | '1' | '2' | \0 |
+-----+-----+-----+-----+-----+----+-----+-----+-----+----+
or to be more precise with the integer values
0 1 2 3 4 5 6 7 8 9 10
+----+-----+-----+-----+-----+---+----+----+----+---+
| 72 | 101 | 108 | 108 | 111 | 0 | 48 | 49 | 50 | 0 |
+----+-----+-----+-----+-----+---+----+----+----+---+
Note that the value 0 represents '\0', 48 represents '0', 49 represents
'1' and 50 represents '2'. If you do
printf("%lu\n", strlen(string));
the output will be 5. strlen will find the value 0 at the 5th position and
stop counting, however string stores two strings, because from the 6th
position on, a new sequence of characters starts that also terminates with 0, thus making it a
second valid string in the array. To access it, you would need to have pointer
that points past the first 0 value.
printf("1. %s\n", string);
printf("2. %s\n", string + strlen(string) + 1);
The output would be
Hello
012
This property is used in functions like strtok (and mine above) to return you
a substring from a larger string, without the need of creating a copy (that would be
creating a new array, dynamically allocating memory, using strcpy to create
the copy).
Assume you have this string:
char line[] = "This is a sentence;This is another one";
Here you have one string only, because the '\0'-terminating byte comes after
the last 'e' in the string. If I however do:
line[18] = 0; // same as line[18] = '\0';
then I created two strings in the same array:
"This is a sentence\0This is another one"
because I replaced the semicolon ';' with '\0', thus creating a new string
from position 0 to 18 and a second one from position 19 to 38. If I do now
printf("string: %s\n", line);
the output will be
string: This is a sentence
Now let's us take look at the function itself:
char *substrtok_r(char *str, const char *substrdelim, char **saveptr);
The first argument is the source string, the second argument is the delimiters
strings and the third one is doule pointer of char. You have to pass a pointer
to a pointer of char. This will be used to remember where the function should
resume scanning next, more on that later.
This is the algorithm:
if str is not NULL:
start a new scan sequence from str
otherwise
resume scanning from string pointed to by *saveptr
found position of substring_d pointed to by 'substrdelim'
if no such substring_d is found
if the current character of the scanned text is \0
no more substrings to return --> return NULL
otherwise
return the scanned text and set *saveptr to
point to the \0 character of the scanned text,
so that the next iteration ends the scanning
by returning NULL
otherwise (a substring_d was found)
create a new substring_a until the found one
by setting the first character of the found
substring_d to 0.
update *saveptr to the start of the found substring_d
plus it's previous length so that *saveptr
points to the past the delimiter sequence found in substring_d.
return new created substring_a
This first part is easy to understand:
if(str)
haystack = str;
else
haystack = *saveptr;
Here if str is not NULL, you want to start a new scan sequence. That's why
in main the input pointer is set to point to the start of the string saved
in line. Every other iteration must be called with str == NULL, that's
why the first thing is done in the while loop is to set input = NULL; so
that substrtok_r resumes scanning using *saveptr. This is the standard
behaviour of strtok.
The next step is to look for a delimiting substring:
char *found = strstr(haystack, substrdelim);
The next part handles the case where no delimiting substring is
found2:
if(found == NULL)
{
*saveptr = haystack + strlen(haystack);
return *haystack ? haystack : NULL;
}
*saveptr is updated to point past the whole source, so that it points to the
'\0'-terminating byte. The return line can be rewritten as
if(*haystack == '\0')
return NULL
else
return haystack;
which says if the source already is an empy string1, then return
NULL. This means no more substring are found, end calling the function. This
is also standard behaviour of strtok.
The last part
*found = 0;
*saveptr = found + strlen(substrdelim);
return haystack;
is handles the case when a delimiting substring is found. Here
*found = 0;
is basically doing
found[0] = '\0';
which creates substrings as explained above. To make it clear once again, before
Before
*found = 0;
*saveptr = found + strlen(substrdelim);
return haystack;
the memory looks like this:
+-----+-----+-----+-----+-----+-----+
| 'a' | ' ' | '-' | '>' | ' ' | 'b' | ...
+-----+-----+-----+-----+-----+-----+
^ ^
| |
haystack found
*saveptr
After
*found = 0;
*saveptr = found + strlen(substrdelim);
the memory looks like this:
+-----+------+-----+-----+-----+-----+
| 'a' | '\0' | '-' | '>' | ' ' | 'b' | ...
+-----+------+-----+-----+-----+-----+
^ ^ ^
| | |
haystack found *saveptr
because strlen(substrdelim)
is 3
Remember if I do printf("%s\n", haystack); at this point, because the '-' in
found has been set to 0, it will print a. *found = 0 created two strings out
of one like exaplained above. strtok (and my function which is based on
strtok) uses the same technique. So when the function does
return haystack;
the first string in token will be the token before the split. Eventually
substrtok_r returns NULL and the loop exists, because substrtok_r returns
NULL when no more split can be created, just like strtok.
Fotenotes
1An empty string is a string where the first character is already the
'\0'-terminating byte.
2This is very important part. Most of the standard functions in the C
library like strstr will not return you a new string in memory, will
not create a copy and return a copy (unless the documentation says so). The
will return you a pointer pointing to the original plus an offset.
On success strstr will return you a pointer to the start of the substring,
this pointer will be at an offset to the source.
const char *txt = "abcdef";
char *p = strstr(txt, "cd");
Here strstr will return a pointer to the start of the substring "cd" in
"abcdef". To get the offset you do p - txt which returns how many bytes
there are appart
b = base address where txt is pointing to
b b+1 b+2 b+3 b+4 b+5 b+6
+-----+-----+-----+-----+-----+-----+------+
| 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | '\0' |
+-----+-----+-----+-----+-----+-----+------+
^ ^
| |
txt p
So txt points to address b, p points to address b+2. That's why you get
the offset by doing p-txt which would be (b+2) - b => 2. So p points to
the original address plus the offset of 2 bytes. Because of this bahaviour
things like *found = 0; work in the first place.
Note that doing things like txt + 2 will return you a new pointer pointing to
the where txt points plus the offset of 2. This is called pointer arithmetic.
It's like regualr arithmetic but here the compiler takes the size of an object
into consideration. char is a type that is defined to have the size of 1,
hence sizeof(char) returns 1. But let's say you have an array of integers:
int arr[] = { 7, 2, 1, 5 };
On my system an int has size of 4, so an int object needs 4 bytes in memory.
This array looks like this in memory:
b = base address where arr is stored
address base base + 4 base + 8 base + 12
in bytes +-----------+-----------+-----------+-----------+
| 7 | 2 | 1 | 5 |
+-----------+-----------+-----------+-----------+
pointer arr arr + 1 arr + 2 arr + 3
arithmetic
Here arr + 1 returns you a pointer pointing to where arr is stored plus an
offset of 4 bytes.
I'm trying to compare a string with another string and if they match I want the text "That is correct" to output but I can't seem to get it working.
Here is the code:
int main ()
{
char * password = "Torroc";
char * userInput;
printf("Please enter your password: ");
scanf("%s", userInput);
if (strcmp(password, userInput) == 0) {
printf("That is correct!");
}
}
In your code, userInput pointer does not have provision to hold the string that you are about to pass using the scanf call. You need to allocate space for the cstring userInput in your stack, before you try to save/assign any string to it. So...
You need to change the following code:
char * userInput;
to:
char userInput[200];
Here, 200 is just an arbitrary value. In your case, please select the max. length of the string + 1 for the (\0).
When you enter characters you need to store the characters somewhere.
char* userInput;
is an uninitialized pointer.
So first you declare an array for your input
char userInput[128];
Now when reading from the keyboard you need to make sure the user does not enter more characters than 127 + one for \0 because it would overwrite the stack so best way to read from the keyboard is to use fgets, it also good to check the return value, if the user simply pressed ENTER without writing anything fgets returns NULL.
if (fgets(userInput, sizeof(userInput), stdin) != NULL) {
Now you have the string that the user entered plus the end of line character. To remove it you can do something like
char* p = strchr(userInput,'\n');
if ( p != NULL ) *p = '\0';
Now you can compare the strings
if (strcmp(password, userInput) == 0) {
puts("That is correct!");
}
When you think about "a string" in C, you should see it as an array of char's.
Let me use the identifier s instead of userInput for brevity:
0 1 2 3 4 5 9
+---+---+---+---+---+---+-- --+---+
s -> | p | i | p | p | o | \0| ... | |
+---+---+---+---+---+---+-- --+---+
is what
char s[10] = "pippo";
would create.
In other words, it's a block of memory where the first 6 bytes have been initialized as shown. There is no s variable anywhere.
Instead, declaring a char * like in
char *s;
would create a variable that can hold a pointer to char:
+------------+
s| 0xCF024408 | <-- represent an address
+------------+
If you think this way, you notice immediately that doing:
scanf("%s",s);
only make sense in the first case, where there is (hopefully) enough memory to hold the string.
In the second case, the variable s points to some random address and you will end up writing something into an unknown memory area.
For completeness, in cases like:
char *s = "pippo";
you have the following situation in memory:
0 1 2 3 4 5
+---+---+---+---+---+---+ Somewhere in the
0x0B320080 | p | i | p | p | o | \0| <-- readonly portion
+---+---+---+---+---+---+ of memory
+------------+ a variable pointing
s| 0x0B320080 | <-- to the address where
+------------+ the string is
You can make s pointing somewhere else but you can't change the content of the string pointed by s.
how to store two strings one after other without concatenation (we can increment the address)
char str[10];
scanf("%s",str);
str=str+9;
scanf("%s",str);
NOTE: Here if I give first string as BALA and 2nd as HI, it should print as HI after BALA . But HI should not replace BALA.
You cannot increment (or change in any other way) an array like that, the array variable (str) is a constant which cannot be changed.
You can do it like so:
char str[64];
scanf("%s", str);
scanf("%s", str + strlen(str));
This will first scan into str, then immediately scan once more, starting the new string right on top of the terminating '\0' of the first string.
If you enter "BALA" first, the beginning of str will look like this:
+---+---+---+---+----+
str: | B | A | L | A | \0 |
+---+---+---+---+----+
and since strlen("BALA") is four, the next string will be scanned into the buffer starting right on top of the '\0' visible above. If you then enter "HI", str will start like so:
+---+---+---+---+---+---+----+
str: | B | A | L | A | H | I | \0 |
+---+---+---+---+---+---+----+
At this point, if you print str it will print as "BALAHI".
Of course, this is very dangerous and likely to introduce buffer overrun, but that's what you wanted.
If I understand what you want to do correctly perhaps you want to put the strings in an array.
So a modified version of your code would look something like
char strings[ARRAY_LENGTH][MAX_STRING_LENGTH];
char* str = strings[0];
scanf("%s",str);
str=strings[1];
scanf("%s",str);
Then to print all the strings you would have to loop over the array like this
int i;
for(i = 0; i < ARRAY_LENGTH; i++)
{
printf(strings[i]);
}
(you would have to define ARRAY_LENGTH and MAX_STRING_LENGTH)
Moving in a similar direction to unwind, you can use the %n directive to determine how many bytes have been read. Don't forget to subtract any leading whitespace. You may also want to read your manual regarding scanf very carefully, but paying particular care to the "RETURN VALUE" section. Handling the return value is necessary to ensure a string was actually read and avoid undefined behaviour.
char str[64];
int whitespace_length, str_length, total_length;
/* NOTE: Don't handle errors with assert!
* You should replace this with proper error handling */
assert(scanf(" %n%s%n", &whitespace_length, str, &total_length) == 1);
str_length = total_length - str_length;
assert(scanf(" %n%s%n", &whitespace_length, str + str_length, &total_length) == 1);
str_length += total_length - str_length;
May be you are looking at some thing like this
char arr[100] = {0,}, *str = NULL;
/** initial String will be scanned from beginning **/
str = arr;
/** scan first string **/
fgets(str, 100, stdin);
/** We need to replace NULL termination with space is space is delimiter **/
str += strlen(str)-1;
*str = ' ' ;
/** scan second string from address just after space,
we can not over ride the memory though **/
fgets(str, 100 - strlen(str), stdin);
printf("%s",arr);
Well incase you need the same with scanf
char arr[100] = {0,}, *str = NULL;
/** initial String will be scanned from beginning **/
str = arr;
/** scan first string **/
scanf("%s",str);
/** We need to replace NULL termination with space is space is delimiter **/
str += strlen(str);
*str = ' ' ;
/** scan second string from address just after space,
* we can not over ride the memory though **/
str++;
scanf("%s",str);
printf("%s",arr);
I'm reading a file in my C program and comparing every word in it with my word, which is entered via command line argument. But I get crashes, and I can't understand what's wrong. How do I track such errors? What is wrong in my case?
My compiler is clang. The code compiles fine. When running it says 'segmentation fault'.
Here is the code.
#include <stdio.h>
#include <string.h>
int main(int argc, char* argv[])
{
char* temp = argv[1];
char* word = strcat(temp, "\n");
char* c = "abc";
FILE *input = fopen("/usr/share/dict/words", "r");
while (strcmp(word, c))
{
char* duh = fgets(c, 20, input);
printf("%s", duh);
}
if (!strcmp (word, c))
{
printf("FOUND IT!\n");
printf("%s\n%s", word, c);
}
fclose(input);
}
The issue here is that you are trying to treat strings in C as you might in another language (like C++ or Java), in which they are resizable vectors that you can easily append or read an arbitrary amount of data into.
C strings are much lower level. They are simply an array of characters (or a pointer to such an array; arrays can be treated like pointers to their first element in C anyhow), and the string is treated as all of the characters within that array up to the first null character. These arrays are fixed size; if you want a string of an arbitrary size, you need to allocate it yourself using malloc(), or allocate it on the stack with the size that you would like.
One thing here that is a little confusing is you are using a non-standard type string. Given the context, I'm assuming that's coming from your cs50.h, and is just a typedef to char *. It will probably reduce confusion if you actually use char * instead of string; using a typedef obscures what's really going on.
Let's start with the first problem.
string word = strcat(argv[1], "\n");
strcat() appends the second string onto the first; it starts from the null terminator of the first string, and replaces that with the first character of the second string, and so on, until it reaches a null in the second string. In order for this to work, the buffer containing the first string needs to have enough room to fit the second one. If it does not, you may overwrite arbitrary other memory, which could cause your program to crash or have all kinds of other unexpected behavior.
Here's an illustration. Let's say that argv[1] contains the word hello, and the buffer has exactly as much space as it needs for this. After it is some other data; I've filled in other for the sake of example, though it won't actually be that, it could be anything, and it may or may not be important:
+---+---+---+---+---+---+---+---+---+---+---+---+
| h | e | l | l | o | \0| o | t | h | e | r | \0|
+---+---+---+---+---+---+---+---+---+---+---+---+
Now if you use strcat() to append "\n", you will get:
+---+---+---+---+---+---+---+---+---+---+---+---+
| h | e | l | l | o | \n| \0| t | h | e | r | \0|
+---+---+---+---+---+---+---+---+---+---+---+---+
You can see that we've overwritten the other data that was after hello. This may cause all kinds of problems. To fix this, you need to copy your argv[1] into a new string, that has enough room for it plus one more character (and don't forget the trailing null). You can call strlen() to get the length of the string, then add 1 for the \n, and one for the trailing null, to get the length that you need.
Actually, instead of trying to add a \n to the word you get in from the command line, I would recommend stripping off the \n from your input words, or using strncmp() to compare all but the last character (the \n). In general, it's best in C to avoid appending strings, as appending strings means you need to allocate memory and copy things around, and it can be easy to make mistakes doing so, as well as being inefficient. Higher level languages usually take care of the details for you, making it easier to append strings, though still just as inefficient.
After your edit, you changed this to:
char* temp = argv[1];
char* word = strcat(temp, "\n");
However, this has the same problem. A char * is a pointer to a character array. Your temp variable is just copying the pointer, not the actual value; it is still pointing to the same buffer. Here's an illustration; I'm making up addresses for the purposes of demonstration, in the real machine there will be more objects in between these things, but this should suffice for the purpose of demonstration.
+------------+---------+-------+
| name | address | value |
+------------+---------+-------+
| argv | 1000 | 1004 |-------+
| argv[0] | 1004 | 1008 | --+ <-+
| argv[1] | 1006 | 1016 | --|---+
| argv[0][0] | 1008 | 'm' | <-+ |
| argv[0][1] | 1009 | 'y' | |
| argv[0][2] | 1010 | 'p' | |
| argv[0][3] | 1011 | 'r' | |
| argv[0][4] | 1012 | 'o' | |
| argv[0][5] | 1013 | 'g' | |
| argv[0][6] | 1014 | 0 | |
| argv[1][0] | 1016 | 'w' | <-+ <-+
| argv[1][1] | 1017 | 'o' | |
| argv[1][2] | 1018 | 'r' | |
| argv[1][3] | 1019 | 'd' | |
| argv[1][4] | 1020 | 0 | |
+------------+---------+-------+ |
Now when you create your temp variable, all you are doing is copying argv[1] into a new char *:
+------------+---------+-------+ |
| name | address | value | |
+------------+---------+-------+ |
| temp | 1024 | 1016 | --+
+------------+---------+-------+
As a side note, you also shouldn't ever try to access argv[1] without checking that argc is greater than 1. If someone doesn't pass any arguments in, then argv[1] itself is invalid to access.
I'll move on to the next problem.
string c = "abc";
// ...
char* duh = fgets(c, 20, input);
Here, you are referring to the static string "abc". A string that appears literally in the source, like "abc", goes into a special, read-only part of the memory of the program. Remember what I said; string here is just a way of saying char *. So c is actually just a pointer into this read-only section of memory; and it has only enough room to store the characters that you provided in the text (4, for abc and the null character terminating the string). fgets() takes as its first argument a place to store the string that it is reading, and its second the amount of space that it has. So you are trying to read up to 20 bytes, into a read-only buffer that only has room for 4.
You need to either allocate space for reading on the stack, using, for example:
char c[20];
Or dynamically, using malloc():
char *c = malloc(20);
First problem I see is this:
string word = strcat(argv[1], "\n");
You are adding characters to the end of a buffer here.
A buffer allocated for you by the runtime enviroment, that you should consider read only.
EDIT
I'm afraid your change to the code still has the same effect.
char* temp = argv[1];
Has temp pointing to the same buffer as argv[1].
You need to allocate a buffer the proper size, and use it.
char* temp = (char*)malloc(sizeof(char) * (strlen(argv[1]) + 2));
The +2 is for the adding \n and \0 at the end.
Than you do this:
strcpy(temp, argv[1]);
strcat(temp,"\n");
The code is rather flawed. Another one:
char* duh = fgets(c, 20, input);
Here you define a pointer to char, do not initialize it (hence it contains a random value) and then you write up to 20 bytes to the address pointed to by the random data. If you're lucky you just get a cash. If not, you overwrite some other important data. Fortunately most of the systems in use today won't let you access address space of another program, so the code wreaks havoc only on itself.
The line in question could look like:
#define BUFFERSIZE 1024
...
while (reasonable condition) {
char *duh = malloc(BUFERSIZE);
if (NULL == duh) { /* not enough memory - handle error, and exit */
}
duh = fgets(duh, BUFFERSIZE, input);
if (NULL == duh) { /* handle error or EOF condition */
} else { /* check that the line is read completely,
i.e. including end-of-line mark,
then do your stuff with the data */
}
free (duh);
}
Of course, you can allocate the buffer only once (outside of the loop) and reuse it. The #define makes it easy to adjust the maximum buffer size.
Alternatively, on recent systems, you can use getline(), which is able to allocate a buffer of appropriate size for you. That you must free() at the end of the loop.
If you are on Linux/BSD, use man (e.g. man fgets) to get information on the functions, otherwise resort to internet or a decent book on C for documentation.
First, My C knowledge is old, so I'm not sure what a string is. Either way, it's helpful, but not absolutely required to have a nice pre-zeroed buffer in which to read contents of the file. So whether you zero word or do something like the following, zero the input first.
#define IN_BUF_LEN 120
char in_buf[IN_BUF_LEN] = {0};
120 characters is a safe size, assuming most of your text lines are around 80 characters or less long.
Second, you're basing your loop of the value of a strcmp rather than actually reading the file. It might accomplish the same thing, but I'd base my while on reaching end of file.
Finally, you've declared duh a pointer, not a place to store what fgets returns. That's a problem, too. So, duh should be declared similarly to in_buf above.
Finally, you're assigning the value of argv[1] at compile time, not run-time. I can't see where that's getting you what you want. If you declare temp as a pointer and then assign argv[1] to it, you'll just have another pointer to argv[1], but not actually have copied the value of argv[1] to a local variable. Why not just use argv[1]?
Write a program that takes nouns and
forms their plurals on the basis of
these rules: a. If noun ends in “y”
remove the “y” and add “ies” b. If
noun ends in “s” , “ch”, or “sh”, add
“es” c. In all other cases, just add
“s” Print each noun and its plural.
Try the following data: chair
dairy boss circus fly dog
church clue dish
This is what I've got so far but it just isn't quite functioning like it's supposed to:
#include<stdlib.h>
#include <Windows.h>
#include <stdio.h>
#include <string.h>
#define SIZE 8
char *replace_str(char *str, char *orig, char *rep)
{
static char buffer[4096];
char *p;
if(!(p = strstr(str, orig)))
return str;
strncpy(buffer, str, p-str);
buffer[p-str] = '\0';
sprintf(buffer+(p-str), "%s%s", rep, p+strlen(orig));
return buffer;
}
int main(void)
{
char plural[SIZE];
printf("Enter a noun: ");
scanf("%c",&plural);
bool noreplace = false;
puts(replace_str(plural, "s","es"));
puts(replace_str(plural, "sh","es"));
puts(replace_str(plural, "ch","es"));
puts(replace_str(plural, "y", "ies"));
if(noreplace) {
puts(replace_str(plural, "","s"));
}
system("pause");
return 0;
}
I haven't taken a C class in a while can anyone help me out?
Thanks.
For a start, scanf("%c") gets a single character, not a string. You should use fgets for that, along the lines of:
fgets (buffer, SIZE, stdin);
// Remove newline if there.
size_t sz = strlen(buffer);
if (sz > 0 && buffer[sz-1] == '\n') buffer[sz-1] = '\0';
Once you've fixed that, we can turn to the function which pluralises the words along with a decent test harness. Make sure you keep your own main (with a fixed input method) since there's a couple of things in this harness which will probably make your educator suspect it's not your code. I'm just including it for our testing purposes here.
Start with something like:
#include <stdio.h>
#include <string.h>
char *pluralise(char *str) {
static char buffer[4096];
strcpy (buffer, str);
return buffer;
}
int main(void) {
char *test[] = {
"chair", "dairy", "boss", "circus", "fly",
"dog", "church", "clue", "dish"
};
for (size_t i = 0; i < sizeof(test)/sizeof(*test); i++)
printf ("%-8s -> %s\n", test[i], pluralise(test[i]));
return 0;
}
This basically just gives you back exactly what you passed in but it's a good start:
chair -> chair
dairy -> dairy
boss -> boss
circus -> circus
fly -> fly
dog -> dog
church -> church
clue -> clue
dish -> dish
The next step is to understand how to detect a specific ending and how to copy and modify the string to suit. The string is an array of characters of the form:
0 1 2 3 4 5
+---+---+---+---+---+---+
| c | h | a | i | r | $ |
+---+---+---+---+---+---+
where $ represents the null terminator \0. The numbers above give the offset from the start or the index that you can use to get a character from a particular position in that array. So str[3] will give you i.
Using that and the length of the string (strlen(str) will give you 5), you can check specific characters. You can also copy the characters to your target buffer and use a similar method to modify the end.
Like any good drug pusher, I'm going to give you the first hit for free :-)
char *pluralise(char *str) {
static char buffer[4096]; // Risky, see below.
size_t sz = strlen(str); // Get length.
if (sz >= 1 && str[sz-1] == 'y') { // Ends with 'y'?
strcpy(buffer, str); // Yes, copy whole buffer,
strcpy(&(buffer[sz-1]), "ies"); // overwrite final bit,
return buffer; // and return it.
}
strcpy(buffer, str); // If no rules matched,
strcat(buffer, "s"); // just add "s",
return buffer; // and return it.
}
Of particular interest there is the sequence:
strcpy(buffer, str);
strcpy(&(buffer[sz-1]), "ies");
The first line makes an exact copy of the string like:
0 1 2 3 4 5
+---+---+---+---+---+---+
| d | a | i | r | y | $ |
+---+---+---+---+---+---+
The second line copies the "ies" string into the memory location of buffer[sz-1]. Since sz is 5, that would be offset 4, resulting in the following change:
0 1 2 3 4 5
+---+---+---+---+---+---+
| d | a | i | r | y | $ |
+---+---+---+---+---+---+---+---+
| i | e | s | $ |
+---+---+---+---+
so that you end up with dairies.
From that, you should be able to use the same methods to detect the other string endings, and do similar copy/modify operations to correctly pluralise the strings.
Keep in mind that this is basic code meant to illustrate the concept, not necessarily hardened code that I would use in a production environment. For example, the declaration static char buffer[4096] has at least two problems that will occur under certain circumstances:
If your words are longer than about 4K in length, you'll get buffer overflow. However, even German, with its penchant for stringing basic words together in long sequences(a), doesn't have this problem :-) Still, it's something that should be catered for, if only to handle "invalid" input data.
Being static, the buffer will be shared amongst all threads calling this function if used in a multi-threaded environment. That's unlikely to end well as threads may corrupt the data of each other.
A relatively easy fix would be for the caller to also provide a buffer for the result, at least long enough to handle the largest possible expansion of a word to its plural form. But I've left that as a separate exercise since it's not really relevant to the question.
(a) Such as Donaudampfschiffahrtselektrizitätenhauptbetriebswerkbauunterbeamtengesellschaft :-)
You are reading a word as:
scanf("%c",&plural);
which is incorrect as it only reads one character.
Change it to:
scanf("%s",plural);
or even better use fgets as:
fgets (plural,SIZE, stdin);
But note that fgets might add a newline at the end of the string. If it does you need to remove it before you do the replacement as your replacement depends on the last character in the word.
Also your replacement part is incorrect. You are replacing any s with es (same with other replacements). You need to replace only the last s.
puts(replace_str(plural, "ch","es"));
Consider the input: church
strstr(3) will find the first ch, not the last ch. Ooops.
Furthermore, once you modify replace_str() to find the the last ch, you're still ripping it off and not putting it back on: chures. (Assuming your replace_str() functions as I think it does; that's some hairy code. :) So add the ch back on:
puts(replace_str(plural, "ch","ches"));
first of all u need to find last position of occurance and then call replace_str() function
and secondly scanf("%s",&plural);or use fgets()
Maybe this might help you:
str_replace
it's nicely done!