Issue With Comparing Strings In C - c

I'm trying to compare a string with another string and if they match I want the text "That is correct" to output but I can't seem to get it working.
Here is the code:
int main ()
{
char * password = "Torroc";
char * userInput;
printf("Please enter your password: ");
scanf("%s", userInput);
if (strcmp(password, userInput) == 0) {
printf("That is correct!");
}
}

In your code, userInput pointer does not have provision to hold the string that you are about to pass using the scanf call. You need to allocate space for the cstring userInput in your stack, before you try to save/assign any string to it. So...
You need to change the following code:
char * userInput;
to:
char userInput[200];
Here, 200 is just an arbitrary value. In your case, please select the max. length of the string + 1 for the (\0).

When you enter characters you need to store the characters somewhere.
char* userInput;
is an uninitialized pointer.
So first you declare an array for your input
char userInput[128];
Now when reading from the keyboard you need to make sure the user does not enter more characters than 127 + one for \0 because it would overwrite the stack so best way to read from the keyboard is to use fgets, it also good to check the return value, if the user simply pressed ENTER without writing anything fgets returns NULL.
if (fgets(userInput, sizeof(userInput), stdin) != NULL) {
Now you have the string that the user entered plus the end of line character. To remove it you can do something like
char* p = strchr(userInput,'\n');
if ( p != NULL ) *p = '\0';
Now you can compare the strings
if (strcmp(password, userInput) == 0) {
puts("That is correct!");
}

When you think about "a string" in C, you should see it as an array of char's.
Let me use the identifier s instead of userInput for brevity:
0 1 2 3 4 5 9
+---+---+---+---+---+---+-- --+---+
s -> | p | i | p | p | o | \0| ... | |
+---+---+---+---+---+---+-- --+---+
is what
char s[10] = "pippo";
would create.
In other words, it's a block of memory where the first 6 bytes have been initialized as shown. There is no s variable anywhere.
Instead, declaring a char * like in
char *s;
would create a variable that can hold a pointer to char:
+------------+
s| 0xCF024408 | <-- represent an address
+------------+
If you think this way, you notice immediately that doing:
scanf("%s",s);
only make sense in the first case, where there is (hopefully) enough memory to hold the string.
In the second case, the variable s points to some random address and you will end up writing something into an unknown memory area.
For completeness, in cases like:
char *s = "pippo";
you have the following situation in memory:
0 1 2 3 4 5
+---+---+---+---+---+---+ Somewhere in the
0x0B320080 | p | i | p | p | o | \0| <-- readonly portion
+---+---+---+---+---+---+ of memory
+------------+ a variable pointing
s| 0x0B320080 | <-- to the address where
+------------+ the string is
You can make s pointing somewhere else but you can't change the content of the string pointed by s.

Related

Why can't I copy the terminating null this way?

I wanted to copy string with the following code and it didn't copy the '\0'.
void copyString(char *to, char *from)
{
do{
*to++ = *from++;
}while(*from);
}
int main(void)
{
char to[50];
char from[] = "text2copy";
copyString(to, from);
printf("%s", to);
}
This is output to the code:
text2copyÇ■   ║kvu¡lvu
And every time I rerun the code the code, the character after text2copy changes, so while(*from) works fine but something random is copied instead of '\0'.
text2copyÖ■   ║kvu¡lvu
text2copy╨■   ║kvu¡lvu
text2copy╡■   ║kvu¡lvu
//etc
Why is this happenning?
The problem is that you never copy the '\0' character at the end of the string. To see why consider this:
The string passed in is a constant string sized exactly to fit the data:
char from[] = "text2copy";
It looks like this in memory:
----+----+----+----+----+----+----+----+----+----+----+----
other memory | t | e | x | t | 2 | c | o | p | y | \0 | other memory
----+----+----+----+----+----+----+----+----+----+----+----
^
from
Now let's imagine that you have done the loop several times already and you are at the top of the loop and from is pointing to the 'y' character in text2copy:
----+----+----+----+----+----+----+----+----+----+----+----
other memory | t | e | x | t | 2 | c | o | p | y | \0 | other memory
----+----+----+----+----+----+----+----+----+----+----+----
^
from
The computer executes *to++ = *from++; which copies the 'y' character to to and then increments both to and from. Now the memory looks like this:
----+----+----+----+----+----+----+----+----+----+----+----
other memory | t | e | x | t | 2 | c | o | p | y | \0 | other memory
----+----+----+----+----+----+----+----+----+----+----+----
^
from
The computer executes } while(*from); and realizes that *from is false because it points to the '\0' character at the end of the string so the loop ends and the '\0' character is never copied.
Now you might think this would fix it:
void copyString(char *to, char *from)
{
do{
*to++ = *from++;
} while(*from);
*to = *from; // copy the \0 character
}
And it does copy the '\0' character but there are still problems. The code even more fundamentally flawed because, as #JonathanLeffler said in the comments, for the empty string you peek at the contents of memory that is after the end of the string and because it was not allocated to you accessing it causes undefined behaviour:
----+----+----
other memory | \0 | other memory
----+----+----
^
from
The computer executes *to++ = *from++; which copies the '\0' character to to and then increments both to and from which makes from point to memory you don't own:
----+----+----
other memory | \0 | other memory
----+----+----
^
from
Now the computer executes }while(*from); and accesses memory that isn't yours. You can point from anywhere with no problem, but dereferencing from when it points to memory that isn't yours is undefined behaviour.
The example I made in the comments suggests saving the value copied into a temporary variable:
void copyString(char *to, char *from)
{
int test;
do{
test = (*to++ = *from++); // save the value copied
} while(test);
}
The reason I suggested that particular way was to show you that the problem was WHAT you were testing and not about testing the loop condition afterwards. If you save the value copied and then test that saved value later the character gets copied before it is tested (so the \0 gets copied) and you don't read from the incremented pointer (so there is no undefined behaviour)
But the example #JonathanLeffler had in his comments is shorter, easier to understand, and more idiomatic. It exact does the same thing without declaring a named temporary variable:
void copyString(char *to, char *from)
{
while ((*to++ = *from++) != '\0')
;
}
The code first copies the character and then tests the value that was copied (so the '\0' will be copied) but the incremented pointer is never dereferenced (so there is no undefined behaviour).
the posted code stops looping when it encounters the NUL byte, rather than afterwards.
Regarding:
}while(*from);
Suggest following that line with:
*to = '\0';

Split a string based on contiguous delimiters

I'm looking to split a sting based on a specific sequence of characters but only if they are in order.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
int i = 0;
char **split;
char *tmp;
split = malloc(20 * sizeof(char *));
tmp = malloc(20 * 12 * sizeof(char));
for(i=0;i<20;i++)
{
split[i] = &tmp[12*i];
}
char *line;
line = malloc(50 * sizeof(char));
strcpy(line, "Test - Number -> <10.0>");
printf("%s\n", line);
i = 0;
while( (split[i] = strsep(&line, " ->")) != NULL)
{
printf("%s\n", split[i]);
i++;
}
}
This will print out:
Test
Number
<10.0
However I just want to split around the -> so it could give the output:
Test - Number
<10.0>
I think the best way to do the splits with an ordered sequence of delimeters is
to replicate strtok_r behaviour using strstr, like this:
#include <stdio.h>
#include <string.h>
char *substrtok_r(char *str, const char *substrdelim, char **saveptr)
{
char *haystack;
if(str)
haystack = str;
else
haystack = *saveptr;
char *found = strstr(haystack, substrdelim);
if(found == NULL)
{
*saveptr = haystack + strlen(haystack);
return *haystack ? haystack : NULL;
}
*found = 0;
*saveptr = found + strlen(substrdelim);
return haystack;
}
int main(void)
{
char line[] = "a -> b -> c -> d; Test - Number -> <10.0> ->No->split->here";
char *input = line;
char *token;
char *save;
while(token = substrtok_r(input, " ->", &save))
{
input = NULL;
printf("token: '%s'\n", token);
}
return 0;
}
This behaves like strtok_r but only splits when the substring is found. The
output of this is:
$ ./a
token: 'a'
token: ' b'
token: ' c'
token: ' d; Test - Number'
token: ' <10.0>'
token: 'No->split->here'
And like strtok and strtok_r, it requires that the source string is
modifiable, as it writes the '\0'-terminating byte for creating and returning
the tokens.
EDIT
Hi, would you mind explaining why '*found = 0' means the return value is only the string in-between delimiters. I don't really understand what is going on here or why it works. Thanks
The first thing you've got to understand is how strings work in C. A string is
just a sequence of bytes (characters) that ends with the '\0'-terminating
byte. I wrote bytes and characters in parenthesis, because a character in C is
just a 1-byte value (on most systems a byte is 8 bit long) and the integer
values representing the characters are those defined in the ASSCI code
table, which are 7-bit long values. As you can see from the table the
value 97 represents the character 'a', 98 represents 'b', etc. Writing
char x = 'a';
is the same as doing
char x = 97;
The value 0 is an special value for strings, it is called NUL (null character)
or '\0'-terminating byte. This value is used to tell the functions where a
string ends. A function like strlen that returns the length of a string, does
it by counting how many bytes it encounters until it encounters a byte with
the value 0.
That's why strings are stored using char arrays, because a pointer to an array
gives to the start of the memory block where the sequence of chars is stored.
Let's look at this:
char string[] = { 'H', 'e', 'l', 'l', 'o', 0, 48, 49, 50, 0 };
The memory layout for this array would be
0 1 2 3 4 5 6 7 8 9
+-----+-----+-----+-----+-----+----+-----+-----+-----+----+
| 'H' | 'e' | 'l' | 'l' | 'o' | \0 | '0' | '1' | '2' | \0 |
+-----+-----+-----+-----+-----+----+-----+-----+-----+----+
or to be more precise with the integer values
0 1 2 3 4 5 6 7 8 9 10
+----+-----+-----+-----+-----+---+----+----+----+---+
| 72 | 101 | 108 | 108 | 111 | 0 | 48 | 49 | 50 | 0 |
+----+-----+-----+-----+-----+---+----+----+----+---+
Note that the value 0 represents '\0', 48 represents '0', 49 represents
'1' and 50 represents '2'. If you do
printf("%lu\n", strlen(string));
the output will be 5. strlen will find the value 0 at the 5th position and
stop counting, however string stores two strings, because from the 6th
position on, a new sequence of characters starts that also terminates with 0, thus making it a
second valid string in the array. To access it, you would need to have pointer
that points past the first 0 value.
printf("1. %s\n", string);
printf("2. %s\n", string + strlen(string) + 1);
The output would be
Hello
012
This property is used in functions like strtok (and mine above) to return you
a substring from a larger string, without the need of creating a copy (that would be
creating a new array, dynamically allocating memory, using strcpy to create
the copy).
Assume you have this string:
char line[] = "This is a sentence;This is another one";
Here you have one string only, because the '\0'-terminating byte comes after
the last 'e' in the string. If I however do:
line[18] = 0; // same as line[18] = '\0';
then I created two strings in the same array:
"This is a sentence\0This is another one"
because I replaced the semicolon ';' with '\0', thus creating a new string
from position 0 to 18 and a second one from position 19 to 38. If I do now
printf("string: %s\n", line);
the output will be
string: This is a sentence
Now let's us take look at the function itself:
char *substrtok_r(char *str, const char *substrdelim, char **saveptr);
The first argument is the source string, the second argument is the delimiters
strings and the third one is doule pointer of char. You have to pass a pointer
to a pointer of char. This will be used to remember where the function should
resume scanning next, more on that later.
This is the algorithm:
if str is not NULL:
start a new scan sequence from str
otherwise
resume scanning from string pointed to by *saveptr
found position of substring_d pointed to by 'substrdelim'
if no such substring_d is found
if the current character of the scanned text is \0
no more substrings to return --> return NULL
otherwise
return the scanned text and set *saveptr to
point to the \0 character of the scanned text,
so that the next iteration ends the scanning
by returning NULL
otherwise (a substring_d was found)
create a new substring_a until the found one
by setting the first character of the found
substring_d to 0.
update *saveptr to the start of the found substring_d
plus it's previous length so that *saveptr
points to the past the delimiter sequence found in substring_d.
return new created substring_a
This first part is easy to understand:
if(str)
haystack = str;
else
haystack = *saveptr;
Here if str is not NULL, you want to start a new scan sequence. That's why
in main the input pointer is set to point to the start of the string saved
in line. Every other iteration must be called with str == NULL, that's
why the first thing is done in the while loop is to set input = NULL; so
that substrtok_r resumes scanning using *saveptr. This is the standard
behaviour of strtok.
The next step is to look for a delimiting substring:
char *found = strstr(haystack, substrdelim);
The next part handles the case where no delimiting substring is
found2:
if(found == NULL)
{
*saveptr = haystack + strlen(haystack);
return *haystack ? haystack : NULL;
}
*saveptr is updated to point past the whole source, so that it points to the
'\0'-terminating byte. The return line can be rewritten as
if(*haystack == '\0')
return NULL
else
return haystack;
which says if the source already is an empy string1, then return
NULL. This means no more substring are found, end calling the function. This
is also standard behaviour of strtok.
The last part
*found = 0;
*saveptr = found + strlen(substrdelim);
return haystack;
is handles the case when a delimiting substring is found. Here
*found = 0;
is basically doing
found[0] = '\0';
which creates substrings as explained above. To make it clear once again, before
Before
*found = 0;
*saveptr = found + strlen(substrdelim);
return haystack;
the memory looks like this:
+-----+-----+-----+-----+-----+-----+
| 'a' | ' ' | '-' | '>' | ' ' | 'b' | ...
+-----+-----+-----+-----+-----+-----+
^ ^
| |
haystack found
*saveptr
After
*found = 0;
*saveptr = found + strlen(substrdelim);
the memory looks like this:
+-----+------+-----+-----+-----+-----+
| 'a' | '\0' | '-' | '>' | ' ' | 'b' | ...
+-----+------+-----+-----+-----+-----+
^ ^ ^
| | |
haystack found *saveptr
because strlen(substrdelim)
is 3
Remember if I do printf("%s\n", haystack); at this point, because the '-' in
found has been set to 0, it will print a. *found = 0 created two strings out
of one like exaplained above. strtok (and my function which is based on
strtok) uses the same technique. So when the function does
return haystack;
the first string in token will be the token before the split. Eventually
substrtok_r returns NULL and the loop exists, because substrtok_r returns
NULL when no more split can be created, just like strtok.
Fotenotes
1An empty string is a string where the first character is already the
'\0'-terminating byte.
2This is very important part. Most of the standard functions in the C
library like strstr will not return you a new string in memory, will
not create a copy and return a copy (unless the documentation says so). The
will return you a pointer pointing to the original plus an offset.
On success strstr will return you a pointer to the start of the substring,
this pointer will be at an offset to the source.
const char *txt = "abcdef";
char *p = strstr(txt, "cd");
Here strstr will return a pointer to the start of the substring "cd" in
"abcdef". To get the offset you do p - txt which returns how many bytes
there are appart
b = base address where txt is pointing to
b b+1 b+2 b+3 b+4 b+5 b+6
+-----+-----+-----+-----+-----+-----+------+
| 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | '\0' |
+-----+-----+-----+-----+-----+-----+------+
^ ^
| |
txt p
So txt points to address b, p points to address b+2. That's why you get
the offset by doing p-txt which would be (b+2) - b => 2. So p points to
the original address plus the offset of 2 bytes. Because of this bahaviour
things like *found = 0; work in the first place.
Note that doing things like txt + 2 will return you a new pointer pointing to
the where txt points plus the offset of 2. This is called pointer arithmetic.
It's like regualr arithmetic but here the compiler takes the size of an object
into consideration. char is a type that is defined to have the size of 1,
hence sizeof(char) returns 1. But let's say you have an array of integers:
int arr[] = { 7, 2, 1, 5 };
On my system an int has size of 4, so an int object needs 4 bytes in memory.
This array looks like this in memory:
b = base address where arr is stored
address base base + 4 base + 8 base + 12
in bytes +-----------+-----------+-----------+-----------+
| 7 | 2 | 1 | 5 |
+-----------+-----------+-----------+-----------+
pointer arr arr + 1 arr + 2 arr + 3
arithmetic
Here arr + 1 returns you a pointer pointing to where arr is stored plus an
offset of 4 bytes.

Recursive addition code in C

The following code works but I don't quite understand how *if (s == 0) works.
It checks if the string is 0?
Also for return(isnumber(s+1)) what is the logic behind that?
I know s is a string but I can just pass s+1 into a function? How does it even know what character I'm looking for?
int isnumber(char *s) {
if (*s == 0) {
return 1; /* Reached end, we've only seen digits so far! */
}
if(!isdigit(*s)) {
printf("The number is invalid\n");
return 0; /* first character is not a digit, so no go */
}
return(isnumber(s+1));
}
int main () {
char inbuf[LENGTH];
int i, j;
printf("Enter a string > ");
fgets(inbuf, LENGTH-1, stdin); // ignore carriage return
inbuf[strlen(inbuf)-1] = 0;
j = isnumber(inbuf);
....
}
This function is a recursive function that checks if a string contains all numbers. To understand how the code works, you must understand how C stores strings. If you have the string "123", C stores this string in memory, like this:
|-----------------------------------|
| 0x8707 | 0x8708 | 0x8709 | 0x870A |
|--------|--------|--------|--------|
| | | | |
| '1' | '2' | '3' | '\0' |
|-----------------------------------|
What C does is it breaks your sting up into characters, stores them in some arbitrary location in memory and adds a null character (\0) (ASCII 0) to the end of the string. This null character is how C knows where the string ends.
Your isnumber() function takes a char *s as a parameter. This is called a pointer. Internally, whats going on is your main() function calls isdigit() and it actually passes in the address of your string, not the string itself. This is important:
j = isnumber(inbuf);
How the compiler interprets this is call isnumber() and pass along the address of inbuf and assign the return value to j.
Now back up at the isnumber() function, its receiving the address of inbuf and assigning it to s. By placing an asterisk (*) in front of s, you are doing something called dereferencing s. Dereferencing means you want the value contained at the address of s. So the line that says if (*s == 0) is basically saying If the value contained at the address of s is equal to 0. Remember earlier I told you in memory, strings always have a terminating null (\0) character? This is how your function knows to end and return.
The next thing to understand is pointer arithmetic. Depending on your system, a char might occupy either 1 byte of memory or 2 bytes. You can find out for sure by printing a sizeof(char). But when you refer to (s+1), that is telling the computer to take the memory address pointed to by s and add to it whatever the size of a char is. So if a char is 1 byte long and s is pointed to 0x8707, then (s+1) will make s equal 0x8708 and *s will point to the '2' in our string (see my memory block diagram above). This is how we iterate through each character in the string.
Hopefully this clears up the confusion!
The statement if (*s == 0) checks to see if the char s points to is zero. In other words, it checks to see if s is a zero-length string and returns 1 if so.
The statement return (isnumber(s+1)) adds 1 to s, causing it to point to the second char in the string, and passes that to isnumber(). isnumber returns true if the string at s[1] is a digit.
In C, strings are terminated with a null character.
(*s == 0) is checking for the null terminator.
This code is a little weirder.
return(isnumber(s+1));
Since the current character is a digit, keep going...call the function again starting at the NEXT character. This is a recursive function call and there is really no need when iteration would be simpler.

strstr() to search two different strings in the same line

I am trying to search two different strings in a line using strstr.
sBuffer = "This is app test"
s1= strstr (sBuffer, "This");
s2= strstr (sBuffer, "test");
printf("%s\n", s1); //prints - This is app test
printf("%s\n", s2); //prints - test
if (s1 && s2)
//do something
Expected output for s1 should be the string "This" but it is printing the entire string for s1.
s2 however is printed correctly.
Any help appreciated.
EDIT: Although all the answers are correct (upvoted all answers), I am accepting dasblinkenlight's answer. This is because I realize checking the boolean condition as shown below would suffice my requirement. Thanks for all the answers.
if ( (strstr (sBuffer, "This")) && (strstr (sBuffer, "test")) )
//do something
You're not understanding what the function does.
It gives you the address in sBuffer ("the haystack") where the search string ("the needle") has been found. It doesn't modify the haystack string, so it won't terminate the sub-string.
You have:
+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+----+
sBuffer: | T | h | i | s | | i | s | | a | p | p | | t | e | s | t | \0 |
+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+----+
^ ^
| |
| |
strstr(sBuffer, "Test") strstr(sBuffer, "test")
As you can see, strstr(sBuffer, "Test") will simply return sBuffer, which of course still contains the rest of the characters, it's the same memory buffer.
If you need to extract the sub-string that you found, you must do so yourself. A suitable function to use is strlcpy() if you have it, else strncpy() will work since you know the exact length of the data to copy.
The return value of strstr is the pointer to the original, unmodified, string at the point of the match. The reason why the second call displays test is a coincidence: test simply happens to be at the end of the searched string. Had the sBuffer been "This is app test of strstr", the output for the second call would be test of strstr, not simply test.
To fix this, you can change your program like this:
printf("%s\n", s1 ? "This" : "");
printf("%s\n", s2 ? "test" : "");
The reason this works is that you know that the only case when strstr would return a non-null pointer is when it finds the exact match to what you've been searching for. If all you need is a boolean "found/not found" flag, you can simply test s1 and s2 for NULL. You are using this trick already in your final if statement.
strstr() returns a pointer to the first character of the substring it found. It doesn't NUL-terminate the string after the searched substring, this is the expected and correct behavior.
As to the solution: if you have a non-const string, you can simply modify it so that it's NUL-terminated at the correct position (but then beware of the modifications you made). If not, then make a copy of the substring.
const char *haystack = "abcd efgh ijkl";
const char *needle = "efgh";
const char *p = strstr(haystack, needle);
if (p) {
size_t l = strlen(needle);
char buf[l + 1];
memcpy(buf, p, l);
buf[l] = 0;
printf("%s\n", buf);
}

Reading file in C

I'm reading a file in my C program and comparing every word in it with my word, which is entered via command line argument. But I get crashes, and I can't understand what's wrong. How do I track such errors? What is wrong in my case?
My compiler is clang. The code compiles fine. When running it says 'segmentation fault'.
Here is the code.
#include <stdio.h>
#include <string.h>
int main(int argc, char* argv[])
{
char* temp = argv[1];
char* word = strcat(temp, "\n");
char* c = "abc";
FILE *input = fopen("/usr/share/dict/words", "r");
while (strcmp(word, c))
{
char* duh = fgets(c, 20, input);
printf("%s", duh);
}
if (!strcmp (word, c))
{
printf("FOUND IT!\n");
printf("%s\n%s", word, c);
}
fclose(input);
}
The issue here is that you are trying to treat strings in C as you might in another language (like C++ or Java), in which they are resizable vectors that you can easily append or read an arbitrary amount of data into.
C strings are much lower level. They are simply an array of characters (or a pointer to such an array; arrays can be treated like pointers to their first element in C anyhow), and the string is treated as all of the characters within that array up to the first null character. These arrays are fixed size; if you want a string of an arbitrary size, you need to allocate it yourself using malloc(), or allocate it on the stack with the size that you would like.
One thing here that is a little confusing is you are using a non-standard type string. Given the context, I'm assuming that's coming from your cs50.h, and is just a typedef to char *. It will probably reduce confusion if you actually use char * instead of string; using a typedef obscures what's really going on.
Let's start with the first problem.
string word = strcat(argv[1], "\n");
strcat() appends the second string onto the first; it starts from the null terminator of the first string, and replaces that with the first character of the second string, and so on, until it reaches a null in the second string. In order for this to work, the buffer containing the first string needs to have enough room to fit the second one. If it does not, you may overwrite arbitrary other memory, which could cause your program to crash or have all kinds of other unexpected behavior.
Here's an illustration. Let's say that argv[1] contains the word hello, and the buffer has exactly as much space as it needs for this. After it is some other data; I've filled in other for the sake of example, though it won't actually be that, it could be anything, and it may or may not be important:
+---+---+---+---+---+---+---+---+---+---+---+---+
| h | e | l | l | o | \0| o | t | h | e | r | \0|
+---+---+---+---+---+---+---+---+---+---+---+---+
Now if you use strcat() to append "\n", you will get:
+---+---+---+---+---+---+---+---+---+---+---+---+
| h | e | l | l | o | \n| \0| t | h | e | r | \0|
+---+---+---+---+---+---+---+---+---+---+---+---+
You can see that we've overwritten the other data that was after hello. This may cause all kinds of problems. To fix this, you need to copy your argv[1] into a new string, that has enough room for it plus one more character (and don't forget the trailing null). You can call strlen() to get the length of the string, then add 1 for the \n, and one for the trailing null, to get the length that you need.
Actually, instead of trying to add a \n to the word you get in from the command line, I would recommend stripping off the \n from your input words, or using strncmp() to compare all but the last character (the \n). In general, it's best in C to avoid appending strings, as appending strings means you need to allocate memory and copy things around, and it can be easy to make mistakes doing so, as well as being inefficient. Higher level languages usually take care of the details for you, making it easier to append strings, though still just as inefficient.
After your edit, you changed this to:
char* temp = argv[1];
char* word = strcat(temp, "\n");
However, this has the same problem. A char * is a pointer to a character array. Your temp variable is just copying the pointer, not the actual value; it is still pointing to the same buffer. Here's an illustration; I'm making up addresses for the purposes of demonstration, in the real machine there will be more objects in between these things, but this should suffice for the purpose of demonstration.
+------------+---------+-------+
| name | address | value |
+------------+---------+-------+
| argv | 1000 | 1004 |-------+
| argv[0] | 1004 | 1008 | --+ <-+
| argv[1] | 1006 | 1016 | --|---+
| argv[0][0] | 1008 | 'm' | <-+ |
| argv[0][1] | 1009 | 'y' | |
| argv[0][2] | 1010 | 'p' | |
| argv[0][3] | 1011 | 'r' | |
| argv[0][4] | 1012 | 'o' | |
| argv[0][5] | 1013 | 'g' | |
| argv[0][6] | 1014 | 0 | |
| argv[1][0] | 1016 | 'w' | <-+ <-+
| argv[1][1] | 1017 | 'o' | |
| argv[1][2] | 1018 | 'r' | |
| argv[1][3] | 1019 | 'd' | |
| argv[1][4] | 1020 | 0 | |
+------------+---------+-------+ |
Now when you create your temp variable, all you are doing is copying argv[1] into a new char *:
+------------+---------+-------+ |
| name | address | value | |
+------------+---------+-------+ |
| temp | 1024 | 1016 | --+
+------------+---------+-------+
As a side note, you also shouldn't ever try to access argv[1] without checking that argc is greater than 1. If someone doesn't pass any arguments in, then argv[1] itself is invalid to access.
I'll move on to the next problem.
string c = "abc";
// ...
char* duh = fgets(c, 20, input);
Here, you are referring to the static string "abc". A string that appears literally in the source, like "abc", goes into a special, read-only part of the memory of the program. Remember what I said; string here is just a way of saying char *. So c is actually just a pointer into this read-only section of memory; and it has only enough room to store the characters that you provided in the text (4, for abc and the null character terminating the string). fgets() takes as its first argument a place to store the string that it is reading, and its second the amount of space that it has. So you are trying to read up to 20 bytes, into a read-only buffer that only has room for 4.
You need to either allocate space for reading on the stack, using, for example:
char c[20];
Or dynamically, using malloc():
char *c = malloc(20);
First problem I see is this:
string word = strcat(argv[1], "\n");
You are adding characters to the end of a buffer here.
A buffer allocated for you by the runtime enviroment, that you should consider read only.
EDIT
I'm afraid your change to the code still has the same effect.
char* temp = argv[1];
Has temp pointing to the same buffer as argv[1].
You need to allocate a buffer the proper size, and use it.
char* temp = (char*)malloc(sizeof(char) * (strlen(argv[1]) + 2));
The +2 is for the adding \n and \0 at the end.
Than you do this:
strcpy(temp, argv[1]);
strcat(temp,"\n");
The code is rather flawed. Another one:
char* duh = fgets(c, 20, input);
Here you define a pointer to char, do not initialize it (hence it contains a random value) and then you write up to 20 bytes to the address pointed to by the random data. If you're lucky you just get a cash. If not, you overwrite some other important data. Fortunately most of the systems in use today won't let you access address space of another program, so the code wreaks havoc only on itself.
The line in question could look like:
#define BUFFERSIZE 1024
...
while (reasonable condition) {
char *duh = malloc(BUFERSIZE);
if (NULL == duh) { /* not enough memory - handle error, and exit */
}
duh = fgets(duh, BUFFERSIZE, input);
if (NULL == duh) { /* handle error or EOF condition */
} else { /* check that the line is read completely,
i.e. including end-of-line mark,
then do your stuff with the data */
}
free (duh);
}
Of course, you can allocate the buffer only once (outside of the loop) and reuse it. The #define makes it easy to adjust the maximum buffer size.
Alternatively, on recent systems, you can use getline(), which is able to allocate a buffer of appropriate size for you. That you must free() at the end of the loop.
If you are on Linux/BSD, use man (e.g. man fgets) to get information on the functions, otherwise resort to internet or a decent book on C for documentation.
First, My C knowledge is old, so I'm not sure what a string is. Either way, it's helpful, but not absolutely required to have a nice pre-zeroed buffer in which to read contents of the file. So whether you zero word or do something like the following, zero the input first.
#define IN_BUF_LEN 120
char in_buf[IN_BUF_LEN] = {0};
120 characters is a safe size, assuming most of your text lines are around 80 characters or less long.
Second, you're basing your loop of the value of a strcmp rather than actually reading the file. It might accomplish the same thing, but I'd base my while on reaching end of file.
Finally, you've declared duh a pointer, not a place to store what fgets returns. That's a problem, too. So, duh should be declared similarly to in_buf above.
Finally, you're assigning the value of argv[1] at compile time, not run-time. I can't see where that's getting you what you want. If you declare temp as a pointer and then assign argv[1] to it, you'll just have another pointer to argv[1], but not actually have copied the value of argv[1] to a local variable. Why not just use argv[1]?

Resources