Looping until a specific string is found - c

I have a simple question. I want to write a program in C that scans the lines of a specific file, and if the only phrase on the line is "Atoms", I want it to stop scanning and report which line it was on. This is what I have and is not compiling because apparently I'm comparing an integer to a pointer: (of course "string.h" is included.
char dm;
int test;
test = fscanf(inp,"%s", &dm);
while (test != EOF) {
if (dm=="Amit") {
printf("Found \"Atoms\" on line %d", j);
break;
}
j++;
}
the file was already opened with:
inp = fopen( .. )
And checked to make sure it opens correctly...
I would like to use a different approach though, and was wondering if it could work. Instead of scanning individual strings, could I scan entire lines as such:
// char tt[200];
//
// fgets(tt, 200, inp);
and do something like:
if (tt[] == "Atoms") break;
Thanks!
Amit

Without paying too much attention to your actual code here, the most important mistake your making is that the == operator will NOT compare two strings.
In C, a string is an array of characters, which is simply a pointer. So doing if("abcde" == some_string) will never be true unless they point to the same string!
You want to use a method like "strcmp(char *a, char *b)" which will return 0 if two strings are equal and something else if they're not. "strncmp(char *a, char *b, size_t n)" will compare the first "n" characters in a and b, and return 0 if they're equal, which is good for looking at the beginning of strings (to see if a string starts with a certain set of characters)
You also should NOT be passing a character as the pointer for %s in your fscanf! This will cause it to completely destroy your stack it tries to put many characters into ch, which only has space for a single character! As James says, you want to do something like char ch[BUFSIZE] where BUFSIZE is 1 larger than you ever expect a single line to be, then do "fscanf(inp, "%s", ch);"
Hope that helps!

please be aware that dm is a single char, while you need a char *
more: if (dm=="Amit") is wrong, change it in
if (strcmp(dm, "Amit") == 0)

In the line using fscanf, you are casting a string to the address of a char. Using the %s in fscanf should set the string to a pointer, not an address:
char *dm;
test = fscanf(inp,"%s", dm);
The * symbol declares an indirection, namely, the variable pointed to by dm. The fscanf line will declare dm as a reference to the string captured with the %s delimiter. It will point to the address of the first char in the string.
What kit said is correct too, the strcmp command should be used, not the == compare, as == will just compare the addresses of the strings.
Edit: What kit says below is correct. All pointers should be allocated memory before they are used, or else should be cast to a pre-allocated memory space. You can allocate memory like this:
dm = (char*)malloc(sizeof(char) * STRING_LENGTH);
where STRING_LENGTH is a maximum length of a possible string. This memory allocation only has to be done once.

The problem is you've declared 'dm' as a char, not a malloc'd char* or char[BUFSIZE]
http://www.cplusplus.com/reference/clibrary/cstdio/fscanf/
You'll also probably report incorrect line numbers, you'll need to scan the read-in buffer for '\n' occurences, and handle the case where your desired string lies across buffer boundaries.

Related

What happens with extra memory using fscanf?

I'm new to C and I have a couple questions about fscanf. I wrote a simple program that reads the contents of a file and spits it back out on the command line:
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char* argv[1])
{
if (argc != 2)
{
printf("Usage: fscanf txt\n");
return 1;
}
char* txt = argv[1];
FILE* fp = fopen(txt, "r");
if (fp == NULL)
{
printf("Could not open %s.\n", txt);
return 2;
}
char s[50];
while (fscanf(fp, "%49s", s) == 1)
printf("%s\n", s);
return 0;
}
Let's say the contents of my text file is just "C is cool.", which will output:
C
is
cool.
So I have two questions here:
1) Does fscanf assume that the placeholder "%s" will be a single word (an array of chars only)? According to this program's output, spaces and line breaks seem to prompt the function to return. But what if I wanted to read a whole paragraph? Would I use fread() instead?
2) More importantly I'm wondering what happens with all of the unused space in the array. On the first iteration, I think s[0] = "C" and s[1] = "\0", so are s[2] - s[49] just wasted?
EDIT: while (fscanf(fp, "%**49**s", s) == 1) - thanks to #M Oehm for pointing this out - enforcing strong limit here to prevent dangerous buffer overflows
1) Does fscanf assume that the placeholder "%s" will be a single word
(an array of chars only)? According to this program's output, spaces
and line breaks seem to prompt the function to return. But what if I
wanted to read a whole paragraph? Would I use fread() instead?
The %s specifier reads single words that are delimited by white space. The scanf family of functions are very cerude; they do not normally distinguish between line breaks and spaces, for example.
A line is anything up to the next newline. There is no concept of paragraph, but you might consider anything between blank lines a paragraph. The function to read lines of text is fgets, so you could read lines until you find an empty one. (fgets retains the newline at the end, mind.)
fread is a function for reading binary data. It is not useful for reading structured texts. (But it can be used to read the contents of a whole text file at once.)
2) More importantly I'm wondering what happens with all of the unused
space in the array. On the first iteration, I think c[0] = 'C' and
c[1] = '\0', so are c[2] - c[49] just wasted?
You are right, the data after the null ternimator isn't used. "Wasted" is too negative – with user input you don't know whether you encounter a longer word eventually. Because dynamic allocation requires some care in C, allocating "enogh for most cases" is a goopd practice in C. You should enforce the hard limit when reading, though, to prevent buffer overruns:
fscanf(fp, "%49s", s)
The issue of "wasted" memory becomes more serious if you have an array of arrays of 50 chars. Most of the words will be much shorter than 50 chars. Here, the extra memory might eventually hurt you. 48 extra characters for reading a line are okay, though.
(A strategy to save "compact" arrays of chars is to have a running array of chars that is a concatenation of all strings, including their terminators. The word array is then an array of piointers into that master string.)
You use specifier %s which will read and store data in array s until it encounters a space or newline . As soon as it encounters space fscanf returns.
I think c[0] = "C" and c[1] = "\0", so are c[2] - c[49] just wasted?
Yes , s[0]='C' and s[1]='\0' and you probably can't do anything about the size of array being much more.
If you want complete string "C is cool" stored in array use fgets.
#define len 1000
char s[len];
while(fgets(s,len,fp)!=NULL) {
//your code
}

C - fscanf to read two ints and then strings

I need to store in a text file two integers and then the lines of a text. i've successfully done it by writing each int in a line and each line of the text in a new line too. In order to read it, however, I've found some troubles. I'm doing this:
FILE *f = fopen(arquivo, "r");
char *lna = NULL;
fscanf(f, "%d\n%d\n", &maxCol, &maxLin);
//↑This reads the two ints, works fine in step-by-step
for (;;) {
fscanf(f, "%s\n", &lna);
//↑This sets lna to NULL always, even if there are more lines
if (lna != NULL)
lna[strlen(lna) - 1] = '\0';
if (feof(f))
break;
inserirApos(lista, lna, &atual);
}
fclose(f);
I've tryied a few different ways, but they never worked. I understand I can read everthing like strings, with gets or something, but I think that has a problem if the string contains spaces. I wanted to know if the way I'm doing is the best, and what's wrong with it. I've found one of these methods (that didn't work either) that you have to pass the maximum length of each line. I know this information if necessary, it's the maxCol I read before.
fscanf(f, "%s\n", &lna);
Is the wrong argument type. The %s format expects a char* as argument, but you gave it a char**. And you have not allocated memory to that pointer. fscanf expects a char* pointing to a large enough memory area.
char *lna = malloc(whatever_you_need);
...
fscanf("%s ", lna);
(no difference between the '\n' and ' ' in the fscanf format. both consume the entire whitespace following the string of non-whitespace characters scanned int lna.)
You seem to be expecting fscanf() to dynamically allocate strings for you; that's not at all how it works. This is undefined behavior.
You need to allocate the space for lna first.
char *lna = malloc(MAX_SIZE);//MAX_SIZE is the maximum size the string can be + 1
The additional arguments should point to already allocated objects of the type specified by their corresponding format specifier within the format string.

Proper use of strcat() in C

I have an archive file that looks like this:
!<arch>
file1.txt/ 1350248044 45503 13036 100660 28 `
hello
this is sample file 1
Now in here, the number 28 in the header is the file1.txt size. To get that number, I use:
int curr_char;
char file_size[10];
int int_file_size;
curr_char = fgetc(arch_file);
while(curr_char != ' '){
strcat(file_size, &curr_char);
curr_char = fgetc(arch_file);
}
// Convert the characters to the corresponding integer value using atoi()
int_file_size = atoi(file_size);
However, values in the file_size array change every time I run my code. Sometimes it's correct, but mostly not. Here are some examples of what I get for file_size:
?28`U
2U8U
28 <--- Correct!
pAi?28
I believe the problem is with my strcat() function, but not sure. Any help would be appreciated.
You shouldn't read the file character wise. There are higher level functions doing this. As larsmans already pointed out, you can use fscanf() for this task:
fscanf(arch_file, "%d", &int_file_size);
&curr_char is an int*, so you're copying over the bits of an int as if they represented a string.
You should be using scanf.
The expression &curr_char points to a single character (well, actually an integer as that's how you declared it). strcat looks for a string, and string as you should know are terminated by a '\0' character. So what strcat does in your case is use the &curr_char pointer as the address of a string and looks for the terminator. Since that is not found weird stuff will happen.
One way of solving this is to make curr_char an array, initialized to zero (the string terminator character) and read into the first entry:
char curr_char[2] = { '\0' }; /* Will make all character in array be zero */
...
curr_char[0] = fgetc(...);
There is also another problem, and that is that you are trying to concatenate into a string that is not initialized. When running your program, the array file_size can contain any data, it's not automatically zeroed out. This leads to the weird characters before the number. This is solved partially the same way as the above problem, by initializing the array:
char file_size[10] = { '\0' };

Pointer mystery/noobish issue

I am originally a Java programmer who is now struggling with C and specifically C's pointers.
The idea on my mind is to receive a string, from the user, on a command line, into a character pointer. I then want to access its individual elements. The idea is later to devise a function that will reverse the elements' order. (I want to work with anagrams in texts.)
My code is
#include <stdio.h>
char *string;
int main(void)
{
printf("Enter a string: ");
scanf("%s\n",string);
putchar(*string);
int i;
for (i=0; i<3;i++)
{
string--;
}
putchar(*string);
}
(Sorry, Code marking doesn't work).
What I am trying to do is to have a first shot at accessing individual elements. If the string is "Santillana" and the pointer is set at the very beginning (after scanf()), the content *string ought to be an S. If unbeknownst to me the pointer should happen to be set at the '\0' after scanf(), backing up a few steps (string-- repeated) ought to produce something in the way of a character with *string. Both these putchar()'s, though, produce a Segmentation fault.
I am doing something fundamentally wrong and something fundamental has escaped me. I would be eternally grateful for any advice about my shortcomings, most of all of any tips of books/resources where these particular problems are illuminated. Two thick C books and the reference manual have proved useless as far as this.
You haven't allocated space for the string. You'll need something like:
char string[1024];
You also should not be decrementing the variable string. If it is an array, you can't do that.
You could simply do:
putchar(string[i]);
Or you can use a pointer (to the proposed array):
char *str = string;
for (i = 0; i < 3; i++)
str++;
putchar(*str);
But you could shorten that loop to:
str += 3;
or simply write:
putchar(*(str+3));
Etc.
You should check that scanf() is successful. You should limit the size of the input string to avoid buffer (stack) overflows:
if (scanf("%1023s", string) != 1)
...something went wrong — probably EOF without any data...
Note that %s skips leading white space, and then reads characters up to the next white space (a simple definition of 'word'). Adding the newline to the format string makes little difference. You could consider "%1023[^\n]\n" instead; that looks for up to 1023 non-newlines followed by a newline.
You should start off avoiding global variables. Sometimes, they're necessary, but not in this example.
On a side note, using scanf(3) is bad practice. You may want to look into fgets(3) or similar functions that avoid common pitfalls that are associated with scanf(3).

Checking contents of char variable - C Programming

This might seem like a very simple question, but I am struggling with it. I have been writing iPhone apps with Objective C for a few months now, but decided to learn C Programming to give myself a better grounding.
In Objective-C if I had a UILabel called 'label1' which contained some text, and I wanted to run some instructions based on that text then it might be something like;
if (label1.text == #"Hello, World!")
{
NSLog(#"This statement is true");
}
else {
NSLog(#"Uh Oh, an error has occurred");
}
I have written a VERY simple C Program I have written which uses printf() to ask for some input then uses scanf() to accept some input from the user, so something like this;
int main()
{
char[3] decision;
Printf("Hi, welcome to the introduction program. Are you ready to answer some questions? (Answer yes or no)");
scanf("%s", &decision);
}
What I wanted to do is apply an if statement to say if the user entered yes then continue with more questions, else print out a line of text saying thanks.
After using the scanf() function I am capturing the users input and assigning it to the variable 'decision' so that should now equal yes or no. So I assumed I could do something like this;
if (decision == yes)
{
printf("Ok, let's continue with the questions");
}
else
{
printf("Ok, thank you for your time. Have a nice day.");
}
That brings up an error of "use of undeclared identifier yes". I have also tried;
if (decision == "yes")
Which brings up "result of comparison against a string literal is unspecified"
I have tried seeing if it works by counting the number of characters so have put;
if (decision > 3)
But get "Ordered comparison between pointer and integer 'Char and int'"
And I have also tried this to check the size of the variable, if it is greater than 2 characters it must be a yes;
if (sizeof (decision > 2))
I appreciate this is probably something simple or trivial I am overlooking but any help would be great, thanks.
Daniel Haviv's answer told you what you should do. I wanted to explain why the things you tried didn't work:
if (decision == yes)
There is no identifier 'yes', so this isn't legal.
if (decision == "yes")
Here, "yes" is a string literal which evaluates to a pointer to its first character. This compares 'decision' to a pointer for equivalence. If it were legal, it would be true if they both pointed to the same place, which is not what you want. In fact, if you do this:
if ("yes" == "yes")
The behavior is undefined. They will both point to the same place if the implementation collapses identical string literals to the same memory location, which it may or may not do. So that's definitely not what you want.
if (sizeof (decision > 2))
I assume you meant:
if( sizeof(decision) > 2 )
The 'sizeof' operator evaluates at compile time, not run time. And it's independent of what's stored. The sizeof decision is 3 because you defined it to hold three characters. So this doesn't test anything useful.
As mentioned in the other answer, C has the 'strcmp' operator to compare two strings. You could also write your own code to compare them character by character if you wanted to. C++ has much better ways to do this, including string classes.
Here's an example of how you might do that:
int StringCompare(const char *s1, const char *s2)
{ // returns 0 if the strings are equivalent, 1 if they're not
while( (*s1!=0) && (*s2!=0) )
{ // loop until either string runs out
if(*s1!=*s2) return 1; // check if they match
s1++; // skip to next character
s2++;
}
if( (*s1==0) && (*s2==0) ) // did both strings run out at the same length?
return 0;
return 1; // one is longer than the other
}
You should use strcmp:
if(strcmp(decision, "yes") == 0)
{
/* ... */
}
You should be especially careful with null-terminated string in C programming. It is not object. It is a pointer to a memory address. So you can't compare content of decision directly with a constant string "yes" which is at another address. Use strcmp() instead.
And be careful that "yes" is actually "yes\0" which will take 4 bytes and the "\0" is very important to strcmp() which will be recognized as the termination during the comparison loop.
Ok a few things:
decision needs to be an array of 4 chars in order to fit the string "yes" in it. That's because in C, the end of a string is indicated by the NUL char ('\0'). So your char array will look like: { 'y', 'e', 's', '\0' }.
Strings are compared using functions such as strcmp, which compare the contents of the string (char array), and not the location/pointer. A return value of 0 indicates that the two strings match.
With: scanf("%s", &decision);, you don't need to use the address-of operator, the label of an array is the address of the start of the array.
You use strlen to get the length of a string, which will just increment a counter until it reaches the NUL char, '\0'. You don't use sizeof to check the length of strings, it's a compile-time operation which will return the value 3 * sizeof(char) for a char[3].
scanf is unsafe to use with strings, you should alternatively use fgets(stdin...), or include a width specifier in the format string (such as "3%s") in order to prevent overflowing your buffer. Note that if you use fgets, take into account it'll store the newline char '\n' if it reads a whole line of text.
To compare you could use strcmp like this:
if(strcmp(decision, "yes") == 0) {
// decision is equal to 'yes'
}
Also you should change char decision[3] into char decision[4] so that the buffer has
room for a terminating null character.
char decision[4] = {0}; // initialize to 0
There's several issues here:
You haven't allocated enough storage for the answer:
char[3] decision;
C strings are bytes in the string followed by an ASCII NUL byte: 0x00, \0. You have only allocated enough space for ye\0 at this point. (Well, scanf(3) will give you yes\0 and place that NUL in unrelated memory. C can be cruel.) Amend that to include space for the terminating \0 and amend your scanf(3) call to prevent the buffer overflow:
char[4] decision;
/* ... */
scanf("%3s", decision);
(I've left off the &, because simply giving the name of the array is the same as giving the address of its first element. It doesn't matter, but I believe this is more idiomatic.)
C strings cannot be compared with ==. Use strcmp(3) or strncmp(3) or strcasecmp(3) or strncasecmp(3) to compare your strings:
if(strcasecmp(decision, "yes") == 0) {
/* yes */
}
C has lots of lib functions to handle this but it pays to know what you are declaring.
Declaring
char[3] decision;
is actually declaring a char array of length 3. So therefor attempting a comparison of
if(decision == "yes")
is comparing a literal against and array and therefor will not work. Since there is no defined string type in C you have to use pointers, but not directly, if you don't want to. In C strings are in fact arrays of char so you can declare them both ways eg:
char[3] decision ;
* char decision ;
Both will in point of fact work but you in the first instance the compiler will allocate the memory for you, but it will ONLY allocate 3 bytes. Now since strings in C are null terminated you need to actually allocate 4 bytes since you need room for "yes" and the null. Declaring it the second way simply declares a pointer to someplace in memory but you have no idea really where. You would then have to allocate memory to contain whatever you are going to put there since to do otherwise will more then likely cause a SEGFAULT.
To compare what you get from input you have two options, either use the strcomp() function or do it yourself by iterating through decision and comparing each individual byte against "Y" and "E" and "S" until you hit null aka \0.
There are variations on strcomp() to deal with uppercase and lowercase and they are part of the standard string.h library.

Resources