C strings string comparisons always result in false - c

I am trying to finding a string in a file. I wrote following by modifying code snippet present in man page of getline.
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
FILE * fp;
char * line = NULL;
char *fixed_str = "testline4";
size_t len = 0;
ssize_t read;
fp = fopen("test.txt", "r");
if (fp == NULL)
exit(EXIT_FAILURE);
while ((read = getline(&line, &len, fp)) != -1) {
printf("Retrieved line of length %zu:\n", read);
printf("%s", line);
if (strcmp(fixed_str,line)==0)
printf("the match is found\n");
}
//printf("the len of string is %zu\n", strlen(fixed_str));
fclose(fp);
if (line)
free(line);
exit(EXIT_SUCCESS);
}
The problem is that result of strcmp is always false despite getline is successfully and correctly iterating over all lines in the file.
The length of fixed_str is 9 and that of equal string in file is 10 due to newline character (AM I RIGHT?). But comparing 9 chars with the help of strncmp still produces wrong result. I also ruled out the possibilities of caps and spaces so I think I am doing something very wrong
The test.txt is as below
test line1
test line2
test line3
testline4
string1
string2
string3
first name
I tried all entries but no success
NOTE: In my actual program I have to read fixed_str from another file

From the getline() man page (my emphasis):
getline() reads an entire line from stream, storing the address of
the buffer containing the text into *lineptr. The buffer is null-
terminated and includes the newline character, if one was found.
Your fixed_str has no newline.
Strip any newline character thus (for example):
char* nl = strrchr( line, '\n' ) ;
if(nl != NULL) *nl = `\0` ;
Or more efficiently since getline() returns the line length (in read in your case):
if(line[read - 1] == '\n' ) line[read - 1] = `\0` ;
Adding a '\n' to fixed_str may seem simpler, but is not a good idea because the last (or only) line in a file won't have one but may otherwise be a match.
Using strncmp() as described in your question should have worked, but without seeing the attempt it is hard to comment, but it is in any case a flawed solution since it would match all of the following for example:
testline4
testline4 and some more
testline4 12345.
Where fixed_str is taken from console or file input rather than a constant, the input method and data source may cause problems, as may the possibility of alternate line-end conventions. To make it more robust you might do:
// Strip any LF or CR+LF line end from fixed_str
char* line_end = strpbrk( fixed_str, "\r\n" ) ;
if( line_end != NULL ) *line_end = '\0' ;
// Strip any LF or CR+LF line end from line
line_end = strpbrk( line, "\r\n" ) ;
if( line_end != NULL ) *line_end = '\0' ;
Or the simpler (i.e. better) solution pointed out by #AndrewHenle:
// Strip any LF or CR+LF line end from fixed_str
fixed_str[strcspn(line, "\r\n")] = '\0';
// Strip any LF or CR+LF line end from line
line[strcspn(line, "\r\n")] = '\0';
That way either input can be compared regardless of lines ending in nothing, CR or CR+LF and the line end may even differ between the two inputs.

Related

Is there a quick way to get the last element that was put in an array?

I use an fgets to read from stdin a line and save it in a char array, I would like to get the last letter of the line i wrote , which should be in the array before \nand \0.
For example if i have a char line[10] and write on the terminal 1stLine, is there a fast way to get the letter e rather than just cycling to it?
I saw this post How do I print the last element of an array in c but I think it doesn't work for me, even if I just create the array without filling it with fgets , sizeof line is already 10 because the array already has something in it
I know it's not java and I can't just .giveMeLastItem(), but I wonder if there is a smarter way than to cycle until the char before the \n to get the last letter I wrote
code is something like
char command[6];
fgets(command,6,stdin);
If you know the sentinel value, ex: \0 (or \n ,or any value for that matter), and you want the value of the element immediately preceding to that, you can
use strchr() to find out the position of the sentinel and
get the address of retPtr-1 and dereference to get the value you want.
There are many different ways to inspect the line read by fgets():
first you should check the return value of fgets(): a return value of NULL means either the end of file was reached or some sort of error occurred and the contents of the target array is undefined. It is also advisable to use a longer array.
char command[80];
if (fgets(command, sizeof command, stdin) == NULL) {
// end of file or read error
return -1;
}
you can count the number of characters with len = strlen(command) and if this length os not zero(*), command[len - 1] is the last character read from the file, which should be a '\n' if the line has less than 5 bytes. Stripping the newline requires a test:
size_t len = strlen(command);
if (len > 0 && command[len - 1] == '\n')
command[--len] = '\0';
you can use strchr() to locate the newline, if present with char *p strchr(command, '\n'); If a newline is present, you can strip it this way:
char *p = strchar(command, '\n');
if (p != NULL)
*p = '\0';
you can also count the number of characters no in the set "\n" with pos = strcspn(command, "\n"). pos will point to the newline or to the null terminator. Hence you can strip the trailing newline with:
command[strcspn(command, "\n")] = '\0'; // strip the newline if any
you can also write a simple loop:
char *p = command;
while (*p && *p != '\n')
p++;
*p = '\n'; // strip the newline if any
(*) strlen(command) can return 0 if the file contains an embedded null character at the beginning of a line. The null byte is treated like an ordinary character by fgets(), which continues reading bytes into the array until either size - 1 bytes have been read or a newline has been read.
Once you have only the array, there is no other way to do this. You could use strlen(line) and then get the last characters position based on this index, but this basically does exactly the same (loop over the array).
char lastChar = line[strlen(line)-1];
This has time-complexity of O(n), where n is the input length.
You can change the input method to a char by char input and count the length or store the last input. Every O(1) method like this uses O(n) time before (like n times O(1) for every character you read). But unless you have to really speed optimize (and you don't, when you work with user input) should just loop over the array by using a function like strlen(line) (and store the result, when you use it multiple times).
EDIT:
The strchr() function Sourav Ghosh mentioned, does exactly the same, but you can/must specify the termination character.
A straightforward approach can look the following way
char last_letter = command[ strcspn( command, "\n" ) - 1 ];
provided that the string is not empty or contains just the new line character '\n'.
Here is a demonstrative progarm.
#include <stdio.h>
#include <string.h>
int main(void)
{
enum { N = 10 };
char command[N];
while ( fgets( command, N, stdin ) && command[0] != '\n' )
{
char last_letter = command[ strcspn( command, "\n" ) - 1 ];
printf( "%c ", last_letter );
}
putchar( '\n' );
return 0;
}
If to enter the following sequence of strings
Is
there
a
quick
way
to
get
the
last
element
that
was
put
in
an
array?
then the output will be
s e a k y o t e t t t s t n n ?
The fastest way is to keep an array of references like this:
long ref[]
and ref[x] to contain the file offset of the last character of the xth line. Having this reference saved at the beginning of the file you will do something like:
fseek(n*sizeof(long))
long ref = read_long()
fseek(ref)
read_char()
I think this is the fastest way to read the last character at the end of the nth line.
I did a quick test of the three mentioned methods of reading a line from a stream and measuring its length. I read /usr/share/dict/words 100 times and measured with clock()/1000:
fgets + strlen = 420
getc = 510
fscanf with " 100[^\n]%n" = 940
This makes sense as fgets and strlen just do 2 calls, getc does a call per character, and fscanf may do one call but has a lot of machinery to set up for processing complex formats, so a lot more overhead. Note the added space in the fscanf format to skip the newline left from the previous line.
Beside the other good examples.
Another way is using fscanf()/scanf() and the %n format specifier to write to an argument the amount of read characters so far after you have input the string.
Then you subtract this number by one and use it as an index to command:
char command[6];
int n = 0;
if (fscanf(stdin, "%5[^\n]" "%n", command, &n) != 1)
{
fputs("Error at input!", stderr);
// error routine.
}
getchar();
if (n != 0)
{
char last_letter = command[n-1];
}
#include <stdio.h>
int main (void)
{
char command[6];
int n = 0;
if (fscanf(stdin, "%5[^\n]" "%n", command, &n) != 1)
{
fputs("Error at input!", stderr);
// error routine.
}
getchar();
if (n != 0)
{
char last_letter = command[n-1];
putchar(last_letter);
}
return 0;
}
Execution:
./a.out
hello
o

How can I skip an entire comment line by using fgets even after the set maximum nr of chars

I'm reading lines from a file and I might have a comment anywhere throughout it of any size.
while (fgets(line, 100, myFile))
{
// skip and print comment
if (line[0] == '#') printf("Comment is = %s", line);
else {...}
}
The code is doing what is supposed to until it gets a comment which is over 100 characters. In that case it will not detect the # anymore and it won't skip it. How can I solve this?
You could introduce a state variable to tell the program that you are on comment
mode. Like this:
// mode == 0 --> normal
// mode == 1 --> comment, remove/ignore comments
int mode = 0;
char line[100];
while(fgets(line, sizeof line, myFile))
{
char *newline = strchr(line, '\n');
if(mode == 1)
{
if(newline)
mode = 0; // restore normal mode
continue; // ignore read characters
}
char *comment = strchr(line, '#');
if(comment)
{
*comment = '\0';
if(newline == NULL)
mode = 1; // set comment mode
}
// process your line without the comment
}
If a comment is found, strchr returns a pointer to that location. Setting it
to '\0' allows you to process the line without a comment. If the comment is
larger than line can hold, then the newline character is not found. In that case
you have to skip the next read bytes of fgets, until you find a newline.
That's when the mode variable comes in handy, you set it to 1 so the next
iterations can ignore the line if a newline is not found.
fgets itself will tell you when it did not read the entire line:
The fgets() function reads bytes from the stream into the array pointed to by s, until nāˆ’1 bytes are read, or a newline character is read and transferred to s, or an end-of-file condition is encountered. The string is then terminated with a null byte.
(https://docs.oracle.com/cd/E36784_01/html/E36874/fgets-3c.html)
So if your just-read line starts with # but does not end with \n (and fgets does not indicate EOF), read and skip all next 'lines' until you either find the end of the current 'line' indicated by a terminating \n, or you encounter an EOF condition.
If you want to store the comment for later display (as you are doing), use malloc and realloc to create and enlarge memory for the comment itself. Do not forget to free it after you're done.

Copying a desired string from a text file in C

I have read all the text from a desired file and it is now stored in buff. I want to copy just the string content after identifier strings such as 'Title'.
Example file below:
"Title: I$_D$-V$_{DS}$ Characteristic Curves (Device 1)
MDate: 2016-03-01
XLabel: Drain voltage V$_{DS}$
YLabel: Drain current I$_D$
CLabel: V$_{GS}$
XUnit: V
... "
for(;;) {
size_t n = fread(buff, 1 , DATAHOLD, inFile);
subString = strstr( buff, "Title");
if( subString != NULL) {
strcpy(graph1.title , (subString + 7));
subString = NULL;
}
....more if statements....
if( n < DATAHOLD) {
break;
}
}
I understand that strstr() returns a pointer to location of the search string, I added 7 to get just the text that comes after the search string and this part works fine. The problem is strcpy() copies the rest of buff character array into graph1.title.
How to instruct strcpy() to only copy the text on the same line as the substring pointer? Using strtok() maybe?
I agree with ChuckCottrill, it would be better if you read and process one line at a time.
Also since the file you are dealing with is a text file, you could be opening it in text mode.
FILE *fin = fopen("filename", "r");
Read a line with fgets() into a string str. It should be noted that fgets() will take the trailing \n' to str.
fgets(str, sizeof(str), fin);
char *substring;
if( (substring = strstr(str, "Title: ")) != NULL )
{
strcpy(graph1.title, substring+strlen("Title: "));
}
At this point, graph1.title will have I$_D$-V$_{DS}$ Characteristic Curves (Device 1) in it.
Read and process a single line at a time.
for( ; fgets(line,...); ) {
do stuff on line
}
You could use another strstr to get the position of the end of the line, and then use strncpy which is like strcpy, but accepts a third argument, the number of chars to copy of the input.

fscanf() how to go in the next line?

So I have a wall of text in a file and I need to recognize some words that are between the $ sign and call them as numbers then print the modified text in another file along with what the numbers correspond to.
Also lines are not defined and columns should be max 80 characters.
Ex:
I $like$ cats.
I [1] cats.
[1] --> like
That's what I did:
#include <stdio.h>
#include <stdlib.h>
#define N 80
#define MAX 9999
int main()
{
FILE *fp;
int i=0,count=0;
char matr[MAX][N];
if((fp = fopen("text.txt","r")) == NULL){
printf("Error.");
exit(EXIT_FAILURE);
}
while((fscanf(fp,"%s",matr[i])) != EOF){
printf("%s ",matr[i]);
if(matr[i] == '\0')
printf("\n");
//I was thinking maybe to find two $ but Idk how to replace the entire word
/*
if(matr[i] == '$')
count++;
if(count == 2){
...code...
}
*/
i++;
}
fclose(fp);
return 0;
}
My problem is that fscanf doesn't recognize '\0' so it doesn't go in the next line when I print the array..also I don't know how to replace $word$ with a number.
Not only will fscanf("%s") read one whitespace-delimited string at a time, it will also eat all whitespace between those strings, including line terminators. If you want to reproduce the input whitespace in the output, as your example suggests you do, then you need a different approach.
Also lines are not defined and columns should be max 80 characters.
I take that to mean the number of lines is not known in advance, and that it is acceptable to assume that no line will contain more than 80 characters (not counting any line terminator).
When you say
My problem is that fscanf doesn't recognize '\0' so it doesn't go in the next line when I print the array
I suppose you're talking about this code:
char matr[MAX][N];
/* ... */
if(matr[i] == '\0')
Given that declaration for matr, the given condition will always evaluate to false, regardless of any other consideration. fscanf() does not factor in at all. The type of matr[i] is char[N], an array of N elements of type char. That evaluates to a pointer to the first element of the array, which pointer will never be NULL. It looks like you're trying to determine when to write a newline, but nothing remotely resembling this approach can do that.
I suggest you start by taking #Barmar's advice to read line-by-line via fgets(). That might look like so:
char line[N+2]; /* N + 2 leaves space for both newline and string terminator */
if (fgets(line, sizeof(line), fp) != NULL) {
/* one line read; handle it ... */
} else {
/* handle end-of-file or I/O error */
}
Then for each line you read, parse out the "$word$" tokens by whatever means you like, and output the needed results (everything but the $-delimited tokens verbatim; the bracket substitution number for each token). Of course, you'll need to memorialize the substitution tokens for later output. Remember to make copies of those, as the buffer will be overwritten on each read (if done as I suggest above).
fscanf() does recognize '\0', under select circumstances, but that is not the issue here.
Code needs to detect '\n'. fscanf(fp,"%s"... will not do that. The first thing "%s" directs is to consume (and not save) any leading white-space including '\n'. Read a line of text with fgets().
Simple read 1 line at a time. Then march down the buffer looking for words.
Following uses "%n" to track how far in the buffer scanning stopped.
// more room for \n \0
#define BUF_SIZE (N + 1 + 1)
char buffer[BUF_SIZE];
while (fgets(buffer, sizeof buffer, stdin) != NULL) {
char *p = buffer;
char word[sizeof buffer];
int n;
while (sscanf(p, "%s%n", word, &n) == 1) {
// do something with word
if (strcmp(word, "$zero$") == 0) fputs("0", stdout);
else if (strcmp(word, "$one$") == 0) fputs("1", stdout);
else fputs(word, stdout);
fputc(' ', stdout);
p += n;
}
fputc('\n', stdout);
}
Use fread() to read the file contents to a char[] buffer. Then iterate through this buffer and whenever you find a $ you perform a strncmp to detect with which value to replace it (keep in mind, that there is a 2nd $ at the end of the word). To replace $word$ with a number you need to either shrink or extend the buffer at the position of the word - this depends on the string size of the number in ascii format (look solutions up on google, normally you should be able to use memmove). Then you can write the number to the cave, that arose from extending the buffer (just overwrite the $word$ aswell).
Then write the buffer to the file, overwriting all its previous contents.

Why does opendir() work for one string but not another?

I would like to open a directory using opendir but am seeing something unexpected. opendir works for the string returned from getcwd but not the string from my helper function read_cwd, even though the strings appear to be equal.
If I print the strings, both print /Users/gwg/x, which is the current working directory.
Here is my code:
char real_cwd[255];
getcwd(real_cwd, sizeof(real_cwd));
/* This reads a virtual working directory from a file */
char virt_cwd[255];
read_cwd(virt_cwd);
/* This prints "1" */
printf("%d\n", strcmp(real_cwd, virt_cwd) != 0);
/* This works for real_cwd but not virt_cwd */
DIR *d = opendir(/* real_cwd | virt_cwd */);
Here is the code for read_cwd:
char *read_cwd(char *cwd_buff)
{
FILE *f = fopen(X_PATH_FILE, "r");
fgets(cwd_buff, 80, f);
printf("Read cwd %s\n", cwd_buff);
fclose(f);
return cwd_buff;
}
The function fgets includes the final newline in the buffer ā€” so the second string is actually "/Users/gwg/x\n".
The simplest (but not necessarily the cleanest) way to solve this issue is to overwrite the newline with a '\0': add the following at the end of the function read_cwd:
n = strlen(cwd_buff);
if(n > 0 && cwd_buff[n - 1] == '\n')
cwd_buff[n - 1] = '\0';
fgets() includes the newline.
Parsing stops if end-of-file occurs or a newline character is found, in which case str will contain that newline character. ā€” http://en.cppreference.com/w/c/io/fgets
You should trim the white space on both ends of the string when reading input like this.
From the fgets man page:
fgets() reads in at most one less than size characters from stream and
stores them into the buffer pointed to by s. Reading stops after an
EOF or a newline. If a newline is read, it is stored into the buffer.
A terminating null byte (aq\0aq) is stored after the last character in
the buffer.
You need to remove the newline character from the string you are reading in.

Resources