skip to next line of file ignoring content - c

hi so i have a program where if there is an # at the begining of the first line of the text file it needs to be ignored, how do you jump to the next line of file? ignoring all that there is after the #?
for example:
#1234
5
I want to print 5 and the rest to be ignored.
I only managed to skip the # if there is nothing behind it
while (a == '#' || a == '\r'|| a == '\n') {
fscanf(inp, "%c", &a);
}

As for your previous question, if your want to ignore comment lines with an initial #, it is highly recommended to read the file line by line with fgets() and to handle non comment lines directly while ignoring comment lines.
It is actually non trivial to do it with fscanf because depending on your format lines, the linefeed may or may not have been consumed.
If you are at the start of a line and want to read the next char while ignoring the comment lines, do this:
int c; // Must be int to accommodate for EOF.
while ((c = getc(inp)) == '#') {
while ((c = getc(inp)) != EOF && c != '\n')
continue;
}
// Here c contains the first char from a non comment line or EOF.

Instead of
while (a == '#' || a == '\r'|| a == '\n') {
fscanf(inp,"%c",&a);
}
Try (pseudo code):
If FirstChar == '#'
Loop/scan until '\n'
On nextline here
If you want to use fscanf().
If better performance is needed, work on buffers directly.

Related

C Code for Deleting a Line

I am a C noob, and I was trying to make a program to delete a specific line. For this, I chose to copy the contents of the source file, skipping the line intended for deletion. In my original code, I wrote:
while(read_char = fgetc(fp) != '\n') //code to move the cursor position to end of line
{
printf("%c",read_char); //temporary code to see the skipped characters
}
which gave me lots of smileys.
In the end I found the code which gave the intended output:
read_char=fgetc(fp);
while(read_char != '\n') //code to move the cursor position to end of line
{
printf("%c",read_char); //temporary code to see the skipped characters
read_char=fgetc(fp);
}
But what is the actual difference between these two codes?
Assignment has lower priority than not-equal, so:
read_char = fgetc(fp) != '\n'
results in read_char getting a 0 or 1, the result of comparing the result of the fgetc() call against '\n'.
You need parentheses:
while((read_char = fgetc(fp)) != '\n')
which will assign the fgetc() result to read_char before comparing with '\n'.

Reading lines ahead in a file (In C)

I have a file that looks like this:
This is the first line in the file
This is the third line in the file
Where I have a blank line in the file (On line 2). I want to read the file line by line (Which I do using fgets), but then i want to read ahead just check if a line there is a blank line in the file.
However, My while fgetshas a break statement in it, because my function is only so posed to read the file a line at a time per function call.
so if I call the function:
func(file);
It would read the first line, then break.
If I called it again, it would read the second line then break, etc.
Because I have to implement it this way, it's hard to read ahead, is there any way I can accomplish this?
This is my code:
int main(void) {
FILE * file;
if(file == NULL){perror("test.txt"); return EXIT_FAILURE;}
readALine(file);
}
void readALine(FILE * file) {
char buffer[1000];
while(fgets(buffer,sizeof(buffer),file) != NULL) {
//Read lines ahead to check if there is a line
//which is blank
break; //only read a line each FUNCTION CALL
}
}
So to clarify, if I WAS reading the entire file at once (Only one function call) it would go like this (Which is easy to implement).
int main(void) {
FILE * file = fopen("test.txt","r");
if(file == NULL){perror("test.txt"); return EXIT_FAILURE;}
readALine(file);
}
void readALine(FILE * file) {
char buffer[1000];
while(fgets(buffer,sizeof(buffer),file) != NULL) {
if(isspace(buffer[0]) {
printf("Blank line found\n");
}
}
}
But since I'm reading the file in (Line by line, per function call), The second piece of code above wouldn’t work (Since I break per line read, which I can't change).
Is there a way I could use fseek to accomplish this?
A while loop ending in an unconditional break is an if statement, so I don't really see why you are using a while loop. I'm also assuming you are not worried about a single line being longer than 1000 chars.
the continue statement jumps over to the next iteration of the loop and checks the condition again.
void readALine(FILE * file) {
char buffer[1000];
while(fgets(buffer,sizeof(buffer),file) != NULL) {
if(!isspace(buffer[0]) { //note the not operator
//I'm guessing isspace checks for a newline character since otherwise this will be true also for lines beginning with space
continue; //run the same loop again
}
break;
}
//buffer contains the next line except for empty ones here...
}
You can "read ahead" by simply storing your position in the file (with position = ftell(your_file)), then read the line, if this is a blank line do whatever you have to do, and finally go back to the position you were (with fseek(your_file, position, SEEK_SET)).
Hope this helps !
The while loop in readALine reads lines until the end of the file. So it will skip blank lines, and all other lines.
You can return from within the loop if you've found a non-blank line:
while(fgets(buffer,sizeof(buffer),file) != NULL) {
if (buffer[0] != '\n')
return;
}
If you also want to skip lines that consist of nothing but spaces, you can write a function that does that check:
bool isNothingButWhitespace(char *s) {
while (*s == ' ' || *s == '\n')
s++;
return *s == '\0';
}
This will find the first character that's not whitespace. If it's the string terminator '\0' then it will return true (the string was nothing but whitespace) otherwise falseS (there was some non-whitespace character found).
If the while loop in readALine completes due to it reaching the end of file, you need some way to signal that back to the caller. I recommend setting buffer[0] = '\0'.

Writing One File To Another Produces Incorrect Results

The program opens a file in read mode. It then creates a second file, writes the contents of the first file into the second and deletes the first. It finishes by renaming the second file to the original name.
Here is the output I get.
User:~ ./main
Before
M1
M2
M3
M4
After
1
M2
M4
ÿ User:~
The output should read the same as the first excluding the second line because that is the line I want to delete.
This is the part of the code that copy's the characters.
ch = getc(File1);
while(ch != EOF);
{
ch = getc(File1);
if (ch == '\n')
ln++;
if (ln != LineToDelete)
{
putc(ch, File2);
}
}
Here is The Full Code On Pastebin
There are three problems that I see:
You are throwing away your first input character, because you read a character before entering your loop then immediately read another after entering.
You are not initializing ln. It appears that you got lucky and it was already 0, so you ended up omitting the line with "M3" rather than "M2". However, you're dealing with undefined behaviour here; anything could have happened.
You are printing out the EOF character.
Try something like this:
ln = 1;
while (EOF != (ch = getc(File1))) {
if ('\n' == ch)
++ln;
if (LineToDelete != ln)
putc(ch, File2);
}

How to handle newline character '\n' in any text file against the linux newline characters \r\n characters?

I have a C code which reads 1 line at a time, from a file opened in text mode using
fgets(buf,200,fin);
The input file which fgets() reads lines from, is an command line argument to the program.
Now fgets leaves the newline character included in the string copied to buf.
Somewhere do the line in the code I check
length = strlen(buf);
For some input files , which I guess are edited in *nix environment newline character is just '\n'
But for some other test case input files(which I guess are edited/created under Windows environment) have 2 characters indicating a newline - '\r''\n'
I want to remove the newline character and want to put a '\0' as the string terminator character. So I have to either do -
if(len == (N+1))
{
if(buf[length-1] == '\n')
{
buf[length-2] = '\0'; //for a `\r\n` newline
}
}
or
if(len == (N))
{
if(buf[length-1] == '\n')
{
buf[length-1] = '\0'; //for a `\n` newline
}
}
Since the text files are passed as commandline argument to the program I have no control of how it is edited/composed and hence cannot filter it using some tool to make newlines consistent.
How can I handle this situation?
Is there any fgets equivalent function in standard C library(no extensions) which can handle these inconsistent newline characters and return a string without them?
I like to update length at the same time
if (buf[length - 1] == '\n') buf[--length] = 0;
if (buf[length - 1] == '\r') buf[--length] = 0;
or, to remove all trailing whitespace
/* remember to #include <ctype.h> */
while ((length > 0) && isspace((unsigned char)buf[length - 1])) {
buf[--length] = 0;
}
I think your best (and easiest) option is to write your own strlen function:
size_t zstrlen(char *line)
{
char *s = line;
while (*s && *s != '\r' && s != '\n) s++;
*s = '\0';
return (s - line);
}
Now, to calculate the length of the string excluding the newline character(s) and eliminating it(/them) you simply do:
fgets(buf,200,fin);
length = zstrlen(buf);
It works for Unix style ('\n'), Windows style ('\r\n') and old Mac style ('\r').
Note that there are faster (but non-portable) implementation of strlen that you can adapt to your needs.
Hope it helps,
RD:
If you are troubled by the different line endings (\n and \r\n) on different machines, one way to neutralize them would be to use the dos2unix command (assuming you are working on linux and have files edited in a Windows environment). That command would replace all window-style line endings with linux-style line endings. The reverse unix2dos also exists. You can call these utilities from within the C program (system maybe) and then process the line like you are currently doing. This would reduce the burden on your program.

Reading a text file up to a certain character

Here's my dilemma. I have a file, and wish to read in all characters up until the program hits a '#', and ignore everything on that line after the '#'. For example
0 4001232 0 #comment, discard
This is frustrating, as it feels like there is a very simple solution. Thanks!
FILE *f = fopen("file.txt", "r");
int c;
while ((c = getc(f)) != '#' && c != EOF)
putchar(c);
Read a line using fgets, read through this line till you get a '#' character.
Read an another line...
There are plenty of ways and examples of how to do it. Usually, the idea is to have a variable that holds the state (before #, after # , after \n etc.) and run in a while loop until EOF. an example you can see here it's a program to remove C comments, but the idea is the same.
filePointer = fopen("person.txt", "r");
do
{
read = fgetc(filePointer);
//stop when '#' read or when file ends
if (feof(filePointer) || read == '#')
{
break;
}
printf("%c", read);
} while (1);
fclose(filePointer);
also you better check if file opened succesfully
if (filePointer == NULL)
{
printf("person.txt file failed to open.");
}
else
{
file operations
}
The solution depends on how you are "reading" that.
I could, for example, just remove all of those comments with sed 's/#.*//' <infile >outfile in bash.
EDIT: However, if I was parsing it manually, I could simply (in my loop for parsing it) have
if(line[i]=='#') {
continue;
}
which would stop parsing that line by exiting the loop.

Resources