How to delete a string in a file with ANSI C? - c

I'm trying to delete a string in a data-file.
The format of the data is just like following records:
4253 1
3119 1
5709 1
576 1
857 1
5859 1
5896 1
116 1
2396 1
1088 1
4180 1
Those are a part of a file.I have no right to send img.
Each record makes up of two numbers segregated by space and segregates by invisible char '\n'.
There're thousands of records in the file, I just want to delete some records useless when i scan the file. Should use C language to implement it.
very sorry for not providing detailed format of the data.

Files in C are sequential entities. Unless you impose your own structure on them (such as treating NUL characters as non-existent ones), the only real way to delete characters or lines is to overwrite them, shifting the part of the file following them a little towards the front.
You can either do this in-place with things like fseek and truncate (that last is not ISO C) or by reading from one file and writing to another.
For example, the following program will delete a line containing 11 from the standard input:
#include <stdio.h>
int main (void) {
char buff[1024];
while (fgets (buff, sizeof(buff), stdin) != NULL)
if (strcmp (buff, "11\n") != 0)
printf ("%s", buff);
return 0;
}
Beware the usual caveats lines lines that are too long for the input buffer.

Related

How to slice/index data in c

I am trying to learn C and have recieved a homework assignment to write code which can read data from a .txt file and print out particular lines.
I wrote the following:
#include <stdio.h>
void main() {
char str[5];
FILE *fp;
fp=fopen("data.txt","r");
int i;
for (i=1;i<=5;i++){
fgets(str,5,fp);
printf("%d \n",i);
if (i==1||i==3||i==5) {
printf("%s \n \n",str);
}
}
}
The file data.txt is just the following:
3.21
5.22
4.67
2.31
2.51
1.11
I had read that each time fgets is run, the pointer is updated to point to the next line. I thought I could keep running fgets and then only print the string str when at the correct value for i (the line I want output on the console).
It partially worked, here is the output:
1
3.21
2
3
5.22
4
5
4.67
Process returned 8 (0x8) execution time : 0.024 s
Press any key to continue.
It did only print when i had the correct values, but for some reason it only printed the first 3 lines, even though fgets was supposed to have been run 5 times by the last iteration, and so the pointer should have been reading the last line.
Can someone explain why the pointer did not update as expected and if there is an easier way to slice or index through a file in c.
You need to account for (at least) two additional characters, in addition to the numbers you have in the file. There is the end-of-line delimiter (\n on UNIX/Mac, or possibly \r\n on Windows... so maybe 3 additional characters), plus (from the fgets documentation):
A terminating null character is automatically appended after the characters copied to str.
A lot of the C functions that manipulate character arrays (ie. strings) will give you this extra null "for free" and it can be tricky if you forget about it.
Also, a better way to loop over the lines might be:
#define MAX_CHARS 7
char buf[MAX_CHARS];
while((fgets(buf, MAX_CHARS, fp)) != NULL) {
printf("%s\n", buf);
}
It's still not the best way to do it (no error checking) but a little more compact/readable and idiomatic C, IMO.

C : Best way to go to a known line of a file

I have a file in which I'd like to iterate without processing in any sort the current line. What I am looking for is the best way to go to a determined line of a text file. For example, storing the current line into a variable seems useless until I get to the pre-determined line.
Example :
file.txt
foo
fooo
fo
here
Normally, in order to get here, I would have done something like :
FILE* file = fopen("file.txt", "r");
if (file == NULL)
perror("Error when opening file ");
char currentLine[100];
while(fgets(currentLine, 100, file))
{
if(strstr(currentLine, "here") != NULL)
return currentLine;
}
But fgetswill have to read fully three line uselessly and currentLine will have to store foo, fooo and fo.
Is there a better way to do this, knowing that here is line 4? Something like a go tobut for files?
Since you do not know the length of every line, no, you will have to go through the previous lines.
If you knew the length of every line, you could probably play with how many bytes to move the file pointer. You could do that with fseek().
You cannot access directly to a given line of a textual file (unless all lines have the same size in bytes; and with UTF8 everywhere a Unicode character can take a variable number of bytes, 1 to 6; and in most cases lines have various length - different from one line to the next). So you cannot use fseek (because you don't know in advance the file offset).
However (at least on Linux systems), lines are ending with \n (the newline character). So you could read byte by byte and count them:
int c= EOF;
int linecount=1;
while ((c=fgetc(file)) != EOF) {
if (c=='\n')
linecount++;
}
You then don't need to store the entire line.
So you could reach the line #45 this way (using while ((c=fgetc(file)) != EOF) && linecount<45) ...) and only then read entire lines with fgets or better yet getline(3) on POSIX systems (see this example). Notice that the implementation of fgets or of getline is likely to be built above fgetc, or at least share some code with it. Remember that <stdio.h> is buffered I/O, see setvbuf(3) and related functions.
Another way would be to read the file in two passes. A first pass stores the offset (using ftell(3)...) of every line start in some efficient data structure (a vector, an hashtable, a tree...). A second pass use that data structure to retrieve the offset (of the line start), then use fseek(3) (using that offset).
A third way, POSIX specific, would be to memory-map the file using mmap(2) into your virtual address space (this works well for not too huge files, e.g. of less than a few gigabytes). With care (you might need to mmap an extra ending page, to ensure the data is zero-byte terminated) you would then be able to use strchr(3) with '\n'
In some cases, you might consider parsing your textual file line by line (using appropriately fgets, or -on Linux- getline, or generating your parser with flex and bison) and storing each line in a relational database (such as PostGreSQL or sqlite).
PS. BTW, the notion of lines (and the end-of-line mark) vary from one OS to the next. On Linux the end-of-line is a \n character. On Windows lines are rumored to end with \r\n, etc...
A FILE * in C is a stream of chars. In a seekable file, you can address these chars using the file pointer with fseek(). But apart from that, there are no "special characters" in files, a newline is just another normal character.
So in short, no, you can't jump directly to a line of a text file, as long as you don't know the lengths of the lines in advance.
This model in C corresponds to the files provided by typical operating systems. If you think about it, to know the starting points of individual lines, your file system would have to store this information somewhere. This would mean treating text files specially.
What you can do however is just count the lines instead of pattern matching, something like this:
#include <stdio.h>
int main(void)
{
char linebuf[1024];
FILE *input = fopen("seekline.c", "r");
int lineno = 0;
char *line;
while (line = fgets(linebuf, 1024, input))
{
++lineno;
if (lineno == 4)
{
fputs("4: ", stdout);
fputs(line, stdout);
break;
}
}
fclose(input);
return 0;
}
If you don't know the length of each line, you have to go through all of them. But if you know the line you want to stop you can do this:
while (!found && fgets(line, sizeof line, file) != NULL) /* read a line */
{
if (count == lineNumber)
{
//you arrived at the line
//in case of a return first close the file with "fclose(file);"
found = true;
}
else
{
count++;
}
}
At least you can avoid so many calls to strstr

How to read numbers from a text file properly?

I would like to write a lottery program in C, that reads the chosen numbers of former weeks into an array. I have got a text file in which there are 5 columns that are separated with tabulators. My questions would be the following:
What should I separate the columns with? (e.g. a comma, a semicolon, a tabulator or something else)
Should I include a kind of EOF in the last row? (e.g. -1, "EOF") Is there any accepted or "official" convention to do this?
Which function should I use for reading the numbers? Is there any proper or "accepted" way of reading data from text files?
I used to write a C program for a "Who Wants to Be a Billionaire" game. In that one I used a kind of function that read each line into an array that was big enough to hold a whole line. After that I separated its data into variables like this:
line: "text1";"text2";"text3";"text4"endline (-> line loaded into a buffer array)
text1 -> answer1 (until reaching the semicolon)
text2 -> answer2 (until reaching the semicolon)
text3 -> answer3 (until reaching the semicolon)
text4 -> answer4 (until reaching the end of the line)
endline -> start over, that is read a new line and separate its contents into variables.
It worked properly, but I don't know if it was good enough for a programmer. (btw I'm not a programmer yet, I study Computer Science at a university)
Every answers and advice is welcome. Thanks in advance for your kind help!
The scanf() family of functions don't care about newlines, so if you want to process lines, you need to read the lines first and then process the lines with sscanf(). The scanf() family of functions also treats white space — blanks, tabs, newlines, etc. — interchangeably. Using tabs as separators is fine, but blanks will work too. Clearly, if you're reading and processing a line at a time, newlines won't really factor into the scanning.
int lottery[100][5];
int line;
char buffer[4096];
for (line = 0; fgets(buffer, sizeof(buffer), stdin) != 0 && line < 100; line++)
{
if (sscanf(buffer, "%d %d %d %d %d", &lottery[line][0], &lottery[line][1],
&lottery[line][2], &lottery[line][3], &lottery[line][4]) != 5)
{
fprintf(stderr, "Faulty line: [%s]\n", line);
break;
}
}
This stops on EOF, too many lines, and a faulty line (one which doesn't start with 5 numbers; you can check their values etc in the loop if you want to — but what are the tests you need to run?). If you want to validate the white space separators, you have to work harder.
Maybe you want to test for nothing but spaces and newlines after the 5 numbers; that's a bit trickier (it can be done; look up the %n conversion specification in sscanf()).

How do you scan redirected files in C (STDIN)?

Say I'm calling a program:
$ ./dataset < filename
where filename is any file with x amount of line pairs where the first line contains a string and second line contains 10 numbers separated by spaces. The last line ends with "END"
How can I then start putting the first lines of pairs (string) into:
char *experiments[20] // max of 20 pairs
and the second lines of the pairs (numbers) into:
int data[10][20] // max of 20, 10 integers each
Any guidance? I don't even understand how I'm supposed to scan the file into my arrays.
Update:
So say this is my file:
Test One
0 1 2 3 4 5 6 7 8 9
END
Then redirecting this file would mean if I want to put the first line into my *experiments, that I would need to scan it as such?
scanf("%s", *experiments[0]);
Doing so gives me an error: Segmentation fault (core dumped)
What is incorrect about this?
Say my file is simply numbers, for ex:
0 1 2 3 4 5 6 7 8 9
Then,
scanf("%d", data[0][0]); works, and will hold value of '1'. Is there an easier way to do this for the whole line of data? i.e. data[0-9][0].
find the pseudo-code, code explains how to read the input
int main()
{
char str[100]; // make sure that this size is enough to hold the single line
int no_line=1;
while(gets(str) != NULL && strcmp(str,"END"))
{
if(no_line % 2 == 0)
{
/*read integer values from the string "str" using sscanf, sscanf can be called in a loop with %d untill it fails */
}
else
{
/*strore string in your variable "experiments" , before copying allocate a memory for the each entry */
}
no_line++;
}
}
The redirected file is associated with the FILE * stdin. It's already opened for you...
otherwise, you can treat it the same as any other text file, and/or use the functions that are dedicated to standard input - with the only exception that you cannot seek in the file and not retrieve the size of the input.
For the data sizes you're talking about, by far the easiest thing to do is just slurp all of the content into a buffer and work on that: you don't have to be super-stingy, just make sure that you don't overrun.
If you want to be super-stingy with memory, preallocate a 4kB buffer with malloc(), progressively read() into it from stdin, and realloc() another 4kB every time the input exceeds what you've already read. If you don't care so much about being stingy with memory (e.g. on a modern machine with gigabytes of memory), just malloc() something much bigger than the expected input (e.g. a megabyte) and bug out if the input is more than that: this is far simpler to implement but less general/elegant.
You then have all of the input in a buffer and you can do what you like with it, which depends too strongly on the format of the input for me to say how you should approach that part.

C Programming, Reading specific sections from file

my question is how can I read specific sections from a file? For instance, if my file was:
454545454 Joe Brown 70 50 40
656565656 David Smith 80 90 100
383838383 George Williams 95 100 80
How could I read the first string (9-Digit #), skip over the name, and then read the 3 sets of numbers?
I think that you could notice that the white space is your sentinel. I'm thinking that maybe you can store the whole file into a char* and asking for this sentinel each time.
Other solution could be using atoi (ascii to int) for validate if it's a number or a letter. You can also read about fread and fseek.
I think that the best way is to mix both solution... find each sentinel and try to parse it using atoi.
The main idea is that you try to find some pattern in the file that allows you to think the algorithm.
In C, most of the times you have to solve the logic by yourself.
Hope it helps!
Instead of "reading specific sections," read file line by line and save the information you want and discard the others. scanf is used to read formatted from an external source into program variables. Since scanf returns the number of successful reads from the source, you can use that to do some error checking.
char num_string[STR_LEN];
int numbers[3];
char dummy1[STR_LEN], dummy2[STR_LEN];
int num_read = scanf( "%s%s%s%d%d%d", num_string, dummy1, dummy2, &numbers[0], &numbers[1], &numbers[2] );
if( num_read != 6 )
// error
else
{
// do stuff with num_string, and numbers[0]-numbers[2]
}

Resources