Read from file fast - c

I have a txt file that contains 2 graphs and the number of vertices in the following format:
6
0 1 0 1 0 0
1 0 1 0 0 1
0 1 0 1 0 0
1 0 1 0 1 0
0 0 0 1 0 1
0 1 0 0 1 0
0 1 0 0 1 0
1 0 1 0 0 0
0 1 0 1 0 1
0 0 1 0 1 0
1 0 0 1 0 1
0 0 1 0 1 0
The matrices represent vertice adjacency. If two vertices are adjacent, their pair gets 1.
Although the graphs are not separated visually, the second graph starts after the 6th row of the first.
Each graph can have a lot of vertices, like 5000 and they are both of the same size (the graphs).
I wrote an algorithm that checks if the two graphs are isomorphic and i noticed that reading the graphs takes 8 seconds and the actual algorithm takes 2.5 (for 5000 vertices).
Since my goal is to optimize the overall speed of my program, I want to know if i can improve (in terms of speed) my current code of reading from file:
FILE* file = fopen ("input.txt", "r");
fscanf (file, "%d", &i);
int n = i;
while (!feof (file))
{
fscanf (file, "%d", &i);
if (j < (n*n)) { // first graph
if (i==1) {
adj_1[j/n][v_rank_1[j/n]] = j - (j/n)*n; // add the vertice to the adjacents of the current vertice
v_rank_1[j/n] += 1;
}
}
else if (j>=(n*n)) { // second graph
if (i==1) {
adj_2[(j-(n*n))/n][v_rank_2[(j-(n*n))/n]] = (j-(n*n)) - ((j-(n*n))/n)*n; // add the vertice to the adjacents of the current vertice
v_rank_2[(j-(n*n))/n] += 1;
}
}
j++;
}
fclose (file);
The adj_* table holds the indexes of the adjacent vertices of a vertice
The v_rank_* table holds the number of vertices adjacent to a vertice
It is important that I acquire this and only this information from the graph.

The first optimization is to read the whole file in memory in one shot. Accessing memory in the loops will be faster than calling fread.
The second optimization is to do less arythmetic operations, even if it means more code.
Third optimization is treating the data from file as characters to avoid integer conversion.
The result could be:
// bulk read file into memory
fseek(file, 0, SEEK_END);
long fsize = ftell(file);
fseek(file, 0, SEEK_SET);
char *memFile = malloc(fsize + 1);
if (memFile == NULL) return; // not enough memory !! Handle it as you wish
fscanf(file, "%d", &n);
fread(memFile, fsize, 1, file);
fclose(file);
memfile[fsize] = 0;
// more code but less arythmetic operations
int lig, col;
char *mem = memFile, c;
for (int lig = 0; lig < n; lig++) { // first graph
for (int col = 0; col < n; col++) {
for (;;)
{
c = *mem;
if (c == 0) break;
mem++;
if (c == '1') {
adj_1[lig][v_rank_1[lig]++] = col; // add the vertice to the adjacents of the current vertice
k++; // ??
break;
}
if (c == '0') break;
}
}
}
for (int lig = 0; lig < n; lig++) { // second graph
for (int col = 0; col < n; col++) {
c = *mem;
if (c == 0) break;
mem++;
if (c == '1') {
adj_2[(lig][v_rank_2[lig]++] = col; // add the vertice to the adjacents of the current vertice
l++; // ??
break;
}
if (c == '0') break;
}
}
}
free(memFile);
Remarks: you said nothing about variables k and l.

You could speed it up by accessing the file system less often. You are reading one integer at a time from the file thus accessing the file every time through the loop.
Instead, try reading the whole file or a large chunk of the file at once. (This is called block reading). You can buffer it into an array. Inside your loop, read from the memory buffer instead of the file. Refresh your memory buffer as needed inside the loop if you don't read in the entire file.

Use fgets() to read a line at a time into a line buffer. Parse the line buffer into integer values.
This function reduces the number of times you read from the file, because behind the scenes, fgets() reads a large chunk of data from the file and returns a line at a time. It only attempts to read another chunk when there are no more lines left in its internal buffer.

Related

How to read a file, convert letters, and print string and integers to an array in c?

struct reviewStruct {
char reviewer[50];
int feedback[3];
};
int readReviews(FILE *file, struct reviewStruct reviews[10]) {
int i;
file = fopen("Names.txt", "r");
if(file == NULL) {
printf("Error");
exit(-1);
}
for(i = 0; i < 10; i++) {
fgets(reviews[i].reviewer, 50, file);
}
fclose(file);
for(i = 0; i < 10; i++) {
printf("%s", reviews[i].reviewer);
}
return 0;
}
Hello, I'm trying to read a file line by line and print it to an array, with a catch. Whenever a 'Y' or 'y' appears, it converts that letter into a 1, and if an 'N' or 'n' appears, it is converted into a 0 (zero), excluding the first word of every line. For example, I have a file with the following information:
charlie Y n N
priya N n Y
lance y y Y
stan N y n
arin N n N
This is the text file called Names.txt, I want to save the integer information to the array called "feedback", so that it looks like this when printed using a for loop:
1 0 0
0 0 1
1 1 1
0 1 0
0 0 0
How do I populate the feedback array such that it can be printed along with the names using a for loop as it is in the following image?
charlie 1 0 0
priya 0 0 1
lance 1 1 1
stan 0 1 0
arin 0 0 0
Thanks.

How read a variable number of int from a string

I have the following text file
0 0 0 debut
1 120 0 permis exploitation
2 180 1 1 piste 6km
3 3 1 2 installation sondeuses
4 30 1 2 batiments provisoires
5 60 1 2 groudronnage piste
6 90 1 4 adduction eau
7 240 2 3 4 campagne sondage
8 180 3 5 6 7 forage 3 puits
9 240 3 5 6 7 construction bureaux logements
10 30 2 8 9 transport installation matériel
11 360 2 8 9 traçage du fond
12 240 2 8 9 construction laverie
13 0 3 10 11 12 fin des travaux
Each line is the representation of a task and is described as followed: the first number is and ID, the second is the duration, the third is the number of previous tasks that are required, and all the numbers afterward are the IDs of the required tasks. Finaly the string in the end is the title of the string.
I'm trying to fill an array of those struct by reading this file. Here is the struct:
typedef struct{
int id;
int duration;
int nbPrev; /* number of required previous tasks */
int prev[NMAXPREV]; /*array of required previous tasks*/
char title[LGMAX];
}Task ;
Here is my code to read the file
int readTasksFile(char* file_name, Task t[])
{
FILE* f;
char line[256] = {'\0'};
int i = 0;
char c[1] = {0};
if((f = fopen(file_name, "r")) == NULL)
{
perror("The file couldn't be opened");
exit(EXIT_FAILURE);
}
while (fgets(line, 256, f) != EOF)
{
sscanf_s(line, "&d &d &d", &(t[i].id), &(t[i].duration), &(t[i].nbPrev));
i++;
}
fclose(f);
return 0;
}
How can I read all the previous tasks number in a line considering it is variable and still be able to read the title afterward ?
How can I read all the previous tasks number in a line considering it is variable and still be able to read the title afterward ?
The 3rd int should be the number of following ints.
Use "%n" to record scan offset.
After reading the .prev[], copy the rest of the line to .title.
Add error checking. This is very important, especially for complex input.
// Untested code to get OP started
// while (fgets(line, 256, f) != EOF) Comparing against EOF is incorrect
while (fgets(line, sizeof line, f)) {
int offset = 0;
// Use %d, not &d
if (sscanf(line, "%d %d %d %n",
&t[i].id, &t[i].duration, &t[i].nbPrev, &offset) != 3) {
// Handle bad input, for now, exit loop
break;
}
if (t[i].nbPrev < 0 || t[i].nbPrev > NMAXPREV) {
// Handle bad input, for now, exit loop
break;
}
char *p = line + offset;
int prev;
// Populate t[i].prev[]
for (prev = 0; prev < t[i].nbPrev; prev++) {
if (sscanf(p, "%d %n", &t[i].prev[prev], &offset) != 1) {
break;
}
p += offset;
}
if (prev != t[i].nbPrev) {
// Handle bad input, for now, exit loop
break;
}
// remaining text
int len = strlen(p);
if (len > 0 && p[len-1] == '\n') p[--len] = '\0'; // consume potential trailing \n
if (len >= LGMAX) {
// Handle bad input, for now, exit loop
break;
}
strcpy(t[i].title, p);
i++;
}
return i; // Let caller know of successful lines parsed.
Advanced: robust code would use strtol() instead of "%d" and sscanf().
readTasksFile() should also pass in the max number of Task t[] that can be read.
You could also scan by line and assign the two first numbers to id and duration, then do an int analysis and add the rest of the elements to nbPrev until you encounter a letter.
I don't know if this would be the best way to do it, but it's how I would do it.
Why don't you create also a list each time you register in struct nbPrev?
Like, instead of nbPrev being of type int, make it of type list?

How to get the third column of text file in C programming

I have a text file that contains:
1 1 1
1 2 2
1 3 2
1 7 5
1 8 4
1 9 4
1 10 2
...
and this is my function:
void addRatings()
{
int n,m,l;
int a[50][100];
MovieR = fopen("d://ratings.txt","r");
l = LineNum(MovieR);
MovieR = fopen("d://ratings.txt","r");
for(int i=0;i<l;i++)
{
fscanf(MovieR,"%[^\t]\t%[^\t]\t%[^\t]\n",&n,&m,&a[n][m]);
}
}
Now I want to get the first and second column for n and m
then I want to give third column to the a[n][m].
How can I do that?
You need to read the third value into a temporary variable, and then store that value into the array if and only if the following conditions are met:
fscanf returned 3, meaning that it actually found three numbers
the value for n is between 0 and 49 inclusive
the value for m is between 0 and 99 inclusive
And the code doesn't need to count the number of lines (using LineNum()). The loop should end when fscanf runs out of numbers to read, i.e. returns something other than 3.
The resulting code looks something like this:
void addRatings(void)
{
int a[50][100] = {{0}}; // initialize all ratings to 0
FILE *MovieR = fopen("d://ratings.txt", "r");
if (MovieR != NULL)
{
int n, m, rating;
while (fscanf(MovieR, "%d%d%d", &n, &m, &rating) == 3) // loop until end-of-file
{
if (n < 0 || n > 49 || m < 0 || m > 99) // check for valid indexes
break;
a[n][m] = rating;
}
fclose(MovieR);
}
}

Looking for faults in algorithm for organizing a one-way railway station traffic

I wrote the code below for solving this railway station traffic programming contest question. ( You may read comments and proposed solutions here). However, there are a few exceptional cases for which this code won't work. What are they?
#include <stdio.h>
#include <stdlib.h>
int main(){
int n, i,j;
int * array;
scanf("%i",&n);
array = malloc(sizeof(int) * n);
for(i=0;i<n;++i) scanf("%i",&array[i]);
for(i=0;i<n;++i){
if(i+1 == array[i]) array[i] = -1;
else{
if(array[i] < i+1){
for(j=0;j<i;++j){
if(array[i] == j+1){
if(array[j] == -1){
printf("No\n");
return 0;
}
else array[i] = array[j] = -1;
}
}
}
}
}
for(i=0;i<n;++i) if(array[i] != -1) break;
if(i == n) printf("Yes\n");
else printf("No\n");
return 0;
}
P.S.: I'm assuming this program takes one entry at each time ( rather than waiting for an 0 for signaling the end of input ).
What this code is supposed to do:
1) I'm assuming you've already read what's in this link.
2) After copying a sequence into an array, we must verify whether or not this sequence is valid.
So we use the following algorithm:
Iterate over the sequence, starting from the first element.
If element = element's index + 1 ( because C lists are zero-indexed ), then element = -1.
Otherwise, if and only if element < element's index: We look for a previous element for which ( current element == previous' element index + 1 ) is valid. If this element is found, then now both current element and previous element are changed to -1. If previous element has already been changed before ( that is, it's already -1 ) then this is not a valid sequence.
If after iterating over the list like this any elements are still left, this is not a valid sequence.
Examples:
Example 1
Array: 5 4 3 2 1
5 : 5 > 0 + 1, skip. 4: 4 > 1 + 1, skip. 3: 3 == 2 + 1. Then 3 -> -1.
Array: 5 4 -1 2 1
2 : 2 < 3 + 1. 4 has an index of 1 and 1 + 1 = 2.
Array: 5 -1 -1 -1 1
1: 1 < 4 + 1. 5 has an index of 0 and 0 + 1 = 1.
Array: -1 -1 -1 -1 -1
Therefore this sequence is valid.
Example 2
Array: 5 4 1 2 3
5: 5 > 0 + 1, skip. 4: 4 > 1 + 1, skip. 1: 1 < 2 + 1. 5 has an index
of 0.
Array: -1 4 -1 2 3
2: 2 < 3 + 1. 4 has an index of 1.
Array: -1 -1 -1 -1 3
3: 3 < 4 + 1. -1 ( at position 2 ) has an index of 2. 2 + 1 = 3.
Therefore the sequence is not valid.
Here is an example of an input where your code will give the wrong output:
5
3 4 2 5 1
Your description gave a translation of the code in English, but did not give insight into why that algorithm would solve the problem. So, I just went for a solution where an extra array is used for keeping track of the carriages that are in the station, which will have to function like a stack (First-in-last-out):
#include <stdio.h>
#include <stdlib.h>
int main(){
int n, i;
int carriageAtA = 1;
int * array;
int * station;
int stationSize = 0;
// Read input
scanf("%i",&n);
array = malloc(sizeof(int) * n);
station = malloc(sizeof(int) * n);
for(i=0;i<n;++i) scanf("%i",&array[i]);
// Iterate the desired carriages in sequence
for(i=0;i<n;++i) {
// While the last one in the station is not what we need:
while ((!stationSize || station[stationSize-1] != array[i]) && carriageAtA <= n) {
printf("Move %i from A to station\n", carriageAtA);
// Last carriage in station B is not what we need, so pull one in from A:
station[stationSize] = carriageAtA;
stationSize++; // There is now one more carriage in the station
carriageAtA++; // This is the next carriage at A
}
if (!stationSize || station[stationSize-1] != array[i]) {
// Could not find desired carriage at A nor at station. Give up.
printf("No\n");
return 0;
}
// Drive last carriage at station to B:
printf("Move %i from station to B\n", array[i]);
stationSize--;
}
printf("Yes\n");
return 0;
}
The additional printf calls are just for getting a view of the process. Remove them when you are satisfied.

Generating Binary Inputs in C

Essentially, I have a program that has eight variables. I'm attempting to check every combination of truth values for these eight variables, and so I need a truth table using 0s and 1s that demonstrate every combination of them. These inputs will be read into the program.
It should look something like:
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 0
0 0 0 0 0 0 1 1
And so on...
How would I accomplish this in C?
I've opened up a file for writing, but I'm not sure how to... Logically do this.
Convert every decimal number till 2^8, into corresponding binary number...
and you have the required pattern.....
Two loops and the character '0' are more than enough...
FILE *f = fopen("truth.txt", "w"); // we'll alter the truth...
assert(f != NULL);
unsigned i;
int j;
for (i = 0; i < 256; i++) {
for (j = 0; j < 8; j++) {
fprintf(f, "%c ", '0' + ((i >> (7 - j)) & 1));
}
fprintf(f, "\n");
}
fclose(f);

Resources