I'm trying to do the following:
Open a file with up to 1000 ints in a text file, 1 int per line.
Read the file line by line
Store the ints in a dynamically allocated array of 1000
Print the contents of the array on a single line
Yes this is homework, I'm stuck on this.
Output should be:
$ 1 2 3 4 5 6 7 8 9 ....
What I have so far prints the integers on a new line after each iteration.
int x = 0;
char buf[1000];
int *array = (int *) malloc(1000 * sizeof(int));
FILE *fp = fopen("test.txt", "r");
while(fgets(buf, 1000, fp) != NULL) {
array[x] = buf;
printf("%s ", array[x]);
x++;
}
fclose(fp);
return 0;
}
Your code puts the buffer address in the integer array and then uses that as a string pointer for printf(), but you'll find it won't work if you try printing the array in a loop separate from the reading loop, because every element of the array holds this same buffer address. You should have been given a compiler warning about this.
This answer uses two loops, as you want to print the numbers after reading them.
while(fgets(buf, 1000, fp) != NULL && x < 1000)
array[x++] = atoi(buf);
for (y=0; y<x; y++)
printf("%d ", array[y]);
printf("\n");
Related
I am trying to read arrays from multiple files, and the array size in each file is different. So what I do is, I try to count the number of lines in the file and then store that as the array size.
For example, I have two .txt files, File_1.txt and File_2.txt which contain the following data:
0.000 300.00
0.054 2623.3
1.000 300.00
0.000 300.00
0.054 2623.3
0.500 1500.0
1.000 300.00
respectively.
Here is the code that I use:
int main()
{
char filter[1024];
char filename[60];
FILE *fp;
double *T_SR, Z_SR;
for (int i = 1; i < 3; i++)
{
sprintf(filename, "File_%d.txt", i);
fp = fopen(filename, "r");
if (fp == NULL)
{
exit(1);
}
int count = 0;
for (int j = getc(fp); j != EOF; j = getc(fp))
{
if (j == '\n')
{
count = count + 1;
}
}
T_SR = (double *)malloc(count * sizeof(double));
Z_SR = (double *)malloc(count * sizeof(double));
for (int rows = 0; rows < count; rows++)
{
fscanf(fp, "%lf %lf", &Z_SR[rows], &T_SR[rows]);
printf("%lf %lf\n", Z_SR[rows], T_SR[rows]);
if (feof(fp))
{
break;
}
}
}
}
But instead of printing the given array as output, it prints this:
0.0000 0.0000
0.0000 0.0000
I checked the value of count, it's good. Maybe the problem is simple, but I am not able to find it. Can someone please help?
After you ran the whole file with getc the file indicator will be at the end of the file you must set it back to the beginning before you use fscanf, you can use rewind for that.
rewind(fp); //<--
for (int rows = 0; rows < count; rows++)
{
//...
}
Aside from that, other problems exist as Jaberwocky pointed out, among others, like a memory leak issue, and the fact that you don't close your files or check malloc return, here's how your code could look like (with comments):
double *T_SR, *Z_SR; // fix the pointer issue
//...
char line[1024]; // make sure it's larger than the largest line in the file
while (fgets(line, sizeof line, fp)) // fixes the count issue
{
// doesn't count empty lines, if there are any
if (line[0] != '\n')
{
count++;
}
}
if(count > 0)
{
T_SR = malloc(count * sizeof *T_SR);
Z_SR = malloc(count * sizeof *Z_SR);
if(T_SR == NULL || Z_SR == NULL) // check memory allocation
{
perror("malloc");
return EXIT_FAILURE;
}
rewind(fp);
for(int rows = 0; fscanf(fp, "%lf%lf", &Z_SR[rows], &T_SR[rows]) == 2; rows++)
{
printf("%lf %lf\n", Z_SR[rows], T_SR[rows]);
}
free(T_SR); // free the memory, avoids memory leaks
free(Z_SR);
}
fclose(fp); // and close the file
//...
Live demo
There are several bugs:
The most important one is the rewind issue that has been addressed in anastaciu's anwer.
double * T_SR, Z_SR is wrong, it should be double * T_SR, *Z_SR. I wonder actually if the code you posted is the code you compile.
your line counting method is flawed. If the last line of the file does not end with a \n, the count variable will be 2 and you'll miss the last line.
fscanf returns the number of items read or EOF. If you had check that, you might have found the problem in your code yourself.
the feof check is done too late, if fscanf encounters en EOF you still print the values that have not bee read due to the EOF condition.
I try to count the number of lines in the file and then store that as the array size.
Aside from the key rewind() issue, avoid reading code one way to find line count and another to find the doubles. Far too easy to get a line count that does not match the "line count" of reading two doubles.
Use one approach to find both.
size_t read_SR(size_t count, double *Z_SR, double *T_SR, FILE *inf) {
char line[100];
rewind(inf);
size_t rows;
while (fgets(line, sizeof line, inf)) {
double Z, T;
if (sscanf(line, "%lf %lf", &Z, &T) != 2) return rows;
if (rows < count) {
if (Z_SR) Z_SR[rows] = Z;
if (T_SR) T_SR[rows] = T;
}
rows++;
}
return rows;
}
Usage
// First pass, find size
size_t count = read_SR(0, NULL, NULL, inf);
double *T_SR = malloc(sizeof *T_SR * count);
double *Z_SR = malloc(sizeof *Z_SR * count);
// 2nd pass, save data
read_SR(count, Z_SR, T_SR, inf);
I need to read in a file called "data.txt" and store the first input as a value and the second corresponding input as a weight. I'm having issues reading them in and storing the values.
data.txt (example)
3 25
2 20
1 15
4 40
5 50
This is what I´ve started with:
FILE *myFile;
myFile=fopen("data.txt", "r");
int val[20]={0}; //initialize value array to zero
int wt[20]={0};
int W=80; //Set capacity to 80
int i;
int n;
while(!feof(myFile)){
fscanf(myFile, "%1d%1d", &val[i], &wt[i]);
}
n = sizeof(val)/sizeof(val[0]);
printf("%d", knapSack(W, wt, val, n));//prints out the maximum value
fclose(myFile);
return 0;
I've edited the above code to the following:
FILE *myFile;
myFile=fopen("data.txt", "r");
int val[20]={0};
int wt[20]={0};
int W=80; //Set capacity to 80
int i;
int n;
for(i=0;i<sizeof(val);i++){
fscanf(myFile, "%1d%1d", &wt[i],&val[i]);
}
n = sizeof(val)/sizeof(val[0]);
printf("%d", knapSack(W, wt, val, n));//prints out the maximum value
fclose(myFile);
return 0;
It keeps outputting 55 when I use the inputs from the data.txt example.
The biggest problem you are having is you are not controlling your read-loop with the return of the read itself. For example, in your case you would want:
int i = 0;
while (fscanf(myFile, "%1d%1d", &wt[i],&val[i]) == 2)
i++;
At the end of your read, i would hold the number of elements read into your arrays.
(note: you cannot use any input function correctly unless you check the return...)
Instead of reading the values into separate arrays, whenever you are coordinating multiple values as a single object (e.g. each val and wt pair), you should be thinking struct. That allows you to coordinate both values as a single object.
A simple example in your case could be:
#include <stdio.h>
#define MAXVAL 20 /* if you need a constant, #define one (or more) */
typedef struct { /* struct with int val, wt + typdef for conveninece */
int val, wt;
} mydata;
int main (int argc, char **argv) {
size_t n = 0; /* number of elements read */
mydata arr[MAXVAL] = {{ .val = 0 }}; /* array of mydtata */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
/* read all pairs of values in file into array */
while (fscanf (fp, "%d %d", &arr[n].val, &arr[n].wt) == 2)
n++;
if (fp != stdin) /* close file if not stdin */
fclose (fp);
for (size_t i = 0; i < n; i++) /* output values */
printf ("arr[%zu] %2d %2d\n", i, arr[i].val, arr[i].wt);
}
Above, the code does the same as I suggested in conditioning the read-loop on successfully reading a pair of values from the file. The only difference is that is coordinates the val and wt values in a struct.
Example Use/Output
With your data in the file dat/val_wt.txt, you would receive the following output:
$ ./bin/read_val_wt dat/val_wt.txt
arr[0] 3 25
arr[1] 2 20
arr[2] 1 15
arr[3] 4 40
arr[4] 5 50
While above we read directly with fscanf, you can make your read a bit more robust by reading each line into a character array first, and then parsing the wanted values from the character array with sscanf. You are essentially doing the same thing, but by using fgets/sscanf you can make an independent validation of (1) the read of the line; and (2) the parse of the wanted information from the line. If you have a malformed-line, it prevents the matching-failure from impacting the read of the remaining lines in the input file.
Look things over and let me know if you have further questions.
Oops, many little problems here...
First even if unrelated, you consistenly fail to check the result of input functions. It can lead to hide problems...
Next, the rule is when you do not get what you would expect, trace intermediary values.
Had you happen those lines:
// uncomment next block for debugging
printf("n=%d\n);
for (i = 0; i < n; i++) {
printf("%d %d\n", wt[i], val[i]);
}
You would have seen
n = 20
3 2
5 2
2 0
1 1
5 4
4 0
5 5
0
showing that:
n was 20 (unsure whether you expected it)
you had read your values one digit at a time instead of one integer value (because of the %1d formats)
My advice:
for (i = 0; i<sizeof(val); i++) { // do not try to read more than array capacity
if (2 != fscanf(myFile, "%d%d", &wt[i], &val[i])) break; // stop when no more data
}
n = i; // number of actual values
// uncomment next block for debugging
/*
printf("n=%d\n);
for (i = 0; i < n; i++) {
printf("%d %d\n", wt[i], val[i]);
}
*/
I want to read lines of a text file, and the content of it is as below.
first
second
third
I've already written some code, but the result was different from what I expected. I hope you can help me a little. (code below)
/*
content of test.txt
first
second
third
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
// double pointer for two string lines
// and the two string lines will be pointed by (char *) pointer
FILE *fp = fopen("test.txt", "r");
char **lst = malloc(3 * sizeof(char *)); // corrected 2 -> 3
// read lines until feof(fp) is not Null
// Only 2 lines will be read by this loop
int i = 0;
while (!feof(fp)) {
char line[10];
fgets(line, 10, fp);
*(lst + i) = line;
printf("%d : %s", i, *(lst + i));
i++;
}
/*
Result:
0 : first
1 : second
2 : third // Why was this line printed here?
There are 3 lines, not 2 lines!
*/
printf("\n\n");
int j;
for (j = 0; j < i; j++) {
printf("%d : %s\n", j, *(lst + j));
}
/*
result:
0 : third
1 : third
2 : third
The whole lines were not printed correctly!
*/
free(lst);
return 0;
}
Expected output:
0 : first
1 : second
2 : third
Many thanks.
First and foremost, you are allocating space for an array of two char*s and you have a single statically sized buffer for a string. But you’re attempting to read three strings. Where do you think the space for the strings is coming from? You’re not allocating it.
You need to make your various numbers match up: allocate an array of three strings, and then allocate three string buffers:
char **lst = malloc(3 * sizeof *lst);
for (int i = 0; i < 3; i++) {
lst[i] = malloc(10);
fgets(lst[i], 10, fp);
}
And don’t forget to free all allocated buffers subsequently:
for (int i = 0; i < 3; i++) {
free(lst[i]);
}
free(lst);
… of course this code isn’t terribly great either since it hard-codes the number of lines you can read, and the maximum line length. But it should get you started.
I am trying to read strings from file and insert them into matrix. Every line is one word.
FILE *fp = fopen("zadanie4.txt","r");
if( fp == NULL)
{
perror("Error while opening the file.\n");
exit(EXIT_FAILURE);
}
int symbol, num_of_lines = 0, len_of_string = 0, max_len = 0;
do {
symbol = fgetc(fp);
len_of_string++;
if (symbol == '\n' || feof(fp)) {
num_of_lines++;
if(len_of_string > max_len){
max_len = len_of_string;
}
len_of_string = 0;
}
} while (symbol != EOF);
fclose(fp);
printf("Number of words: %d\n", num_of_lines);
printf("Longest word: %d\n", max_len);
fp = fopen("zadanie4.txt","r");
char (*arr)[num_of_lines] = calloc(num_of_lines, sizeof(char*) * max_len);
int index = 0;
while(fscanf(fp, "%s", arr[index++]) == 1) {
printf("%s\n", arr[index - 1]); //first check to see what is written into array
}
close(fp);
printf("--------------------------\n");
int i;
for(i = 0; i < num_of_lines; i++){
printf("%s\n", arr[i]); //second check
}
I find out size of longest string and allocate memory for number of strings * longest string.
Here is how output looks like if longest word is 5 (+1 for empty '\0'):
Number of words: 6
Longest word: 6
AAAAA
BBBBB
CCCCC
DDDDD
EEEEE
FFFFF
--------------------------
AAAAA
BBBBB
CCCCC
DDDDD
EEEEE
FFFFF
If I add another char to every line:
Number of words: 6
Longest word: 7
AAAAAa
BBBBBb
CCCCCc
DDDDDd
EEEEEe
FFFFFf
--------------------------
AAAAAaBBBBBbCCCCCcDDDDDdEEEEEeFFFFFf
BBBBBbCCCCCcDDDDDdEEEEEeFFFFFf
CCCCCcDDDDDdEEEEEeFFFFFf
DDDDDdEEEEEeFFFFFf
EEEEEeFFFFFf
FFFFFf
Note: Every string is same size in this example, but I want it to work for various sizes.
Can anybody aid me how to properly allocate memory for this array?
Your declaration and allocation of arr are wrong and don't match.
You declare arr as a pointer to an array of num_of_lines characters. Then you allocate max_len number of pointers to character.
You can use variable-length arrays for both the outer and inner arrays:
char arr[num_of_lines][max_len + 1]; // +1 for string terminator
No need for dynamic allocation at all.
I believe you don't really understand how calloc works. You can't assign memory to the members of an array using
char (*arr)[num_of_lines] = calloc(num_of_lines, sizeof(char*) * max_len);
The first parameter of calloc will be the amount of blocks to allocate for an array and the second parameter will be the size of each block in bytes.
Instead you want to define the array like this
char *arr[num_of_lines];
Now you need to loop through each member and allocate memory for it
for(int i = 0; i < num_of_lines; i++) {
arr[i] = malloc(sizeof(char) * max_len);
}
Alternatively you can just define the array as a static array because you know the size of max_len.
You're seeing that behavior because you are over-writing the string terminator (null character == 0) at the end of each string when you add the extra character.
Instead of using one contiguous block of memory, you'll want to keep an array of string pointers, and allocate each string separately:
char *strings[];
strings = calloc(sizeof(char*),NUMBER_OF_STRINGS);
for i = 0; i < NUMBER_OF_STRINGS; i++ {
strings[i] = calloc(sizeof(char), MAX_STRING_LENGTH+1);
}
Then, when you want to add a character to a string, use strcat if the new total is less than the max string length:
strcat(strings[i],"-suffix");
Or, if it is longer, you'll need to reallocate the storage:
strings[i] = realloc(strings[i], MAX_STRING_LEN+EXTRA_BYTES+1);
strcat(strings[i],"-suffix");
I have a txt file consisting of tab-separated data with type double. The data file is over 10 GB, so I just wish to read the data line-by-line and then do some processing. Particularly, the data is layout as an matrix with, say 1001 columns, and millions of rows. Below is just a fake sample to show the layout.
10.2 30.4 42.9 ... 3232.000 23232.45
...
...
7.234 824.23232 ... 4009.23 230.01
...
For each line I'd like to store the first 1000 values in an array, and the last value in a separate variable. I am new to C, so it would be nice if you could kindly point out major steps.
Update:
Thanks for all valuable suggestions and solutions. I just figured out one simple example where I just read a 3-by-4 matrix row by row from a txt file. For each row, the first 3 elements are stored in x, and the last element is stored in vector y. So x is a n-by-p matrix with n=p=3, y is a 1-by-3 vector.
Below is my data file and my code.
Data file:
1.112272 -0.345324 0.608056 0.641006
-0.358203 0.300349 -1.113812 -0.321359
0.155588 2.081781 0.038588 -0.562489
My code:
#include<math.h>
#include <stdlib.h>
#include<stdio.h>
#include <string.h>
#define n 3
#define p 3
void main() {
FILE *fpt;
fpt = fopen("./data_temp.txt", "r");
char line[n*(p+1)*sizeof(double)];
char *token;
double *x;
x = malloc(n*p*sizeof(double));
double y[n];
int index = 0;
int xind = 0;
int yind = 0;
while(fgets(line, sizeof(line), fpt)) {
//printf("%d\n", sizeof(line));
//printf("%s\n", line);
token = strtok(line, "\t");
while(token != NULL) {
printf("%s\n", token);
if((index+1) % (p+1) == 0) { // the last element in each line;
yind = (index + 1) / (p+1) - 1; // get index for y vector;
sscanf(token, "%lf", &(y[yind]));
} else {
sscanf(token, "%lf", &(x[xind]));
xind++;
}
//sscanf(token, "%lf", &(x[index]));
index++;
token = strtok(NULL, "\t");
}
}
int i = 0;
int j = 0;
puts("Print x matrix:");
for(i = 0; i < n*p; i++) {
printf("%f\n", x[i]);
}
printf("\n");
puts("Print y vector:");
for(j = 0; j < n; j++) {
printf("%f\t", y[j]);
}
printf("\n");
free(x);
fclose(fpt);
}
With above, hopefully things will work if I replace data_temp.txt with my raw 10 GB data file (of course change values of n,p, and some other code wherever necessary.)
I have additional questions that I wish if you could help me.
I first initialized char line[] as char line[(p+1)*sizeof(double)] (note not multiplying n). But the line cannot be read completely. How could I assign memory JUST for one single line? What's the lenght? I assume it's (p+1)*sizeof(double) since there are (p+1) doubles in each line. Should I also assign memory for \t and \n? If so, how?
Does the code look reasonable to you? How could I make it more efficient since this code will be executed over millions of rows?
If I don't know the number of columns or rows in the raw 10 GB file, how could I quickly count rows and columns?
Again I am new to C, any comments are very appreciated. Thanks a lot!
1st way
Read file in chunks into preallocated buffer using fread.
2nd way
Map the file into your process memory space using mmap, move the pointer then over the file.
3rd way
Since your file is delimited by lines, open the file with fopen, use setvbuf or similar to set a buffer size greater than about 10 lines or so, then read the file line-by-line using fgets.
To potentially read the file even faster, use open with O_DIRECT (assuming Linux), then use fdopen to get a FILE * for the open file, then use setvbuf to set a page-aligned buffer. Doing that will allow you to bypass the kernel page cache - if your system's implementation works successfully using direct IO that way. (There can be many restrictions to direct IO)
Something to get you started: Reading 1 line
#define COLUMN (1000+1)
double data[COLUMNS];
for (int i = 0; i< COLUMN; i++) {
char delim = '\n';
int cnt = fscanf(in_stream, "%lf%c", &data[i], &delim);
if (cnt < 1) {
if (cnt == EOF && i == 0) return 0; // None read, OK as end of file
puts("Missing or bad data");
return -1; // problem
}
if (delim != '\t') {
// If tab not found, should be at end of line
if (delim == '\n' && i == COLUMN-1) {
return COLUMN; // Success
}
puts("Bad delimiter");
return -1;
}
}
puts("Extra data");
return -1;