I have a binary file with variable length record that looks about like this:
12 economic10
13 science5
14 music1
15 physics9
16 chemistry9
17 history2
18 anatomy7
19 physiology7
20 literature3
21 fiction3
16 chemistry7
14 music10
20 literature1
The name of the course is the only variable length record in the file, the first number is the code of the course and it can be a number between 1 and 9999 and the 2nd number is the department and it can be between 1 and 10.
As you see in the file there is no space between the name of the course and the department number.
The question is how can I read from the binary file? There is no field in the file to tell me what is the size of the string which is the course name..
I can read the first int (course id) fine, but how do I know what is the size of the name of the course?
Use fscanf() with the format string "%u %[a-z]%u".
Here's a complete example program:
#include <stdio.h>
#define NAME_MAX 64
int main(int argc, char ** argv)
{
FILE * file = fopen("foo.txt", "rb");
unsigned int course, department;
char name[NAME_MAX];
while(fscanf(file, "%u %[a-z]%u", &course, name, &department) != EOF)
{
// do stuff with records
printf("%u-%u %s\n", department, course, name);
}
fclose(file);
return 0;
}
You'd need to know how the file was written out in the first place.
To read variable length records, you should use some sort of convention. For example, a special characters that indicates the end of a record. Inside every record, you could use a another special character indicating end of field.
DO_READ read from file
is END_OF_RECORD char present?
yes: GOTO DO_PROCESS
no : GOTO DO_READ
DO_PROCESS read into buffer
is END_OF_FILE mark present?
yes: GOTO DOSOMETHINGWITHIT
no: GOTO DO_PROCESS
As others have said this looks a lot like text, so a text parsing approach is likely to be the right way to go. Since this is homework, I'm not going to code it out for you, but here's the general approach I'd take:
Using fscanf, read the course code, and a combined string with the name and department code.
Starting from the end of the combined string, go backwards until you find the first non-digit. This is end the of the course name.
Read the integer starting just beyond the end of the course name (ie, the last digit we find scanning backwards).
Replace the first character of that integer's part of the string with a NUL ('\0') - this terminates the combined string immediately after the course name. So all we have left in the combined string is the course name, and we have the course code and department code in integer variables.
Repeat for the next line.
If there is a one to one correspondence between course code and course name (including department code), you can deduce the size of the course name from its code, with a predefined table somewhere in the code or in a configuration file.
If not, the main problem I see is to discriminate things like music1 and music10.
Assuming there are no carriage returns and each string is null terminated.
I have written a little program to create a binary file and then read it back, producing similar output.
// binaryFile.cpp
#include "stdafx.h"
#include <stdio.h>
#include <string.h>
#define BUFSIZE 64
int _tmain(int argc, _TCHAR* argv[])
{
FILE *f;
char buf[BUFSIZE+1];
// create dummy bin file
f = fopen("temp.bin","wb");
if (f)
{ // not writing all the data, just a few examples
sprintf(buf,"%04d%s\00",12,"economic10"); fwrite(buf,sizeof(char),strlen(buf)+1,f);
sprintf(buf,"%04d%s\00",13,"science5"); fwrite(buf,sizeof(char),strlen(buf)+1,f);
sprintf(buf,"%04d%s\00",14,"music1"); fwrite(buf,sizeof(char),strlen(buf)+1,f);
sprintf(buf,"%04d%s\00",15,"physics9"); fwrite(buf,sizeof(char),strlen(buf)+1,f);
fclose(f);
}
// read dummy bin file
f = fopen("temp.bin","rb");
if (f)
{
int classID;
char str[64];
char *pData
long offset = 0;
do
{
fseek(f,offset,SEEK_SET);
pData = fgets(buf,BUFSIZE,f);
if (pData)
{ sscanf(buf,"%04d%s",&classID,&str);
printf("%d\t%s\r\n",classID,str);
offset +=strlen(pData)+1; // record + 1 null character
}
} while(pData);
fclose(f);
}
getchar();
return 0;
}
Related
I have a file storing data of students in the following order:
id (space) name (space) address
Below is the content of the file:
10 john manchester
11 sam springfield
12 samuel glasgow
Each data is stored in a newline.
I want to search the student with id 10 and display his/her details using the lseek command, however I'm not to complete the task. Any help is appreciated.
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
void main() {
char line[50] = "";
char id[2] = "";
ssize_t fd = open("file.dat", O_RDONLY);
while(read(fd,line,sizeof(line))>0){
if (id[0] == '1' && id[1] == '0'){
printf("%s\n",line);
}
lseek(fd, 1 ,SEEK_CUR);
}
close(fd);
Use the right tools for the task. Hammer for nails, Screwdriver for screws.
lseek is not the right tool here, since lseek is for repositioning the file offset (which you do not have yet, you are looking for a specific position, when found, then you don't have a need for repositioning the file offset, since you are already there).
Ask yourself,
What is your task:
search for a specific id
print the line if match
What do you have:
a dataset (textfile) with a fixed format (id <space> name <space> address <newline>)
Your dataset is separated by a newline, and the id is the first field of that row.
The keywords here are 'newline' and 'first field'.
The right procedure here would be:
read a whole line (fgets)
compare the first field (start of line) with the desired id (strcmp)
Example:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main() {
//return value of main
int ret = EXIT_FAILURE;
//open filestream in read mode
FILE *f = fopen("file.dat", "r");
//string buffer
#define MAX_LEN 50
const char line[MAX_LEN];
char field[MAX_LEN];
//the id to search for
const char *id = "10";
//for each line
while (fgets(line, MAX_LEN, f)) {
//extract the first field ('%s' matches a sequence of non-white-space characters)
sscanf(line, "%s", field);
//compare the field with the desired id
if (strcmp(field, id) == 0) {
//if found print line
printf("%s", str);
//set result to success
ret = EXIT_SUCCESS;
//and exit
break;
}
}
//cleanup
fclose(f);
//return the result
return ret;
}
Your file has a first line that has 18 characters, a second line with the same number of characters, and a third one with one less (17) number of characters.
In case you have a four line in which the name for example makes the number of characters different, they should be appended to the file without any other structure.
Lines are delimited by \n characters, that can appear at any point, so second line starts as soon as just behind the first appearance of the \n char.
For this reason, you don't know the precise position where each line begins, and so you cannot know the exact position where each line begins, as the position of each line is (n + 1) bytes forward from twhere the previous line started, where n is the number of characters you put in the previous line, plus one (for the new line character).
You need an index, which is a file that allows you to get, on a fixed length record, to store the starting positions of each line in the data file. In this way, to read line i, you access the index at position (record_index_size * i), and get the position of the starting point of line i. Then you go to the data file, and position your file pointer to the value obtained from the las calculation, and read that with, for example fgets(3).
To build the index, you need to call ftell() right before each call to fgets(), because the call to fgets() will move the pointer, and so the position obtained will not be correct. Try to write the position in a fixed length format, e.g. binary form, with:
write(ix_fd, position, sizeof position);
so the position of line i, can be calculated by reading from index at position i * sizeof position.
I'm trying to read different data types on the same line of a text file, and currently trying to store them in their own arrays via a structure. I'm not sure if this is the best course of action to begin with, but the point is to read data from a file and manipulate it using different functions. I thought that if I could extract the data from the file and store it in arrays, I could send the arrays into functions with the arrays as their parameters. Here's what I have, and the problem explained within the main function:
Driver File:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "student_struct.c"
struct Student{
char name[50];
int id;
float gpa;
int age;
};
int main(){
FILE *fptr;
fptr = fopen("student_records.txt", "r");
if (fptr == NULL){
printf("Error opening file!\n");
exit(1);
}
struct Student students[100];
int i = 0;
while(!feof(fptr)){
//PROBLEM HERE. Data for what is expected to be in the "gpa" array is always 0.
fscanf(fptr, "%c %d %f %d", &students[i].name[i], &students[i].id, &students[i].gpa, &students[i].age);
i++;
}
fclose(fptr);
//Always prints "0.0000"
printf("GPA of student #2: %f\n", students[1].gpa);
//avgGPA(students.gpa);
return 0;
}
Function:
#include <stdio.h>
float avgGPA(float gpa[]){
int i;
float avgGPA = 0;
for(i = 0; i < sizeof(*gpa); i++){
avgGPA += gpa[i];
}
avgGPA = avgGPA / sizeof(*gpa);
printf("Average GPA: %f", avgGPA);
}
Text file:
David 1234 4.0 44
Sally 4321 3.6 21
Bob 1111 2.5 20
Greg 9999 1.8 28
Heather 0000 3.2 22
Keith 3434 2.7 40
Pat 1122 1.0 31
Ann 6565 3.0 15
Mike 9898 2.0 29
Steve 1010 2.2 24
Kristie 2222 3.9 46
My question is, how do I properly pull the data from the file and use it in different functions? Thank you for your help.
The %c in fscanf needs to be changed to %s. Refer to the fscanf man page for what each of the conversion specifiers mean. Specifically:
s
Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the terminating null byte ('\0'), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.
c
Matches a sequence of characters whose length is specified by the maximum field width (default 1); the next pointer must be a pointer to char, and there must be enough room for all the characters (no terminating null byte is added). The usual skip of leading white space is suppressed. To skip white space first, use an explicit space in the format.
In other words, %c by default only matches a single character. %s matches multiple non-white space characters (ie, colloquially a "word").
Other follow on questions you had:
but if the array in the structure "Student" is made of characters, why does it properly function with a string?
In C a string is defined as an array of characters terminated by a NUL (0).
At that end, why does that influence the rest of the operation?
%c will consume just one character. Which means the next modifier (%d in this case) will try to match with the remaining part of the first word and fail.
Other best practices that are relevant should be applied. Specifically:
Always check the return value of function calls. fscanf in particular for this case. If that were done you would be able to see that fscanf had failed to match most of the modifiers.
while !feof is always wrong. A full explanation of that is not provided here but please refer to other SO answers such as this.
Use a debugger to step through your code to help you examine the state of variables to better understand what the program is doing and where things go wrong.
I understand how to read in a text file and scan/print the entire file, but how can a line be split into several strings? Also, can variables be assigned to those strings to be called later?
My code so far:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
FILE *fPointer;
fPointer = fopen("p1customer.txt", "r");
char singleLine[150];
int id;
while (!feof(fPointer)){
fscanf(fPointer, "%d", &id);
printf("%d",id);
}
fclose(fPointer);
return 0;
}
Example Text File to be read:
99999 John Doe Basketball
Example Output:
John Doe has ID number 99999 and plays Basketball
I am attempting to split/tokenize those strings and assign them variables (IDnumber, Name, Sport) and print the output in a new file.
you can use a library function strtok(str,chrs) function.
A sequence of calls of strtok(str,chrs) splits str into tokens, each delimited by a character from chrs.
The first call in a sequence is a non Null str.It finds the first token in str consisting of chars not int chrs;it terminates that by overwrtting the next characters of str by \0 and return pointer to token. Each subsequent call,indicated by a NULL value of str,retuens a pointer to next such token, searching from just past the end of privious one.
You should post an example of the input file so that you can help in more detail.
I've seen you've also entered a string, I guess you want to fill in with something but you did not specify that.
If you wanted to treat the file as a list of numbers, the sample of the code might be the following.
#include <stdio.h>
int main() {
FILE *infile;
char buf[100];
int len_file=0;
if(!(infile = fopen("p1customer.txt", "r"))) { /*checks the correct opening of the file*/
printf("Error in open p1customer.txt\n");
return 1;
}
while(fgets(buf,sizeof(buf),infile)!=NULL) /*check the lenght of the file (number of row) */
len_file++;
int id[len_file];
int i=0;
rewind(infile);
while(fgets(buf,sizeof(buf),infile)!=NULL) {
sscanf(buf,"%i",&id[i]);
i++;
}
for(i=0;i<len_file;i++)
printf("%i\n",id[i]);
fclose(infile);
return 0;
}
If you want to treat the file as an indefinite list of numbers on each row separated by a space, you can use the parsing of the string by using in the sscanf formatting %31[^ ]which has the task of reading the number until it encounters a space, also you can add a variable that is incremented for each char/number read.
Then you can refine the code by checking if there are any characters in the line using the isalpha function in the ctype.h library to see if there are any characters and then insert them into a string until you find the termination character '\ 0'.
The possibilities are infinite so it would useful have the input file, when you provided it, i'll update the answer.
Here is a minimal "working" example:
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char* argv[])
{
int num = 10;
FILE* fp = fopen("test.txt", "r"); // test.txt contains character sequence
char* ptr = (char*) malloc(sizeof (char)*(num+1)); // +1 for '\0'
fread(ptr, sizeof(char), num, fp); // read bytes from file
ptr[num] = '\0';
printf("%s\n", ptr); // output: ´╗┐abcdefg
free(ptr);
fclose(fp);
return 0;
}
I would like to read some letters from a text file, containing all letters from the alphabet in a single line. I want my array to store the first 10 letters, but the first 3 shown in the output are weird symbols (see the comment at the printf statement).
What am I doing wrong?
The issue is that your file is encoded using UTF-8. While UTF-8 is backwards-compatible with ASCII (which is what your code will be using) there are many differences.
In particular, many programs will put a BOM (Byte Order Mark) symbol at the start of the file to indicate which direction the bytes go. If you print the BOM using the default windows code page, you get the two symbols you saw.
Whatever program you used to create your text file was automatically inserting that BOM at the start of the file. Notepad++ is notorious for doing this. Check the save options and make sure to save either as plain ASCII or as UTF-8 without BOM. That will solve your problem.
Alright I've been at this all day and can't for the life of me get this down, maybe you chaps can help. I have a file that reads as follows
1301,105515018,"Boatswain","Michael R.",ABC, 123,="R01"
1301,103993269,"Castille","Michael Jr",ABC, 123,="R03"
1301,103993267,"Castille","Janice",ABC, 123,="R03"
1301,104727546,"Bonczek","Claude",ABC, 123,="R01"
1301,104731479,"Cruz","Akeem Mike",ABC, 123,="R01"
1301,105415888,"Digiacomo","Stephen",ABC, 123,="R02"
1301,106034479,"Annitto Grassis","Susan",ABC, 123,="R04"
1301,106034459,"Als","Christian",ABC, 123,="R01"
And here is my code...
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_NAME 15
#define MAX_SUBSEC 3
#define N 128
//void printArr(struct *students);
struct student{
int term;
int id;
char lastname[MAX_NAME];
char firstname[MAX_NAME];
char subjectname[MAX_SUBSEC];
int catalog;
char section[MAX_SUBSEC];
}students[10];
int main(){
int term;
int id;
char lastname[MAX_NAME];
char firstname[MAX_NAME];
char sub[MAX_SUBSEC];
int cat;
char sec[MAX_SUBSEC];
char fname[N];
FILE *inputf;
printf("Enter the name of the text file: ");
scanf("%123s",fname);
strcat(fname,".txt");
inputf = fopen(fname,"r");
if (inputf == NULL){
printf("I couldn't open the file for reading.\n");
exit(0);
}
//TROUBLE HERE!
fscanf(inputf, "%d,%d,%[^,]s", &students[0].term, &students[0].id,students[0].lastname);
printf("%d\n", students[0].term);
printf("%d\n", students[0].id);
printf("%s\n", students[0].lastname);
/*for (j = 1 ; j <= 10-1 ; j++){
for(k = 0 ; k <= 10-2 ; k++){
if(students[k] > students[k+1]){
temp = students[k];
students[k] = students[k+1];
students[k+1] = temp;
}
}
}*/
fclose(inputf);
system("pause");
return 0;
}
void printArr(int a[], int tally){
int i;
for(i = 0 ; i < tally ; i++){
printf("%d ", a[i]);
}
printf("\n");
}
My objective is to take each one of those values in the text file and input it to where it belongs in the struct and subsequently the struct array, but I can't get passed the first 2 ints.
Getting the lastname string, because it is a max of 15 characters, it spills over into the first name string right after it and takes what remaining characters it needs in order to fill up the lastname char array. Obviously I do not want this. As you can see I have tried strtok but it doesnt do anything, not sure what I have to do though as I have never used it before. Also have tried just including all the variables into fscanf statement, but I either get the same output, or it becomes a mess. As it is, I am extremely lost, how do I get these values into the variables they belong?!
EDIT: updated my code, I have gotten a little farther but not much. I can now print out just the last name but can not more farther from there, I cant get to the firstname string or any of the variables beyond it.
What you have there is a CSV file with quoted strings, and so I would recommend you use a CSV parser (or roll your own) rather than trying to do it all with scanf (since scanf cannot deal with quotes, e.g. commas within quoted strings). A quick Google search turns up libcsv.c which you may be able to use in your project.
With the fscanf format string "%d,%d,\"%[^\"]\",\"%[^\"]\",%[^,],%d,=\"%[^\"]\"" we can read a whole line's data. Besides, you have to define
char lastname[MAX_NAME+1];
char firstname[MAX_NAME+1];
char subjectname[MAX_SUBSEC+1];
int catalog;
char section[MAX_SUBSEC+1];
— the +1 to account for the terminating '\0' character.
I have a question for you... If you want to know how to use a diamond cutter, do you try it and see, or do you consult the manual? The problem here isn't the result of your choice, but your choice itself. Believe it or not, I have answered these questions so often that I'm tired of repeating myself. The answer is all in the manual.
Read the POSIX 2004 scanf manual — or the POSIX 2008/2013 version — and the answer this question and you'll have some idea of what you're not doing that you should be. Even fscanf code should use assert as a debugging aid to ensure the number of items read was correct.
%[^,]s It seems as though there's a mistake here. Perhaps you meant %[^,]. The %[ format specifier is a different format specifier to the %s format specifier, hence in the presumably mistaken code there are two directives: %[^,] and s. The s directive tells scanf to read an 's' and discard it.
1.There is a syntax error in
while(result != NULL){
printf(".....);
......
}
}//error
fscanf(inputf, "%s", lastname); can't read a line ,fscanf will stop when it comes across an space
fscanf reads one line at a time, and you can easily capture the contents of each line because your file is formatted pretty nicely, especially due to the comma separation (really useful if none of your separated values contain a comma).
You can pass fscanf a format like you're doing with "%d" to capture an int, "%s" to capture a string (ends at white space, be weary of this when for example trying to find a name like "Annitto Grassis, which would require 2 %s's), etc, from the currently read line of the file. You can be more advanced and use regex patterns to define the contents you want captured as chars, such as "Boatswain", a sequence comprised chars from the sets {A-Z}, {a-z}, and the {"}. You'll want to scan the file until you reach the end (signified by EOF in C) so you can do such and capture the contents of the line and appropriately assign the values to variables like so:
while( fscanf(inputf, "%d,%d,%[\"A-Za-z ],%[\"A-Za-z .]", &term, &id, lastname, firstname) != EOF) {
.... //do something with term, id, lastname, firstname - put them in a student struct
}
For more about regex, Mastering Regex by Jeff Friedl is a good book for learning about the topic.