Finding a string in a file using the lseek command in C - c

I have a file storing data of students in the following order:
id (space) name (space) address
Below is the content of the file:
10 john manchester
11 sam springfield
12 samuel glasgow
Each data is stored in a newline.
I want to search the student with id 10 and display his/her details using the lseek command, however I'm not to complete the task. Any help is appreciated.
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
void main() {
char line[50] = "";
char id[2] = "";
ssize_t fd = open("file.dat", O_RDONLY);
while(read(fd,line,sizeof(line))>0){
if (id[0] == '1' && id[1] == '0'){
printf("%s\n",line);
}
lseek(fd, 1 ,SEEK_CUR);
}
close(fd);

Use the right tools for the task. Hammer for nails, Screwdriver for screws.
lseek is not the right tool here, since lseek is for repositioning the file offset (which you do not have yet, you are looking for a specific position, when found, then you don't have a need for repositioning the file offset, since you are already there).
Ask yourself,
What is your task:
search for a specific id
print the line if match
What do you have:
a dataset (textfile) with a fixed format (id <space> name <space> address <newline>)
Your dataset is separated by a newline, and the id is the first field of that row.
The keywords here are 'newline' and 'first field'.
The right procedure here would be:
read a whole line (fgets)
compare the first field (start of line) with the desired id (strcmp)
Example:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main() {
//return value of main
int ret = EXIT_FAILURE;
//open filestream in read mode
FILE *f = fopen("file.dat", "r");
//string buffer
#define MAX_LEN 50
const char line[MAX_LEN];
char field[MAX_LEN];
//the id to search for
const char *id = "10";
//for each line
while (fgets(line, MAX_LEN, f)) {
//extract the first field ('%s' matches a sequence of non-white-space characters)
sscanf(line, "%s", field);
//compare the field with the desired id
if (strcmp(field, id) == 0) {
//if found print line
printf("%s", str);
//set result to success
ret = EXIT_SUCCESS;
//and exit
break;
}
}
//cleanup
fclose(f);
//return the result
return ret;
}

Your file has a first line that has 18 characters, a second line with the same number of characters, and a third one with one less (17) number of characters.
In case you have a four line in which the name for example makes the number of characters different, they should be appended to the file without any other structure.
Lines are delimited by \n characters, that can appear at any point, so second line starts as soon as just behind the first appearance of the \n char.
For this reason, you don't know the precise position where each line begins, and so you cannot know the exact position where each line begins, as the position of each line is (n + 1) bytes forward from twhere the previous line started, where n is the number of characters you put in the previous line, plus one (for the new line character).
You need an index, which is a file that allows you to get, on a fixed length record, to store the starting positions of each line in the data file. In this way, to read line i, you access the index at position (record_index_size * i), and get the position of the starting point of line i. Then you go to the data file, and position your file pointer to the value obtained from the las calculation, and read that with, for example fgets(3).
To build the index, you need to call ftell() right before each call to fgets(), because the call to fgets() will move the pointer, and so the position obtained will not be correct. Try to write the position in a fixed length format, e.g. binary form, with:
write(ix_fd, position, sizeof position);
so the position of line i, can be calculated by reading from index at position i * sizeof position.

Related

`C`: String not being updated properly when reading from a CVS file

I am given a text file of movie showtime information. I have to format the information in a clean way. Right now I'm just trying to get all line's information saved into strings. However, when getting the movie's rating the array wont save the rating properly.
This is the main code.
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
int main(void) {
const int MAX_TITLE_CHARS = 44; // Maximum length of movie titles
const int LINE_LIMIT = 100; // Maximum length of each line in the text file
char line[LINE_LIMIT];
char inputFileName[25];
FILE *file;
file = fopen("D:\\movies.txt", "r");
char currentLine[LINE_LIMIT];
char movieTitle[MAX_TITLE_CHARS];
char movieTime[10];
char movieRating[10];
fgets(currentLine, LINE_LIMIT, file); // Get first file
while(!feof(file)){
sscanf(currentLine, "%[^,],%44[^,],%s", movieTime, movieTitle, movieRating);
printf("%s\n", movieRating);
fgets(currentLine, LINE_LIMIT, file); // Get next file
}
return 0;
}
This is the CVS file
16:40,Wonders of the World,G
20:00,Wonders of the World,G
19:00,Journey to Space ,PG-13
12:45,Buffalo Bill And The Indians or Sitting Bull's History Lesson,PG
15:00,Buffalo Bill And The Indians or Sitting Bull's History Lesson,PG
19:30,Buffalo Bill And The Indians or Sitting Bull's History Lesson,PG
10:00,Adventure of Lewis and Clark,PG-13
14:30,Adventure of Lewis and Clark,PG-13
19:00,Halloween,R
This prints out
G
G
PG-13
PG-13
PG-13
PG-13
PG-13
PG-13
R
I need it to be
G
G
PG-13
PG
PG
PG
PG-13
PG-13
R
I use Eclipse and when in the debugger, I see that when it encounters the first PG-13, it doesn't update at all until the R. I'm thinking maybe since PG and PG-13 have the same two starting characters perhaps it gets confused? I'm not sure. Any help is appreciated.
You are converting the line using the following line:
sscanf(currentLine, "%[^,],%44[^,],%s", movieTime, movieTitle, movieRating);
the function will read a string into movietTime until a ',' appears in the input, then it will read another string until either a ',' appears or 44 characters are read. This behavior is explained in the manual for sscanf:
...
An optional decimal integer which specifies the maximum field width.
Reading of characters stops either when this maximum is reached or when
a nonmatching character is found, whichever happens first...
The lines with PG ratings have titles with 62 characters. Thus, it does not read the entire title, and does not find the comma. To fix this issue, you can either set MAX_TITLE_CHARS to a greater value or use the %m modifier to have sscanf dynamically allocate the string for you.
OP code had undefined behavior (UB) as the movieTitle[] was only big enough for 43 character + the terminating null character and OP used "%44[^,]" rather than the correct width limit of 43.
const int MAX_TITLE_CHARS = 44; // Maximum length of movie
...
char movieTitle[MAX_TITLE_CHARS];
Other problems too that followed this UB.
Account for the '\n' of the line and a '\0' to form a string.
Never use while(feof(...)).
Test sscanf() results.
Limit printed title width with a precision.
const int LINE_LIMIT = 100; // Maximum length of each line in the text file
char line[LINE_LIMIT + 2 /* room for \n and \0 */];
while (fgets(currentLine, sizeof currentLine, file)) {
// Either use a _width limit_ with `"%s"`, `"%[]"` or use a worse case size.
char movieTime[10 + 1]; // 10 characters + \0
char movieTitle[sizeof currentLine];
char movieRating[sizeof currentLine];
// Examples:
// Use a 10 width limit for the 11 char movieTime
// Others: use a worst case size.
if (sscanf(currentLine, " %10[^,], %[^,], %[^\n]",
movieTime, movieTitle, movieRating) != 3) {
fprintf(stderr, "Failed to parse <%s>\n", currentLine);
break;
}
// Maximum length of movie titles _to print_
const int MAX_TITLE_CHARS = 44;
printf("Title: %-.*s\n", MAX_TITLE_CHARS, movieTitle);
printf("Rating: %s\n", movieRating);
}
Note that "Maximum length of each line" is unclear if the length includes the ending '\n'. In the C library, a line includes the '\n'.
A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a
terminating new-line character is implementation-defined. C17dr § 7.21.2 2
Your string 'Buffalo Bill...' is more than 44 characters. thus the sccanf statement reads up to that limit, it then looks for a ',', which doesn't exist 44 characters into the string and exits.
Because your new movieRating isn't being set, it just prints the previous value.
Hint: If you are looking for a work around, you can parse your string with something like strsep(). You can also just increase the size of your movie title.

C Programming - File io parsing strings using sscanff

I am trying to do the following the C programming language, any help or if you can finish the code I will be greatly appreciated:
I am trying to write a program in C programming language that uses file io, that will parse through the words using sscanf function and output each word in all the sentences inside a txt document (bar.txt). Here is the instructions.
Write a program that opens the file bar.txt name the program "reader". Pass a parameter to indicate lines to read. Read all the lines in the file based on the parameter 'lines' into a buffer and using sscanf parse all the words of the sentences into different string* variables. Print each of the words to the screen followed by a carriage return. You can hardwire filename (path of bar.xt) or use option to enter filename.
This is the txt file (bar.txt) i am working with:
bar.txt
this is the first sentence
this is the 2nd sentence
this is the 3rd sentence
this is the 4th sentence
this is the 5th sentence
end of file: bar.txt
usage of argv: Usage: updater [-f "filename"] 'lines'
-f is optional (if not provided have a hardwired name from previous program 2 (bar.txt))
'lines' integer from 1 to 10 (remember the files has 5-10 strings from previous program)
a sample input example for the input into the program is:
./reader -f bar.txt 1
OUTPUT:
Opening file "bar.txt"
File Sentence 1 word 1 = this
File Sentence 1 word 2 = is
File Sentence 1 word 3 = the
File Sentence 1 word 4 = first
File Sentence 1 word 5 = sentence
another example
./reader -f bar.txt 5
OUTPUT:
File Sentence 5 word 1 = this
File Sentence 5 word 2 = is
File Sentence 5 word 3 = the
File Sentence 5 word 4 = 5th
File Sentence 5 word 5 = sentence
Examples of commands:
./reader -f bar.txt 5
./reader -f bar.txt 2
./reader -f bar.txt 7
./reader 2
./reader 5
./reader 8
./reader 11
this is the code that I have so far please fix the code to show the desired output:
#include <stdlib.h>
#include <stdio.h>
#define MAXCHAR 1000
int main(int argc, char *argv[]) {
FILE *file;
char string[MAXCHAR];
char* filename = "c:\\cprogram\\fileio-activity\\bar.txt";
int integer = argv[3][0] - 48;
int i; //for loops
if (argv[1][0] == '-' && argv[1][1] == 'f')
{
file = fopen(filename, "r");
if (file == NULL){
printf("Could not open file %s",filename);
return 1;
}
while (fgets(string, MAXCHAR, file) != NULL)
printf("%s", string);
fclose(file);
return 0;
}
}
You need to get the filename from argv if they use the -f option. And you need to get the number of lines from a different argument depending on whether this option was supplied.
Use strcmp() to compare strings, rather than testing each character separately. And use atoi() to convert the lines argument to an integer, since your method only works for single-digit numbers.
#include <stdlib.h>
#include <stdio.h>
#define MAXCHAR 1000
function usage() {
fprintf(stderr, "Usage: reader [-f filename] lines\n");
exit(1);
}
int main(int argc, char *argv[]) {
FILE *file;
char string[MAXCHAR];
char* filename = "c:\\cprogram\\fileio-activity\\bar.txt";
int integer;
int i; //for loops
if (argc < 2) {
usage();
}
# Process arguments
if (strcmp(argv[1], "-f") == 0)
{
if (argc < 4) {
usage();
}
filename = argv[2];
integer = atoi(argv[3]);
} else {
integer = atoi(argc[1]);
}
file = fopen(filename, "r");
if (file == NULL){
fprintf(stderr, "Could not open file %s\n",filename);
return 1;
}
while (fgets(string, MAXCHAR, file) != NULL)
printf("%s", string);
fclose(file);
return 0;
}
To add to what Barmar already answered, for the further steps in completing the assignment:
Splitting a string into separate words is usually called tokenization, and we normally use strtok() for this. There are several ways how one can use sscanf() to do it. For example:
Use sscanf(string, "%s %s %s", word1, word2, word3) with however many word buffers you might need. (If you use e.g. char word1[100], then use %99s, to avoid buffer overrun bugs. One character must be reserved for the end-of-string character \0.)
The return value of sscanf() tells you how many words it copied to the word buffers. However, if string contains more than the number of words you specified, the extra ones are lost.
If the exercise specifies the maximum length of strings, say N, then you know there can be at most N/2+1 words, each of maximum length N, because each consecutive pair of words must be separated by at least one space or other whitespace character.
  
Use sscanf(string + off, " %s%n", word, &len) to obtain each word in a loop. It will return 1 (with int len set to a positive number) for each new word, and 0 or EOF when string starting at off does not contain any more words.
The idea is that for each new word, you increment off by len, thus examining the rest of string in each iteration.
  
Use sscanf(string + off, " %n%*s%n", &start, &end) with int start, end to obtain the range of positions containing the next word. Set start = -1 and end = -1 before the call, and repeat as long as end > start after the call. Advance to next word by adding end to off.
The beginning of the next word (when start >= 0) is then string + start, and it has end - start characters.
To "emulate" strtok() behaviour, one can temporarily save the terminating character (which can be whitespace or the end of string character) by using e.g. char saved = string[off + end];, then replace it with an end-of-string character, string[off + end] = '\0';, so that (string + start) is a pointer to the word, just like strtok() returns. Before the next scan, one does string[off + end] = saved; to restore the saved character, and off += end; to advance to the next word. 
The first one is the easiest, but is the least useful in practical programs. (It works fine, but we do not usually know beforehand the string length and word count limitations.)
The second one is very useful when you have alternate patterns you can try for the next "word" or item; for example, when reading 2D or 3D vectors (points in a plane, or in three-dimensional space), you can support multiple different formats from <1 2 3> to [1,2,3] to 1 2 3, by trying to parse the most complicated/longest first, and trying the next one, until one of them works. (If none of them work, then the input is in error, of course.)
The third one is most useful in that it describes essentially how strtok() works, and what its side effects are. (It's saved character is hidden internally as a static variable.)

How to Split strings with fgets or fscanf in C?

I understand how to read in a text file and scan/print the entire file, but how can a line be split into several strings? Also, can variables be assigned to those strings to be called later?
My code so far:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
FILE *fPointer;
fPointer = fopen("p1customer.txt", "r");
char singleLine[150];
int id;
while (!feof(fPointer)){
fscanf(fPointer, "%d", &id);
printf("%d",id);
}
fclose(fPointer);
return 0;
}
Example Text File to be read:
99999 John Doe Basketball
Example Output:
John Doe has ID number 99999 and plays Basketball
I am attempting to split/tokenize those strings and assign them variables (IDnumber, Name, Sport) and print the output in a new file.
you can use a library function strtok(str,chrs) function.
A sequence of calls of strtok(str,chrs) splits str into tokens, each delimited by a character from chrs.
The first call in a sequence is a non Null str.It finds the first token in str consisting of chars not int chrs;it terminates that by overwrtting the next characters of str by \0 and return pointer to token. Each subsequent call,indicated by a NULL value of str,retuens a pointer to next such token, searching from just past the end of privious one.
You should post an example of the input file so that you can help in more detail.
I've seen you've also entered a string, I guess you want to fill in with something but you did not specify that.
If you wanted to treat the file as a list of numbers, the sample of the code might be the following.
#include <stdio.h>
int main() {
FILE *infile;
char buf[100];
int len_file=0;
if(!(infile = fopen("p1customer.txt", "r"))) { /*checks the correct opening of the file*/
printf("Error in open p1customer.txt\n");
return 1;
}
while(fgets(buf,sizeof(buf),infile)!=NULL) /*check the lenght of the file (number of row) */
len_file++;
int id[len_file];
int i=0;
rewind(infile);
while(fgets(buf,sizeof(buf),infile)!=NULL) {
sscanf(buf,"%i",&id[i]);
i++;
}
for(i=0;i<len_file;i++)
printf("%i\n",id[i]);
fclose(infile);
return 0;
}
If you want to treat the file as an indefinite list of numbers on each row separated by a space, you can use the parsing of the string by using in the sscanf formatting %31[^ ]which has the task of reading the number until it encounters a space, also you can add a variable that is incremented for each char/number read.
Then you can refine the code by checking if there are any characters in the line using the isalpha function in the ctype.h library to see if there are any characters and then insert them into a string until you find the termination character '\ 0'.
The possibilities are infinite so it would useful have the input file, when you provided it, i'll update the answer.

Reading first 5 characters from a file using fread function in C

How do i read some 5 to 10 characters from a sample txt file using an fread funtion.
I have the following code:
#include <stdio.h>
main()
{
char ch,fname[20];
FILE *fp;
printf("enter the name of the file:\t");
gets(fname);
fp=fopen(fname,"r");
while(fread(&ch,1,1,fp)!=0)
fwrite(&ch,1,1,stdout);
fclose(fp);
}
when i enter any sample filename..it prints all the data of the file.
my question is how to print only the first 5 to 10 characters from the sample file.
Your while loop runs until read reaches the end of the file (reads 0 bytes for the first time).
You will want to change the condition by using a for loop or a counter.
i.e. (these are suggestions, not the full working code):
int counter = 10;
while(fread(&ch,1,1,fp)!=0 && --counter)
fwrite(&ch,1,1,stdout);
or
int i;
for(i=0; i < 10 && fread(&ch,1,1,fp) > 0 ; i++)
fwrite(&ch,1,1,stdout);
Good luck!
P.S.
To answer your question in the comments, fread allows us to read the data in "atomic units", so that if a whole unit isn't available, no data will be read.
A single byte is the smallest unit (1), and you are reading one unite (of a single byte), this is the 1,1 part in the fread(&ch,1,1,fp).
You could read 10 units using fread(&ch,1,10,fp) or read all the bytes unrequited for a single binary int (this won't be portable - it's just a demo) using int i; fread(&i,sizeof(int),1,fp);
read more here.
Here is a modified version of your code. Check the comments at the lines that are modified
#include <stdio.h>
#define N_CHARS 10 // define the desired buffer size once for code maintenability
int main() // main function should return int
{
char ch[N_CHARS + 1], fname[20]; // create a buffer with enough size for N_CHARS chars and the null terminating char
FILE *fp;
printf("enter the name of the file:\t");
scanf("%20s", fname); // get a string with max 20 chars from stdin
fp=fopen(fname,"r");
if (fread(ch,1,N_CHARS,fp)==N_CHARS) { // check that the desired number of chars was read
ch[N_CHARS] = '\0'; // null terminate before printing
puts(ch); // print a string to stdout and a line feed after
}
fclose(fp);
}

Read from binary file with variable length records

I have a binary file with variable length record that looks about like this:
12 economic10
13 science5
14 music1
15 physics9
16 chemistry9
17 history2
18 anatomy7
19 physiology7
20 literature3
21 fiction3
16 chemistry7
14 music10
20 literature1
The name of the course is the only variable length record in the file, the first number is the code of the course and it can be a number between 1 and 9999 and the 2nd number is the department and it can be between 1 and 10.
As you see in the file there is no space between the name of the course and the department number.
The question is how can I read from the binary file? There is no field in the file to tell me what is the size of the string which is the course name..
I can read the first int (course id) fine, but how do I know what is the size of the name of the course?
Use fscanf() with the format string "%u %[a-z]%u".
Here's a complete example program:
#include <stdio.h>
#define NAME_MAX 64
int main(int argc, char ** argv)
{
FILE * file = fopen("foo.txt", "rb");
unsigned int course, department;
char name[NAME_MAX];
while(fscanf(file, "%u %[a-z]%u", &course, name, &department) != EOF)
{
// do stuff with records
printf("%u-%u %s\n", department, course, name);
}
fclose(file);
return 0;
}
You'd need to know how the file was written out in the first place.
To read variable length records, you should use some sort of convention. For example, a special characters that indicates the end of a record. Inside every record, you could use a another special character indicating end of field.
DO_READ read from file
is END_OF_RECORD char present?
yes: GOTO DO_PROCESS
no : GOTO DO_READ
DO_PROCESS read into buffer
is END_OF_FILE mark present?
yes: GOTO DOSOMETHINGWITHIT
no: GOTO DO_PROCESS
As others have said this looks a lot like text, so a text parsing approach is likely to be the right way to go. Since this is homework, I'm not going to code it out for you, but here's the general approach I'd take:
Using fscanf, read the course code, and a combined string with the name and department code.
Starting from the end of the combined string, go backwards until you find the first non-digit. This is end the of the course name.
Read the integer starting just beyond the end of the course name (ie, the last digit we find scanning backwards).
Replace the first character of that integer's part of the string with a NUL ('\0') - this terminates the combined string immediately after the course name. So all we have left in the combined string is the course name, and we have the course code and department code in integer variables.
Repeat for the next line.
If there is a one to one correspondence between course code and course name (including department code), you can deduce the size of the course name from its code, with a predefined table somewhere in the code or in a configuration file.
If not, the main problem I see is to discriminate things like music1 and music10.
Assuming there are no carriage returns and each string is null terminated.
I have written a little program to create a binary file and then read it back, producing similar output.
// binaryFile.cpp
#include "stdafx.h"
#include <stdio.h>
#include <string.h>
#define BUFSIZE 64
int _tmain(int argc, _TCHAR* argv[])
{
FILE *f;
char buf[BUFSIZE+1];
// create dummy bin file
f = fopen("temp.bin","wb");
if (f)
{ // not writing all the data, just a few examples
sprintf(buf,"%04d%s\00",12,"economic10"); fwrite(buf,sizeof(char),strlen(buf)+1,f);
sprintf(buf,"%04d%s\00",13,"science5"); fwrite(buf,sizeof(char),strlen(buf)+1,f);
sprintf(buf,"%04d%s\00",14,"music1"); fwrite(buf,sizeof(char),strlen(buf)+1,f);
sprintf(buf,"%04d%s\00",15,"physics9"); fwrite(buf,sizeof(char),strlen(buf)+1,f);
fclose(f);
}
// read dummy bin file
f = fopen("temp.bin","rb");
if (f)
{
int classID;
char str[64];
char *pData
long offset = 0;
do
{
fseek(f,offset,SEEK_SET);
pData = fgets(buf,BUFSIZE,f);
if (pData)
{ sscanf(buf,"%04d%s",&classID,&str);
printf("%d\t%s\r\n",classID,str);
offset +=strlen(pData)+1; // record + 1 null character
}
} while(pData);
fclose(f);
}
getchar();
return 0;
}

Resources