I am given a text file of movie showtime information. I have to format the information in a clean way. Right now I'm just trying to get all line's information saved into strings. However, when getting the movie's rating the array wont save the rating properly.
This is the main code.
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
int main(void) {
const int MAX_TITLE_CHARS = 44; // Maximum length of movie titles
const int LINE_LIMIT = 100; // Maximum length of each line in the text file
char line[LINE_LIMIT];
char inputFileName[25];
FILE *file;
file = fopen("D:\\movies.txt", "r");
char currentLine[LINE_LIMIT];
char movieTitle[MAX_TITLE_CHARS];
char movieTime[10];
char movieRating[10];
fgets(currentLine, LINE_LIMIT, file); // Get first file
while(!feof(file)){
sscanf(currentLine, "%[^,],%44[^,],%s", movieTime, movieTitle, movieRating);
printf("%s\n", movieRating);
fgets(currentLine, LINE_LIMIT, file); // Get next file
}
return 0;
}
This is the CVS file
16:40,Wonders of the World,G
20:00,Wonders of the World,G
19:00,Journey to Space ,PG-13
12:45,Buffalo Bill And The Indians or Sitting Bull's History Lesson,PG
15:00,Buffalo Bill And The Indians or Sitting Bull's History Lesson,PG
19:30,Buffalo Bill And The Indians or Sitting Bull's History Lesson,PG
10:00,Adventure of Lewis and Clark,PG-13
14:30,Adventure of Lewis and Clark,PG-13
19:00,Halloween,R
This prints out
G
G
PG-13
PG-13
PG-13
PG-13
PG-13
PG-13
R
I need it to be
G
G
PG-13
PG
PG
PG
PG-13
PG-13
R
I use Eclipse and when in the debugger, I see that when it encounters the first PG-13, it doesn't update at all until the R. I'm thinking maybe since PG and PG-13 have the same two starting characters perhaps it gets confused? I'm not sure. Any help is appreciated.
You are converting the line using the following line:
sscanf(currentLine, "%[^,],%44[^,],%s", movieTime, movieTitle, movieRating);
the function will read a string into movietTime until a ',' appears in the input, then it will read another string until either a ',' appears or 44 characters are read. This behavior is explained in the manual for sscanf:
...
An optional decimal integer which specifies the maximum field width.
Reading of characters stops either when this maximum is reached or when
a nonmatching character is found, whichever happens first...
The lines with PG ratings have titles with 62 characters. Thus, it does not read the entire title, and does not find the comma. To fix this issue, you can either set MAX_TITLE_CHARS to a greater value or use the %m modifier to have sscanf dynamically allocate the string for you.
OP code had undefined behavior (UB) as the movieTitle[] was only big enough for 43 character + the terminating null character and OP used "%44[^,]" rather than the correct width limit of 43.
const int MAX_TITLE_CHARS = 44; // Maximum length of movie
...
char movieTitle[MAX_TITLE_CHARS];
Other problems too that followed this UB.
Account for the '\n' of the line and a '\0' to form a string.
Never use while(feof(...)).
Test sscanf() results.
Limit printed title width with a precision.
const int LINE_LIMIT = 100; // Maximum length of each line in the text file
char line[LINE_LIMIT + 2 /* room for \n and \0 */];
while (fgets(currentLine, sizeof currentLine, file)) {
// Either use a _width limit_ with `"%s"`, `"%[]"` or use a worse case size.
char movieTime[10 + 1]; // 10 characters + \0
char movieTitle[sizeof currentLine];
char movieRating[sizeof currentLine];
// Examples:
// Use a 10 width limit for the 11 char movieTime
// Others: use a worst case size.
if (sscanf(currentLine, " %10[^,], %[^,], %[^\n]",
movieTime, movieTitle, movieRating) != 3) {
fprintf(stderr, "Failed to parse <%s>\n", currentLine);
break;
}
// Maximum length of movie titles _to print_
const int MAX_TITLE_CHARS = 44;
printf("Title: %-.*s\n", MAX_TITLE_CHARS, movieTitle);
printf("Rating: %s\n", movieRating);
}
Note that "Maximum length of each line" is unclear if the length includes the ending '\n'. In the C library, a line includes the '\n'.
A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a
terminating new-line character is implementation-defined. C17dr § 7.21.2 2
Your string 'Buffalo Bill...' is more than 44 characters. thus the sccanf statement reads up to that limit, it then looks for a ',', which doesn't exist 44 characters into the string and exits.
Because your new movieRating isn't being set, it just prints the previous value.
Hint: If you are looking for a work around, you can parse your string with something like strsep(). You can also just increase the size of your movie title.
Related
I have a file storing data of students in the following order:
id (space) name (space) address
Below is the content of the file:
10 john manchester
11 sam springfield
12 samuel glasgow
Each data is stored in a newline.
I want to search the student with id 10 and display his/her details using the lseek command, however I'm not to complete the task. Any help is appreciated.
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
void main() {
char line[50] = "";
char id[2] = "";
ssize_t fd = open("file.dat", O_RDONLY);
while(read(fd,line,sizeof(line))>0){
if (id[0] == '1' && id[1] == '0'){
printf("%s\n",line);
}
lseek(fd, 1 ,SEEK_CUR);
}
close(fd);
Use the right tools for the task. Hammer for nails, Screwdriver for screws.
lseek is not the right tool here, since lseek is for repositioning the file offset (which you do not have yet, you are looking for a specific position, when found, then you don't have a need for repositioning the file offset, since you are already there).
Ask yourself,
What is your task:
search for a specific id
print the line if match
What do you have:
a dataset (textfile) with a fixed format (id <space> name <space> address <newline>)
Your dataset is separated by a newline, and the id is the first field of that row.
The keywords here are 'newline' and 'first field'.
The right procedure here would be:
read a whole line (fgets)
compare the first field (start of line) with the desired id (strcmp)
Example:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main() {
//return value of main
int ret = EXIT_FAILURE;
//open filestream in read mode
FILE *f = fopen("file.dat", "r");
//string buffer
#define MAX_LEN 50
const char line[MAX_LEN];
char field[MAX_LEN];
//the id to search for
const char *id = "10";
//for each line
while (fgets(line, MAX_LEN, f)) {
//extract the first field ('%s' matches a sequence of non-white-space characters)
sscanf(line, "%s", field);
//compare the field with the desired id
if (strcmp(field, id) == 0) {
//if found print line
printf("%s", str);
//set result to success
ret = EXIT_SUCCESS;
//and exit
break;
}
}
//cleanup
fclose(f);
//return the result
return ret;
}
Your file has a first line that has 18 characters, a second line with the same number of characters, and a third one with one less (17) number of characters.
In case you have a four line in which the name for example makes the number of characters different, they should be appended to the file without any other structure.
Lines are delimited by \n characters, that can appear at any point, so second line starts as soon as just behind the first appearance of the \n char.
For this reason, you don't know the precise position where each line begins, and so you cannot know the exact position where each line begins, as the position of each line is (n + 1) bytes forward from twhere the previous line started, where n is the number of characters you put in the previous line, plus one (for the new line character).
You need an index, which is a file that allows you to get, on a fixed length record, to store the starting positions of each line in the data file. In this way, to read line i, you access the index at position (record_index_size * i), and get the position of the starting point of line i. Then you go to the data file, and position your file pointer to the value obtained from the las calculation, and read that with, for example fgets(3).
To build the index, you need to call ftell() right before each call to fgets(), because the call to fgets() will move the pointer, and so the position obtained will not be correct. Try to write the position in a fixed length format, e.g. binary form, with:
write(ix_fd, position, sizeof position);
so the position of line i, can be calculated by reading from index at position i * sizeof position.
I'm trying to read different data types on the same line of a text file, and currently trying to store them in their own arrays via a structure. I'm not sure if this is the best course of action to begin with, but the point is to read data from a file and manipulate it using different functions. I thought that if I could extract the data from the file and store it in arrays, I could send the arrays into functions with the arrays as their parameters. Here's what I have, and the problem explained within the main function:
Driver File:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "student_struct.c"
struct Student{
char name[50];
int id;
float gpa;
int age;
};
int main(){
FILE *fptr;
fptr = fopen("student_records.txt", "r");
if (fptr == NULL){
printf("Error opening file!\n");
exit(1);
}
struct Student students[100];
int i = 0;
while(!feof(fptr)){
//PROBLEM HERE. Data for what is expected to be in the "gpa" array is always 0.
fscanf(fptr, "%c %d %f %d", &students[i].name[i], &students[i].id, &students[i].gpa, &students[i].age);
i++;
}
fclose(fptr);
//Always prints "0.0000"
printf("GPA of student #2: %f\n", students[1].gpa);
//avgGPA(students.gpa);
return 0;
}
Function:
#include <stdio.h>
float avgGPA(float gpa[]){
int i;
float avgGPA = 0;
for(i = 0; i < sizeof(*gpa); i++){
avgGPA += gpa[i];
}
avgGPA = avgGPA / sizeof(*gpa);
printf("Average GPA: %f", avgGPA);
}
Text file:
David 1234 4.0 44
Sally 4321 3.6 21
Bob 1111 2.5 20
Greg 9999 1.8 28
Heather 0000 3.2 22
Keith 3434 2.7 40
Pat 1122 1.0 31
Ann 6565 3.0 15
Mike 9898 2.0 29
Steve 1010 2.2 24
Kristie 2222 3.9 46
My question is, how do I properly pull the data from the file and use it in different functions? Thank you for your help.
The %c in fscanf needs to be changed to %s. Refer to the fscanf man page for what each of the conversion specifiers mean. Specifically:
s
Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the terminating null byte ('\0'), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.
c
Matches a sequence of characters whose length is specified by the maximum field width (default 1); the next pointer must be a pointer to char, and there must be enough room for all the characters (no terminating null byte is added). The usual skip of leading white space is suppressed. To skip white space first, use an explicit space in the format.
In other words, %c by default only matches a single character. %s matches multiple non-white space characters (ie, colloquially a "word").
Other follow on questions you had:
but if the array in the structure "Student" is made of characters, why does it properly function with a string?
In C a string is defined as an array of characters terminated by a NUL (0).
At that end, why does that influence the rest of the operation?
%c will consume just one character. Which means the next modifier (%d in this case) will try to match with the remaining part of the first word and fail.
Other best practices that are relevant should be applied. Specifically:
Always check the return value of function calls. fscanf in particular for this case. If that were done you would be able to see that fscanf had failed to match most of the modifiers.
while !feof is always wrong. A full explanation of that is not provided here but please refer to other SO answers such as this.
Use a debugger to step through your code to help you examine the state of variables to better understand what the program is doing and where things go wrong.
How can I split character and variable in 1 line?
Example
INPUT
car1900food2900ram800
OUTPUT
car 1900
food 2900
ram 800
Code
char namax[25];
int hargax;
scanf ("%s%s",&namax,&hargax);
printf ("%s %s",namax,hargax);
If I use code like that, I need double enter or space for make output. How can I split without that?
You should be able to use code like this to read one name and number:
if (scanf("%24[a-zA-Z]%d", namax, &hargax) == 2)
…got name and number OK…
else
…some sort of problem to be reported and handled…
You would need to wrap that in a loop of some sort in order to get three pairs of values. Note that using &namax as an argument to scanf() is technically wrong. The %s, %c and %[…] (scan set) notations all expect a char * argument, but you are passing a char (*)[25] which is quite different. A fortuitous coincidence means you usually get away with the abuse, but it is still not correct and omitting the & is easy (and correct).
You can find details about scan sets etc in the POSIX specification of scanf().
You should consider reading a whole line of input with fgets() or POSIX
getline(), and then processing the resulting string with sscanf(). This makes error reporting and error recovery easier. See also How to use sscanf() in loops.
Since you are asking this question which is actually easy, I presume you are somewhat a beginner in C programming. So instead of trying to split the input itself during the input which seems to be a bit too complicated for someone who's new to C programming, I would suggest something simpler(not efficient when you take memory into account).
Just accept the entire input as a String. Then check the string internally to check for digits and alphabets. I have used ASCII values of them to check. If you find an alphabet followed by a digit, print out the part of string from the last such occurrence till the current point. And while printing this do the same with just a slight tweak with the extracted sub-part, i.e, instead of checking for number followed by letter, check for letter followed by digit, and at that point print as many number of spaces as needed.
just so that you know:
ASCII value of digits (0-9) => 48 to 57
ASCII value of uppercase alphabet (A-Z) => 65 to 90
ASCII value of lowercase alphabets (a-z)
=> 97 to 122
Here is the code:
#include<stdio.h>
#include<string.h>
int main() {
char s[100];
int i, len, j, k = 0, x;
printf("\nenter the string:");
scanf("%s",s);
len = strlen(s);
for(i = 0; i < len; i++){
if(((int)s[i]>=48)&&((int)s[i]<=57)) {
if((((int)s[i+1]>=65)&&((int)s[i+1]<=90))||(((int)s[i+1]>=97)&&((int)s[i+1]<=122))||(i==len-1)) {
for(j = k; j < i+1; j++) {
if(((int)s[j]>=48)&&((int)s[j]<=57)) {
if((((int)s[j-1]>=65)&&((int)s[j-1]<=90))||(((int)s[j-1]>=97)&&((int)s[j-1]<=122))) {
printf("\t");
}
}
printf("%c",s[j]);
}
printf("\n");
k = i + 1;
}
}
}
return(0);
}
the output:
enter the string: car1900food2900ram800
car 1900
food 2900
ram 800
In addition to using a character class to include the characters to read as a string, you can also use the character class to exclude digits which would allow you to scan forward in the string until the next digit is found, taking all characters as your name and then reading the digits as an integer. You can then determine the number of characters consumed so far using the "%n" format specifier and use the resulting number of characters to offset your next read within the line, e.g.
char namax[MAXNM],
*p = buf;
int hargax,
off = 0;
while (sscanf (p, "%24[^0-9]%d%n", namax, &hargax, &off) == 2) {
printf ("%-24s %d\n", namax, hargax);
p += off;
}
Note how the sscanf format string will read up to 24 character that are not digits as namax and then the integer that follows as hargax storing the number of characters consumed in off which is then applied to the pointer p to advance within the buffer in preparation for your next parse with sscanf.
Putting it altogether in a short example, you could do:
#include <stdio.h>
#define MAXNM 25
#define MAXC 1024
int main (void) {
char buf[MAXC] = "";
while (fgets (buf, MAXC, stdin)) {
char namax[MAXNM],
*p = buf;
int hargax,
off = 0;
while (sscanf (p, "%24[^0-9]%d%n", namax, &hargax, &off) == 2) {
printf ("%-24s %d\n", namax, hargax);
p += off;
}
}
}
Example Use/Output
$ echo "car1900food2900ram800" | ./bin/fgetssscanf
car 1900
food 2900
ram 800
I need to get a user-input string with a maximum length of 50 chars. Therefore I defined a MAX_STRING_LENGTH variable at 50 and the string is initalized with 51 characters. However, every time the input is greater than 48 characters, the string is cut from the last two characters. This is a school exercise and I can't use <string.h>.
#include <stdio.h>
#define MAX_STRING_LENGTH 50
int main(void)
{
int j=0;
char stringInput[MAX_STRING_LENGTH+1]; //string initialized.
printf("Please enter a valid string\n");
fgets(stringInput,MAX_STRING_LENGTH,stdin); //string input.
for(j=0;stringInput[j]!='\0';j++);
if(j<MAX_STRING_LENGTH+1)
{
j=j-1;
stringInput[j]='\0'; //remove newline if it exists
}
//...
return 0;
}
I don't understand why the string is losing 2 characters.
I am assuming that a newline(\n) is created always when using fgets (even if a full string of 50 characters is inputted), and I'm losing 1 character always(and therefore I have to increase the string size of the string). However I do not understand how the other character is lost.
I would appreciate your feedback. Thank you
In this call
fgets(stringInput,MAX_STRING_LENGTH,stdin);
you specified that at most MAX_STRING_LENGTH - 1 characters will be read in the array stringInput.
If you want that the array can contain a string with 50 characters (excluding the terminating zero) then you have to call fgets like
fgets(stringInput,MAX_STRING_LENGTH + 1,stdin);
But in this case the new line character that corresponds to the entered key Enter will still be in the input buffer if the user entered exactly 50 characters. To extract it you should declare the array at least like
char stringInput[MAX_STRING_LENGTH + 2];
and write the call like
fgets(stringInput,MAX_STRING_LENGTH + 2,stdin);
However it would be better initially to declare MAX_STRING_LENGTH equal to 52.
this little loop
for(j=0;stringInput[j]!='\0';j++);
if(j<MAX_STRING_LENGTH+1)
{
j=j-1;
stringInput[j]='\0'; //remove newline if it exists
}
removes the last character regardless of what it contains.
This is a [c] not a [c++] question.
Documentation for fgets
I have a function that will parse some data coming in. My problem is that after using strncpy I get some garbage when I try to print it. I try using malloc to make the char array the exact size.
Code:
void parse_data(char *unparsed_data)
{
char *temp_str;
char *pos;
char *pos2;
char *key;
char *data;
const char newline = '\n';
int timestamp = 0;
temp_str = (char*)malloc(strlen(unparsed_data));
g_print("\nThe original string is: \n%s\n",unparsed_data);
//Ignore the first two lines
pos = strchr(unparsed_data, newline);
strcpy(temp_str, pos+1);
pos = strchr(temp_str, newline);
strcpy(temp_str, pos+1);
//Split the line in two; The key name and the value
pos = strchr(temp_str, ':'); // ':' divides the name from the value
pos2 = strchr(temp_str, '\n'); //end of the line
key = (char*)malloc((size_t)(pos-temp_str)-1); //allocate enough memory
data = (char*)malloc((size_t)(pos2-pos)-1);
strncpy(key, temp_str, (size_t)(pos-temp_str));
strncpy(data, pos + 2, (size_t)(pos2-pos));
timestamp = atoi(data);
g_print("size of the variable \"key\" = %d or %d\n", (size_t)(pos-temp_str), strlen(key));
g_print("size of the variable \"data\" = %d or %d\n", (size_t)(pos2-pos), strlen(data));
g_print("The key name is %s\n",key);
g_print("The value is %s\n",data);
g_print("End of Parser\n");
}
Output:
The original string is:
NEW_DATAa_PACKET
Local Data Set 16-byte Universal Key
Time Stamp (microsec): 1319639501097446
Frame Number: 0
Version: 3
Angle (deg): 10.228428
size of the variable "key" = 21 or 22
size of the variable "data" = 18 or 21
The key name is Time Stamp (microsec)
The value is 1319639501097446
F32
End of Parser
Run it again:
The original string is:
NEW_DATAa_PACKET
Local Data Set 16-byte Universal Key
Time Stamp (microsec): 1319639501097446
Frame Number: 0
Version: 3
Angle (deg): 10.228428
size of the variable "key" = 21 or 25
size of the variable "data" = 18 or 18
The key name is Time Stamp (microsec)ipe
The value is 1319639501097446
F
End of Parser
Your results are because strncpy does not put a null character at the end of the string.
Your strncpy(data, pos + 2, (size_t)(pos2-pos)); doesn't add a terminating \0 character at the end of the string. Therefore when you try to print it later, printf() prints your whole data string and whatever is in the memory right after it, until it reaches zero - that's the garbate you're getting. You need to explicitly append zero at the end of your data. It's also needed for atoi().
Edit:
You need to allocate one more byte for your data, and write a terminating character there. data[len_of_data] = '\0'. Only after that it becomes a valid C string and you can use it for atoi() and printf().
You need to malloc() +1 byte for a string, so it can append the zero when you do strcpy(), but the strncpy will not append zero as well you need a extra byte for it.
One problem: What if there's no newline?
Undefined behaviour:
pos = strchr(temp_str, newline);
strcpy(temp_str, pos+1);
The source and destination of strcpy must not overlap.
You have to remember in allocating space for a string to add one byte for the terminating '\0' character. You have to be careful with strncpy, especially if you are used to using strcpy, strcat, or sprintf. Those three functions terminate the string with '\0'. strncpy copies a number of bytes that you specify, and makes no assumption of terminating the string.
You assume that responsibility by making sure you place a '\0' at the end of the character buffer to which you copied. That means you have to know the starting the position and the length of the copy and put a '\0' one byte past the sum of the starting position and length.
I chose to solve a sample problem slightly differently, but it still involves knowing the length of what I copied.
In this case, I use strncpy to take the first 9 characters from pcszTestStr1 and
copy them to szTestBuf. Then, I use strcpy -- which terminates the string with a zero --
to append the new part of the sentence.
#include <stdio.h>
#include <string.h>
int n;
int argv_2;
char szTestBuf[100] = {0};
char * pcszTestStr1 =
"This is a very long, long string to be used in a C example, OK?";
int main(int argc, char *argv[])
{
int rc = 0;
printf("The following sentence is too long.\n%s\n", pcszTestStr1);
strncpy(szTestBuf, pcszTestStr1, 9);
strcpy(szTestBuf + 9, " much shorter sentence.");
printf("%s\n", szTestBuf);
return rc;
}
Here's the output of running test.c compiled gcc -o test test.c.
cnorton#hiawatha:~/scratch$ ./test
The following sentence is too long.
This is a very long, long string to be used in a C example, OK?
This is a much shorter sentence.
cnorton#hiawatha:~/scratch$