Storing strings using fgets - c

I have a file which contains information about films like this:
Film code
Name
Year of release
Movie length(in minutes)
The film producer
I have to read this info from a file and store that info into pointers. My code so far:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct filmiab
{
int koodpk;
char *nimed;
int aasta;
int kestus;
char *rezi;
} filmiab;
int main()
{
filmiab *db;
FILE *f1;
f1 = fopen("filmid.txt", "r");
db->nimed = (char*)malloc(sizeof(db->nimed) * sizeof(char));
db->rezi = (char*)malloc(sizeof(db->rezi) * sizeof(char));
while(1)
{
fscanf(f1, "%d ", &db->koodpk);
fgets(db->nimed, 100, f1);
db->nimed = (char*)realloc(db->nimed, sizeof(char) * sizeof(db->nimed)); //gets more memory to store the strings
fscanf(f1, "%d %d ", &db->aasta, &db->kestus);
fgets(db->rezi, 100, f1);
db->rezi = (char*)realloc(db->rezi, sizeof(char) * sizeof(db->rezi));
printf("Filmi kood: %d\nFilmi nimi: %sAasta: %d\nKestus minutites: %d\nFilmi rezis66r: %s\n",
db->koodpk, db->nimed, db->aasta, db->kestus, db->rezi);
printf("\n");
}
return 0;
}
It just goes into an infinte loop and only prints the last 5 lines. I know that when using fgets it replaces all the strings with the last 5 lines.
But what can I do so it would store all the info and that so I could print them out (or just use them) in another function. And why does it go into an infinite loop ?
EDIT:
I have to use only the pointers that are created in the struct.
EDIT2:
Now both these lines
fgets(db->nimed, 100, f1);
fgets(db->rezi, 100, f1);
store the required info and the blank spaces. What to do so it only stores the names of the films and the producers.

It just goes into an infinte loop
That's because it is an infinite loop. You have a while(1) with no break condition. It should break after it can no longer read any lines.
Every time you work with a file, that means fopen, fgets, and fscanf, you need to check whether the operation succeeded. If it fails, the code will continue with whatever garbage is the result.
This is especially a problem with fscanf because if it fails, it leaves the file pointer where it was, and might continuously rescan the same line over and over again. In general, avoid scanf and fscanf. Instead, fgets the whole line, to ensure it gets read, and scan it with sscanf.
The other problem is how you're allocating memory isn't right.
filmiab *db;
This puts a pointer on the stack, but it points to garbage. No memory has been allocated for the actual struct.
db->nimed = (char*)malloc(sizeof(db->nimed) * sizeof(char));
sizeof(db->nimed) is not the length of the string in db->nimed, but the size of the pointer. Probably 4 or 8. So you've only allocated 4 or 8 bytes.
fgets(db->nimed, 100, f1);
Then you read up to 100 bytes into it with fgets, probably causing a buffer overflow.
db->nimed = (char*)realloc(db->nimed, sizeof(char) * sizeof(db->nimed));
Then you reallocate too little, too late. Again, same as before, this is allocating only 4 or 8 bytes. It's probably doing nothing because the memory was already this size.
To fix this, start by putting the whole struct on the stack.
filmiab db;
Then allocate the necessary space for its strings. Note that since sizeof(char) is always 1 there's no need to include it. There's also no need to cast the result of malloc.
db.nimed = malloc(100);
db.rezi = malloc(100);
Now there's no need to realloc, you've got you 100 bytes of memory and can write to it with fgets.
For future reference, here's how I'd rework this.
int main() {
filmiab db;
char file[] = "filmid.txt";
FILE *f1 = fopen(file, "r");
if( f1 == NULL ) {
fprintf( stderr, "Could not open %s for reading: %s", file, strerror(errno) );
}
char line[1024];
int state = 0;
while(fgets(line, 1024, f1) != NULL) {
switch(state % 5) {
case 0:
sscanf(line, "%d", &db.koodpk);
break;
case 1:
db.nimed = strdup(line);
break;
case 2:
sscanf(line, "%d", &db.aasta);
break;
case 3:
sscanf(line, "%d", &db.kestus);
break;
case 4:
db.rezi = strdup(line);
printf("Filmi kood: %d\nFilmi nimi: %sAasta: %d\nKestus minutites: %d\nFilmi rezis66r: %s\n",
db.koodpk, db.nimed, db.aasta, db.kestus, db.rezi);
printf("\n");
break;
default:
// Should never get here
assert(0);
break;
}
state++;
}
return 0;
}
There's a single, large line buffer which gets reused, it's 1K but it's only 1K once. strdup duplicates strings, but only allocates enough memory to hold the string. This eliminates the need to predict how big lines are, and it also avoids fragmenting memory with a lot of reallocs.
In this particular case, since db is being reused, it would be more optimal to just allocate 1024 each for db.nimed and db.rezi, but I wanted to demonstrate the more general case where the stuff read in will stick around.
while(fgets(line, 1024, f1) != NULL) ensures I'll read until the end of the file. Then line is processed using the switch statement depending on what sort of line is coming next. This separates the process of reading from a file, which can be unpredictable and needs a lot of error checking, from processing the data which is a bit easier. Technically I should be checking if those sscanfs succeeded, but I got lazy. :)

Related

Multiple fscanf

I have written the following program that is intended to read a string from a file into variable "title":
#include <stdio.h>
#include <stdlib.h>
int main()
{
int m, b;
char *title;
FILE *fp;
fp = fopen("input2.txt", "r");
if (fp == NULL)
{
printf ("Error: file cannot be found\n");
return 1;
}
fscanf(fp, "<%d>\n<%d>", &m, &b);
printf("%d\n%d", m, b);
fscanf(fp, "<%s>", title);
fclose(fp);
return 0;
}
The above program crashes at the second call to fscanf. Why does this happen?
Your main problem is that you've not allocated space for the string to be read into. You can do this in multiple ways:
char title[256];
or:
char *title = malloc(256);
if (title == NULL)
{
fprintf(stderr, "Out of memory\n");
exit(1);
}
either of which should then be used with:
if (fscanf(fp, " <%255[^>]>", title) != 1)
{
fprintf(stderr, "Oops: format error\n");
exit(1);
}
or, if you have a system with an implementation of fscanf() that's compliant with POSIX 2008, you can use the m modifier to %s (or with %c, or, in this case, a scanset %[...] — more on that below):
char *title = 0;
if (fscanf(fp, " <%m[^>]>", &title) != 1) // Note the crucial &
{
fprintf(stderr, "Oops: format error\n");
exit(1);
}
This way, if the fscanf() succeeds in its entirety, the function will allocate the memory for the title. If it fails, the memory will have been released (or never assigned).
Note that I changed %s to %m[^>]. This is necessary because the original conversions will never match the >. If there is a > in the input, it will be incorporated into the result string because that reads up to white space, and > is not white space. Further, you won't be able to tell whether the trailing context was ever matched — that's the > in the original format, and it's still a problem (or not) in the revised code I'm suggesting.
I also added a space at the start of the string to match optional white space. Without that, the < at the start of the string must be on the same line as the > after the second number, assuming that the > is present at all. You should also check the return from the first fscanf():
if (fscanf(fp, "<%d>\n<%d>", &m, &b) != 2)
{
fprintf(stderr, "Oops: format error\n");
exit(1);
}
Note that the embedded newline simply looks for white space between the > and the < — that's zero or more blanks, tabs or newlines. Also note that you'll never know whether the second > was matched or not.
You could use exit(EXIT_FAILURE); in place of exit(1); — or, since this code is in main(), you could use either return 1; or return(EXIT_FAILURE); where the parentheses are optional in either case but their presence evokes unwarranted ire in some people.
You could also improve the error messages. And you should consider using fgets() or POSIX's getline() followed by sscanf() because it makes it easier (by far) to do good error reporting, plus you can rescan the data easily if the first attempt at converting it fails.
This:
char *title;
is just a pointer to a char. If fscanf writes more than one character to it, you will corrupt whatever happens to be in memory after
You need to do one of two things:
char title[50]; // Holds up to 49 characters, plus termination
Or:
#include <stdlib.h>
// ...
char *title = malloc(50 * sizeof(char)); // Same capacity as above
if (title == NULL) {
// handle out of mem error
}
// ...
free (title);
The first option is obviously much simpler, but requires you to know your array size at compile time.
If you are new to programming, and haven't encountered pointers and dynamic memory allocation yet, stick with the first option for now.

Array Not Filling Properly

I am trying to deconstruct a document into its respective paragraphs, and input each paragraphs, as a string, into an array. However, each time a new value is added, it overwrites all previous values in the array. The last "paragraph" read (as denoted by newline) is the value of each non-null value of the array.
Here is the code:
char buffer[MAX_SIZE];
char **paragraphs = (char**)malloc(MAX_SIZE * sizeof(char*));
int pp = 0;
int i;
FILE *doc;
doc = fopen(argv[1], "r+");
assert(doc);
while((i = fgets(buffer, sizeof(buffer), doc) != NULL)) {
if(strncmp(buffer, "\n", sizeof(buffer))) {
paragraphs[pp++] = (char*)buffer;
}
}
printf("pp: %d\n", pp);
for(i = 0; i < MAX_SIZE && paragraphs[i] != NULL; i++) {
printf("paragraphs[%d]: %s", i, paragraphs[i]);
}
The output I receive is:
pp: 4
paragraphs[0]: paragraph four
paragraphs[1]: paragraph four
paragraphs[2]: paragraph four
paragraphs[3]: paragraph four
when the program is run as follows: ./prog.out doc.txt, where doc.txt is:
paragraph one
paragraph two
paragraph three
paragraph four
The behavior of the program is otherwise desired. The paragraph count works properly, ignoring the line that contains ONLY the newline character (line 4).
I assume the problem occurs in the while loop, however am unsure how to remedy the problem.
Your solution is pretty sound. Your Paragraph array is supposed to hold each paragraph, and since each paragraph element is just a small 4 bytes pointer you can afford to define a reasonable max number of them. However, since this max number is a constant, it is of little use to allocate the array dynamically.
The only meaningful use of dynamic allocation would be to read the whole text once to count the actual number of paragraphs, allocate the array accordingly and re-read the whole file a second time, but I doubt this is worth the effort.
The downside of using fixed-size paragraph array is that you must stop filling it once you reach the maximal number of elements.
You can then re-allocate a bigger array if you absolutely want to be able to process the whole Bible, but for an educational exercise I think it's reasonable to just stop recording paragraphs (thus producing a code that can store and count paragraphs up to a maximal number).
The real trouble with your code is, you don't store the paragraph contents anywhere. When you read the actual lines, it's always inside the same buffer, so each paragraph will point to the same string, which will contain the last paragraph read.
The solution is to make a unique copy of the buffer and have the current paragraph point to that.
C being already messy enough as it is, I suggest using the strdup() function, which duplicates a string (basically computing string length, allocating sufficient memory, copying the string into it and returning the new block of memory holding the new copy). You just need to remember to free this new copy once you're done using it (in your case at the end of your program).
This is not the most time-efficient solution, since each string will require a strlen and a malloc performed internally by strdump while you could have pre-allocated a big buffer for all paragraphs, but it is certainly simpler and probably more memory-efficient (only the minimal amount of memory will be allocated for each string, though each malloc consumes a few extra bytes for internal allocator housekeeping).
The bloody awkward fgets also stores the trailing \n at the end of the line, so you'll probably want to get rid of that.
Your last display loop would be simpler, more robust and more efficient if you simply used pp as a limit, instead of checking uninitialized paragraphs.
Lastly, you'd better define two different constants for max line size and max number of paragraphs. Using the same value for both makes little sense, unless you're processing perfectly square texts :).
#define MAX_LINE_SIZE 82 // max nr of characters in a line (including trailing \n and \0)
#define MAX_PARAGRAPHS 100 // max number of paragraphs in a file
void main (void)
{
char buffer[MAX_LINE_SIZE];
char * paragraphs[MAX_PARAGRAPHS];
int pp = 0;
int i;
FILE *doc;
doc = fopen(argv[1], "r+");
assert(doc != NULL);
while((fgets(buffer, sizeof(buffer), doc) != NULL)) {
if (pp != MAX_PARAGRAPHS // make sure we don't overflow our paragraphs array
&& strcmp(buffer, "\n")) {
// fgets awkwardly collects the ending \n, so get rid of it
if (buffer[strlen(buffer)-1] == '\n') buffer[strlen(buffer)-1] = '\0';
// current paragraph references a unique copy of the actual text
paragraphs[pp++] = strdup (buffer);
}
}
printf("pp: %d\n", pp);
for(i = 0; i != pp; i++) {
printf("paragraphs[%d]: %s", i, paragraphs[i]);
free(paragraphs[i]); // release memory allocated by strdup
}
}
What is the proper way to allocate the necessary memory? Is the malloc on line 2 not enough?
No, you need to allocate memory for the 2D array of strings you created. The following will not work as coded.
char **paragraphs = (char**)malloc(MAX_SIZE * sizeof(char*));
If you have: (for a simple explanation)
char **array = {0}; //array of C strings, before memory is allocation
Then you can create memory for it like this:
int main(void)
{
int numStrings = 10;// for example, change as necessary
int maxLen = MAX_SIZE; //for example, change as necessary
char **array {0};
array = allocMemory(array, numStrings, maxLen);
//use the array, then free it
freeMemory(array, numStrings);
return 0;
}
char ** allocMemory(char ** a, int numStrings, int maxStrLen)
{
int i;
a = calloc(sizeof(char*)*(numStrings+1), sizeof(char*));
for(i=0;i<numStrings; i++)
{
a[i] = calloc(sizeof(char)*maxStrLen + 1, sizeof(char));
}
return a;
}
void freeMemory(char ** a, int numStrings)
{
int i;
for(i=0;i<numStrings; i++)
if(a[i]) free(a[i]);
free(a);
}
Note: you can determine the number of lines in a file several ways, One way for example, by FILE *fp = fopen(filepath, "r");, then calling ret = fgets(lineBuf, lineLen, fp) in a loop until ret == EOF, keeping count of an index value for each loop. Then fclose(). (which you did not do either) This necessary step is not included in the code example above, but you can add it if that is the approach you want to use.
Once you have memory allocated, Change the following in your code:
paragraphs[pp++] = (char*)buffer;
To:
strcpy(paragraphs[pp++], buffer);//no need to cast buffer, it is already char *
Also, do not forget to call fclose() when you are finished with the open file.

How to copy one line from long string C

I'm looking to copy the FIRST line from a LONG string P into a buffer
I have no idea how to make it.
while (*pros_id != '/n'){
*pros_id_line=*pros_id;
pros_id++;
pros_id_line++;
}
And tried
fgets(pros_id_line, sizeof(pros_id_line), pros_id);
Both are not working. Can I get some help please?
Note, as Adriano Repetti pointed out in a comment and an answer, that the newline character is '\n' and not '/n'.
Your initial code can be fixed up to work, provided that the destination buffer is big enough:
while (*pros_id != '\n' && *pros_id != '\0')
*pros_id_line++ = *pros_id++;
*pros_id_line = '\0';
This code does not include the newline in the copied buffer; it is easy enough to add it if you need it.
One advantage of this code is that it makes a single pass through the data up to the newline (or end of string). An alternative makes two passes through the data, one to find the newline and another to copy to the newline:
if ((end = strchr(pros_id, '\n')) != 0)
{
memmove(pros_id_line, pros_id, end - pros_id);
pros_id_line[end - pros_id] = '\0';
}
This ensures that the string is null-terminated; again, it omits the newline, and assumes there is enough space in the pros_id_line buffer for the data. You have to decide what is the correct behaviour when there is no newline in the buffer. It might be sufficient to copy the buffer without the newline into the target area, or you might prefer to report a problem.
You can use strncpy() instead of memmove() but it has a more complex loop condition than memmove() — it has to check for a null byte as well as the count, whereas memmove() only has to check the count. You can use memcpy() instead of memmove() if you're sure there's no overlap between source and target, but memmove() always works and memcpy() sometimes doesn't (though only when the source and target areas overlap), and I prefer reliability over possible misbehaviour.
Note that setting a buffer to zero before copying a string to it is a waste of energy. The parts that you're about to overwrite with data didn't need to be zeroed. The parts that you aren't going to overwrite with data didn't need to be zeroed either. You should know exactly which byte needs to be zeroed, so why waste the time on zeroing anything except the one byte that needs to be zeroed?
(One exception to this is if you are dealing with sensitive data and are concerned that some function that your code will call may deliberately read beyond the end of the string and come across parts of a password or other sensitive data. Then it may be appropriate to wipe the memory before writing new data to it. On the whole, though, most people aren't writing such code.)
New line is \n not /n anyway I'd use strchar for this:
char* endOfFirstLine = strchr(inputString, '\n');
if (endOfFirstLine != NULL)
{
strncpy(yourBuffer, inputString,
endOfFirstLine - inputString);
}
else // Input is one single line
{
strcpy(yourBuffer, inputString);
}
With inputString as your char* multiline string and inputBuffer (assuming it's big enough to contain all data from inputString and it has been zeroed) as your required output (first line of inputString).
If you're going to be doing a lot of reading from long text buffers, you could try using a memory stream, if you system supports them: https://www.gnu.org/software/libc/manual/html_node/String-Streams.html
#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
static char buffer[] = "foo\nbar";
int
main()
{
char arr[100];
FILE *stream;
stream = fmemopen(buffer, strlen(buffer), "r");
fgets(arr, sizeof arr, stream);
printf("First line: %s\n", arr);
fgets(arr, sizeof arr, stream);
printf("Second line: %s\n", arr);
fclose (stream);
return 0;
}
POSIX 2008 (e.g. most Linux systems) has getline(3) which heap-allocates a buffer for a line.
So you could code
FILE* fil = fopen("something.txt","r");
if (!fil) { perror("fopen"); exit(EXIT_FAILURE); };
char *linebuf=NULL;
size_t linesiz=0;
if (getline(&linebuf, &linesiz, fil) {
do_something_with(linebuf);
}
else { perror("getline"; exit(EXIT_FAILURE); }
If you want to read an editable line from stdin in a terminal consider GNU readline.
If you are restricted to pure C99 code you have to do the heap allocation yourself (malloc or calloc or perhaps -with care- realloc)
If you just want to copy the first line of some existing buffer char*bigbuf; which is non-NULL, valid, and zero-byte terminated:
char*line = NULL;
char *eol = strchr(bigbuf, '\n');
if (!eol) { // bigbuf is a single line so duplicate it
line = strdup(bigbuf);
if (!line) { perror("strdup"); exit(EXIT_FAILURE); }
} else {
size_t linesize = eol-bugbuf;
line = malloc(linesize+1);
if (!line) { perror("malloc"); exit(EXIT_FAILURE);
memcpy (line, bigbuf, linesize);
line[linesize] = '\0';
}

Reading numbers

Huge thanks to everyone that answered , i have realised that i suck a lot at this, i will take every answer into consideration and hopefully i will manage to compile something that is working
Some remarks:
Allocating 500 MB just in case doesn't seem like a good idea. A better approach would be to allocate a small amount of memory first, if it's not enough then allocate 2 times bigger memory, etc (this would work if you read the number on per-character basis).
Important: right after every (re)allocation, you have to check whether your malloc call succeeded (i.e. what it returns is not NULL), otherwise you cannot go any further.
what the first getchar() is for?
instead of using gets(), you could try to read the characters one-by-one, until you encounter something that is not a number, at which point you can assume that the number input has finished (that is the simplest way, obviously one can process user input differently).
adding '\0' for something that was read with gets() is not needed, afaik (for something that would be read character-by-character, that would make sense).
Last but not least, you should also take care of actually freeing the allocated memory (i.e. calling free() after you are done with num). Not doing so results in a memory leak.
(Update) printf("%c",num[0]); will only print the first character of the string num. If you want to print out the whole string, you should call printf("%s",num);
Well, there are quite a few problems with this code, none that necessarily have to do with reading big numbers. But you're still learning, so here we go. In order in which they appear in the code:
(Not really an error, but also not recommended): Casting the result of malloc is unnecessary, as outlined in this answer.
As the other answer states: allocating 500MB is probably way overkill, if you really need this much you can always add more, but you may want to start out with less (5KB, for example).
You should add a new-line at the end of your puts, or the output may end up in places where you don't expect it (i.e. much later).
(This is an error) Don't ever use gets: this page explains why.
You're checking if(num == NULL) after you've already used it (presumably to check if gets failed, but it will return NULL on failure, the num pointer itself won't be changed). You want to move this check up to right after the malloc.
After your NULL-check for num your code happily continues after the if, you'll want to add a return or exit inside the if's body.
There is a syntax error with your very last printf: you forgot the closing ].
When you decide to use fgets to get the user input, you can check if the last character in the string is a new-line. If it isn't then that means it couldn't fit the entire input into the string, so you will need to fgets some more. When the last character is a new-line you might want to remove that (use num[len]='\0'; trick that isn't necessary for gets, but is for fgets).
Instead of increasing the size of your buffer by just 1, you should grow it by a bit more than that: a common used value is to just double the current size. malloc, calloc and realloc are fairly expensive system-calls (performance-wise) and since you don't seem too fussed about memory-usage it can save a lot of time keeping these calls to a minimum.
An example of these recommendations:
size_t bufferSize = 5000, // start with 5K
inputLength = 0;
char * buffer = malloc(bufferSize);
if(buffer == NULL){
perror("No memory!");
exit(-1);
}
while(fgets(buffer, bufferSize, stdin) != NULL){
inputLength = strlen(buffer);
if(buffer[inputLength] != '\n'){ // last character was not a new-line
bufferSize *= 2; // double the buffer in size
char * tmp = realloc(buffer, bufferSize);
if(tmp == NULL){
perror("No memory!");
free(buffer);
exit(-1);
}
// reallocating didn't fail: continue with grown buffer
buffer = tmp;
}else{
break; // last character was a new-line: were done reading
}
}
Beware of bugs in the above code; I have only proved it correct, not tried it.
Finally, instead of re-inventing the wheel, you may want to take a look at the GNU Multiple Precision library which is specifically made for handling big numbers. If anything you can use it for inspiration.
This is how you could go about reading some really big numbers in. I have decided on your behalf that a 127 digit number is really big.
#include <stdio.h>
#include <stdlib.h>
#define BUFSIZE 128
int main()
{
int n, number, len;
char *num1 = malloc(BUFSIZE * sizeof (char));
if(num1==NULL){
puts("Not enough memory");
return 1;
}
char *num2 = malloc(BUFSIZE * sizeof (char));
if(num2==NULL){
puts("Not enough memory");
return 1;
}
puts("Please enter your first number");
fgets(num1, BUFSIZE, stdin);
puts("Please enter your second number");
fgets(num2, BUFSIZE, stdin);
printf("Your first number is: %s\n", num1);
printf("Your second number is: %s\n", num2);
free(num1);
free(num2);
return 0;
}
This should serve as a starting point for you.

Reading From A Buffer and Storing the line in an array

I am trying to make a simple client and server. Right now I am able to prints the contents of a file out to the screen. I would now like to store every line i read from the buffer into an array. I have attempted this but for some reason it always just adds the last line received from the buffer. Can anyone point out where I have gone wrong
int getFile (char path[256], int fd)
{
char buffer[256];
char bufferCopy[256];
char arguments[1000][1000];
int total = 0;
char * ptr;
while(read(fd, buffer, 256) != NULL)
{
char * temp;
strcpy(arguments[total], buffer);
total++;
}
for(int i = 0; i < total; i++)
{
printf("\n %s", arguments[i]);
}
}
Your read call doesn't read lines, it reads up to 256 bytes from fd. read also doesn't know anything about null terminators so there is no guarantee that buffer will hold a string (i.e. have a null terminator) and hence no guarantee that strcpy will stop copying at a sensible place. You're almost certainly scribbling all over your stack and once you do that, all bets are off and you can't expect anything sensible to happen.
If you want to read lines then you might want to switch to fgets or keep using read and figure out where the EOLs are yourself.

Resources