I am having trouble reading a few lines of text from a file using fgets. The file is some basic user data that is written to a file within the bundle the first time the plugin is launched. Any subsequent launch of the plugin should result in the user data being read and cross referenced to check the users authenticity.
The data is always 3 lines long and is written with frwite exactly as it should be and is opened with fopen.
My original theory was to just call fgets 3 times reading each line into it's own char array which is part of a data struct. The problem is the first line is read correctly, the second line is read as though the position indicator starts on the next line but offset by the number of characters read from line 1. The third line is then not read at all.
fgets is not returning any errors and is behaving as though it has read the data it should have so i'm obviously missing something.
Anyway here's a portion of my code hopefully someone can some shed some light on my mistakes!
int length;
fgets(var.n, 128, regFile);
length = strlen(var.n);
var.n[length-1] = NULL;
fgets(var.em, 128, regFile);
length = strlen(var.em);
var.em[length-1] = NULL;
fgets(var.k, 128, regFile);
length = strlen(var.k);
var.k[length-1] = NULL;
fclose(regFile);
Setting the last character in each string to NULL is just to remove the /n
This sequence of code outputs the whole of line 1, the second half of line 2 and none of line 3.
Thanks to #alvits for the answer to this one:
fwrite() is not compatible with fgets(). Files created using fwrite() should use fread() to read them ?>back in. Both fwrite() and fread() operates on binary streams unless explicitly converted to and from >strings. fgets() is compatible with fputs(), both operates on strings.
I used fputs() to write my data instead and it read back in perfectly.
In POSIX systems, including Linux, there is no differentiation between binary and text files. When opening a file stream, the b flag is ignored. This is described in fopen().
You might ask "how would you differentiate text from binary files?". The contents differentiate them. How the contents are written makes them a binary or text file.
Look at the signature size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream). You'll notice that it writes the contents of *ptr with size describing the size of each members, nmemb. The written stream is not converted to string. If you were to write 97 it will write the binary 97 which in ascii is A. Binary data does not obey string terminations. Presence of \n and \0 in data is literally written as is.
Now look at the signature int fputs(const char *s, FILE *stream). It writes the string content of *s. If you were to write 97, it will have to be a string "97" which is not A. String termination is obeyed. \n is automatically converted to the O/S supported newline (CRLF or LF).
You can coerce fwrite() to behave like fputs() but not the other way around. For example, if you declare ptr as a pointer to string and calculate the size exactly as the length of the content excluding string terminator, you'll be able to write it out as text instead of binary. You will also need to handle \0 and \n and convert them to O/S supported newline. Writing the entire string buffer will write everything including and past the string terminators.
Related
This would be enough to read the first character 'a' inside fp
file.txt
abcdef
// readchar.c
FILE *fp = fopen("file.txt", "r");
int c = fgetc(fp);
but how can i read (e.g.) the 3rd character?
A text file is a kind of sequential file. To read a specific item, you must read everything preceding it in the file.
Operating systems let you read a file anywhere by moving a so called file pointer. The position in the file where read take place can be changed by seeking into the file. There are several ways to handle file I/O. The one you found invoke fopen to open the file and get a file pointer, fgetcto read the next character where the file pointer points and advance it by one character. You have fgetsto read a complete line. fseek to more the file pointer somewhere else. fcloseto close the file. And other similar function.
Back to the text file. Assuming we have a file containing: two lines containing:
Hello world,
Programming rocks!
If you want to read the 5th character of the first line, it is easy: just position the file pointer with fseek to the 5th position in the file (the first character is at position zero). Then read it with fgetc.
Now if you need to read the 5th character of the second line, whatever the first line is, you cannot use fseek because you don't know the length of the first line without reading the line first.
To read the Nth character on the Mth line, you must read M lines, throwing away data except the last one (You simply read all line in a for/loop into the same buffer). And then access the Nth character in the buffer where you just read the last line. Make that buffer an array of char and you have direct access to the Nth character.
I'm porting some code on an embedded platform that uses a C-like API. The original code uses fscanf() to read and parse data from files. Unfortunately on my API I don't have a fscanf() equivalent, so prior to the actual porting I'm trying to obtain the same behavior of fscanf() using fread() and vsscanf() (which I do have). I also have the equivalent of fseek() and ftell().
EDIT: please keep in mind that the access to the embedded filesystem is very limited (fread - fseek - ftell - fgetc - fgets), so I need a solution that works with strings in memory rather than accessing the file in some other way.
The code looks something like this:
int main()
{
[...] /* variable declarations and definitions */
do
{
read = wrapped_fscanf(pFile, "%d %s", &val, str);
} while (read == 2);
fclose(pFile);
return 0;
}
int wrapped_fscanf(FILE *f, const char *template, ...)
{
va_list args;
va_start(args, template);
char tmpstr[50];
fread(tmpstr, sizeof(char), sizeof(tmpstr), f);
int ret = vsscanf(tmpstr, template, args);
long offset = /* ??? */
fseek(f, offset, SEEK_CUR);
va_end(args);
return ret;
}
The problem is that fscanf() moves the pointer to the position in the file stream at the end of the match, whereas with fread() I'm reading a fixed amount of data (in this case 50 bytes) and I should find a way to move the pointer back to the end of the matched string.
Let's assume that the 50-char string I read from the file is the following:
12 bar 13 foo 56789012345678901234567890123456789
fscanf() would match the int 12 , the string bar and the pointer would point right after the "r" in "bar" so I can call it again and read 13 foo
On the other hand fread() puts the pointer after the last char in the 50-element sequence, which is wrong: I still have to read 13 foo but if I call wrapped_fscanf() again the pointer is in the 51st position.
I have to use fseek() to roll back to the end of the first match, but how do I do that? How do I calculate the value of offset ?
vsscanf() returns the number of matches, not the length of the string and I have no way of knowing how many whitespace charachters separate the elements of the match (or do I?)
I.e. I get the same outputs( {var,str,read} == {9,"xyz",2} ) with
9 xyz
and
9 xyz
Is there some trick that I'm not aware of or do I have to find another solution other than wrapping fscanf() with fread() vsscanf() ftell() and fseek()?
Thank you
Supposing that your vsscanf() implementation supports it, your substitute for fscanf() can append a %n field descriptor to the end of the provided format. As long as there is no failure prior to vsscanf() reaching that field, it will store the number of characters consumed up to that point in the corresponding argument. You could then use that result to reposition the stream appropriately. That would require a bit of varargs wrangling and probably some macro assistance, but I think it could be made to work.
You will need some intermediary buffering code, that will grab chunks of data (using fread), and scan your buffer for the pattern. if the pattern is found, truncate the buffer, if the pattern is not found, append some more data. this is effectively what fscanf will do.
I wanna know the use of putw() and getw() function. As I know, these are used to write and read from file as like as putc and getc but these deals with only integers. But when I use these for writing integers, it just write different symbol in file (like if I write 65 to file using putw(). It writes A in the file). Why does it take the ASCII value? I am using codeblocks 13.12. Code:
#include <stdio.h>
int main() {
FILE *fp;
int num;
fp = fopen("file.txt", "w");
printf("Enter any number:\n");
scanf("%d", &num);
putw(num, fp);
fclose(fp);
printf("%d\n", num);
return 0;
}
Let's read the point to point explanation of getw() and putw() functions.
getw() and putw() are related to FILE handling.
putw() is use to write integer data on the file (text file).
getw() is use to read the integer data from the file.
getw() and putw() are similar to getc() and putc(). The only difference is that getw() and putw() are especially meant for reading and writing the integer data.
int putw(integer, FILE*);
Return type of the function is integer.
Having two argument first "integer", telling the integer you want to write on the file and second argument "FILE*" telling the location of the file in which the data would be get written.
Now let's see an example.
int main()
{
FILE *fp;
fp=fopen("file1.txt","w");
putw(65,fp);
fclose(fp);
}
Here putw() takes the integer number as argument (65 in this case) to write it on the file file1.txt, but if we manually open the text file we find 'A' written on the file. It means that putw() actually take integer argument but write it as character on the file.
So, it means that compiler take the argument as the ASCII code of the particular character and write the character on the text file.
int getw(FILE*);
Return type is integer.
Having one argument that is FILE* that is the location of the file from which the integer data to be read.
In this below example we will read the data that we have written on the file named file1.txt in the example above.
int main()
{
FILE *fp;
int ch;
fp=fopen("file1.txt","r");
ch=getw(fp);
printf("%d",ch);
fclose(fp);
}
output
65
Explanation: Here we read the data we wrote to file1.txt in above program and we will get the output as 65.
So, getw() reads the character 'A' that was already written on the file file1.txt and return the ASCII code of the character 'A' that is 65.
We can also write the above program as:
int main()
{
FILE *fp;
fp=fopen("file1.txt","r");
printf("%d",getw(fp));
fclose(fp);
}
output
65
If num is an int, then putw(num, fp) is equivalent to fwrite(&num, sizeof(int), 1, fp), except for having a different return value. It writes an int to the file in binary format. getw is similar but with fread instead. You can see how glibc implements them: putw,getw.
This means that:
They are not appropriate for writing text. If you want to write a number to a file in human-readable decimal or hexadecimal format, use fprintf instead.
They typically read/write more than one byte (one character) to the file. For instance, on a machine with 32-bit ints, they will read/write four bytes. Attempting to do putw('c') will not simply write the single character 'c'.
They should only be used with files opened in binary mode (if that makes a difference on your system).
You should not expect the contents of the file to be human-readable at all. If you attempt to view the file in an editor, you'll see the representation of whatever bytes are in the file, in your current character set (e.g. ASCII).
You should not expect the file to be successfully read back on another computer that uses a different internal representation for int (e.g. different width, different endianness).
On a typical system with 32-bit little-endian int, putw(65, fp) will result in the four bytes 0x41 0x00 0x00 0x00. The 0x41 (decimal 65) is the ASCII code for the character A, so you'll see that if you view it. The 0x00 bytes may or may not be displayed at all, depending on what you are using to view.
These function are not a good idea to use in new code. Even if you do need to store binary data in files, which has various disadvantages as noted and should usually only be done if there is a very good reason for it, you should simply use fwrite and fread. getw/putw are a worse option because:
They will make your code less portable. fwrite/fread are part of the ISO C standard, which is the most widely supported cross-platform modern standard for the C language. getw/putw were present in the Single Unix Specification v2 version 2, which dates to 1997 and is now obsolete. They were not included in the POSIX/SUSv3 specs which superseded SUSv2, and it would be unwise to count on them being available on new systems.
They will make your code less readable. Since fread/fwrite are far more widely used, another programmer reading your code will recognize immediately what they do. Since getw/putw are more obscure, people are likely to have to go and look them up, and the names don't make it easy to remember that they operate specifically on the type int. Readers may also confuse them with the similarly-named ISO-standard functions getwc/putwc. So using getw/putw makes your code less readable.
They may introduce subtle bugs. getw returns EOF on end-of-file or error, but EOF is a valid integer value (often -1). Therefore, if it returns this value, you cannot easily tell whether the file actually contained the integer -1, or whether the read failed. And since this only happens for one particular value, it may be missed in testing. You can check ferror() and feof() to distinguish the two cases, but this is awkward, easy to forget to do, and negates most of the apparent convenience of the "simpler" interface of getw compared to fread.
I speculate that the only reason these functions existed in the first place is that, like putc (respectively getc), putw could be implemented as a macro that directly wrote the buffer of fp and would thus be a little more efficient than calling fwrite. Such an implementation is no longer feasible on modern systems, since it wouldn't be thread-safe, so putw needs a function call anyway. In fact, with glibc in particular, putw just calls fwrite after all, with the overhead of an additional function call, so it's now less efficient. So there is no longer any reason at all to use these functions.
From the man page of putw() and getw()
getw() reads a word (that is, an int) from stream. It's provided for compatibility with SVr4.
putw() writes the word w (that is, an int) to stream. It is provided for compatibility with SVr4.
You can use the fread and fwrite function for better use.
getw() reads the integer from the given FILE stream.
putw() it write the integer given in the first argument into the file pointer.
getw:
It will read the integer from the file. like getchar() doing the work. Consider the file having the content "hello". It will read the h and return ascii value of h.
putw:
It will place the given integer, integer taken as a ascii value. Corresponding value of the ascii value placed in the file. like putchar()
I am having difficulty with a feature of a segment of code that is designed to illustrate the fgets() function for input. Before I proceed, I would like to make sure that my understanding of I/O and streams is correct and that I'm not completely off base:
Input and Output in C has no specific viable function for working with strings. The one function specific for working with strings is the 'gets()' function, which will accept input beyond the limits of the char array to store the input (thus making it effectively illegal for all but backward compatibility), and create buffer overflows.
This brings up the topic of streams, which to the best of my understanding is a model to explain I/O in a program. A stream is considered 'flowing water' on which the data utilized by programs is conveyed. See links: (also as a conveyor belt)
Can you explain the concept of streams?
What is a stream?
In the C language, there are 3 predefined ANSII streams for standard input and output, and 2 additional streams if using windows or DOS which are as follows:
stdin (keyboard)
stdout (screen)
stderr (screen)
stdprn (printer)
stdaux (serial port)
As I understand, to make things manageable it is okay to think of these as rivers that exist in your operating system, and a program uses I/O functions to put data in them, take data out of them, or change the direction of where the streams are flowing (such as reading or writing a file would require). Never think of the 'beginning' or 'end' of the streams: this is handled by the operating system. What you need to be concerned with is where the water takes your data, and that is mediated by use of specific functions (such as printf(), puts(), gets(), fgets(), etc.).
This is where my questions start to take form. Now I am interested in getting a grasp on the fgets() function and how it ties into streams. fgets() uses the 'stdin' stream (naturally) and has the built in fail safe (see below) that will not allow user input to exceed the array used to store the input. Here is the outline of the fgets() function, rather its prototype (which I don't see why one would ever need to declare it?):
char *fgets(char *str , int n , FILE *fp);
Note the three parameters that the fgets function takes:
p1 is the address of where the input is stored (a pointer, which will likely just be the name of the array you use, e.g., 'buffer')
p2 is the maximum length of characters to be input (I think this is where my question is!)
p3 specifies the input stream, which in this code is 'stdin' (when would it ever be different?)
Now, the code I have below will allow you to type characters until your heart is content. When you hit return, the input is printed on the screen in rows of the length of the second parameter minus 1 (MAXLEN -1). When you enter a return with no other text, the program terminates.
#include <stdio.h>
#define MAXLEN 10
int main(void)
{
char buffer[MAXLEN];
puts("Enter text a line at a time: enter a blank line to exit");
while(1)
{
fgets(buffer, MAXLEN, stdin); //Read comments below. Note 'buffer' is indeed a pointer: just to array's first element.
if(buffer[0] == '\n')
{
break;
}
puts(buffer);
}
return 0;
}
Now, here are my questions:
1) Does this program allow me to input UNLIMITED characters? I fail to see the mechanism that makes fgets() safer than gets(), because my array that I am storing input in is of a limited size (256 in this case). The only thing that I see happening is my long strings of input being parsed into MAXLEN - 1 slices? What am I not seeing with fgets() that stops buffer overflow that gets() does not? I do not see in the parameters of fgets() where that fail-safe exists.
2) Why does the program print out input in rows of MAXLEN-1 instead of MAXLEN?
3) What is the significance of the second parameter of the fgets() function? When I run the program, I am able to type as many characters as I want. What is MAXLEN doing to guard against buffer overflow? From what I can guess, when the user inputs a big long string, once the user hits return, the MAXLEN chops up the string in to MAXLEN sized bites/bytes (both actually work here lol) and sends them to the array. I'm sure I'm missing something important here.
That was a mouthful, but my lack of grasp on this very important subject is making my code weak.
Question 1
You can actually type as much character as your command line tool will allow you per input. However, you call to fgets() will handle only MAXLEN in your example because you tell him to do so.
Moreover, there is no safe check inside fgets(). The second parameter you gave to fgets is the "safety" argument. Try to give to change your call to fgets to fgets(buffer, MAXLEN + 10, stdin); and then type more than MAXLEN characters. Your program will crash because you are accessing unallocated memory.
Question 2
When you make a call to fgets(), it will read MAXLEN - 1 characters because the last one is reserved to the character code \0 which usually means end of string
The second parameter of fgets() is not the number of character you want to store but the maximum capacity of your buffer. And you always have to think about string termination character \0
Question 3
If you undestood the 2 answer before, you will be able to answer to this one by yourself. Try to play with this value. And use a different value than the one used for you buffer size.
Also, you said
p3 specifies the input stream, which in this code is 'stdin' (when would it ever be different?)
You can use fgets to read files stored on your computer. Here is an example :
char buffer[20];
FILE *stream = fopen("myfile.txt", "r"); //Open the file "myfile.txt" in readonly mode
fgets(buffer, 20, stream); //Read the 19 first characters of the file "myfile.txt"
puts(buffer);
When you call fgets(), it lets you type in as much as you want into stdin, so everything stays in stdin. It seems fgets() takes the first 9 characters, attaches a null character, and assigns it to buffer. Then puts() displays buffer then creates a newline.
The key is it's in a while loop -- the code loops again then takes what was remaining in stdin and feeds it into fgets(), which takes the next 9 characters and repeats. Stdin just still had stuff "in queue".
Input and Output in C has no specific viable function for working with strings.
There are several functions for outputting strings, such as printf and puts.
Strings can be input with fgets or scanf; however there is no standard function that both inputs and allocates memory. You need to pre-allocate some memory, and then read some characters into that memory.
Your analogy of a stream as a river is not great. Rivers flow whether or not you are taking items out of them, but streams don't. A better analogy might be a line of people at the gates to a stadium.
C also has the concept of a "line", lines are marked by having a '\n' character at the end. In my analogy let's say the newline character is represented by a short person.
When you do fgets(buf, 20, stdin) it is like "Let the next 19 people in, but if you encounter a short person during this, let him through but not anybody else". Then the fgets function creates a string out of these 0 to 19 characters, by putting the end-of-string marker on the end; and that string is placed in buf.
Note that the second argument to fgets is the buffer size , not the number of characters to read.
When you type in characters, that is like more people joining the queue.
If there were fewer than 19 people and no short people, then fgets waits for more people to arrive. In standard C there's no way to check if people are waiting without blocking to wait for them if they aren't.
By default, C streams are line buffered. In my analogy, this is like there is a "pre-checking" gate earlier on than the main gate, where all people that arrive go into a holding pen until a short person arrives; and then everyone from the holding pen plus that short person get sent onto the main gate. This can be turned off using setvbuf.
Never think of the 'beginning' or 'end' of the streams: this is handled by the operating system.
This is something you do have to worry about. stdin etc. are already begun before you enter main(), but other streams (e.g. if you want to read from a file on your hard drive), you have to begin them.
Streams may end. When a stream is ended, fgets will return NULL. Your program must handle this. In my analogy, the gate is closed.
I have no experience with fscanf() and very little with functions for FILE. I have code that correctly determines if a client requested an existing file (using stat() and it also ensures it is not a directory). I will omit this part because it is working fine.
My goal is to send a string back to the client with a HTTP header (a string) and the correctly read data, which I would imagine has to become a string at some point to be concatenated with the header for sending back. I know that + is not valid C, but for simplicity I would like to send this: headerString+dataString.
The code below does seem to work for text files but not images. I was hoping that reading each character individually would solve the problem but it does not. When I point a browser (Firefox) at my server looking for an image it tells me "The image (the name of the image) cannot be displayed because it contains errors.".
This is the code that is supposed to read a file into httpData:
int i = 0;
FILE* file;
file = fopen(fullPath, "r");
if (file == NULL) errorMessageExit("Failed to open file");
while(!feof(file)) {
fscanf(file, "%c", &httpData[i]);
i++;
}
fclose(file);
printf("httpData = %s\n", httpData);
Edit: This is what I send:
char* httpResponse = malloc((strlen(httpHeader)+strlen(httpData)+1)*sizeof(char));
strcpy(httpResponse, httpHeader);
strcat(httpResponse, httpData);
printf("HTTP response = %s\n", httpResponse);
The data part produces ???? for the image but correct html for an html file.
Images contain binary data. Any of the 256 distinct 8-bit patterns may appear in the image including, in particular, the null byte, 0x00 or '\0'. On some systems (notably Windows), you need to distinguish between text files and binary files, using the letter b in the standard I/O fopen() call (works fine on Unix as well as Windows). Given that binary data can contain null bytes, you can't use strcpy() et al to copy chunks of data around since the str*() functions stop copying at the first null byte. Therefore, you have to use the mem*() functions which take a start position and a length, or an equivalent.
Applied to your code, printing the binary httpData with %s won't work properly; the %s will stop at the first null byte. Since you have used stat() to verify the existence of the file, you also have a size for the file. Assuming you don't have to deal with dynamically changing files, that means you can allocate httpData to be the correct size. You can also pass the size to the reading code. This also means that the reading code can use fread() and the writing code can use fwrite(), saving on character-by-character I/O.
Thus, we might have a function:
int readHTTPData(const char *filename, size_t size, char *httpData)
{
FILE *fp = fopen(filename, "rb");
size_t n;
if (fp == 0)
return E_FILEOPEN;
n = fread(httpData, size, 1, fp);
fclose(fp);
if (n != 1)
return E_SHORTREAD;
fputs("httpData = ", stdout);
fwrite(httpData, size, 1, stdout);
putchar('\n');
return 0;
}
The function returns 0 on success, and some predefined (negative?) error numbers on failure. Since memory allocation is done before the routine is called, it is pretty simple:
Open the file; report error if that fails.
Read the file in a single operation.
Close the file.
Report error if the read did not get all the data that was expected.
Report on the data that was read (debugging only — and printing binary data to standard output raw is not the best idea in the world, but it parallels what the code in the question does).
Report on success.
In the original code, there is a loop:
int i = 0;
...
while(!feof(file)) {
fscanf(file, "%c", &httpData[i]);
i++;
}
This loop has a lot of problems:
You should not use feof() to test whether there is more data to read. It reports whether an EOF indication has been given, not whether it will be given.
Consequently, when the last character has been read, the feof() reports 'false', but the fscanf() tries to read the next (non-existent) character, adds it to the buffer (probably as a letter such as ÿ, y-umlaut, 0xFF, U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS).
The code makes no check on how many characters have been read, so it has no protection against buffer overflow.
Using fscanf() to read a single character is a lot of overhead compared to getc().
Here's a more nearly correct version of the code, assuming that size is the number of bytes allocated to httpData.
int i = 0;
int c;
while ((c = getc(file)) != EOF && i < size)
httpData[i++] = c;
You could check that you get EOF when you expect it. Note that the fread() code does the size checking inside the fread() function. Also, the way I wrote the arguments, it is an all-or-nothing proposition — either all size bytes are read or everything is treated as missing. If you want byte counts and are willing to tolerate or handle short reads, you can reverse the order of the size arguments. You could also check the return from fwrite() if you wanted to be sure it was all written, but people tend to be less careful about checking that output succeeded. (It is almost always crucial to check that you got the input you expected, though — don't skimp on input checking.)
At some point, for plain text data, you need to think about CRLF vs NL line endings. Text files handle that automatically; binary files do not. If the data to be transferred is image/png or something similar, you probably don't need to worry about this. If you're on Unix and dealing with text/plain, you may have to worry about CRLF line endings (but I'm not an expert on this — I've not done low-level HTTP stuff recently (not in this millennium), so the rules may have changed).