Finding the size of a file created by fmemopen - c

I'm using fmemopen to create a variable FILE* fid to pass it to a function the reads data from an open file.
Somewhere in that function it uses the following code to find out the size of the file:
fseek(fid, 0, SEEK_END);
file_size = ftell(fid);
this works well in case of regular files, but in case of file ids created by fmemopen I always get file_size = 8192
Any ideas why this happens?
Is there a method to get the correct file size that works for both regular files and files created with fmemopen?
EDIT:
my call to fmemopen:
fid = fmemopen(ptr, memSize, "r");
where memSize != 8192
EDIT2:
I created a minimal example:
#include <cstdlib>
#include <stdio.h>
#include <string.h>
using namespace std;
int main(int argc, char** argv)
{
const long unsigned int memsize = 1000000;
void * ptr = malloc(memsize);
FILE *fid = fmemopen(ptr, memsize, "r");
fseek(fid, 0, SEEK_END);
long int file_size = ftell(fid);
printf("file_size = %ld\n", file_size);
free(ptr);
return 0;
}
btw, I am currently working on another computer, and here I get file_size=0

In case of fmemopen , if you open using the option b then SEEK_END measures the size of the memory buffer. The value you see must be the default buffer size.

OK, I have got this mystery solved by myself. The documentation says:
If the opentype specifies append mode, then the initial file position is set to the first null character in the buffer
and later:
For a stream open for reading, null characters (zero bytes) in the buffer do not count as "end of file". Read operations indicate end of file only when the file position advances past size bytes.
It seems that fseek(fid, 0, SEEK_END) goes to the first zero byte in the buffer, and not to the end of the buffer.
Still looking for a method that will work on both standard and fmemopen files.

Related

Get the number of bytes in a file [duplicate]

How can I figure out the size of a file, in bytes?
#include <stdio.h>
unsigned int fsize(char* file){
//what goes here?
}
On Unix-like systems, you can use POSIX system calls: stat on a path, or fstat on an already-open file descriptor (POSIX man page, Linux man page).
(Get a file descriptor from open(2), or fileno(FILE*) on a stdio stream).
Based on NilObject's code:
#include <sys/stat.h>
#include <sys/types.h>
off_t fsize(const char *filename) {
struct stat st;
if (stat(filename, &st) == 0)
return st.st_size;
return -1;
}
Changes:
Made the filename argument a const char.
Corrected the struct stat definition, which was missing the variable name.
Returns -1 on error instead of 0, which would be ambiguous for an empty file. off_t is a signed type so this is possible.
If you want fsize() to print a message on error, you can use this:
#include <sys/stat.h>
#include <sys/types.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
off_t fsize(const char *filename) {
struct stat st;
if (stat(filename, &st) == 0)
return st.st_size;
fprintf(stderr, "Cannot determine size of %s: %s\n",
filename, strerror(errno));
return -1;
}
On 32-bit systems you should compile this with the option -D_FILE_OFFSET_BITS=64, otherwise off_t will only hold values up to 2 GB. See the "Using LFS" section of Large File Support in Linux for details.
Don't use int. Files over 2 gigabytes in size are common as dirt these days
Don't use unsigned int. Files over 4 gigabytes in size are common as some slightly-less-common dirt
IIRC the standard library defines off_t as an unsigned 64 bit integer, which is what everyone should be using. We can redefine that to be 128 bits in a few years when we start having 16 exabyte files hanging around.
If you're on windows, you should use GetFileSizeEx - it actually uses a signed 64 bit integer, so they'll start hitting problems with 8 exabyte files. Foolish Microsoft! :-)
Matt's solution should work, except that it's C++ instead of C, and the initial tell shouldn't be necessary.
unsigned long fsize(char* file)
{
FILE * f = fopen(file, "r");
fseek(f, 0, SEEK_END);
unsigned long len = (unsigned long)ftell(f);
fclose(f);
return len;
}
Fixed your brace for you, too. ;)
Update: This isn't really the best solution. It's limited to 4GB files on Windows and it's likely slower than just using a platform-specific call like GetFileSizeEx or stat64.
**Don't do this (why?):
Quoting the C99 standard doc that i found online: "Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state.**
Change the definition to int so that error messages can be transmitted, and then use fseek() and ftell() to determine the file size.
int fsize(char* file) {
int size;
FILE* fh;
fh = fopen(file, "rb"); //binary mode
if(fh != NULL){
if( fseek(fh, 0, SEEK_END) ){
fclose(fh);
return -1;
}
size = ftell(fh);
fclose(fh);
return size;
}
return -1; //error
}
POSIX
The POSIX standard has its own method to get file size.
Include the sys/stat.h header to use the function.
Synopsis
Get file statistics using stat(3).
Obtain the st_size property.
Examples
Note: It limits the size to 4GB. If not Fat32 filesystem then use the 64bit version!
#include <stdio.h>
#include <sys/stat.h>
int main(int argc, char** argv)
{
struct stat info;
stat(argv[1], &info);
// 'st' is an acronym of 'stat'
printf("%s: size=%ld\n", argv[1], info.st_size);
}
#include <stdio.h>
#include <sys/stat.h>
int main(int argc, char** argv)
{
struct stat64 info;
stat64(argv[1], &info);
// 'st' is an acronym of 'stat'
printf("%s: size=%ld\n", argv[1], info.st_size);
}
ANSI C (standard)
The ANSI C doesn't directly provides the way to determine the length of the file.
We'll have to use our mind. For now, we'll use the seek approach!
Synopsis
Seek the file to the end using fseek(3).
Get the current position using ftell(3).
Example
#include <stdio.h>
int main(int argc, char** argv)
{
FILE* fp = fopen(argv[1]);
int f_size;
fseek(fp, 0, SEEK_END);
f_size = ftell(fp);
rewind(fp); // to back to start again
printf("%s: size=%ld", (unsigned long)f_size);
}
If the file is stdin or a pipe. POSIX, ANSI C won't work.
It will going return 0 if the file is a pipe or stdin.
Opinion:
You should use POSIX standard instead. Because, it has 64bit support.
And if you're building a Windows app, use the GetFileSizeEx API as CRT file I/O is messy, especially for determining file length, due to peculiarities in file representations on different systems ;)
If you're fine with using the std c library:
#include <sys/stat.h>
off_t fsize(char *file) {
struct stat filestat;
if (stat(file, &filestat) == 0) {
return filestat.st_size;
}
return 0;
}
I used this set of code to find the file length.
//opens a file with a file descriptor
FILE * i_file;
i_file = fopen(source, "r");
//gets a long from the file descriptor for fstat
long f_d = fileno(i_file);
struct stat buffer;
fstat(f_d, &buffer);
//stores file size
long file_length = buffer.st_size;
fclose(i_file);
I found a method using fseek and ftell and a thread with this question with answers that it can't be done in just C in another way.
You could use a portability library like NSPR (the library that powers Firefox).
In plain ISO C, there is only one way to determine the size of a file which is guaranteed to work: To read the entire file from the start, until you encounter end-of-file.
However, this is highly inefficient. If you want a more efficient solution, then you will have to either
rely on platform-specific behavior, or
revert to platform-specific functions, such as stat on Linux or GetFileSize on Microsoft Windows.
In contrast to what other answers have suggested, the following code is not guaranteed to work:
fseek( fp, 0, SEEK_END );
long size = ftell( fp );
Even if we assume that the data type long is large enough to represent the file size (which is questionable on some platforms, most notably Microsoft Windows), the posted code has the following problems:
The posted code is not guaranteed to work on text streams, because according to §7.21.9.4 ¶2 of the ISO C11 standard, the value of the file position indicator returned by ftell contains unspecified information. Only for binary streams is this value guaranteed to be the number of characters from the beginning of the file. There is no such guarantee for text streams.
The posted code is also not guaranteed to work on binary streams, because according to §7.21.9.2 ¶3 of the ISO C11 standard, binary streams are not required to meaningfully support SEEK_END.
That being said, on most common platforms, the posted code will work, if we assume that the data type long is large enough to represent the size of the file.
However, on Microsoft Windows, the characters \r\n (carriage return followed by line feed) will be translated to \n for text streams (but not for binary streams), so that the file size you get will count \r\n as two bytes, although you are only reading a single character (\n) in text mode. Therefore, the results you get will not be consistent.
On POSIX-based platforms (e.g. Linux), this is not an issue, because on those platforms, there is no difference between text mode and binary mode.
C++ MFC extracted from windows file details, not sure if this is better performing than seek but if it is extracted from metadata I think it is faster because it doesn't need to read the entire file
ULONGLONG GetFileSizeAtt(const wchar_t *wFile)
{
WIN32_FILE_ATTRIBUTE_DATA fileInfo;
ULONGLONG FileSize = 0ULL;
//https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/nf-fileapi-getfileattributesexa?redirectedfrom=MSDN
//https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/ns-fileapi-win32_file_attribute_data?redirectedfrom=MSDN
if (GetFileAttributesEx(wFile, GetFileExInfoStandard, &fileInfo))
{
ULARGE_INTEGER ul;
ul.HighPart = fileInfo.nFileSizeHigh;
ul.LowPart = fileInfo.nFileSizeLow;
FileSize = ul.QuadPart;
}
return FileSize;
}
Try this --
fseek(fp, 0, SEEK_END);
unsigned long int file_size = ftell(fp);
rewind(fp);
What this does is first, seek to the end of the file; then, report where the file pointer is. Lastly (this is optional) it rewinds back to the beginning of the file. Note that fp should be a binary stream.
file_size contains the number of bytes the file contains. Note that since (according to climits.h) the unsigned long type is limited to 4294967295 bytes (4 gigabytes) you'll need to find a different variable type if you're likely to deal with files larger than that.
I have a function that works well with only stdio.h. I like it a lot and it works very well and is pretty concise:
size_t fsize(FILE *File) {
size_t FSZ;
fseek(File, 0, 2);
FSZ = ftell(File);
rewind(File);
return FSZ;
}
Here's a simple and clean function that returns the file size.
long get_file_size(char *path)
{
FILE *fp;
long size = -1;
/* Open file for reading */
fp = fopen(path, "r");
fseek(fp, 0, SEEK_END);
size = ftell(fp);
fclose(fp);
return size;
}
You can open the file, go to 0 offset relative from the bottom of the file with
#define SEEKBOTTOM 2
fseek(handle, 0, SEEKBOTTOM)
the value returned from fseek is the size of the file.
I didn't code in C for a long time, but I think it should work.

Displaying size of a file [C]

I'm making a simple sockets program to send a text file or a picture file over to another socket connected to a port. However, I want to also send the size of the file over to the client socket so that it knows how many bytes to receive.
I also want to implement something where I can send a certain number of bytes instead of the file itself. For example, if a file I wanted to send was 14,003 bytes and I felt like sending 400 bytes, then only 400 bytes would be sent.
I am implementing something like this:
#include <stdio.h>
int main(int argc, char* argv[]) {
FILE *fp;
char* file = "text.txt";
int offset = 40;
int sendSize = 5;
int fileSize = 0;
if ((fp = fopen(file, "r")) == NULL) {
printf("Error: Cannot open the file!\n");
return 1;
} else {
/* Seek from offset into the file */
//fseek(fp, 0L, SEEK_END);
fseek(fp, offset, sendSize + offset); // seek to sendSize
fileSize = ftell(fp); // get current file pointer
//fseek(fp, 0, SEEK_SET); // seek back to beginning of file
}
printf("The size is: %d", fileSize);
}
offset is pretty much going to go 40 bytes into the file and then send whatever sendSize bytes over to the other program.
I keep getting an output of 0 instead of 5. Any reason behind this?
You can try this.
#include <stdio.h>
int main(int argc, char* argv[]) {
FILE *fp;
char* file = "text.txt";
int offset = 40;
int sendSize = 5;
int fileSize = 0;
if ((fp = fopen(file, "r")) == NULL) {
printf("Error: Cannot open the file!\n");
return 1;
} else {
fseek(fp, 0L, SEEK_END);
fileSize = ftell(fp);
}
printf("The size is: %d", fileSize);
}
The fseek() to the end, then ftell() method is a reasonably portable way of getting the size of a file, but not guaranteed to be correct. It won't transparently handle newline / carriage return conversions, and as a result, the standard doesn't actually guarantee that the return from ftell() is useful for any purpose other than seeking to the same position.
The only portable way is to read the file until data runs out and keep a count of bytes. Or stat() the file using the (non-ANSI) Unix standard function.
You may be opening the file in text mode as Windows can open a file in text mode even without the "t" option.
And you can't use ftell() to get the size of a file opened in text mode. Per 7.21.9.4 The ftell function of the C Standard:
For a text stream, its file position indicator contains unspecified information, usable by the fseek function for returning the file
position indicator for the stream to its position at the time
of the ftell call; the difference between two such return
values is not necessarily a meaningful measure of the number of
characters written or read.
Even if it does return the "size" of the file, the translation to "text" may changed the actual number of bytes read.
It's also not portable or standard-conforming to use fseek() to find the end of a binary file. Per 7.21.9.2 The
fseek
function:
A binary stream need not meaningfully support fseek calls with a
whence value of SEEK_END.
I think your Seek does not work due to the 3rd parameter:
try to seek with
(fp, offset, SEEK_SET);
as he will try to use the number sendSize+Offset as the "origin" constant, it will be compared to the 3 constant values as below (it is 0, 1 or 2) and as nothing compares it seem to return 0 all time.
http://www.cplusplus.com/reference/cstdio/fseek/
Parameters
stream, offset, origin
Position used as reference for the offset. It is specified by one of the following constants defined in exclusively to be used as arguments for this function:
Constant Reference position
SEEK_SET Beginning of file
SEEK_CUR Current position of the file pointer
SEEK_END End of file

Proper way to get file size in C

I am working on an assignment in socket programming in which I have to send a file between sparc and linux machine. Before sending the file in char stream I have to get the file size and tell the client. Here are some of the ways I tried to get the size but I am not sure which one is the proper one.
For testing purpose, I created a file with content " test" (space + (string)test)
Method 1 - Using fseeko() and ftello()
This is a method I found on https://www.securecoding.cert.org/confluence/display/c/FIO19-C.+Do+not+use+fseek()+and+ftell()+to+compute+the+size+of+a+regular+file
While the fssek() has a problem of "Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream", fseeko() is said to have tackled this problem but it only works on POSIX system (which is fine because the environment I am using is sparc and linux)
fd = open(file_path, O_RDONLY);
fp = fopen(file_path, "rb");
/* Ensure that the file is a regular file */
if ((fstat(fd, &st) != 0) || (!S_ISREG(st.st_mode))) {
/* Handle error */
}
if (fseeko(fp, 0 , SEEK_END) != 0) {
/* Handle error */
}
file_size = ftello(fp);
fseeko(fp, 0, SEEK_SET);
printf("file size %zu\n", file_size);
This method works fine and get the size correctly. However, it is limited to regular files only. I tried to google the term "regular file" but I still not quite understand it thoroughly. And I do not know if this function is reliable for my project.
Method 2 - Using strlen()
Since the max. size of a file in my project is 4MB, so I can just calloc a 4MB buffer. After that, the file is read into the buffer, and I tried to use the strlen to get the file size (or more correctly the length of content). Since strlen() is portable, can I use this method instead? The code snippet is like this
fp = fopen(file_path, "rb");
fread(file_buffer, 1024*1024*4, 1, fp);
printf("strlen %zu\n", strlen(file_buffer));
This method works too and returns
strlen 8
However, I couldn't see any similar approach on the Internet using this method. So I am thinking maybe I have missed something or there are some limitations of this approach which I haven't realized.
Regular file means that it is nothing special like device, socket, pipe etc. but "normal" file.
It seems that by your task description before sending you must retrieve size of normal file.
So your way is right:
FILE* fp = fopen(...);
if(fp) {
fseek(fp, 0 , SEEK_END);
long fileSize = ftell(fp);
fseek(fp, 0 , SEEK_SET);// needed for next read from beginning of file
...
fclose(fp);
}
but you can do it without opening file:
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
struct stat buffer;
int status;
status = stat("path to file", &buffer);
if(status == 0) {
// size of file is in member buffer.st_size;
}
OP can do it the easy way as "max. size of a file in my project is 4MB".
Rather than using strlen(), use the return value from fread(). stlen() stops on the first null character, so may report too small a value. #Sami Kuhmonen Also we do not know the data read contains any null character, so it may not be a string. Append a null character (and allocate +1) if code needs to use data as a string. But in that case, I'd expect the file needed to be open in text mode.
Note that many OS's do not even use allocated memory until it is written.
Why is malloc not "using up" the memory on my computer?
fp = fopen(file_path, "rb");
if (fp) {
#define MAX_FILE_SIZE 4194304
char *buf = malloc(MAX_FILE_SIZE);
if (buf) {
size_t numread = fread(buf, sizeof *buf, MAX_FILE_SIZE, fp);
// shrink if desired
char *tmp = realloc(buf, numread);
if (tmp) {
buf = tmp;
// Use buf with numread char
}
free(buf);
}
fclose(fp);
}
Note: Reading the entire file into memory may not be the best idea to begin with.

reading from a binary file in C

I am currently working on a project in which I have to read from a binary file and send it through sockets and I am having a hard time trying to send the whole file.
Here is what I wrote so far:
FILE *f = fopen(line,"rt");
//size = lseek(f, 0, SEEK_END)+1;
fseek(f, 0L, SEEK_END);
int size = ftell(f);
unsigned char buffer[MSGSIZE];
FILE *file = fopen(line,"rb");
while(fgets(buffer,MSGSIZE,file)){
sprintf(r.payload,"%s",buffer);
r.len = strlen(r.payload)+1;
res = send_message(&r);
if (res < 0) {
perror("[RECEIVER] Send ACK error. Exiting.\n");
return -1;
}
}
I think it has something to do with the size of the buffer that I read into,but I don't know what it's the correct formula for it.
One more thing,is the sprintf done correctly?
If you are reading binary files, a NUL character may appear anywhere in the file.
Thus, using string functions like sprintf and strlen is a bad idea.
If you really need to use a second buffer (buffer), you could use memcpy.
You could also directly read into r.payload (if r.payload is already allocated with sufficient size).
You are looking for fread for a binary file.
The return value of fread tells you how many bytes were read into your buffer.
You may also consider to call fseek again.
See here How can I get a file's size in C?
Maybe your code could look like this:
#include <stdint.h>
#include <stdio.h>
#define MSGSIZE 512
struct r_t {
uint8_t payload[MSGSIZE];
int len;
};
int send_message(struct r_t *t);
int main() {
struct r_t r;
FILE *f = fopen("test.bin","rb");
fseek(f, 0L, SEEK_END);
size_t size = ftell(f);
fseek(f, 0L, SEEK_SET);
do {
r.len = fread(r.payload, 1, sizeof(r.payload), f);
if (r.len > 0) {
int res = send_message(&r);
if (res < 0) {
perror("[RECEIVER] Send ACK error. Exiting.\n");
fclose(f);
return -1;
}
}
} while (r.len > 0);
fclose(f);
return 0;
}
No, the sprintf is not done correctly. It is prone to buffer overflow, a very serious security problem.
I would consider sending the file as e.g. 1024-byte chunks instead of as line-by-line, so I would replace the fgets call with an fread call.
Why are you opening the file twice? Apparently to get its size, but you could open it only once and jump back to the beginning of the file. And, you're not using the size you read for anything.
Is it a binary file or a text file? fgets() assumes you are reading a text file -- it stops on a line break -- but you say it's a binary file and open it with "rb" (actually, the first time you opened it with "rt", I assume that was a typo).
IMO you should never ever use sprintf. The number of characters written to the buffer depends on the parameters that are passed in, and in this case if there is no '\0' in buffer then you cannot predict how many bytes will be copied to r.payload, and there is a very good chance you will overflow that buffer.
I think sprintf() would be the first thing to fix. Use memcpy() and you can tell it exactly how many bytes to copy.

open_memstream with fseek to end pads buffer with zeros

I using some C code that writes binary data to a file. In the process, it seeks around to different positions and then finally seeks to the end with fseeko(fp, 0, SEEK_END);.
However, in some cases, I want to work on a stream in memory instead. I use open_memstream for this, but seeking to the end pads the buffer with zeros and it ends up being twice as big as it should be.
An example just to demonstrate the effect of the fseek to the end of the stream is below. In the actual code, we also fseek to different parts of the stream, patching and editing bits of it, etc., as the stream is processed. Note also that writing the file at the end to the filesystem is just for demonstration to show the contents of the buffer – otherwise I wouldn't need the memory stream.
#include <stdio.h>
#include <stdlib.h>
#if (defined(BSD) || __APPLE__)
#include "open_memstream.h"
#endif
int main(void) {
FILE *stream;
FILE *outfile;
char *buf;
size_t buf_len;
int i;
stream = open_memstream(&buf, &buf_len);
for(i = 0; i < 1000; i++) {
fprintf(stream, "%d\n", i);
}
fseeko(stream, 0, SEEK_END);
fclose(stream);
outfile = fopen("out.txt", "w");
fwrite(buf, buf_len, 1, outfile);
fclose(outfile);
return 0;
}
I was testing this out on Mac OS X with this implementation of open_memstream and it worked as I expected, but when I run this on Linux the file is twice the size with zeros at the end.
What's the best way to deal with this? I'm not sure if it's reliable to divide the buffer length by two and truncate it.
I've just ran into the same problem on Linux.
// It seams that SEEK_END does not work with open_memstream()
fseek(stream, 0, SEEK_END);
I've ended up doing this:
off_t o = ftell(stream);
/* do some things with the stream */
fseek(stream, o, SEEK_SET);

Resources