Get the number of bytes in a file [duplicate] - c

How can I figure out the size of a file, in bytes?
#include <stdio.h>
unsigned int fsize(char* file){
//what goes here?
}

On Unix-like systems, you can use POSIX system calls: stat on a path, or fstat on an already-open file descriptor (POSIX man page, Linux man page).
(Get a file descriptor from open(2), or fileno(FILE*) on a stdio stream).
Based on NilObject's code:
#include <sys/stat.h>
#include <sys/types.h>
off_t fsize(const char *filename) {
struct stat st;
if (stat(filename, &st) == 0)
return st.st_size;
return -1;
}
Changes:
Made the filename argument a const char.
Corrected the struct stat definition, which was missing the variable name.
Returns -1 on error instead of 0, which would be ambiguous for an empty file. off_t is a signed type so this is possible.
If you want fsize() to print a message on error, you can use this:
#include <sys/stat.h>
#include <sys/types.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
off_t fsize(const char *filename) {
struct stat st;
if (stat(filename, &st) == 0)
return st.st_size;
fprintf(stderr, "Cannot determine size of %s: %s\n",
filename, strerror(errno));
return -1;
}
On 32-bit systems you should compile this with the option -D_FILE_OFFSET_BITS=64, otherwise off_t will only hold values up to 2 GB. See the "Using LFS" section of Large File Support in Linux for details.

Don't use int. Files over 2 gigabytes in size are common as dirt these days
Don't use unsigned int. Files over 4 gigabytes in size are common as some slightly-less-common dirt
IIRC the standard library defines off_t as an unsigned 64 bit integer, which is what everyone should be using. We can redefine that to be 128 bits in a few years when we start having 16 exabyte files hanging around.
If you're on windows, you should use GetFileSizeEx - it actually uses a signed 64 bit integer, so they'll start hitting problems with 8 exabyte files. Foolish Microsoft! :-)

Matt's solution should work, except that it's C++ instead of C, and the initial tell shouldn't be necessary.
unsigned long fsize(char* file)
{
FILE * f = fopen(file, "r");
fseek(f, 0, SEEK_END);
unsigned long len = (unsigned long)ftell(f);
fclose(f);
return len;
}
Fixed your brace for you, too. ;)
Update: This isn't really the best solution. It's limited to 4GB files on Windows and it's likely slower than just using a platform-specific call like GetFileSizeEx or stat64.

**Don't do this (why?):
Quoting the C99 standard doc that i found online: "Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state.**
Change the definition to int so that error messages can be transmitted, and then use fseek() and ftell() to determine the file size.
int fsize(char* file) {
int size;
FILE* fh;
fh = fopen(file, "rb"); //binary mode
if(fh != NULL){
if( fseek(fh, 0, SEEK_END) ){
fclose(fh);
return -1;
}
size = ftell(fh);
fclose(fh);
return size;
}
return -1; //error
}

POSIX
The POSIX standard has its own method to get file size.
Include the sys/stat.h header to use the function.
Synopsis
Get file statistics using stat(3).
Obtain the st_size property.
Examples
Note: It limits the size to 4GB. If not Fat32 filesystem then use the 64bit version!
#include <stdio.h>
#include <sys/stat.h>
int main(int argc, char** argv)
{
struct stat info;
stat(argv[1], &info);
// 'st' is an acronym of 'stat'
printf("%s: size=%ld\n", argv[1], info.st_size);
}
#include <stdio.h>
#include <sys/stat.h>
int main(int argc, char** argv)
{
struct stat64 info;
stat64(argv[1], &info);
// 'st' is an acronym of 'stat'
printf("%s: size=%ld\n", argv[1], info.st_size);
}
ANSI C (standard)
The ANSI C doesn't directly provides the way to determine the length of the file.
We'll have to use our mind. For now, we'll use the seek approach!
Synopsis
Seek the file to the end using fseek(3).
Get the current position using ftell(3).
Example
#include <stdio.h>
int main(int argc, char** argv)
{
FILE* fp = fopen(argv[1]);
int f_size;
fseek(fp, 0, SEEK_END);
f_size = ftell(fp);
rewind(fp); // to back to start again
printf("%s: size=%ld", (unsigned long)f_size);
}
If the file is stdin or a pipe. POSIX, ANSI C won't work.
It will going return 0 if the file is a pipe or stdin.
Opinion:
You should use POSIX standard instead. Because, it has 64bit support.

And if you're building a Windows app, use the GetFileSizeEx API as CRT file I/O is messy, especially for determining file length, due to peculiarities in file representations on different systems ;)

If you're fine with using the std c library:
#include <sys/stat.h>
off_t fsize(char *file) {
struct stat filestat;
if (stat(file, &filestat) == 0) {
return filestat.st_size;
}
return 0;
}

I used this set of code to find the file length.
//opens a file with a file descriptor
FILE * i_file;
i_file = fopen(source, "r");
//gets a long from the file descriptor for fstat
long f_d = fileno(i_file);
struct stat buffer;
fstat(f_d, &buffer);
//stores file size
long file_length = buffer.st_size;
fclose(i_file);

I found a method using fseek and ftell and a thread with this question with answers that it can't be done in just C in another way.
You could use a portability library like NSPR (the library that powers Firefox).

In plain ISO C, there is only one way to determine the size of a file which is guaranteed to work: To read the entire file from the start, until you encounter end-of-file.
However, this is highly inefficient. If you want a more efficient solution, then you will have to either
rely on platform-specific behavior, or
revert to platform-specific functions, such as stat on Linux or GetFileSize on Microsoft Windows.
In contrast to what other answers have suggested, the following code is not guaranteed to work:
fseek( fp, 0, SEEK_END );
long size = ftell( fp );
Even if we assume that the data type long is large enough to represent the file size (which is questionable on some platforms, most notably Microsoft Windows), the posted code has the following problems:
The posted code is not guaranteed to work on text streams, because according to §7.21.9.4 ¶2 of the ISO C11 standard, the value of the file position indicator returned by ftell contains unspecified information. Only for binary streams is this value guaranteed to be the number of characters from the beginning of the file. There is no such guarantee for text streams.
The posted code is also not guaranteed to work on binary streams, because according to §7.21.9.2 ¶3 of the ISO C11 standard, binary streams are not required to meaningfully support SEEK_END.
That being said, on most common platforms, the posted code will work, if we assume that the data type long is large enough to represent the size of the file.
However, on Microsoft Windows, the characters \r\n (carriage return followed by line feed) will be translated to \n for text streams (but not for binary streams), so that the file size you get will count \r\n as two bytes, although you are only reading a single character (\n) in text mode. Therefore, the results you get will not be consistent.
On POSIX-based platforms (e.g. Linux), this is not an issue, because on those platforms, there is no difference between text mode and binary mode.

C++ MFC extracted from windows file details, not sure if this is better performing than seek but if it is extracted from metadata I think it is faster because it doesn't need to read the entire file
ULONGLONG GetFileSizeAtt(const wchar_t *wFile)
{
WIN32_FILE_ATTRIBUTE_DATA fileInfo;
ULONGLONG FileSize = 0ULL;
//https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/nf-fileapi-getfileattributesexa?redirectedfrom=MSDN
//https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/ns-fileapi-win32_file_attribute_data?redirectedfrom=MSDN
if (GetFileAttributesEx(wFile, GetFileExInfoStandard, &fileInfo))
{
ULARGE_INTEGER ul;
ul.HighPart = fileInfo.nFileSizeHigh;
ul.LowPart = fileInfo.nFileSizeLow;
FileSize = ul.QuadPart;
}
return FileSize;
}

Try this --
fseek(fp, 0, SEEK_END);
unsigned long int file_size = ftell(fp);
rewind(fp);
What this does is first, seek to the end of the file; then, report where the file pointer is. Lastly (this is optional) it rewinds back to the beginning of the file. Note that fp should be a binary stream.
file_size contains the number of bytes the file contains. Note that since (according to climits.h) the unsigned long type is limited to 4294967295 bytes (4 gigabytes) you'll need to find a different variable type if you're likely to deal with files larger than that.

I have a function that works well with only stdio.h. I like it a lot and it works very well and is pretty concise:
size_t fsize(FILE *File) {
size_t FSZ;
fseek(File, 0, 2);
FSZ = ftell(File);
rewind(File);
return FSZ;
}

Here's a simple and clean function that returns the file size.
long get_file_size(char *path)
{
FILE *fp;
long size = -1;
/* Open file for reading */
fp = fopen(path, "r");
fseek(fp, 0, SEEK_END);
size = ftell(fp);
fclose(fp);
return size;
}

You can open the file, go to 0 offset relative from the bottom of the file with
#define SEEKBOTTOM 2
fseek(handle, 0, SEEKBOTTOM)
the value returned from fseek is the size of the file.
I didn't code in C for a long time, but I think it should work.

Related

Find file size in c [duplicate]

How can I figure out the size of a file, in bytes?
#include <stdio.h>
unsigned int fsize(char* file){
//what goes here?
}
On Unix-like systems, you can use POSIX system calls: stat on a path, or fstat on an already-open file descriptor (POSIX man page, Linux man page).
(Get a file descriptor from open(2), or fileno(FILE*) on a stdio stream).
Based on NilObject's code:
#include <sys/stat.h>
#include <sys/types.h>
off_t fsize(const char *filename) {
struct stat st;
if (stat(filename, &st) == 0)
return st.st_size;
return -1;
}
Changes:
Made the filename argument a const char.
Corrected the struct stat definition, which was missing the variable name.
Returns -1 on error instead of 0, which would be ambiguous for an empty file. off_t is a signed type so this is possible.
If you want fsize() to print a message on error, you can use this:
#include <sys/stat.h>
#include <sys/types.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
off_t fsize(const char *filename) {
struct stat st;
if (stat(filename, &st) == 0)
return st.st_size;
fprintf(stderr, "Cannot determine size of %s: %s\n",
filename, strerror(errno));
return -1;
}
On 32-bit systems you should compile this with the option -D_FILE_OFFSET_BITS=64, otherwise off_t will only hold values up to 2 GB. See the "Using LFS" section of Large File Support in Linux for details.
Don't use int. Files over 2 gigabytes in size are common as dirt these days
Don't use unsigned int. Files over 4 gigabytes in size are common as some slightly-less-common dirt
IIRC the standard library defines off_t as an unsigned 64 bit integer, which is what everyone should be using. We can redefine that to be 128 bits in a few years when we start having 16 exabyte files hanging around.
If you're on windows, you should use GetFileSizeEx - it actually uses a signed 64 bit integer, so they'll start hitting problems with 8 exabyte files. Foolish Microsoft! :-)
Matt's solution should work, except that it's C++ instead of C, and the initial tell shouldn't be necessary.
unsigned long fsize(char* file)
{
FILE * f = fopen(file, "r");
fseek(f, 0, SEEK_END);
unsigned long len = (unsigned long)ftell(f);
fclose(f);
return len;
}
Fixed your brace for you, too. ;)
Update: This isn't really the best solution. It's limited to 4GB files on Windows and it's likely slower than just using a platform-specific call like GetFileSizeEx or stat64.
**Don't do this (why?):
Quoting the C99 standard doc that i found online: "Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state.**
Change the definition to int so that error messages can be transmitted, and then use fseek() and ftell() to determine the file size.
int fsize(char* file) {
int size;
FILE* fh;
fh = fopen(file, "rb"); //binary mode
if(fh != NULL){
if( fseek(fh, 0, SEEK_END) ){
fclose(fh);
return -1;
}
size = ftell(fh);
fclose(fh);
return size;
}
return -1; //error
}
POSIX
The POSIX standard has its own method to get file size.
Include the sys/stat.h header to use the function.
Synopsis
Get file statistics using stat(3).
Obtain the st_size property.
Examples
Note: It limits the size to 4GB. If not Fat32 filesystem then use the 64bit version!
#include <stdio.h>
#include <sys/stat.h>
int main(int argc, char** argv)
{
struct stat info;
stat(argv[1], &info);
// 'st' is an acronym of 'stat'
printf("%s: size=%ld\n", argv[1], info.st_size);
}
#include <stdio.h>
#include <sys/stat.h>
int main(int argc, char** argv)
{
struct stat64 info;
stat64(argv[1], &info);
// 'st' is an acronym of 'stat'
printf("%s: size=%ld\n", argv[1], info.st_size);
}
ANSI C (standard)
The ANSI C doesn't directly provides the way to determine the length of the file.
We'll have to use our mind. For now, we'll use the seek approach!
Synopsis
Seek the file to the end using fseek(3).
Get the current position using ftell(3).
Example
#include <stdio.h>
int main(int argc, char** argv)
{
FILE* fp = fopen(argv[1]);
int f_size;
fseek(fp, 0, SEEK_END);
f_size = ftell(fp);
rewind(fp); // to back to start again
printf("%s: size=%ld", (unsigned long)f_size);
}
If the file is stdin or a pipe. POSIX, ANSI C won't work.
It will going return 0 if the file is a pipe or stdin.
Opinion:
You should use POSIX standard instead. Because, it has 64bit support.
And if you're building a Windows app, use the GetFileSizeEx API as CRT file I/O is messy, especially for determining file length, due to peculiarities in file representations on different systems ;)
If you're fine with using the std c library:
#include <sys/stat.h>
off_t fsize(char *file) {
struct stat filestat;
if (stat(file, &filestat) == 0) {
return filestat.st_size;
}
return 0;
}
I used this set of code to find the file length.
//opens a file with a file descriptor
FILE * i_file;
i_file = fopen(source, "r");
//gets a long from the file descriptor for fstat
long f_d = fileno(i_file);
struct stat buffer;
fstat(f_d, &buffer);
//stores file size
long file_length = buffer.st_size;
fclose(i_file);
I found a method using fseek and ftell and a thread with this question with answers that it can't be done in just C in another way.
You could use a portability library like NSPR (the library that powers Firefox).
In plain ISO C, there is only one way to determine the size of a file which is guaranteed to work: To read the entire file from the start, until you encounter end-of-file.
However, this is highly inefficient. If you want a more efficient solution, then you will have to either
rely on platform-specific behavior, or
revert to platform-specific functions, such as stat on Linux or GetFileSize on Microsoft Windows.
In contrast to what other answers have suggested, the following code is not guaranteed to work:
fseek( fp, 0, SEEK_END );
long size = ftell( fp );
Even if we assume that the data type long is large enough to represent the file size (which is questionable on some platforms, most notably Microsoft Windows), the posted code has the following problems:
The posted code is not guaranteed to work on text streams, because according to §7.21.9.4 ¶2 of the ISO C11 standard, the value of the file position indicator returned by ftell contains unspecified information. Only for binary streams is this value guaranteed to be the number of characters from the beginning of the file. There is no such guarantee for text streams.
The posted code is also not guaranteed to work on binary streams, because according to §7.21.9.2 ¶3 of the ISO C11 standard, binary streams are not required to meaningfully support SEEK_END.
That being said, on most common platforms, the posted code will work, if we assume that the data type long is large enough to represent the size of the file.
However, on Microsoft Windows, the characters \r\n (carriage return followed by line feed) will be translated to \n for text streams (but not for binary streams), so that the file size you get will count \r\n as two bytes, although you are only reading a single character (\n) in text mode. Therefore, the results you get will not be consistent.
On POSIX-based platforms (e.g. Linux), this is not an issue, because on those platforms, there is no difference between text mode and binary mode.
C++ MFC extracted from windows file details, not sure if this is better performing than seek but if it is extracted from metadata I think it is faster because it doesn't need to read the entire file
ULONGLONG GetFileSizeAtt(const wchar_t *wFile)
{
WIN32_FILE_ATTRIBUTE_DATA fileInfo;
ULONGLONG FileSize = 0ULL;
//https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/nf-fileapi-getfileattributesexa?redirectedfrom=MSDN
//https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/ns-fileapi-win32_file_attribute_data?redirectedfrom=MSDN
if (GetFileAttributesEx(wFile, GetFileExInfoStandard, &fileInfo))
{
ULARGE_INTEGER ul;
ul.HighPart = fileInfo.nFileSizeHigh;
ul.LowPart = fileInfo.nFileSizeLow;
FileSize = ul.QuadPart;
}
return FileSize;
}
Try this --
fseek(fp, 0, SEEK_END);
unsigned long int file_size = ftell(fp);
rewind(fp);
What this does is first, seek to the end of the file; then, report where the file pointer is. Lastly (this is optional) it rewinds back to the beginning of the file. Note that fp should be a binary stream.
file_size contains the number of bytes the file contains. Note that since (according to climits.h) the unsigned long type is limited to 4294967295 bytes (4 gigabytes) you'll need to find a different variable type if you're likely to deal with files larger than that.
I have a function that works well with only stdio.h. I like it a lot and it works very well and is pretty concise:
size_t fsize(FILE *File) {
size_t FSZ;
fseek(File, 0, 2);
FSZ = ftell(File);
rewind(File);
return FSZ;
}
Here's a simple and clean function that returns the file size.
long get_file_size(char *path)
{
FILE *fp;
long size = -1;
/* Open file for reading */
fp = fopen(path, "r");
fseek(fp, 0, SEEK_END);
size = ftell(fp);
fclose(fp);
return size;
}
You can open the file, go to 0 offset relative from the bottom of the file with
#define SEEKBOTTOM 2
fseek(handle, 0, SEEKBOTTOM)
the value returned from fseek is the size of the file.
I didn't code in C for a long time, but I think it should work.

File get contents in C

What is the best way to get the contents of a file into a single character array?
I have read this question:
Easiest way to get file's contents in C
But from the comments, I've seen that the solution isn't great for large files. I do have access to the stat function. If the file size is over 4 gb, should I just return an error?
The contents of the file is encrypted and since it's supplied by the user it could be as large as anyone would want it to be. I want it to return an error and not crash if the file is too big. The main purpose of populating the character array with the contents of a file, is to compare it to another character array and also (if needed and configured to do so) to log both of these to a log file (or multiple log files if necessary).
You may use fstat(3) from sys/stat.h. Here is a little function to get size of the file, allocate memory if file is less than 4GB's and return (-1) otherwise. It reads the file to the char array passed to char *buffer a char *, which contains the contents of the whole file.It should be free'd after use.
#include <stdio.h>
#include <sys/stat.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <fcntl.h>
char *loadlfile(const char *path)
{
int file_descr;
FILE *fp;
struct stat buf;
char *p, *buffer;
fstat((file_descr = open(path, O_RDONLY)), &buf);
// This check is done at preprocessing and requires no check at runtime.
// It basically means "If this machine is not of a popular 64bit architecture,
// it's probably not 128bit and possibly has limits in maximum memory size.
// This check is done for the sake of omission of malloc(3)'s unnecessary
// invocation at runtime.
// Amd 64 Arm64 Intel 64 Intel 64 for Microsofts compiler.
#if !defined(__IA_64) || !defined(__aarch64__) || !defined(__ia64__) || !defined(_M_IA64)
#define FILE_MAX_BYTES (4000000000)
// buf.st_size is of off_t, you may need to cast it.
if(buf.st_size >= FILE_MAX_BYTES-1)
return (-1);
#endif
if(NULL == (buffer = malloc(buf.st_size + 1)))
return NULL;
fp = fdopen(file_descr, "rb");
p = buffer;
while((*p++ = fgetc(fp)) != EOF)
;
*p = '\0';
fclose(fp);
close(file_descr);
return buffer;
}
A very broad list of pre-defined macros for various things can be found # http://sourceforge.net/p/predef/wiki/Home/. The reason for the architecture and file size check is, malloc can be expensive at times and it is best to omit/skip it's usage when it is not needed. And querying a memory of max. 4gb for a whole block of 4gb storage is just waste of those precious cycles.
From that guy's code just do, if I understand your question correctly:
char * buffer = 0;
long length;
FILE * f = fopen (filename, "rb");
if (f)
{
fseek (f, 0, SEEK_END);
length = ftell (f);
if(length > MY_MAX_SIZE) {
return -1;
}
fseek (f, 0, SEEK_SET);
buffer = malloc (length);
if (buffer)
{
fread (buffer, 1, length, f);
}
fclose (f);
}
if (buffer)
{
// start to process your data / extract strings here...
}

Finding the size of a file created by fmemopen

I'm using fmemopen to create a variable FILE* fid to pass it to a function the reads data from an open file.
Somewhere in that function it uses the following code to find out the size of the file:
fseek(fid, 0, SEEK_END);
file_size = ftell(fid);
this works well in case of regular files, but in case of file ids created by fmemopen I always get file_size = 8192
Any ideas why this happens?
Is there a method to get the correct file size that works for both regular files and files created with fmemopen?
EDIT:
my call to fmemopen:
fid = fmemopen(ptr, memSize, "r");
where memSize != 8192
EDIT2:
I created a minimal example:
#include <cstdlib>
#include <stdio.h>
#include <string.h>
using namespace std;
int main(int argc, char** argv)
{
const long unsigned int memsize = 1000000;
void * ptr = malloc(memsize);
FILE *fid = fmemopen(ptr, memsize, "r");
fseek(fid, 0, SEEK_END);
long int file_size = ftell(fid);
printf("file_size = %ld\n", file_size);
free(ptr);
return 0;
}
btw, I am currently working on another computer, and here I get file_size=0
In case of fmemopen , if you open using the option b then SEEK_END measures the size of the memory buffer. The value you see must be the default buffer size.
OK, I have got this mystery solved by myself. The documentation says:
If the opentype specifies append mode, then the initial file position is set to the first null character in the buffer
and later:
For a stream open for reading, null characters (zero bytes) in the buffer do not count as "end of file". Read operations indicate end of file only when the file position advances past size bytes.
It seems that fseek(fid, 0, SEEK_END) goes to the first zero byte in the buffer, and not to the end of the buffer.
Still looking for a method that will work on both standard and fmemopen files.

32 bit Windows and the 2GB file size limit (C with fseek and ftell)

I am attempting to port a small data analysis program from a 64 bit UNIX to a 32 bit Windows XP system (don't ask :)).
But now I am having problems with the 2GB file size limit (long not being 64 bit on this platform).
I have searched this website and others for possible sollutions but cannot find any that are directly translatable to my problem.
The problem is in the use of fseek and ftell.
Does anyone know of a modification to the following two functions to make them work on 32 bit Windows XP for files larger than 2GB (actually order 100GB).
It is vital that the return type of nsamples is a 64 bit integer (possibly int64_t).
long nsamples(char* filename)
{
FILE *fp;
long n;
/* Open file */
fp = fopen(filename, "rb");
/* Find end of file */
fseek(fp, 0L, SEEK_END);
/* Get number of samples */
n = ftell(fp) / sizeof(short);
/* Close file */
fclose(fp);
/* Return number of samples in file */
return n;
}
and
void readdata(char* filename, short* data, long start, int n)
{
FILE *fp;
/* Open file */
fp = fopen(filename, "rb");
/* Skip to correct position */
fseek(fp, start * sizeof(short), SEEK_SET);
/* Read data */
fread(data, sizeof(short), n, fp);
/* Close file */
fclose(fp);
}
I tried using _fseeki64 and _ftelli64 using the following to replace nsamples:
__int64 nsamples(char* filename)
{
FILE *fp;
__int64 n;
int result;
/* Open file */
fp = fopen(filename, "rb");
if (fp == NULL)
{
perror("Error: could not open file!\n");
return -1;
}
/* Find end of file */
result = _fseeki64(fp, (__int64)0, SEEK_END);
if (result)
{
perror("Error: fseek failed!\n");
return result;
}
/* Get number of samples */
n = _ftelli64(fp) / sizeof(short);
printf("%I64d\n", n);
/* Close file */
fclose(fp);
/* Return number of samples in file */
return n;
}
for a file of 4815060992 bytes I get 260046848 samples (e.g. _ftelli64 gives 520093696 bytes) which is strange.
Curiously when I leave out the (__int64) cast in the call to _fseeki64 I get a runtime error (invalid argument).
Any ideas?
sorry for not posting sooner but I have been preoccupied with other projects for a while.
The following solution works:
__int64 nsamples(char* filename)
{
int fh;
__int64 n;
/* Open file */
fh = _open( filename, _O_BINARY );
/* Find end of file */
n = _lseeki64(fh, 0, SEEK_END);
/* Close file */
_close(fh);
return n / sizeof(short);
}
The trick was using _open instead of fopen to open the file.
I still don't understand exactly why this has to be done, but at least this works now.
Thanks to everyone for your suggestions which eventually pointed me in the right direction.
There are two functions called _fseeki64 and _ftelli64 that support longer file offsets even on 32 bit Windows:
int _fseeki64(FILE *stream, __int64 offset, int origin);
__int64 _ftelli64(FILE *stream);
My BC says:
520093696 + 4294967296 => 4815060992
I'm guessing that your print routine is 32-bit. Your offset returned is most likely correct but being chopped off somewhere.
And for gcc, see SO question 1035657. Where the advice is compile with the flag -D_FILE_OFFSET_BITS=64 so that the hidden variable(s) (of type off_t) used by the f-move-around functions is(are) 64-bits.
For MinGW: "Large-file support (LFS) has been implemented by redefining the stat and seek functions and types to their 64-bits equivalents. For fseek and ftell, separate LFS versions, fseeko and ftello, based on fsetpos and fgetpos, are provided in LibGw32C." (reference). In recent versions of gcc, fseeko and ftello are built-in and a separate library is not needed.

How do you determine the size of a file in C?

How can I figure out the size of a file, in bytes?
#include <stdio.h>
unsigned int fsize(char* file){
//what goes here?
}
On Unix-like systems, you can use POSIX system calls: stat on a path, or fstat on an already-open file descriptor (POSIX man page, Linux man page).
(Get a file descriptor from open(2), or fileno(FILE*) on a stdio stream).
Based on NilObject's code:
#include <sys/stat.h>
#include <sys/types.h>
off_t fsize(const char *filename) {
struct stat st;
if (stat(filename, &st) == 0)
return st.st_size;
return -1;
}
Changes:
Made the filename argument a const char.
Corrected the struct stat definition, which was missing the variable name.
Returns -1 on error instead of 0, which would be ambiguous for an empty file. off_t is a signed type so this is possible.
If you want fsize() to print a message on error, you can use this:
#include <sys/stat.h>
#include <sys/types.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
off_t fsize(const char *filename) {
struct stat st;
if (stat(filename, &st) == 0)
return st.st_size;
fprintf(stderr, "Cannot determine size of %s: %s\n",
filename, strerror(errno));
return -1;
}
On 32-bit systems you should compile this with the option -D_FILE_OFFSET_BITS=64, otherwise off_t will only hold values up to 2 GB. See the "Using LFS" section of Large File Support in Linux for details.
Don't use int. Files over 2 gigabytes in size are common as dirt these days
Don't use unsigned int. Files over 4 gigabytes in size are common as some slightly-less-common dirt
IIRC the standard library defines off_t as an unsigned 64 bit integer, which is what everyone should be using. We can redefine that to be 128 bits in a few years when we start having 16 exabyte files hanging around.
If you're on windows, you should use GetFileSizeEx - it actually uses a signed 64 bit integer, so they'll start hitting problems with 8 exabyte files. Foolish Microsoft! :-)
Matt's solution should work, except that it's C++ instead of C, and the initial tell shouldn't be necessary.
unsigned long fsize(char* file)
{
FILE * f = fopen(file, "r");
fseek(f, 0, SEEK_END);
unsigned long len = (unsigned long)ftell(f);
fclose(f);
return len;
}
Fixed your brace for you, too. ;)
Update: This isn't really the best solution. It's limited to 4GB files on Windows and it's likely slower than just using a platform-specific call like GetFileSizeEx or stat64.
**Don't do this (why?):
Quoting the C99 standard doc that i found online: "Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state.**
Change the definition to int so that error messages can be transmitted, and then use fseek() and ftell() to determine the file size.
int fsize(char* file) {
int size;
FILE* fh;
fh = fopen(file, "rb"); //binary mode
if(fh != NULL){
if( fseek(fh, 0, SEEK_END) ){
fclose(fh);
return -1;
}
size = ftell(fh);
fclose(fh);
return size;
}
return -1; //error
}
POSIX
The POSIX standard has its own method to get file size.
Include the sys/stat.h header to use the function.
Synopsis
Get file statistics using stat(3).
Obtain the st_size property.
Examples
Note: It limits the size to 4GB. If not Fat32 filesystem then use the 64bit version!
#include <stdio.h>
#include <sys/stat.h>
int main(int argc, char** argv)
{
struct stat info;
stat(argv[1], &info);
// 'st' is an acronym of 'stat'
printf("%s: size=%ld\n", argv[1], info.st_size);
}
#include <stdio.h>
#include <sys/stat.h>
int main(int argc, char** argv)
{
struct stat64 info;
stat64(argv[1], &info);
// 'st' is an acronym of 'stat'
printf("%s: size=%ld\n", argv[1], info.st_size);
}
ANSI C (standard)
The ANSI C doesn't directly provides the way to determine the length of the file.
We'll have to use our mind. For now, we'll use the seek approach!
Synopsis
Seek the file to the end using fseek(3).
Get the current position using ftell(3).
Example
#include <stdio.h>
int main(int argc, char** argv)
{
FILE* fp = fopen(argv[1]);
int f_size;
fseek(fp, 0, SEEK_END);
f_size = ftell(fp);
rewind(fp); // to back to start again
printf("%s: size=%ld", (unsigned long)f_size);
}
If the file is stdin or a pipe. POSIX, ANSI C won't work.
It will going return 0 if the file is a pipe or stdin.
Opinion:
You should use POSIX standard instead. Because, it has 64bit support.
And if you're building a Windows app, use the GetFileSizeEx API as CRT file I/O is messy, especially for determining file length, due to peculiarities in file representations on different systems ;)
If you're fine with using the std c library:
#include <sys/stat.h>
off_t fsize(char *file) {
struct stat filestat;
if (stat(file, &filestat) == 0) {
return filestat.st_size;
}
return 0;
}
I used this set of code to find the file length.
//opens a file with a file descriptor
FILE * i_file;
i_file = fopen(source, "r");
//gets a long from the file descriptor for fstat
long f_d = fileno(i_file);
struct stat buffer;
fstat(f_d, &buffer);
//stores file size
long file_length = buffer.st_size;
fclose(i_file);
I found a method using fseek and ftell and a thread with this question with answers that it can't be done in just C in another way.
You could use a portability library like NSPR (the library that powers Firefox).
In plain ISO C, there is only one way to determine the size of a file which is guaranteed to work: To read the entire file from the start, until you encounter end-of-file.
However, this is highly inefficient. If you want a more efficient solution, then you will have to either
rely on platform-specific behavior, or
revert to platform-specific functions, such as stat on Linux or GetFileSize on Microsoft Windows.
In contrast to what other answers have suggested, the following code is not guaranteed to work:
fseek( fp, 0, SEEK_END );
long size = ftell( fp );
Even if we assume that the data type long is large enough to represent the file size (which is questionable on some platforms, most notably Microsoft Windows), the posted code has the following problems:
The posted code is not guaranteed to work on text streams, because according to §7.21.9.4 ¶2 of the ISO C11 standard, the value of the file position indicator returned by ftell contains unspecified information. Only for binary streams is this value guaranteed to be the number of characters from the beginning of the file. There is no such guarantee for text streams.
The posted code is also not guaranteed to work on binary streams, because according to §7.21.9.2 ¶3 of the ISO C11 standard, binary streams are not required to meaningfully support SEEK_END.
That being said, on most common platforms, the posted code will work, if we assume that the data type long is large enough to represent the size of the file.
However, on Microsoft Windows, the characters \r\n (carriage return followed by line feed) will be translated to \n for text streams (but not for binary streams), so that the file size you get will count \r\n as two bytes, although you are only reading a single character (\n) in text mode. Therefore, the results you get will not be consistent.
On POSIX-based platforms (e.g. Linux), this is not an issue, because on those platforms, there is no difference between text mode and binary mode.
C++ MFC extracted from windows file details, not sure if this is better performing than seek but if it is extracted from metadata I think it is faster because it doesn't need to read the entire file
ULONGLONG GetFileSizeAtt(const wchar_t *wFile)
{
WIN32_FILE_ATTRIBUTE_DATA fileInfo;
ULONGLONG FileSize = 0ULL;
//https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/nf-fileapi-getfileattributesexa?redirectedfrom=MSDN
//https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/ns-fileapi-win32_file_attribute_data?redirectedfrom=MSDN
if (GetFileAttributesEx(wFile, GetFileExInfoStandard, &fileInfo))
{
ULARGE_INTEGER ul;
ul.HighPart = fileInfo.nFileSizeHigh;
ul.LowPart = fileInfo.nFileSizeLow;
FileSize = ul.QuadPart;
}
return FileSize;
}
Try this --
fseek(fp, 0, SEEK_END);
unsigned long int file_size = ftell(fp);
rewind(fp);
What this does is first, seek to the end of the file; then, report where the file pointer is. Lastly (this is optional) it rewinds back to the beginning of the file. Note that fp should be a binary stream.
file_size contains the number of bytes the file contains. Note that since (according to climits.h) the unsigned long type is limited to 4294967295 bytes (4 gigabytes) you'll need to find a different variable type if you're likely to deal with files larger than that.
I have a function that works well with only stdio.h. I like it a lot and it works very well and is pretty concise:
size_t fsize(FILE *File) {
size_t FSZ;
fseek(File, 0, 2);
FSZ = ftell(File);
rewind(File);
return FSZ;
}
Here's a simple and clean function that returns the file size.
long get_file_size(char *path)
{
FILE *fp;
long size = -1;
/* Open file for reading */
fp = fopen(path, "r");
fseek(fp, 0, SEEK_END);
size = ftell(fp);
fclose(fp);
return size;
}
You can open the file, go to 0 offset relative from the bottom of the file with
#define SEEKBOTTOM 2
fseek(handle, 0, SEEKBOTTOM)
the value returned from fseek is the size of the file.
I didn't code in C for a long time, but I think it should work.

Resources