ReadFile function returns ERROR_INVALID_PARAMETER - c

I'm trying to get to work ReadFile function. Here's my code:
#define BUFFERSIZE 5
int main(int argc, char* argv[])
{
OVERLAPPED overlapIn = {};
HANDLE tHandle;
char buf[BUFFERSIZE] = {};
DWORD lpNumberOfBytesRead;
tHandle = CreateFile(
L"\\\\.\\D:",
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
if (tHandle == INVALID_HANDLE_VALUE)
{
DWORD error = GetLastError();
assert(0);
}
if (ReadFile(tHandle, &buf, BUFFERSIZE - 1, &lpNumberOfBytesRead, NULL) == 0)
{
int error = GetLastError();
printf("Terminal failure: Unable to read from disk.\n GetLastError=%d\n", error);
CloseHandle(tHandle);
return 1;
}
The GetLastError function returns code 87, which is ERROR_INVALID_PARAMETER.
It's clear that one of the parameters is wrong, but I have no idea which one, since I tried to do everything like it's written in the documentation.

This is described in the documentation for CreateFile:
Volume handles can be opened as noncached at the discretion of the particular file system, even when the noncached option is not specified in CreateFile. You should assume that all Microsoft file systems open volume handles as noncached.
The MSDN article on File Buffering describes the requirements for noncached handles:
File access sizes, including the optional file offset in the OVERLAPPED structure, if specified, must be for a number of bytes that is an integer multiple of the volume sector size. For example, if the sector size is 512 bytes, an application can request reads and writes of 512, 1,024, 1,536, or 2,048 bytes, but not of 335, 981, or 7,171 bytes.
File access buffer addresses for read and write operations should be physical sector-aligned, which means aligned on addresses in memory that are integer multiples of the volume's physical sector size. Depending on the disk, this requirement may not be enforced.
Rigorous code should check the sector size for the file system in question, then use this approach to allocate the memory. However, in my experience, the sector size has always been less than or equal to the allocation granularity, so you can get away with just using VirtualAlloc() to allocate a memory block.

buffer size needs to be aligned with hdd sectorsize
WIN32_FIND_DATA atr = {0};
DWORD BYTES_PER_SECTOR;
char path[MAX_PATH];
/* get path length current dir */
const size_t len = GetCurrentDirectory(0, 0);
/* set path to path char array */
GetCurrentDirectory(len, path);
/* windows function to get disk details */
GetDiskFreeSpace(NULL, NULL, &BYTES_PER_SECTOR, NULL, NULL);
/* find first file in dir */
find = FindFirstFile(path, &atr);
for(;find != INVALID_HANDLE_VALUE;){
/* get the file size */
DWORD filesize = atr.nFileSizeLow;
if(atr.nFileSizeHigh > 0){
filesize = atr.nFileSizeHigh;
filesize = (filesize << 31);
filesize = atr.nFileSizeLow;
}
/* sector size aligned file size */
size_t buffer_size = ((BYTES_PER_SECTOR + ((filesize + BYTES_PER_SECTOR)-1)) & ~(BYTES_PER_SECTOR -1));
/* create buffer */
DWORD buffer[buffer_size];
/* create a new file or open an existing file */
handle = CreateFile(&path[0], GENERIC_READ | GENERIC_WRITE, 0 , NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL | FILE_FLAG_NO_BUFFERING, NULL))!=INVALID_HANDLE_VALUE)
/* read the file in to buffer */
ReadFile(handle, (void*)&buffer, buffer_size, &bytesread, NULL)
if(FindNextFile(find, &atr)==0){ printf("last file processed, leaving\n");break;};
}
CloseHandle(file);
FindClose(find);

Related

Mapping files into virtual memory in C on Windows

On POSIX systems, I am able to use the mmap function to read the contents of a file faster than getline, getc, etc. This is important in the program that I am developing as it is expected to read very large files into memory; iteratively collecting lines using getline is too costly. Portability is also a requirement of my software, so if I use mmap, I need to find a way to memory map files using the WinApi, as I'd rather not compile through cygwin/msys. From a cursory search I identified this MSDN article which describes very briefly a way to map files into memory, however, from trawling through documentation I can't make head nor tails of how to actually implement it, and I'm stuck on finding example snippets of code, like there are for POSIX mmap.
How do I use the WinApi's memory mapping options to read a file into a char*?
How do I use the WinApi's memory mapping options to read a file into a
char*?
Under Windows, when you map a file in memory, you get a pointer to the memory location where the first byte of the file has been mapped. You can cast that pointer to whatever datatype you like, including char*.
In other words, it is Windows which decide where the mapped data will be in memory. You cannot provide a char* and expect Windows will load data there.
This means that if you already have a char* and want the data from the file in the location pointed by that char*, then you have to copy it. Not a good idea in terms of performances.
Here is a simple program dumping a text file by mapping the file into memory and then displaying all ASCII characters. Tested with MSVC2019.
#include <stdio.h>
#include <Windows.h>
int main(int argc, char *argv[])
{
TCHAR *lpFileName = TEXT("hello.txt");
HANDLE hFile;
HANDLE hMap;
LPVOID lpBasePtr;
LARGE_INTEGER liFileSize;
hFile = CreateFile(lpFileName,
GENERIC_READ, // dwDesiredAccess
0, // dwShareMode
NULL, // lpSecurityAttributes
OPEN_EXISTING, // dwCreationDisposition
FILE_ATTRIBUTE_NORMAL, // dwFlagsAndAttributes
0); // hTemplateFile
if (hFile == INVALID_HANDLE_VALUE) {
fprintf(stderr, "CreateFile failed with error %d\n", GetLastError());
return 1;
}
if (!GetFileSizeEx(hFile, &liFileSize)) {
fprintf(stderr, "GetFileSize failed with error %d\n", GetLastError());
CloseHandle(hFile);
return 1;
}
if (liFileSize.QuadPart == 0) {
fprintf(stderr, "File is empty\n");
CloseHandle(hFile);
return 1;
}
hMap = CreateFileMapping(
hFile,
NULL, // Mapping attributes
PAGE_READONLY, // Protection flags
0, // MaximumSizeHigh
0, // MaximumSizeLow
NULL); // Name
if (hMap == 0) {
fprintf(stderr, "CreateFileMapping failed with error %d\n", GetLastError());
CloseHandle(hFile);
return 1;
}
lpBasePtr = MapViewOfFile(
hMap,
FILE_MAP_READ, // dwDesiredAccess
0, // dwFileOffsetHigh
0, // dwFileOffsetLow
0); // dwNumberOfBytesToMap
if (lpBasePtr == NULL) {
fprintf(stderr, "MapViewOfFile failed with error %d\n", GetLastError());
CloseHandle(hMap);
CloseHandle(hFile);
return 1;
}
// Display file content as ASCII charaters
char *ptr = (char *)lpBasePtr;
LONGLONG i = liFileSize.QuadPart;
while (i-- > 0) {
fputc(*ptr++, stdout);
}
UnmapViewOfFile(lpBasePtr);
CloseHandle(hMap);
CloseHandle(hFile);
printf("\nDone\n");
}

Prevent CreateFileMapping from altering source file (Windows)

I'm studying Memory Mapping on Windows and wrote the following piece code (I omitted error handling from the copy for the sake of readability):
HANDLE file_h = CreateFile(filename, GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
HANDLE map_h = CreateFileMapping(file_h, NULL, PAGE_READWRITE, 0, lengthOfFile + padding, NULL);
char * map;
map = MapViewOfFile(map_h, FILE_MAP_COPY, 0, 0, lengthOfFile + padding);
Where lengthOfFile is the length of the file I previously calculated this way (for some reasons that are unrelevant to this case):
FILE * file;
file = fopen(filename,"rb");
fseek(file, 0, SEEK_END);
int lengthOfFile = ftell(file);
fseek(file, 0, SEEK_SET);
int last = (lengthOfFile % 4);
int n_pack = (int)(lengthOfFile / 4);
int padding = 4 - last;
And padding is some additional length I needed to add (you may figure out why by reading the code above).
After that, I perform some operations with the memory mapped file which involve its modification and the dispatch of its new value to another function.
How can I make it so that when I close both the file_h and map_h handles the source file (filename) stays unaltered (right now, as soon as I close its handle it gets modified because of that additional padding which apparently gets "flushed" to the source file right after its handle is closed) ?
I tried using the PAGE_WRITECOPY flag alongside the PAGE_READWRITE (which is needed to modify the content of the memory mapped file) one, but the CreateFileMapping function fails returning a ERROR_INVALID_PARAMETER (87) error.
In other words, I need to achieve the same behavior I managed to get in Unix using:
mmap(0,lengthOfFile + padding,PROT_READ | PROT_WRITE, **MAP_PRIVATE**,fileno(file),0);
I guess the key point is the MAP_PRIVATE attribute.

SetFilePointerEx get file size

I am new in programming, especially in Windows System Programming and I 'm reading a relevant book. Currently I 'm playing arround with GetFileSizeEx, SetFilePointer and SetFilePointerEx in order to get the file size of a file.
I have created this code that works until line 65 where I can't get SetFilePointerEx to work to get the size.
#include <Windows.h>
#include <tchar.h>
#include <stdio.h>
#define BUFF_SIZE 0x100
// program to test file size
int _tmain(DWORD argc, LPTSTR argv[])
{
HANDLE hIn;
HANDLE hOut;
LARGE_INTEGER liSize;
LONG lSize, lDistance = 0;
TCHAR szMsgGetFile[BUFF_SIZE];
TCHAR szMsgSetFile[BUFF_SIZE];
DWORD nIn;
LARGE_INTEGER liPt;
PLARGE_INTEGER pLi;
pLi = &liPt;
SecureZeroMemory(&liSize, sizeof(LARGE_INTEGER));
SecureZeroMemory(&pLi, sizeof(LARGE_INTEGER));
SecureZeroMemory(szMsgGetFile, _tcslen(szMsgGetFile));
SecureZeroMemory(szMsgSetFile, _tcslen(szMsgSetFile));
//get input and output handles
hIn = CreateFile(argv[1], GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (hIn == INVALID_HANDLE_VALUE)
_tprintf(_T("[ERROR] CreateFile to get file input handle failed. Error code %d.\n"), GetLastError());
hOut = CreateFile(_T("CONOUT$"), GENERIC_WRITE, 0, NULL, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
if (hOut == INVALID_HANDLE_VALUE)
_tprintf(_T("[ERROR] CreateFile to get file output handle failed. Error code %d.\n"), GetLastError());
//get the size of the file with GetFileSizeEx, acquired from hIn that is argv1
if (!GetFileSizeEx(hIn, &liSize))
_tprintf(_T("[ERROR] GetFileSizeEx failed. Error code %d\n"), GetLastError());
//get the size of the file with SetFilePointer
//You can obtain the file length by specifying a zero-length move from the end of
//file, although the file pointer is changed as a side effect
lSize = SetFilePointer(hIn, lDistance, NULL, FILE_END);
if (lSize == INVALID_SET_FILE_POINTER)
_tprintf(_T("[ERROR] SetFilePointer failed. Error code %d\n"), GetLastError());
//output the size with WriteConsole (and sprintf)
//and with _tprintf. Notice the usage of the liSize LARGE_INTEGER
_stprintf_s(szMsgGetFile, BUFF_SIZE, "[*] GetFileSizeEx (WriteConsole): The size is %I64d Bytes.\n", liSize.QuadPart);
if (!WriteConsole(hOut, szMsgGetFile, _tcslen(szMsgGetFile), &nIn, NULL))
_tprintf(_T("[ERROR] WriteConsole failed. Error code %d\n"), GetLastError());
_tprintf(_T("[*] GetFileSizeEx (tprintf): The size is %I64d Bytes.\n"), liSize.QuadPart);
//output the size with WriteConsole (and sprintf)
//and _tprintf
_stprintf_s(szMsgSetFile, BUFF_SIZE, "[*] SetFilePointer (WriteConsole): The size is %ld Bytes.\n", lSize);
if (!WriteConsole(hOut, szMsgSetFile, _tcslen(szMsgSetFile), &nIn, NULL))
_tprintf(_T("[ERROR] WriteConsole failed. Error code %d\n"), GetLastError());
_tprintf(_T("[*] SetFilePointer (tprintf): The size is %ld Bytes.\n"), lSize);
//get the size of the file with SetFilePointerEx
//Determine a file’s size by positioning 0 bytes from the end and using the file
//pointer value returned by SetFilePointerEx.
SecureZeroMemory(&liPt, sizeof(LARGE_INTEGER));
SetFilePointerEx(hIn, liPt, pLi, FILE_END);
_tprintf(_T("[*] SetFilePointerEx: %lld Bytes.\n"), pLi->QuadPart);
return 0;
}
MSDN says that
You can use SetFilePointerEx to determine the length of a file. To do this, use FILE_END for dwMoveMethod and seek to location zero. The file offset returned is the length of the file.
However, SetFilePointerEx is of type BOOL. The "Windows System Programming" book says that the "Determine a file’s size by positioning 0 bytes from the end and using the file pointer value returned by SetFilePointerEx.". I am guessing that this parameter is the _Out_opt_ PLARGE_INTEGER lpNewFilePointer according to MSDN.
I would like help on how to get the file size of the file by using SetFilePointerEx.
You have a number of errors in your code. Here's an example of SetFilePointerEx that works. In general, Win32 functions don't allocate the memory to store their output (some do). It's up to the caller to allocate memory. In this case, the memory for the output of SetFilePointerEx is allocated on the stack by declaring size2 to be a LARGE_INTEGER. A pointer to that LARGE_INTEGER is then provided to SetFilePointerEx.
auto hIn = CreateFile(_T("C:\\foo"), GENERIC_READ, FILE_SHARE_READ, nullptr, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, nullptr);
/* check for errors */
LARGE_INTEGER size;
GetFileSizeEx(hIn, &size);
/* check for errors */
LARGE_INTEGER size2;
LARGE_INTEGER offset;
ZeroMemory(&offset, sizeof offset);
SetFilePointerEx(hIn, offset, &size2, FILE_END);
/* check for errors */
Optionally, DWORD dwFileSize = GetFileSize(hFile, NULL); can get the size of the file opened by HANDLE hFile = CreateFileA/W(...);.

reading and writing in chunks on linux using c

I have a ASCII file where every line contains a record of variable length. For example
Record-1:15 characters
Record-2:200 characters
Record-3:500 characters
...
...
Record-n: X characters
As the file sizes is about 10GB, i would like to read the record in chunks. Once read, i need to transform them, write them into another file in binary format.
So, for reading, my first reaction was to create a char array such as
FILE *stream;
char buffer[104857600]; //100 MB char array
fread(buffer, sizeof(buffer), 104857600, stream);
Is it correct to assume, that linux will issue one system call and fetch the entire 100MB?
As the records are separated by new line, i search for character by character for a new line character in the buffer and reconstruct each record.
My question is that is this how i should read in chunks or is there a better alternative to read data in chunks and reconstitute each record? Is there an alternative way to read x number of variable sized lines from an ASCII file in one call ?
Next during write, i do the same. I have a write char buffer, which i pass to fwrite to write a whole set of records in one call.
fwrite(buffer, sizeof(buffer), 104857600, stream);
UPDATE: If i setbuf(stream, buffer), where buffer is my 100MB char buffer, would fgets return from buffer or cause a disk IO?
Yes, fread will fetch the entire thing at once. (Assuming it's a regular file.) But it won't read 105 MB unless the file itself is 105 MB, and if you don't check the return value you have no way of knowing how much data was actually read, or if there was an error.
Use fgets (see man fgets) instead of fread. This will search for the line breaks for you.
char linebuf[1000];
FILE *file = ...;
while (fgets(linebuf, sizeof(linebuf), file) {
// decode one line
}
There is a problem with your code.
char buffer[104857600]; // too big
If you try to allocate a large buffer (105 MB is certainly large) on the stack, then it will fail and your program will crash. If you need a buffer that big, you will have to allocate it on the heap with malloc or similar. I'd certainly keep stack usage for a single function in the tens of KB at most, although you could probably get away with a few MB on most stock Linux systems.
As an alternative, you could just mmap the entire file into memory. This will not improve or degrade performance in most cases, but it easier to work with.
int r, fdes;
struct stat st;
void *ptr;
size_t sz;
fdes = open(filename, O_RDONLY);
if (fdes < 0) abort();
r = fstat(fdes, &st);
if (r) abort();
if (st.st_size > (size_t) -1) abort(); // too big to map
sz = st.st_size;
ptr = mmap(NULL, sz, PROT_READ, MAP_SHARED, fdes, 0);
if (ptr == MAP_FAILED) abort();
close(fdes); // file no longer needed
// now, ptr has the data, sz has the data length
// you can use ordinary string functions
The advantage of using mmap is that your program won't run out of memory. On a 64-bit system, you can put the entire file into your address space at the same time (even a 10 GB file), and the system will automatically read new chunks as your program accesses the memory. The old chunks will be automatically discarded, and re-read if your program needs them again.
It's a very nice way to plow through large files.
If you can, you might find that mmaping the file will be easiest. mmap maps a (portion of a) file into memory so the whole file can be accessed essentially as an array of bytes. In your case, you might not be able to map the whole file at once it would look something like:
#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/mman.h>
/* ... */
struct stat stat_buf;
long pagesz = sysconf(_SC_PAGESIZE);
int fd = fileno(stream);
off_t line_start = 0;
char *file_chunk = NULL;
char *input_line;
off_t cur_off = 0;
off_t map_offset = 0;
/* map 16M plus pagesize to ensure any record <= 16M will always fit in the mapped area */
size_t map_size = 16*1024*1024+pagesz;
if (map_offset + map_size > stat_buf.st_size) {
map_size = stat_buf.st_size - map_offset;
}
fstat(fd, &stat_buf);
/* map the first chunk of the file */
file_chunk = mmap(NULL, map_size, PROT_READ, MAP_SHARED, fd, map_offset);
// until we reach the end of the file
while (cur_off < stat_buf.st_size) {
/* check if we're about to read outside the current chunk */
if (!(cur_off-map_offset < map_size)) {
// destroy the previous mapping
munmap(file_chunk, map_size);
// round down to the page before line_start
map_offset = (line_start/pagesz)*pagesz;
// limit mapped region to size of file
if (map_offset + map_size > stat_buf.st_size) {
map_size = stat_buf.st_size - map_offset;
}
// map the next chunk
file_chunk = mmap(NULL, map_size, PROT_READ, MAP_SHARED, fd, map_offset);
// adjust the line start for the new mapping
input_line = &file_chunk[line_start-map_offset];
}
if (file_chunk[cur_off-map_offset] == '\n') {
// found a new line, process the current line
process_line(input_line, cur_off-line_start);
// set up for the next one
line_start = cur_off+1;
input_line = &file_chunk[line_start-map_offset];
}
cur_off++;
}
Most of the complication is to avoid making too huge a mapping. You might be able to map the whole file using
char *file_data = mmap(NULL, stat_buf.st_size, PROT_READ, MAP_SHARED, fd, 0);
my opinion is using fgets(buff) for auto detect new line.
and then use strlen(buff) for counting the buffer size,
if( (total+strlen(buff)) > 104857600 )
then write in new chunk..
But the chunk's size will hardly be 104857600 bytes.
CMIIW

What is the fastest way to overwrite an entire file with zeros in C?

What I need to do is to fill the entire file contents with zeros in the fastest way. I know some linux commands like cp actually gets what is the best block size information to write at a time, but I wasn't able to figure out if using this block size information is enough to have a nice performance and looks like the st_blksize from the stat() isn't giving me that block size.
Thank you !
Some answers to the comments:
This need to be done in C, not using utilities like shred.
There is no error in the usage of the stat()
st_blksize is returning a block greater than the file size,
don't know how can I handle that.
Using truncate()/ftruncate(), only the extra space is filled with
zeros, I need to overwrite the entire file data.
I'm thinking in something like:
fd = open("file.txt", O_WRONLY);
// check for errors (...)
while(TRUE)
{
ret = write(fd, buffer, sizeof(buffer));
if (ret == -1) break;
}
close(fd);
The problem is how to define the best buffer size "programmatically".
Fastest and simplest:
int fd = open("file", O_WRONLY);
off_t size = lseek(fd, 0, SEEK_END);
ftruncate(fd, 0);
ftruncate(fd, size);
Obviously it would be nice to add some error checking.
This solution is not what you want for secure obliteration of the file though. It will simply mark the old blocks used by the file as unused and leave a sparse file that doesn't occupy any physical space. If you want to clear the old contents of the file from the physical storage medium, you might try something like:
static const char zeros[4096];
int fd = open("file", O_WRONLY);
off_t size = lseek(fd, 0, SEEK_END);
lseek(fd, 0, SEEK_SET);
while (size>sizeof zeros)
size -= write(fd, zeros, sizeof zeros);
while (size)
size -= write(fd, zeros, size);
You could increase the size of zeros up to 32768 or so if testing shows that it improves performance, but beyond a certain point it should not help and will just be a waste.
With mmap (and without error checking):
stat(filename,&stat_buf);
len=stat_buf.st_size;
fd=open(filename,O_RDWR);
ptr=mmap(NULL,len,PROT_READ|PROT_WRITE,MAP_SHARED,fd,0);
memset(ptr,0,len);
munmap(ptr,len);
close(fd);
This should use the kernel's idea of block size, so you don't need to worry about it. Unless the file is larger than your address space.
This is my idea; notice I removed every error checking code for clarity.
int f = open("file", "w"); // open file
int len = lseek(f, 0, SEEK_END); // and get its length
lseek(f, 0, SEEK_BEG); // then go back at the beginning
char *buff = malloc(len); // create a buffer large enough
memset(buff, 0, len); // fill it with 0s
write(f, buff, len); // write back to file
close(f); // and close

Resources