New to C programming.
The following section of code attempts to read a tab-separated list of MD5 (32 chars) and corresponding description (up to 128 chars) from a text file (utf-8), but is causing the application to crash:
HANDLE hFile = CreateFileW(good_path, GENERIC_READ, FILE_SHARE_READ, NULL,
OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL | FILE_FLAG_OVERLAPPED, NULL);
if (hFile == INVALID_HANDLE_VALUE)
{
return FALSE;
}
LPWSTR md5 = malloc(sizeof(wchar_t) * 32);
LPWSTR desc = malloc(sizeof(wchar_t) * 128);
int i;
while((i = fwscanf(hFile, L"%ls %ls", md5, desc)) != EOF)
{
if (i == 2) // OK
{
}
else // Something went wrong
{
}
}
CloseHandle(hFile);
return TRUE;
Few questions:
Is my use of malloc(...) correct?
What might be causing the crash?
Update 1
I've taken this code and made it into a standalone exe (rather than a DLL). Still crashes.
Update 2
Updated to fwscanf as per Chris's comment. Still crashes. If I comment out the while...fwscanf... line it exits properly.
CreateFileW() returns a Windows handle, which is sort of like a file number but different somehow. fwscanf() expects a FILE* not a Windows handle; to get a FILE* open your file with fopen() or _wfopen().
%s stores a null terminator. Malloc 33 and 129 chars.
%s stores a nul-terminated string under the address your provide. To store n significant characters without buffer overflow, you need to provide an address of n+1 long buffer.
Related
Let's say I decided to open an existing file with the CreateFile function. The content of the file is Hello world. There is also a buffer (char array with size 11 and filled with zero bytes) that should contain the contents of the file.
And when I try to read the file with the ReadFile function, certain garbage is written to the buffer. The debugger (I've tried GDB and LLDB) says that the contents of the buffer after reading is \377?H\000e\000l\000l\000o\000\000w\000o\000r\000l\000d\000\r\000\n\000\r\000 \n, '\000', and in a human-readable form, it looks like this ■ H.
I've tried not filling the buffer with zeros. I tried to write (with WriteFile) to a file first, then read. I also tried to change the value of how many bytes to read with ReadFile. But it still doesn't change anything.
Also, GetLastError returns ERROR_SUCCESS.
Code:
#define WIN32_LEAN_AND_MEAN
#include <Windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
HANDLE file = CreateFile("./test_file.txt", GENERIC_READ, FILE_SHARE_READ,
NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (file == INVALID_HANDLE_VALUE) {
puts("Failed to open");
return EXIT_FAILURE;
}
size_t length = strlen("Hello world"); /* 11 */
char buffer[12];
DWORD count = 0; /* Is always 11 (length of "Hello world") after reading */
memset(buffer, '\0', length + 1);
if (!ReadFile(file, buffer, (DWORD) length, &count, NULL)) {
puts("Failed to read or EOF reached.");
CloseHandle(file);
return EXIT_FAILURE;
}
printf("buffer: '%s'\n", buffer);
printf("count: %lu\n", count);
CloseHandle(file);
return EXIT_SUCCESS;
}
In the console, the output of the program looks like this:
buffer: ' ■ H'
count: 11
The text file itself is not written in a 7bit ASCII or 8bit UTF-8 byte encoding, like you are expecting. It is actually written in a UTF-16 byte encoding, with a BOM at the front of the file (bytes 0xFF 0xFE for UTF-16LE). You are simply reading the file's raw bytes and displaying them as-is without any regard to their encoding.
I am reading data from an input file and compressing it with bzip library function calls BZ2_bzCompress in C. I can compress the data successfully. But I cannot write all the compressed data to an output file. Only the first compressed line can be written. Am I missing something here.
int main()
{
bz_stream bz;
FILE* f_d;
FILE* f_s;
BZFILE* b;
int bzerror = -10;
unsigned int nbytes_in;
unsigned int nbytes_out;
char buf[3000] = {0};
int result = 0;
char buf_read[500];
char file_name[] = "/path/file_name";
long int save_pos;
f_d = fopen ( "myfile.bz2", "wb+" );
f_s = fopen(file_name, "r");
if ((!f_d) && (!f_s)) {
printf("Cannot open files");
return(-1);
}
bz.opaque = NULL;
bz.bzalloc = NULL;
bz.bzfree = NULL;
result = BZ2_bzCompressInit(&bz, 1, 2, 30);
while (fgets(buf_read, sizeof(buf_read), f_s) != NULL)
{
bz.next_in = buf_read;
bz.avail_in = sizeof(buf_read);
bz.next_out = buf;
bz.avail_out = sizeof(buf);
printf("%s\n", buf_read);
save_pos = ftell(f_d);
fseek(f_d, save_pos, SEEK_SET);
while ((result == BZ_RUN_OK) || (result == 0) || (result == BZ_FINISH_OK))
{
result = BZ2_bzCompress(&bz, (bz.avail_in) ? BZ_RUN : BZ_FINISH);
printf("2 result:%d,in:%d,outhi:%d, outlo:%d \n",result, bz.total_in_lo32, bz.total_out_hi32, bz.total_out_lo32);
fwrite(buf, 1, bz.total_out_lo32, f_d);
}
if (result == BZ_STREAM_END)
{
result = BZ2_bzCompressEnd(&bz);
}
printf("3 result:%d, out:%d\n", result, bz.total_out_lo32);
result = BZ2_bzCompressInit(&bz, 1, 2, 30);
memset(buf, 0, sizeof(buf));
}
fclose(f_d);
fclose(f_s);
return(0);
}
TL;DR: there are multiple problems, but the main one that explains the problem you asked about is likely that you compress each line of the file independently, instead of the whole file as a unit.
According to the docs of BZ2_bzCompressInit, the bz_stream argument should be allocated and initialized before the call. Yours is (automatically) allocated, but not (fully) initialized. It would be clearer and easier to change to
bz_stream bz = { 0 };
and then skip the assignments to bz.opaque, bz.alloc, and bz.free.
You store but do not really check the return value of your BZ2_bzCompressInit call. It does eventually get tested in the condition of the inner while loop, but you do not detect error conditions there, but instead just success and normal completion conditions.
Your handling of the input buffer is significantly flawed.
In the first place, you set the number of available input bytes incorrectly:
bz.avail_in = sizeof(buf_read);
Since you're using fgets() to read data into the buffer, under no circumstances is the full size of the buffer occupied by input data, because fgets() ensures that a string terminator is written into the array. In fact, it could be worse because fgets() will stop at after newlines, so it may provide as few as just one input byte on a successful read.
If you want to stick with fgets() then you need to use strlen() to determine the number of bytes available from each read, but I would suggest that you instead switch to fread(), which will more reliably fill the buffer, indicate with its return value how many bytes were read, and correctly handle inputs containing null bytes.
In the second place, you use BZ2_bzCompress() to compress each buffer of input as if it were a complete file. When you come to the end of the buffer, you finish a compression run and reinitialize the bz_stream. This will definitely interfere with decompressing, and it may explain why your program (seems to) compress only the first line of its input. You should be reading the whole content of the file (in suitably-sized chunks) and feeding all of it to BZ2_bzCompress(... BZ_RUN) before you finish up. There should be one sequence of calls to BZ2_bzCompress(... BZ_FINISH) and finally one call to BZ2_bzCompressEnd() for the whole file, not per line.
You do not perform error detection or handling for any of your calls to standard library or bzip functions. You do handle the expected success-case return values for some of these, but you need to be rpepared for errors, too.
There are some additional oddities
you have unused variables nbytes_in, nbytes_out, bzerror, and b.
you open the input file as a text file, though whether that makes any difference is platform-dependent.
the ftell() / fseek() pair has no overall effect other than setting save_pos, which is not otherwise used.
although it is not harmful, it also is not useful to memset() the output buffer to all-zeroes at the end of each line (or initially).
Given that you're compressing the input, it's odd (but again not harmful) that you provide six times as much output buffer as you do input buffer.
In my code I am sending sending packets each with a 128 bytes from the text file and need to read in data from a text file (I can't just allocated a buffer and read all of it before sending because the file will be extremely large). For some reason I am getting an Abort 6 error even when I have allocated memory.
SendIndex starts as 0 and it aborts for the first send so that shouldn't be the problem.
The problem occurs during strcpy I just don't know why.
Really confused so I would really appreciate the help.
struct packet packingT;
packingT.header = mpHeaderT;
packingT.data = (char*) calloc(512,sizeof(char));
char* sendString = (char*)calloc(128,sizeof(char));
FILE *file = fopen(receivedStruct->fileTitle, "rb");
if(file == NULL) {
printf("Error - Can't Open File\n");
exit(0);
}
fseek(file, 128*sendIndex, SEEK_SET);
fread(sendString, 128, 1,file);
fclose(file);
// sendString[128] = '\0'; <--- Still don't know if this is needed
packingT.header->seq_num = receivedStruct->nextSeqNum;
strcpy(packingT.data, sendString);
I think all you need to do is replace the final strcpy with memcpy instead. That is, the last line should be memcpy(packingT.data, sendString, 128);
(Edit: The reason being that strcpy determines the length of the thing to be copied by scanning for a zero at the end. You're reading arbitrary data, which may have zeros in the middle, and may not always end in a zero)
(Edit2: please be aware that the content of packingT.data is not terminated, so you can't use string functions on it. Depending on what you're doing, you might need to add a terminator, or ensure one gets written to the file)
For something I am doing I would like to get the external IP of the PC running the program (written in C). So far I have found the best way is to connect to a site that simply displays the IP of the visitor, and then parse the webpage for the IP. The first part was easy, but when I display the buffer I read the page (which only visibly consisted of my IP) I get a few random extra symbols/characters after the IP. Here is the code I am using ATM (simplified to exclude other stuff):
HINTERNET OpenInternet = NULL;
HINTERNET GetIP = NULL;
DWORD BytesRead = 0;
char IPGrabbed[30];
OpenInternet = InternetOpen("Microsoft Internet Explorer", INTERNET_OPEN_TYPE_DIRECT, NULL, NULL, 0);
if (OpenInternet == NULL) {
return 1;
}
GetIP = InternetOpenUrl(OpenInternet, "http://api.externalip.net/ip/", NULL, 0, INTERNET_FLAG_RELOAD, 0);
if (GetIP == NULL)
return 1;
if (!InternetReadFile(GetIP, &IPGrabbed, sizeof(IPGrabbed), &BytesRead))
return 1;
printf("IP: %s", IPGrabbed);
getchar();
I also tried parsing through IPGrabbed stopping at any '\n' or '\r' (because it displays the weird characters on the line below the IP when I printf() it) and then copying everything up till there to another char array, but got the same result. Could anyone help me figure out what is going on here? Thank you.
Initialise the buffer to all 0s and then read one character less then the buffer to read into provides.
This way the 0-terminator a C-"string" relies on is provided implicitly.
char IPGrabbed[30] = ""; /* Initialise the buffer to all `0`s ... */
[...]
/* ... and then read one character less then the buffer to read into provides. */
if (!InternetReadFile(GetIP, &IPGrabbed, sizeof(IPGrabbed) - 1, &BytesRead))
return 1;
fprintf(stderr, "IP: %s", IPGrabbed); /* Print to stderr, as it's not buffered so
everything appear immediately to the console. */
The result from InternetReadFile is not null-terminated, you need to add a null character to the end of the string by code after the read is successful:
IPGrabbed[BytesRead] = 0;
Edit 1
As suggested in the comment by Jonathan Potter, the above code may be subjected to a buffer overflow error if the site being accessed is returning anything longer than a IP string (maximum 16 characters).
Suggest to change the InternetReadFile to read 1 less of the buffer length instead of full buffer length to eliminate the above problem.
InternetReadFile(GetIP, &IPGrabbed, sizeof(IPGrabbed)-1, &BytesRead)
All,
I'm using MapViewOfFile to hold part of a file in memory. There is a stream that points to this file and writes to it, and then is rewound. I use the pointer to the beginning of the mapped file, and read until I get to the null char I write as the final character.
int fd;
yyout = tmpfile();
fd = fileno(yyout);
#ifdef WIN32
HANDLE fm;
HANDLE h = (HANDLE) _get_osfhandle (fd);
fm = CreateFileMapping(
h,
NULL,
PAGE_READWRITE|SEC_RESERVE,
0,
4096,
NULL);
if (fm == NULL) {
fprintf (stderr, "%s: Couldn't access memory space! %s\n", argv[0], strerror (GetLastError()));
exit(GetLastError());
}
bp = (char*)MapViewOfFile(
fm,
FILE_MAP_ALL_ACCESS,
0,
0,
0);
if (bp == NULL) {
fprintf (stderr, "%s: Couldn't fill memory space! %s\n", argv[0], strerror (GetLastError()));
exit(GetLastError());
}
Data is sent to the yyout stream, until flushData() is called. This writes a null to the stream, flushes, and then rewinds the stream. Then I start from the beginning of the mapped memory, and read chars until I get to the null.
void flushData(void) {
/* write out data in the stream and reset */
fprintf(yyout, "%c%c%c", 13, 10, '\0');
fflush(yyout);
rewind(yyout);
if (faqLine == 1) {
faqLine = 0; /* don't print faq's to the data file */
}
else {
char * ps = bp;
while (*ps != '\0') {
fprintf(outstream, "%c%c", *ps, blank);
ps++;
}
fflush(outfile);
}
fflush(yyout);
rewind(yyout);
}
After flushing, more data is written to the stream, which should be set to the start of the memory area. As near as I can determine with gdb, the stream is not getting rewound, and eventually fills up the allocated space.
Since the stream points to the underlying file, this does not cause a problem initially. But, when I attempt to walk the memory, I never find the null. This leads to a SIGSEV. If you want more details of why I need this, see here.
Why am I not reusing the memory space as expected?
I think this line from the MSDN documentation for CreateFileMapping might be the clue.
A mapped file and a file that is accessed by using the input and output (I/O) functions (ReadFile and WriteFile) are not necessarily coherent.
You're not apparently using Read/WriteFile, but the documentation should be understood in terms of mapped views versus explicit I/O calls. In any case, the C RTL is surely implemented using the Win32 API.
In short, this approach is problematic.
I don't know why changing the view/file size helps; perhaps it just shifts the undefined behaviour in a direction that happens to be beneficial.
Well, after working on this for a while, I have a working solution. I don't know why this succeeds, so if someone comes up with something better, I'll be happy to accept their answer instead.
fm = CreateFileMapping(
h,
NULL,
PAGE_READWRITE|SEC_RESERVE,
0,
16384,
NULL);
As you can see, the only change is to the size declared from 4096 to 16384. Why this works when the total chars input at a time is no more than 1200, I don't know. If someone could provide details on this, I would appreciate it.
When you're done with the map, simply un-map it.
UnmapViewOfFile(bp);