Why fread() is not getting the expected bytes? - c

I'm trying to read file in binary mode using C, but it only stores in the buffer the first 43 characters.
I want to read the file in groups of 245 bytes. It contains multi-character bytes and also null chars.
This is the content of the file in hex:
323031353037303735393036333130343739332032373231333732534e30323033323545533036303130340000000008557c0000000000693c0000000000000c0000000008557c0000000000693c0000000000000c0000000008557c0000000000693c0000000000000c00001c00001c00001c00000c00000c00000c00001c4d4e202020204942202020204f393920202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202039444b524d4144
And this is the code that I have:
char* to_hex(const char* strin) {
char * strout = malloc(2 * strlen(strin) + 1);
int x;
for (x = 0; x < strlen(strin);x++){
sprintf(&strout[x+x],"%02x", (int)(*(unsigned char*)(&strin[x])) );
}
strout[2 * strlen(strin)]='\0'
return strout;
}
int main(int argc, char *argv[]) {
FILE * pfinput = fopen("stack.bin", "rb");
int lrec = 245;
char* sbuff = (char *)malloc((lrec + 1) * sizeof(char));
if (pfinput != NULL) {
while (fread (sbuff, 1, lrec, pfinput) > 0){
sbuff[lrec] = '\0';
printf("len=%d hex=%s\n\n", strlen(sbuff), to_hex(sbuff) );
}
}
return 0;
}
It returns the following:
len=43 hex=323031353037303735393036333130343739332032373231333732534e3032303332354553303630313034
Why it only reads 43 characters instead of 245?
Do you have any alternative to do it?

When your string has embedded null characters, you cannot use strlen to reliably compute the number of characters. You need to capture the number of characters read by fread and use it.
int nread = 0;
while (( nread = fread (sbuff, 1, lrec, pfinput)) > 0)
Instead of
printf("len=%d hex=%s\n\n", strlen(sbuff), to_hex(sbuff) );
You need to use:
printf("len=%d hex=%s\n\n", nread, to_hex(sbuff) );
You'll also need to pass nread to to_hex so that you are able to treat the embedded null characters appropriately in that function.
char* to_hex(const char* strin, int nread) {
char * strout = malloc(2 * nread + 1);
int x;
for (x = 0; x < nread; x++){
sprintf(&strout[x+x],"%02x", (int)(*(unsigned char*)(&strin[x])) );
}
strout[2 * nread]='\0';
return strout;
}
After that, the printf line needs to be:
printf("len=%d hex=%s\n\n", nread, to_hex(sbuff, nread) );
PS Note that you are leaking memory here. Memory allocated by to_hex is used in the call to printf but after that it is not deallocated. You might want to capture that memory in a variable and deallocate it.
char* hexstring = to_hex(sbuff, nread);
printf("len=%d hex=%s\n\n", nread, hexstring);
free(hexstring);
Also, deallocate sbuff before returning from main.
free(sbuff);
PS 2 I would simplify the line
sprintf(&strout[x+x],"%02x", (int)(*(unsigned char*)(&strin[x])) );
to
int c = strin[x];
sprintf(&strout[x+x],"%02x", c );

'Upon successful completion, fread() shall return the number of elements successfully read',
The vaue returned by fread() is the ONLY way to determine how many bytes have been read into your buffer. At the end of the file, it is possible that less than 'lrec' chars may be read.

Related

C: counting the string's length before using scanf

I am trying to find a way to manipulate strings in C in a more efficient way (maybe like how java does it).
One way I thought of it is to count the size of the string till the end of the line (maybe including spaces), allocate memory of this size using malloc() and then go back to the beginning of the line and scan the string.
Is there a way to do this? I don't know if there is a way to return the "cursor" to the beginning of the line to 're'scan something.
And if you know another/better way to deal with strings in C please tell me.
Thanks
There is no way to do what you're asking directly, but there is a (in my opinion far better) alternative: fgets().
What it does is read the text until the end of the line, including the final line-feed. If the line is longer than the buffer, then it omits that line feed --- you can use that fact to check if the line was completed.
Something like this (UNTESTED CODE):
// WARNING: Example does not include error checking
// (check the return value of `fgets()`, `malloc()` and `realloc()`!)
size_t buflen = 64;
size_t pos = 0;
char* buf = malloc(buflen);
// `for(;;)` is an infinite loop
for(;;)
{
// read data into buf[pos..buflen] (total of `buflen-pos` bytes)
fgets(buf + pos, buflen - pos, file);
pos = pos + strcspn(buf + pos, "\r\n");
if(buf[pos]) // reached end of line; end the loop
break;
buflen += 64;
// alternative (double the size):
// buflen <<= 1;
buf = realloc(buf, buflen); // resize the buffer
}
// `buf` contains our line; `pos` contains the end of it
// optional: remove the trailing newline
// buf[pos] = 0;
Relevant documentation:
fgets()
strcspn()
malloc()
realloc()
You could use scanf to read every character and then add that character into your buffer.
Your buffer initial size could be 16. And after you read every character you check if you have space for that new character. If you do not have space for your new character you double buffer size and realloc it.
Check out the code example:
#include <stdio.h>
#include <stdlib.h>
char *str;
int main(void) {
char c = '\0';
int size = 0;
int buffer_size = 16;
str = (char *) calloc(buffer_size, sizeof(char));
while (c != '\n') {
scanf("%c", &c);
if (size + 1 == buffer_size) {
buffer_size *= 2;
str = (char *) realloc(str, buffer_size);
if (str == NULL) {
fprintf(stderr, "insufficient memory\n");
return EXIT_FAILURE;
}
}
str[size] = c;
size++;
}
printf("%s\n", str);
return EXIT_SUCCESS;
}

Using read() system call

For an assignment in class we were tasked with using the read() function to read a file containing numbers. While I was able to read the numbers into a buffer I have been unable to move them from the buffer into a char *array so that they can be easily accessed and sorted. Any advice is appreciated.
int readNumbers(int hexI, int MAX_FILENAME_LEN, int **array, char* fname) {
int numberRead = 0, cap = 2;
*array = (int *)malloc(cap*sizeof(int));
int n;
int filedesc = open(fname, O_RDONLY, 0);
if(filedesc < 0){
printf("%s: %s\n", "COULD NOT OPEN", fname);
return -1;
}
char * buff = malloc(512);
buff[511] = '\0';
while(n = read(filedesc, buff+totaln, 512 - totaln) > 0) //Appears to loop only once
totaln += n;
int len = strlen(buff);
for (int a = 0; a < len; a++) { //Dynamically allocates array according to input size
if ((&buff[a] != " ") && (&buff[a] != '\n'))
numberRead++;
if (numberRead >= cap){
cap = cap*2;
*array = (int*)realloc(*array, cap*sizeof(int));
}
}
int k = 0;
while((int *)&buff[k]){ //attempts to assign contents of buff to array
array[k] = (int *)&buff[k];
k++;
}
}
Your use of read() is wrong. There are at least two serious errors:
You ignore the return value, except to test for end-of-file.
You seem to assume that read() will append a nul byte after the data it reads. Perhaps even that it will pad out the buffer with nul bytes.
If you want to read more data into the same buffer after read() returns, without overwriting what you already read, then you must pass a pointer to the first available position in the buffer. If you want to know how many bytes were read in total, then you need to add the return values. The usual paradigm is something like this:
/*
* Read as many bytes as possible, up to buf_size bytes, from file descriptor fd
* into buffer buf. Return the number of bytes read, or an error code on
* failure.
*/
int read_full(int fd, char buf[], int buf_size) {
int total_read = 0;
int n_read;
while ((n_read = read(fd, buf + total_read, buf_size - total_read)) > 0) {
total_read += n_read;
}
return ((n_read < 0) ? n_read : total_read);
}
Having done something along those lines and not received an error, you can be assured that read() has not modified any element of the buffer beyond buf[total_read - 1]. It certainly has not filled the rest of the buffer with zeroes.
Note that it is not always necessary or desirable to read until the buffer is full; the example function does that for demonstration purposes, since it appears to be what you wanted.
Having done that, be aware that you are trying to extract numbers as if they were recorded in binary form in the file. That may indeed be the case, but if you're reading a text file containing formatted numbers then you need to extract the numbers differently. If that's what you're after then add a string terminator after the last byte read and use sscanf() to extract the numbers.

Using write() system call to output dec/hex value of char array bufffer

int fd = open(argv[argc-1], O_RDONLY, 0);
if (fd >=0) {
char buff[4096]; //should be better sized based on stat
ssize_t readBytes;
int j;
readBytes = read(fd, buff, 4096);
char out[4096];
for (j=0; buff[j] != '\0'; j++) {
out[j] = buff[j];
//printf("%d ", out[j]);
}
write(STDOUT_FILENO, out, j+1);
close(fd);
}
else {
perror("File not opened.\n");
exit(errno);
}
This is code for a file dump program. The goal is to have a file and dump its contents to the command line both as ASCII chars and as hex/dec values. The current code is able to dump the ascii values, but not the hex/dec. We are allowed to use printf (as seen in the commented out section) but we can get extra credit if we don't use any high level (higher than system) functions. I have tried multiple ways to manipulate the char array in the loop, but it seems no matter how I try to add or cast the chars they come out as chars.
This isn't surprising since I know chars are, at least in C, technically integers. I am at a loss for how to print the hex/dec value of a char using write() and as have yet not seen any answers on stack that don't default to printf() or putchar()
You could make a larger buffer, make the conversion from ASCII to hex/dec (as needed) in that and print the new one. I hope this example illustrates the idea:
#include <stdlib.h>
#include <io.h>
int main (int argc, char** argv)
{
const char* pHexLookup = "0123456789abcdef";
char pBuffer[] = {'a', 'b', 'c'}; // Assume buffer is the contents of the file you have already read in
size_t nInputSize = sizeof(pBuffer); // You will set this according to how much your input read in
char* pOutputBuffer = (char*)malloc(nInputSize * 3); // This should be sufficient for hex, since it takes max 2 symbols, for decimal you should multiply by 4
for (size_t nByte = 0; nByte < nInputSize; ++nByte)
{
pOutputBuffer[3 * nByte] = pBuffer[nByte];
pOutputBuffer[3 * nByte + 1] = pHexLookup[pBuffer[nByte] / 16];
pOutputBuffer[3 * nByte + 2] = pHexLookup[pBuffer[nByte] % 16];
}
write(1 /*STDOUT_FILENO*/, pOutputBuffer, nInputSize * 3);
free(pOutputBuffer);
return EXIT_SUCCESS;
}
This will print a61b62c63, the ASCII and hex values side by side.
This was done on Windows so don't try to copy it directly, I tried to stick to POSIX system calls. Bascially for hex you allocate a memory chunk 3 times larger than the original (or more if you need to pad the output with spaces) and put an ASCII symbol that corresponds to the hex value of the byte next to it. For decimal you will need more space since it the value can span to 3 characters. And then just write the new buffer. Hope this is clear enough.
How about:
unsigned char val;
val = *out / 100 + 48;
write(STDOUT_FILENO, &val, 1);
val = (*out - *out / 100 * 100 ) / 10 + 48;
write(STDOUT_FILENO, &val, 1);
val = (*out - *out / 10 * 10) + 48;

Searching and Reading a text file

this is my first time asking a question on here so I'll try to do my best. I'm not that great at C, I'm only in Intermediate C programming.
I'm trying to write a program that reads a file, which I got working. But I'm have search for a word then save the word after it into an array. What I have going right now is
for(x=0;x<=256;x++){
fscanf(file,"input %s",insouts[x][0]);
}
In the file there are lines that say "input A0;" and I want it to save "A0" to insouts[x][0]. 256 is just a number I picked because I don't know how many inputs it might have in the text file.
I have insouts declared as:
char * insouts[256][2];
Use fgets() & sscanf(). Seperate I/O from format scanning.
#define N (256)
char insouts[N][2+1]; // note: no * and 2nd dimension is 3
for(size_t x = 0; x < N; x++){
char buf[100];
if (fgets(buf, sizeof buf, stdin) == NULL) {
break; // I/O error or EOF
}
int n = 0;
// 2 this is the max length of characters for insouts[x]. A \0 is appended.
// [A-Za-z0-9] this is the set of legitimate characters for insouts
// %n record the offset of the scanning up to that point.
int result = sscanf(buf, "input %2[A-Za-z0-9]; %n", insouts[x], &n);
if ((result != 1) || (buf[n] != '\0')) {
; // format error
}
}
You want to pass the address of the x'th element of the array and not the value stored there. You can use the address-of operator & to do this.
I think
for(x = 0;x < 256; x++){
fscanf(file,"input %s", &insouts[x][0]);
// you could use insouts[x], which should be equivalent to &insouts[x][0]
}
would do the trick :)
Also, you are only allocating 2 bytes for every string. Keep in mind that strings need to be terminated by a null character, so you should change the array allocation to
char * insouts[256][3];
However, I'm pretty sure the %s will match A0; and not just A0, so you might need to account for this as well. You can use %c together with a width to read a given number of characters. However, you add to add the null byte yourself. This should work (not tested):
char* insouts[256][3];
for(x = 0; x < 256; x++) {
fscanf(file, "input %2c;", insouts[x]);
insouts[x][2] = '\0';
}
Rather than trying to use fscanf why don't you use "getdelim" with ';' as the delimiter.
According to the man page
" getdelim() works like getline(), except that a line delimiter other than newline can be specified as the delimiter argument. As with getline(), a delimiter character is not added if one was not present in the input before end of file was reached."
So you can do something like (untested and uncompiled code)
char *line = NULL;
size_t n, read;
int alloc = 100;
int lc = 0;
char ** buff = calloc(alloc, sizeof(char *)); // since you don't know the file size have 100 buffer and realloc if you need more
FILE *fp = fopen("FILE TO BE READ ", "r");
int deli = (int)';';
while ((read = getline(&line, &n, fp)) != -1) {
printf("%s", line); // This should have "input A0;"
// you can use either sscanf or strtok here and get A0 out
char *out = null ;
sscanf(line, "input %s;", &out);
if (lc > alloc) {
alloc = alloc + 50;
buff = (char **) realloc(buff, sizeof(char *) * alloc);
}
buff[lc++] = out
}
int i = 0 ;
for (i = 0 ; i < lc; i++)
printf ("%s\n", buff[i]);

Why does this code not output the expected output?

This can be a good question for finding bugs.
No? Okay for beginners at least.
#define SIZE 4
int main(void){
int chars_read = 1;
char buffer[SIZE + 1] = {0};
setvbuf(stdin, (char *)NULL, _IOFBF, sizeof(buffer)-1);
while(chars_read){
chars_read = fread(buffer, sizeof('1'), SIZE, stdin);
printf("%d, %s\n", chars_read, buffer);
}
return 0;
}
Using the above code, I am trying to read from a file using redirection ./a.out < data. Contents of input file:
1line
2line
3line
4line
But I am not getting the expected output, rather some graphical characters are mixed in.
What is wrong?
Hint: (Courtesy Alok)
sizeof('1') == sizeof(int)
sizeof("1") == sizeof(char)*2
So, use 1 instead :-)
Take a look at this post for buffered IO example using fread.
The type of '1' is int in C, not char, so you are reading SIZE*sizeof(int) bytes in each fread. If sizeof(int) is greater than 1 (on most modern computers it is), then you are reading past the storage for buffer. This is one of the places where C and C++ are different: in C, character literals are of type int, in C++, they are of type char.
So, you need chars_read = fread(buffer, 1, SIZE, stdin); because sizeof(char) is 1 by definition.
In fact, I would write your loop as:
while ((chars_read = fread(buffer, 1, sizeof buffer - 1)) > 0) {
buffer[chars_read] = 0; /* In case chars_read != sizeof buffer - 1.
You may want to do other things in this case,
such as check for errors using ferror. */
printf("%d, %s\n", chars_read, buffer);
}
To answer your another question, '\0' is the int 0, so {'\0'} and {0} are equivalent.
For setvbuf, my documentation says:
The size argument may be given as zero to obtain deferred optimal-size buffer allocation as usual.
Why are you commenting with \\ instead of // or /* */? :-)
Edit: Based upon your edit of the question, sizeof("1") is wrong, sizeof(char) is correct.
sizeof("1") is 2, because "1" is a char array containing two elements: '1' and 0.
Here's a byte-by-byte way to fread the lines from a file using redirection ./a.out < data.
Produces the expected output at least ... :-)
/*
Why does this code not output the expected output ?,
http://stackoverflow.com/questions/2378264/why-does-this-code-not-output-the-expected-output
compile with:
gcc -Wall -O3 fread-test.c
create data:
echo $'1line\n2line\n3line\n4line' > data
./a.out < data
*/
#include <stdio.h>
#define SIZE 5
int main(void)
{
int i=0, countNL=0;
char singlechar = 0;
char linebuf[SIZE + 1] = {0};
setvbuf(stdin, (char *)NULL, _IOFBF, sizeof(linebuf)-1);
while(fread(&singlechar, 1, 1, stdin)) // fread stdin byte-by-byte
{
if ( (singlechar == '\n') )
{
countNL++;
linebuf[i] = '\0';
printf("%d: %s\n", countNL, linebuf);
i = 0;
} else {
linebuf[i] = singlechar;
i++;
}
}
if ( i > 0 ) // if the last line was not terminated by '\n' ...
{
countNL++;
linebuf[i] = '\0';
printf("%d: %s\n", countNL, linebuf);
}
return 0;
}
char buffer[SIZE + 1] = {0};
This isn't doing what you expect, it is making buffer point to a one byte region in the programs constant data segment. I.e this will corrupt SIZE amount of bytes and possibly cause a memory protection fault. Always initialize C strings with strcpy() or equivalent.

Resources