Why buffer contain more data when use fread function (C programming) - c

I wrote a program that copy content from a file to another but when I used fread() to read data from a file and put into buffer it turn out it have more data than the text file
Here's my code
char *buffer;
int size;
FILE *fp1;
fp1 = fopen(src, "r");
if (fp1 == NULL) {
err = errno;
fprintf(stderr, "Value of errno: %d\n", errno);
fprintf(stderr, "Error opening file: %s\n", strerror( err ));
return 0;
}else{
fseek(fp1, 0, SEEK_END);
size = ftell(fp1);
buffer = (char *) malloc(size +1 );
printf("data in Buffer : %s\n",buffer);
printf("size : %d\n",size);
fseek(fp1, 0, SEEK_SET);
fread(buffer,size,1,fp1);
strcat(buffer,"\0");
printf("data in Buffer after fread(): %s\n",buffer);
int a = strlen(buffer);
printf("strlen in Buffer : %d\n",a);
fclose(fp1);
}
FILE *fp2;
fp2 = fopen("disk1.img", "a");
if (fp2 == NULL) {
err = errno;
fprintf(stderr, "Value of errno: %d\n", errno);
fprintf(stderr, "Error opening file: %s\n", strerror( err ));
}else{
rewind(fp2);
printf("data in Buffer before write to destination : %s\n",buffer);
fclose(fp2);
}
source file contain
test kub test ah hahaha 5
Result
data in Buffer : �
size : 26
data in Buffer after fread(): test kub test
ah hahaha 5
U*
strlen in Buffer : 30
data in Buffer before write to destination : test kub test
ah hahaha 5
U*
The file size is 26 bytes I specify 26 bytes in fread() but in turns out buffer contain 30 characters
I use fread() because I have to write data in specific position in destination file also I added "\0" after fread() because I though it could help but it didn't work
**This is second time I face this problem.First time I specific amount of byte when read data from buffer to solve this problem but now I want to know
Why buffer keep more data than the source file and How to fix it.
--------------------Update----------------------------
I read all comment then
I followed user2225104 suggestion and It worked !
I replaced strcat(buffer,"\0"); with buffer[size] = '\0';
Thank you all for your answer it makes me know c programming better.
Result
data in Buffer : 0u
size : 26
data in Buffer after fread(): test kub test
ah hahaha 5
strlen in Buffer : 26
data in Buffer before write to destination : test kub test
ah hahaha 5

The problem is your attempt to 0-terminate and turn the block of chars into a c-string.
strcat(buffer,"\0");
only works if the first string is already 0-terminated. If it were, you would not need it. As you say yourself, your supposed string length is larger than your buffer. So you read some random 0 value behind your buffers end and then overwrite memory 1 byte behind it with your strcat() operation.
buffer[size] = '\0';
This way to do it does not assume buffer is a 0-terminated string and will not hamper with memory outside buffer.
On a side note, malloc() can return NULL. Best make it a habit to ALWAYS check the results of heap operation functions, just as checking results on file operations (e.g. fopen()). Basically anything which can go wrong at run-time and is not an invariant should be checked.

There's two kinds of strings in the programming world:
the Pascal kind of string (used by managed languages like C# and Java), where the size of the string is stored as an integer separately
the C kind of strings, where the size is indicated by a terminating "special" character
There's pros and cons for each of them, but the most important thing is that C style strings can't hold binary data -- the terminating character chosen by C is a valid character in a file (obviously).
So instead you emulate Pascal strings and call them "buffers", basically vectors of characters of some kind, with the size stored manually. You can see it in your malloc call, and again in your fread. Then you sort of black out and forget you wrote it and stop using it, but the size is still there, it's not part of the string.
Instead of printing it with printf (which expects null terminated C strings), you should use a character buffer function like fwrite to write it, and give it the size as an argument. Instead you're printing memory past what you allocated (since it doesn't end with 0), buffer overruning yourself. Generally hackers don't need your help, if they put their mind to it, they'll do it themselves :)
As a side note, you don't need size+1 characters -- there's no terminator as explained.

It's because your code is invalid.
fread(buffer,size,1,fp1);
Here you are ignoring the count returned by fread(), which tells you how many bytes have just been read into the buffer.
strcat(buffer,"\0");
Here you are pointlessly appending a null character after the first null character in the buffer. Remove it.
printf("data in Buffer after fread(): %s\n",buffer);
Here again you are ignoring the count. Assuming you used int count = fread(...), this line should be
printf("data in Buffer after fread(): %.*s\n",count,buffer);
Then:
int a = strlen(buffer);
This line is pointless. You shouldn't assume that I/O operations result in null-terminated C strings. There's nothing anywhere that guarantees that. Instead, you should use the count again. So
printf("strlen in Buffer : %d\n",a);
should be
printf("byte count in Buffer : %d\n",count);

Related

C write and read single character

I'm currently working with processes, and encountered a problem while reading and writing char to a file.
The idea is we have couple of processes which should read an integer from file, increment it and write back. Here is my attempt: (i wont include error checking)
...
char n;
char buff[5];
int number;
...
read(my_desc, &n, 1);
number = (int)n;
number++;
sprintf(buff, "%4d", number);
write(my_desc, buff, sizeof(buff));
...
The file is just plain
0
But the output seems to be not correct (almost always garbage).
I already read write and read manuals but im clueless. I've checked some topics on read and write functions here on stack overflow, but most of them either don't work for me or i struggle with implementation.
Thanks in advance.
It appears that you are reading a single character, taking the ASCII code of that character and converting that number to a 4-character string, and then writing those 4 characters and the terminating null character back to the file.
According to the information that you provided in the comments section, this is not intended. If I understand you correctly, you rather want to
read the entire file as a string,
convert that string to a number,
increment that number,
convert that number back to a string and
overwrite the entire file with that string.
Step #1 can be accomplished with the function read. However, you should read in the whole file instead of only a single character.
Step #2 can be accomplished by using the function strtol.
Step #3 is trivial.
Step #4 can be accomplished using the function snprintf.
Step #5 can be accomplished by rewinding the file using the function lseek, and then using the function write.
I am assuming that the number represented in the file is in the respresentable range of a long int, which is -9,223,372,036,854,775,808 to +9,223,372,036,854,775,807 on most POSIX platforms. This means that the length of the string can be up to 19 characters, 20 including the terminating null character. That is why I am using a buffer size of 20.
char buffer[20], *p;
ssize_t bytes_read;
long num;
bytes_read = read( my_desc, buffer, (sizeof buffer) - 1 );
if ( bytes_read <= 0 )
{
//TODO: handle input error
}
//add null terminating character to string
buffer[bytes_read] = '\0';
//attempt to convert the string to a number
num = strtol( buffer, &p, 10 );
//check for conversion error
if ( p == buffer )
{
//TODO: handle conversion error
}
//increment the number
num++;
//write incremented number to buffer
snprintf( buffer, sizeof buffer, "%ld", num );
//rewind file
lseek( my_desc, 0, SEEK_SET );
//write buffer to file
write( my_desc, buffer, strlen(buffer) );
Note that I have not tested this code.
Also note that this program assumes that the input file does not contain any leading zeros. If the file contains the text string "003", then this program will overwrite the first character with a 4, but leave the remaining characters in the file intact. If this is an issue, then you will have to add a call to ftruncate to truncate the file.

How to read a complete file with scanf maybe something like %[^\EOF] without loop in single statement

I want to know if I can read a complete file with single scanf statement. I read it with below code.
#include<stdio.h>
int main()
{
FILE * fp;
char arr[200],fmt[6]="%[^";
fp = fopen("testPrintf.c","r");
fmt[3] = EOF;
fmt[4] = ']';
fmt[5] = '\0';
fscanf(fp,fmt,arr);
printf("%s",arr);
printf("%d",EOF);
return 0;
}
And it resulted into a statement after everything happened
"* * * stack smashing detected * * *: terminated
Aborted (core dumped)"
Interestingly, printf("%s",arr); worked but printf("%d",EOF); is not showing its output.
Can you let me know what has happened when I tried to read upto EOF with scanf?
If you really, really must (ab)use fscanf() into reading the file, then this outlines how you could do it:
open the file
use fseek() and
ftell() to find the size of the file
rewind() (or fseek(fp, 0, SEEK_SET)) to reset the file to the start
allocate a big buffer
create a format string that reads the correct number of bytes into the buffer and records how many characters are read
use the format with fscanf()
add a null terminating byte in the space reserved for it
print the file contents as a big string.
If there are no null bytes in the file, you'll see the file contents printed. If there are null bytes in the file, you'll see the file contents up to the first null byte.
I chose the anodyne name data for the file to be read — there are endless ways you can make that selectable at runtime.
There are a few assumptions made about the size of the file (primarily that the size isn't bigger than can be fitted into a long with signed overflow, and that it isn't empty). It uses the fact that the %c format can accept a length, just like most of the formats can, and it doesn't add a null terminator at the end of the string it reads and it doesn't fuss about whether the characters read are null bytes or anything else — it just reads them. It also uses the fact that you can specify the size of the variable to hold the offset with the %n (or, in this case, the %ln) conversion specification. And finally, it assumes that the file is not shrinking (it will ignore growth if it is growing), and that it is a seekable file, not a FIFO or some other special file type that does not support seeking.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
const char filename[] = "data";
FILE *fp = fopen(filename, "r");
if (fp == NULL)
{
fprintf(stderr, "Failed to open file %s for reading\n", filename);
exit(EXIT_FAILURE);
}
fseek(fp, 0, SEEK_END);
long length = ftell(fp);
rewind(fp);
char *buffer = malloc(length + 1);
if (buffer == NULL)
{
fprintf(stderr, "Failed to allocate %ld bytes\n", length + 1);
exit(EXIT_FAILURE);
}
char format[32];
snprintf(format, sizeof(format), "%%%ldc%%ln", length);
long nbytes = 0;
if (fscanf(fp, format, buffer, &nbytes) != 1 || nbytes != length)
{
fprintf(stderr, "Failed to read %ld bytes (got %ld)\n", length, nbytes);
exit(EXIT_FAILURE);
}
buffer[length] = '\0';
printf("<<<SOF>>\n%s\n<<EOF>>\n", buffer);
free(buffer);
return(0);
}
This is still an abuse of fscanf() — it would be better to use fread():
if (fread(buffer, sizeof(char), length, fp) != (size_t)length)
{
fprintf(stderr, "Failed to read %ld bytes\n", length);
exit(EXIT_FAILURE);
}
You can then omit the variable format and the code that sets it, and also nbytes. Or you can keep nbytes (maybe as a size_t instead of long) and assign the result of fread() to it, and use the value in the error report, along the lines of the test in the fscanf() variant.
You might get warnings from GCC about a non-literal format string for fscanf(). It's correct, but this isn't dangerous because the programmer is completely in charge of the content of the format string.

Handle memory while reading long lines from a file in C

First of all, I know this question is very close to this topic, but the question was so poorly worded that I am not even sure it is a duplicate plus no code were shown so I thought it deserved to be asked properly.
I am trying to read a file line by line and I need to store a line in particular in a variable. I have managed to do so quite easily using fgets, nevertheless the size of the lines to be read and the number of lines in the file remain unknown.
I need a way to properly allocate memory to the variable whatever the size of the line might be, using C and not C++.
So far my code looks like that :
allowedMemory = malloc(sizeof(char[1501])); // Checks if enough memory
if (NULL == allowedMemory)
{
fprintf(stderr, "Not enough memory. \n");
exit(1);
}
else
char* res;
res = allowedMemory;
while(fgets(res, 1500, file)) // Iterate until end of file
{
if (res == theLineIWant) // Using strcmp instead of ==
return res;
}
The problem of this code is that it is not adaptable at all. I am looking for a way to allocate just enough memory to res so that I don't miss any data in line.
I was thinking about something like that :
while ( lineContainingKChar != LineContainingK+1Char) // meaning that the line has not been fully read
// And using strcmp instead of ==
realloc(lineContainingKChar, K + 100) // Adding memory
But I would need to iterate through two FILE object in order to fill these variables which would not be very efficient.
Any hints about how to implement this solution or advise about how to do it in a easier way would be appreciated.
EDIT : Seems like using getline() is the best way to do so because this function allocates the memory needed by itself and free it when needed. Nevertheless I don't think that it is 100% portable since I still can't use it though I have included <stdio.h>. To be verified though, since my issues are often situated between keyboard and computer. Until then I am still open to a solution which would not use POSIX-compliant C.
getline() appears to do exactly what you want:
DESCRIPTION
The getdelim() function shall read from stream until it encounters a
character matching the delimiter character.
...
The getline() function shall be equivalent to the getdelim()
function with the delimiter character equal to the <newline>
character.
...
EXAMPLES
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *fp;
char *line = NULL;
size_t len = 0;
ssize_t read;
fp = fopen("/etc/motd", "r");
if (fp == NULL)
exit(1);
while ((read = getline(&line, &len, fp)) != -1) {
printf("Retrieved line of length %zu :\n", read);
printf("%s", line);
}
if (ferror(fp)) {
/* handle error */
}
free(line);
fclose(fp);
return 0;
}
And per the Linux man page:
DESCRIPTION
getline() reads an entire line from stream, storing the address of
the buffer containing the text into *lineptr. The buffer is null-
terminated and includes the newline character, if one was found.
If *lineptr is set to NULL and *n is set 0 before the call, then
getline() will allocate a buffer for storing the line. This buffer
should be freed by the user program even if getline() failed.
Alternatively, before calling getline(), *lineptr can contain a
pointer to a malloc(3)-allocated buffer *n bytes in size. If the
buffer is not large enough to hold the line, getline() resizes it
with realloc(3), updating *lineptr and *n as necessary.
In either case, on a successful call, *lineptr and *n will be
updated to reflect the buffer address and allocated size respectively.

How to properly print file content to the command line in C?

I want to print the contents of a .txt file to the command line like this:
main() {
int fd;
char buffer[1000];
fd = open("testfile.txt", O_RDONLY);
read(fd, buffer, strlen(buffer));
printf("%s\n", buffer);
close(fd);
}
The file testfile.txt looks like this:
line1
line2
line3
line4
The function prints only the first 4 letters line.
When using sizeof instead of strlen the whole file is printed.
Why is strlen not working?
It is incorrect to use strlen at all in this program. Before the call to read, the buffer is uninitialized and applying strlen to it has undefined behavior. After the call to read, some number of bytes of the buffer are initialized, but the buffer is not necessarily a proper C string; strlen(buffer) may return a number having no relationship to the amount of data you should print out, or may still have UB (if read initialized the full length of the array with non-nul bytes, strlen will walk off the end). For the same reason, printf("%s\n", buffer) is wrong.
Your program also can't handle files larger than the buffer at all.
The right way to do this is by using the return value of read, and write, in a loop. To tell read how big the buffer is, you use sizeof. (Note: if you had allocated the buffer with malloc rather than as a local variable, then you could not use sizeof to get its size; you would have to remember the size yourself.)
#include <unistd.h>
#include <stdio.h>
int main(void)
{
char buf[1024];
ssize_t n;
while ((n = read(0, buf, sizeof buf)) > 0)
write(1, buf, n);
if (n < 0) {
perror("read");
return 1;
}
return 0;
}
Exercise: cope with short writes and write errors.
When using sizeof instead of strlen the whole file is printed. Why is
strlen not working?
Because how strlen works is it goes through the char array passed in and counts characters till it encounters 0. In your case, buffer is not initialized - hence it will try to access elements of uninitialized array (buffer) to look for 0, but reading uninitialized memory is not allowed in C. Actually you get undefined behavior.
sizeof works differently and returns the number of bytes of the passed object directly without looking for a 0 inside the array as strlen does.
As correctly noted in other answers read will not null terminate the string for you so you have to do it manually or declare buffer as:
char buffer[1000] = {0};
In this case printing such buffer using %s and printf after reading the file, will work, only assuming read didn't initialize full array with bytes of which none is 0.
Extra:
Null terminating a string means you append a 0 to it somewhere. This is how most of the string related functions guess where the string ends.
Why is strlen not working?
Because when you call it in read(fd, buffer, strlen(buffer));, you haven't yet assigned a valid string to buffer. It contains some indeterminate data which may or may not have a 0-valued element. Based on the behavior you report, buffer just so happens to have a 0 at element 4, but that's not reliable.
The third parameter tells read how many bytes to read from the file descriptor - if you want to read as many bytes as buffer is sized to hold, use sizeof buffer. read will return the number of bytes read from fd (0 for EOF, -1 for an error). IINM, read will not zero-terminate the input, so using strlen on buffer after calling read would still be an error.

Segfault during a sprintf()

So, I am currently working on System programming for my Unix OS class. All that this program should do is read a binary file and output the lines to a CSV file. I feel like i'm almost done but for some reason I keep getting a segfault.
To clarify:
fd1 = input file,
fd2 = output file,
numrecs = number of records from input file.
Somewhere in main():
for(i=0;i<numrecs;i++){
if((bin2csv(fd1, fd2)) == -1){
printf("Error converting data.\n");
}
}
int bin2csv(fd1, fd2){
bin_record rec;
char buffer[100];
int buflen;
strncpy(buffer,"\0", 100); /* fill buffer with NULL */
recs = &rec;
/* read in a record */
if((buflen = read(fd1, &recs, sizeof(recs))) < 0){
printf("Fatal Error: Data could not be read.\n");
return -1;
}
sprintf(buffer, "%d, %s, %s, %f, %d\n", recs->id, recs->lname, recs->fname, recs->gpa, recs->iq);
printf("%s\n", buffer);
write(fd2, buffer, sizeof(buffer));
return 0;
}
The segfault is occurring on the line "sprintf(buffer, etc..);" however, I cannot figure out why that is happening.
This is the error gdb spits out:
Program received signal SIGSEGV, Segmentation fault.
0x0000000100000c87 in bin2csv (fd1=3, fd2=4) at bin2csv.c:25
25 sprintf(buffer, "%d, %s, %s, %f, %d\n", recs->id, recs->lname,
recs->fname, recs->gpa, recs->iq);
Hopefully this is enough info. Thanks!
It looks like recs is a pointer. You are reading bytes directly into that pointer, like reading a raw memory address from file:
read(fd1, &recs, sizeof(recs))
And then you start using it in the call to sprintf... BOOM!
There is actually no reason to use it at all (is it a global?)... Even though you initialised it by recs = &rec, and assuming you don't trash it, it still will not contain valid address outside of that function. That's because rec is a local variable.
So, just read directly into rec like this:
read(fd1, &rec, sizeof(rec))
And then on the sprintf line, you use rec.id instead of recs->id (etc).
I see a few issues here:
sprintf does nothing to prevent writing past the end of the string buffer. In fact it has no knowledge of the length of that buffer (100 bytes in your case). Since you have setup the buffer in the stack, if sprintf over-runs your buffer (which it could do with long first or last names or garbage strings as input) your stack will be corrupted and a seg fault is likely. You may want to consider including logic to ensure that sprintf will not exceed the amount of buffer space you have. Or better yet avoid sprintf altogether (more on that below)
You are not handling end-of-file in the code provided. For end of file, read returns 0. If you pass bad pointers to sprintf, it will fail.
The functions that you are using are the UNIX derived ones (part of POSIX but decidedly low level) that use small integers as file descriptors. I would recommend using the FILE * based ones instead. The I/O functions of interest would be fopen, fclose, fprintf, fwrite, etc. This would eliminate the need to use sprintf.
See this previous question for more information.
if((buflen = read(fd1, &recs, sizeof(recs))) < 0){
Use <= 0 rather than < 0, else when the return value is 0, sprintf(buffer ... may seg fault as it tries to de-reference recs->id which has an uninitialized value.
You have some problems:
1) structure of bin_record. It has char[] and it is possible to overflow.
2) in sprintf you cannot set buffer max size. it is better to use snprintf like this:
sprintf(buffer, 100, "%d, %s, %s, %f, %d\n", recs->id, recs->lname, recs->fname, recs->gpa, recs->iq);
3) to fill buffer with null us this:
memset (buffer,'\0',100);

Resources