How to overwrite using stdio.h? - c

I wanted to overwrite an existing file, which should be simple:
#include <stdio.h>
int main()
{
FILE *f;
unsigned char ba;
int i;
f = fopen("junk", "wb");
for (i = 1; i <= 10; i++)
fputc(i, f);
fclose(f);
f = fopen("junk", "ab");
fseek(f, 0, SEEK_SET);
for (i = 1; i <= 5; i++) {
printf("Position before: %ld\n", ftell(f));
fputc(99, f);
printf("Position after: %ld\n", ftell(f));
}
fclose(f);
f = fopen("junk", "rb");
for (i = 1; i <= 10; i++) {
ba = fgetc(f);
printf("%d ", ba);
}
printf("\n");
}
The result is:
Position before: 0
Position after: 11
Position before: 11
Position after: 12
Position before: 12
Position after: 13
Position before: 13
Position after: 14
Position before: 14
Position after: 15
Position before: 15
Position after: 16
Position before: 16
Position after: 17
Position before: 17
Position after: 18
Position before: 18
Position after: 19
Position before: 19
Position after: 20
1 2 3 4 5 6 7 8 9 10
The idea is to write a file in write mode, then reopen in write/append mode, overwrite the first 5 bytes, then open in read mode and read the contents of the 10 bytes out. The result should have been:
99 99 99 99 99 6 7 8 9 10
As you see from the trace, instead <stdio.h> ignores the fseek() to zero and appends at the end in any case. Obviously this is distilled from a more complex program, but this behavior makes no sense to me.

You can only write to the end of a file opened for append, regardless of the seeking. That's what it is for. Open in mode "rb+" to overwrite. #Weather Vane
Given the requirement "reopen in write/append mode", I see opening with "rb+" [open binary file for update (reading and writing)] still meeting that goal.
Sure, but that behavior is undocumented.
The behavior is well documented in the C spec.
Opening a file with append mode (’a’ as the first character in the mode argument) causes all subsequent writes to the file to be forced to the then current end-of-file, regardless of intervening calls to the fseek function. ... C17dr § 7.21.5.3 6
Other derived documentation may or may not be so informative. When in doubt about the C language or standard library, check the language specification.

The file is open for appending with "ab": any output is performed at the end of the file. As you can see from the output, the first fputc() performs an implied fseek(SEEK_END, 0L, f).
If you opened it for "ab+" you could seek anywhere into the file and read from there but all output would still first seek to the end of the file.
If you mean to overwrite a part of the file and leave the rest intact, you should open it with "rb+".
Also note that the last call to fopen: fopen("junk", "rb"); does not store the stream pointer to f. Further reading from f, that was closed before, works by coincidence, the behavior is actually undefined.

Related

Trouble understanding fseek offset

I have a text file, where each line is an integer with a newline character. I also have a .bin file with the same thing.
10
20
30
40
50
60
70
Running this code...
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
int input;
FILE *infile_t = fopen("numbers.txt", "r");
FILE *infile_b = fopen("numbers.bin", "rb");
if (infile_t == NULL) {
printf("Error: unable to open file %s\n", "numbers.txt");
exit(1);
}
if (infile_b == NULL) {
printf("Error: unable to open file %s\n", "numbers.bin");
exit(1);
}
printf("Enter an integer index: ");
while(scanf("%d",&input) != EOF){
int ch;
fseek(infile_t, (input*sizeof(int))-1, SEEK_SET);
fscanf(infile_t, "In text file: %d\n", &ch);
printf("In text file: %d\n", ch);
fseek(infile_b, (input*sizeof(int))-1, SEEK_SET);
fscanf(infile_b, "%d\n", &ch);
printf("In binary file: %d\n", ch);
printf("Enter an integer index: ");
}
fclose(infile_t);
fclose(infile_b);
return 0;
}
and entering 0, 1, 2, 3, 4 consecutively, I get the outputs:
10
0
40
50
0
I am trying to read the file by 4 bytes at a time (each int) and print the integer. What am I doing wrong and if this is bad practice, what would be better?
There is a difference between the textual representation of numbers and their binary representation.
Your input is a text file, which is a sequence of characters:
"10lf20lf30lf40lf50lf60lf70lf"
Its size is 21 bytes, which you could check with your file explorer.
And as bytes in a tabular form it looks like this, assumed that you are using ASCII and a unix-like system:
Offset
Bytes
Text
0
31 30 0A
"10lf"
3
32 30 0A
"20lf"
6
33 30 0A
"30lf"
9
34 30 0A
"40lf"
12
35 30 0A
"50lf"
15
36 30 0A
"60lf"
18
37 30 0A
"70lf"
There are no integers stored in binary form in your input file.
The function fseek() places the "cursor" into the file at the specified offset.
Then you call scanf() to scan and interpret(!) the sequence of characters that start at that offset.
Input
Offset set by fseek()
Text
Resulting value
0
0
"10lf..."
10
1
4
"0lf..."
0
2
8
"lf40lf..."
40
3
12
"50lf..."
50
4
16
"0lf..."
0
Since scanf() skips leading whitespace, you get "40" in the third case.
You cannot use fseek() in the general case to "jump" to a certain line in a text file. Except, that you know how long each line is. In your case this is known, and if you use a factor of 3 instead of 4, you will get what you seem to want.
I don't know what is in your 'numbers.bin', and you opened 'numbers.txt' as infile_t but didn't use it.
Assuming that the content in 'numbers.bin' is the text content in your question, and you open it in binary mode for reading, the contents stored in the file are as follows(end with one byte '\n' instead of two bytes '\r\n'):
\x31\x30\x0a\x32\x30\x0a\x33\x30\x0a\x34\x30\x0a\x35\x30\x0a\x36\x30\x0a\x37\x30
At this time, the file pointer is at the head of the file, pointing to the text content '1'(ascii code is 0x31).
\x31\x30\x0a\x32\x30\x0a\x33\x30\x0a\x34\x30\x0a\x35\x30\x0a\x36\x30\x0a\x37\x30
↑
when you use scanf("%d",&input) and input '0', the integer variable input will be 0, then you set the file pointer via fseek(infile_b, input*4, SEEK_SET), the file pointer will point to offset 0 relative to the beginning of the file.
Next line fscanf(infile_b, "%d\n", &ch) will read a integer value to variable ch, then ch will store the value 10 and print it to standard output (stdout) via printf.
When you enter '1', the file pointer will be set to 4, which will point to the fifth byte position relative to the beginning of the file, as follows:
\x31\x30\x0a\x32\x30\x0a\x33\x30\x0a\x34\x30\x0a\x35\x30\x0a\x36\x30\x0a\x37\x30
↑
The ascii code of the text value '0' is 0x30. It will read an integer value 0 and store it in ch.
You can replace fseek(infile_b, input*4, SEEK_SET) with fseek(infile_b, input*3, SEEK_SET), and will get the expected output.

Writing even integers up to n in a file in c, putw() function error

I'm writing a program that will output all even integers up to 100 in a text file.
Here's the whole code:
#include <stdio.h>
#include <stdlib.h>
#define MAX 100
int main() {
FILE *fp;
int i;
if ((fp = fopen("even_up_to_100.txt", "w")) == NULL) {
perror("Write");
exit(1);
}
for (i = 1; i <= MAX; ++i) {
if (!(i % 2))
putw(i, fp);
}
fclose(fp);
if ((fp = fopen("even_up_to_100.txt", "r")) == NULL) {
perror("Read");
exit(2);
}
while (!feof(fp))
printf("%d ", getw(fp));
fclose(fp);
return 0;
}
OUTPUT(from text file):
" $ & ( * , . 0 2 4 6 8 : < > # B D F H J L N P R T V X Z \ ^ ` b d
OUTPUT(from console window):
2 4 6 8 10 12 14 16 18 20 22 24 -1
Please point out the error(if there's any) in the code with the solution.
Inside the text file there are some control characters, which are shown as blank spaces here.
Since getw/putw are binary I/O functions, you should be opening your file in binary mode ("wb" instead of "w" as the mode argument to fopen, and likewise "rb" instead of "r").
Character 26 is ASCII Ctrl-Z, which Windows (and DOS before it) use as an end-of-file marker for text files. So if you're on such a system, when you attempt to read the number 26 from your file, the library sees a Ctrl-Z byte and treats that as the end of the file. That would explain why your program stops reading after 24. Opening in binary mode disables this behavior, and will also avoid various other problems, e.g. the handling of CR characters.
Note that if your goal was, as you said, to "output all even integers up to 100 in a text file", then getw/putw are the wrong tools for the job as they do binary I/O, not text. (Even if you did want binary format, you should not use getw/putw but rather fread/fwrite, as I explain here.) If you want to create a text file, with human-readable contents, you should use fprintf and fscanf.

Shifting Extended ASCII codes

When assigning Extended ASCII codes to an unsigned char, I noticed that the values are shifted upwards when they are written to a file.
I condensed my code into this simple program to briefly present my question:
#include <stdio.h>
#include <stdlib.h>
int main()
{
unsigned char testascii[3];
testascii[0] = 122;
testascii[1] = 150;
testascii[2] = 175;
printf("%d\n", testascii[0]);
printf("%d\n", testascii[1]);
printf("%d\n", testascii[2]);
return 0;
}
If I run this simple program, I get this terminal output:
122
150
175
This is correct.
If I now add the following to the above program:
FILE *f;
f = fopen("/mystuff/testascii", "wb");
if (f == NULL)
{
printf("Error opening file\n");
exit(1);
}
fwrite(testascii, 1, 3, f);
fclose(f);
It runs correctly but if I now go to the O/S and run:
od -c testascii
I get this output:
0000000 z 226 257
0000003
As you can see the Standard ASCII code (below 128) is correctly shown; however the Extended ASCII codes (above 127) are changed. I expect them to be 150 and 175 but they are 226 and 257.
If I remove the binary flag from the file open command, the result is still the same.
As a final check, instead of the binary print (fwrite), I changed the code again and looped through the array and did a fprintf of each item like this:
fprintf (fp, "%d", appendtxt[i]);
Here's the OD display for that:
0000000 1 2 2 1 5 0 1 7 5
0000011
This all tells me that the binary print (fwrite) isn't doing what I expected. It's my understanding the fwrite command writes the binary data to the file. In that case why does it successful write a value less than 128 but it fails with values equal to or greater than 128?
Environment:
Code::Blocks 16.01
Centos 7.1
Note: I did find this similar question: fwrite with non ASCII characters but it didn't seem to help with my situation. I could be wrong. Please let me know if I missed something in that post?
You are printing in octal (that's what od does by default), 226 octal is 150 decimal.

Format file to have 5 numbers per line

I am working on a text file containing integers separated by spaces, for instance:
1 2 57 99 8 14 22 36 98 445 1001 221 332 225 789 1111115 147 0 1 21321564 544 489 654 61266 5456 15 19
I would like to re-format this file to only contain 5 integers in any line but the last, and at most 5 integers in the last line.
My code:
#include <stdio.h>
#include <stdlib.h>
int main()
{
FILE *f; // main file (A.txt)
FILE *g; // file copy (B.txt)
// open A.txt to read data
f = fopen("file/path/here/A.txt", "r");
if (f == NULL) {
printf("Read error.\n");
fclose(f);
return -1;
}
// open B.txt to write data
g = fopen("file/path/here/B.txt", "w");
if (g == NULL) {
printf("Write error.\n");
fclose(g);
return -2;
}
int line = 1; // first line in output file
while (!feof(f)) { // not end-of-file
char number[1000];
int i = 0;
for (i = 0; i <= 4; i++)
if (fscanf(f, "%s", number) == 1) { // one number read
fprintf(g, "%s", line + i, number);
}
line += i;
}
// close files
fclose(f);
fclose(g);
return 0;
}
When I run this in Code::Blocks, I get the 'Segmentation fault (core dumped) Process returned 139' message. I suspect that the problem lies in the 'if' statement and my use of formats. Needless to say, I'm relatively new to C. How might I fix this?
The simple reason for your segmentation fault is expression fprintf(g, "%s", line + i, number);, in which you state to pass a pointer to a string (i.e. char*), but actually pass a number (i.e. line + i); hence, the value of line + i, which is probably 1, ..., is interpreted as a pointer to memory address 1, which is not allowed to be addressed. It is as if you wrote fprintf(g, "%s", 1), which crashes, too;
So basically change this expression into fprintf(g, "%s", number);, and it should at least not crash (unless you have numbers with more than 999 digits).
There are some other issues in your code, e.g. that you open B.txt for write and assign it to g, but then you test and close the file using variable f.
But maybe above "crash solution" brings you forward, such that you can work further on your own. Note that - if B.txt failed opening, then your code would also have crashed because of passing NULL as file stream argument to fprintf.
The issue is with the use of fscanf and then fprintf.
fscanf knows how to parse a string into a number. E.g. fscanf(f, "%d", &var);. This reads a signed integer from the file handle f into the variable var. This can then be printed with fprintf.
As it stands, the first fscanf slurps the entire input into number (assuming that 1000 char is enough) and the following ones are not expected to be called

fwrite() does not override text in Windows (C)

I write this C code so that I could test whether fwrite could update some values in a text file. I tested on Linux and it works fine. In Windows (vista 32bits), however, it simply does not work. The file remains unchanged after I write a different byte using: cont = fwrite(&newfield, sizeof(char), 1, fp);
The registers are written on the file using a "#" separator, in the format:
Reg1FirstField#Reg1SecondField#Reg2FirstField#Reg2SecondField...
The final file should be: First#1#Second#9#Third#1#
I also tried putc and fprintf, all with no result. Can someone please help me with this?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct test {
char field1[20];
char field2;
} TEST;
int main(void) {
FILE *fp;
TEST reg, regread;
char regwrite[22];
int i, cont, charwritten;
fp=fopen("testupdate.txt","w+");
strcpy(reg.field1,"First");
reg.field2 = '1';
sprintf(regwrite,"%s#%c#", reg.field1, reg.field2);
cont = (int)strlen(regwrite);
charwritten = fwrite(regwrite,cont,1,fp);
fflush(fp);
strcpy(reg.field1,"Second");
reg.field2 = '1';
sprintf(regwrite,"%s#%c#", reg.field1, reg.field2);
cont = (int)strlen(regwrite);
charwritten = fwrite(regwrite,cont,1,fp);
fflush(fp);
strcpy(reg.field1,"Third");
reg.field2 = '1';
sprintf(regwrite,"%s#%c#", reg.field1, reg.field2);
cont = (int)strlen(regwrite);
charwritten = fwrite(regwrite,cont,1,fp);
fflush(fp);
fclose(fp);
// open file to update
fp=fopen("testupdate.txt","r+");
printf("\nUpdate field 2 on the second register:\n");
char aux[22];
// search for second register and update field 2
for (i = 0; i < 3; i ++) {
fscanf(fp,"%22[^#]#", aux);
printf("%d-1: %s\n", i, aux);
if (strcmp(aux, "Second") == 0) {
char newfield = '9';
cont = fwrite(&newfield, sizeof(char), 1, fp);
printf("written: %d bytes, char: %c\n", cont, newfield);
// goes back one byte in order to read properly
// on the next fscanf
fseek(fp,-1,SEEK_CUR);
}
fscanf(fp,"%22[^#]#", aux);
printf("%d-2: %s\n",i, aux);
aux[0] = '\0';
}
fflush(fp);
fclose(fp);
// open file to see if the update was made
fp=fopen("testupdate.txt","r");
for (i = 0; i < 3; i ++) {
fscanf(fp,"%22[^#]#", aux);
printf("%d-1: %s\n", i, aux);
fscanf(fp,"%22[^#]#",aux);
printf("%d-2: %s\n",i, aux);
aux[0] = '\0';
}
fclose(fp);
getchar();
return 0;
}
You're missing a file positioning function between the read and write. The Standard says:
7.19.5.3/6
When a file is opened with update mode, both input and output may be performed on the associated stream. However, ... input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end-of-file. ...
for (i = 0; i < 3; i ++) {
fscanf(fp,"%22[^#]#", aux); /* read */
printf("%d-1: %s\n", i, aux);
if (strcmp(aux, "Second") == 0) {
char newfield = '9';
/* added a file positioning function */
fseek(fp, 0, SEEK_CUR); /* don't move */
cont = fwrite(&newfield, sizeof(char), 1, fp); /* write */
I didn't know it but here they explain it:
why fseek or fflush is always required between reading and writing in the read/write "+" modes
Conclusion: You must either fflush or fseek before every write when you use "+".
fseek(fp, 0, SEEK_CUR);
// or
fflush(fp);
cont = fwrite(&newfield, sizeof(char), 1, fp);
Fix verified on Cygwin.
You're not checking any return values for errors. I'm guessing the file is read-only and is not even opening properly.
At least here on OSX, your value 9 is begin appended to the end of the file ... so you're not updating the actual register value for Second at it's position in the file. For some reason after the scan for the appropriate point to modify the values, your stream pointer is actually at the end of the file. For instance, running and compiling your code on OSX produced the following output in the actual text file:
First#1#Second#1#Third#1#9
The reason your initial read-back is working is because the data is being written, but it's at the end of the file. So when you write the value and then back-up the stream and re-read the value, that works, but it's not being written in the location you're assuming.
Update: I've added some calls to ftell to see what's happening to the stream pointer, and it seems that your calls to fscanf are working as you'd assume, but the call to fwrite is jumping to the end of the file. Here's the modified output:
Update field 2 on the second register:
**Stream position: 0
0-1: First
0-2: 1
**Stream position: 8
1-1: Second
**Stream position before write: 15
**Stream position after write: 26
written: 1 bytes, char: 9
1-2: 9
**Stream position after read-back: 26
Update-2: It seems by simply saving the position of the stream-pointer, and then setting the position of the stream-pointer, the call to 'fwrite` worked without skipping to the end of the file. So I added:
fpos_t position;
fgetpos(fp, &position);
fsetpos(fp, &position);
right before the call to fwrite. Again, this is on OSX, you may see something different on Windows.
With this:
fp=fopen("testupdate.txt","w+");
^------ Notice the + sign
You opened the file in "append" mode -- that is what the plus sign does in this parameter. As a result, all of your fwrite() calls will be relative to the end of the file.
Using "r+" for the fopen() mode doesn't make sense -- the + means nothing in this case.
This and other issues with fopen() are why I prefer to use the POSIX-defined open().
To fix your particular case, get rid of the + characters from the fopen() modes, and consider that you might need to specify binary format on Windows ("wb" and "rb" modes).

Resources