Writing line to a file using C - c

I'm currently doing this:
FILE *fOut;
fOut = fopen("fileOut.txt", "w");
char line[255];
...
strcat(line, "\n");
fputs(line, fOut);
but find that when I open the file in a text editor I get
line 1
line 2
If I remove the strcat(line, "\n"); then I get.
line 1line2
How do I get fOut to be
line 1
line 2

The puts() function appends a newline to the string it is given to write to stdout; the fputs() function does not do that.
Since you've not shown us all the code, we can only hypothesize about what you've done. But:
strcpy(line, "line1");
fputs(line, fOut);
putc('\n', fOut);
strcpy(line, "line2\n");
fputs(line, fOut);
would produce the result you require, in two slightly different ways that could each be used twice to achieve consistency (and your code should be consistent — leave 'elegant variation' for your literature writing, not for your programming).
In a comment, you say:
I'm actually looping through a file encrypting each line and then writing that line to a new file.
Oh boy! Are you base-64 encoding the encrypted data? If not, then:
You must include b in the fopen() mode (as in fOut = fopen("fileout.bin", "wb");) because encrypted data is binary data, not text data. This (the b) is safe for both Unix and Windows, but is critical on Windows and immaterial on Unix.
You must not use fputs() to write the data; there will be zero bytes ('\0') amongst the encrypted values and fputs() will stop at the first of those that it encounters. You probably need to use fwrite() instead, telling it exactly how many bytes to write each time.
You must not insert newlines anywhere; the encrypted data might contain newlines, but those must be preserved, and no extraneous one can be added.
When you read this file back in, you must open it as a binary file "rb" and read it using fread().
If you are base-64 encoding your encrypted data, then you can go back to treating the output as text; that's the point of base-64 encoding.

When files are opened with w (or wt) Windows replaces the \n with \r\n.
To avoid this, open the file with wb (instead of w).
...
fOut = fopen("fileOut.txt", "wb");
...
Unlike many other OSs, Windows makes a distinction between binary and text mode, and -- confusingly -- the Windows C runtime handles both modes differently.

You can try using \r instead of \n. What platform are you running this on, Windows?

Related

how to make getc() on stdin be binary

I'm using a while loop with getc() to read from a file handle and it works fine. Now I'm adding support for pipes.
My problem is that while x0A, X0D, and x0A0D pass just fine, any cases of x0D0A get reduced to just x0A. Also x1A seems to stop the process entirely when encountered.
I'm piping in the output of tar, and is messing the files up.
FILE *FH;
FH=stdin;
int buff;
while((buff=getc(FH))!=EOF) {
//stuff
}
That's simplified, as FH needs to point to either a file or stdin. For testing I'm just writing the buff out to a file to see the changes. The code works perfectly if FH is a file.
I've seen the same behavior on using tar, type, and cat as the pipe source
You will need to fopen with a binary mode. I'm not sure if you can use freopen on stdin, but give it a try:
freopen(NULL, "rb", stdin);
You have to open the file in binary mode you are writing into. The combination 0d0a is a carriage return followed by new line and depending on the system will get changed when you write in text mode.

How to properly recognize different line endings in C?

I guess the title speaks for itself.
I am coding a C program on Windows 7, using g++ and Notepad++, which compares content of files.
Content of the file:
simple
file with lines
File has line endings in windows style CRLF.
When I count the length of file using this code:
fseek(file, 0, SEEK_END);
size = ftell(file);
fseek(file, 0, SEEK_SET);
I get 23.
When I change line endings to Unix format LF (using Notepad++) I get 22 length.
This creates kind of a problem, when comparing two files. That's why I ask, if there is a way to determine if given file has LF or CR or CRLF.
I know that I can distinguish between CR and LF, LF has ascii code 10 and CR has ascii code 13. Or LF is '\n' and CR is '\r'.
But when reading file char after char I always get LF (ascii 10), even if there is CRLF.
I hope I made it clear. Thanks.
That is the difference between reading files in text and binary mode.
In text mode (fopen with the relevant parameters fopen( file, "r") then getc etc) all line ends are read as one character. If you read in binary mode e.g. fopen(file, "rb") then you will get the actual bytes and you will see CRLF and CR as different. fseek will use the actual number of bytes and so sees the difference in line endings.
And the only way to tell is to read the files in the two different ways and see if there are CRLF pairs or the size differs, or in practice just see if there is a LF as I fdon't think any current major OS uses that as a line enfing.
In addition to Mark's answer, if you need to do this for a filehandle that has already been opened (such as stdin or stdout), you can use _setmode():
#include <fcntl.h>
#include <io.h>
...
_setmode(fileno(stdin), _O_BINARY);
This works provided no input or output has already occurred to that filehandle. Incidentally, _setmode() only exists on Windows and DOS; on Unix-like operating systems (including versions of Mac OS since OS X), files are effectively always opened in binary mode, and fopen(file, "...b") there is accepted but has no effect. On these platforms, a line ending is encoded by the single character \n.

Reading and writing a file with special characters in c

Using the C language, I am trying to manipulate some files generated by openssl and containing a lot of (very) special characters. But the end of file seems to be prematurely detected.
For example see an extract of my program, that is supposed to copy a file to another :
(for simplicity reasons I do not show the test of the opening of the file but I do that in my program)
char msgcrypt[FFILE];
FILE* fMsg = fopen(f4Path,"r");
while(fgets(tmp,FFILE,fMsg) != NULL) strcat(msgcrypt,tmp);
fclose(fMsg);
FILE* fMsg2 = fopen(f5Path,"w");
fprintf(fMsg2,"%s",msgcrypt);
fclose(fMsg2);
here is the content of the file located at f4Path :
Salted__X¢~xÁïÈú™xe^„fl¯�˜<åD
now the content of the file located at f5Path :
Salted__X¢~xÁïÈú™xe^„fl¯
Notice that 4 characters are missing.
Do someone have an idea?
But the end of file seems to be prematurely detected
Sounds familiar.
Use fopen(f4Path, "rb") when opening the file. This has real significance on Windows.
Don't use string functions (fprintf, strcat, fgets etc) they will choke on NUL characters. Use fread and fwrite instead.
strcat tries and copy a nul-terminated char *. Which means, if it encounters a 0, which it probably has done here, it will stop copying.
You'd better use open read, memcpy and write.
That character it stops on I copied into a hex editor, and it ends up being EF BF BD, a BOM if I'm not mistaken. As a result, reading the file as a text file fails. I don't see any NULL characters (unless copying and pasting got rid of them).
The answer (as has already been discussed) is to not treat it as a text file, and avoiding the str functions won't do any harm either.
The first thing I'd do though is add a check for how may characters are read, that way you'll know where the data is being truncated. Right now it could be in any of: read, strcat, write.

How to fix this file related problem

i m reading from file line by line but when i read some garbage character like space /r is being added i m nt getting why it is being added although there is no such character in file from where i m reading ..i have used fread and fgets both from both i m getting the same problem please reply if u have solution for this problem
The file was probably edited/created on Windows. Windows uses \r\n as a line delimiter. When you read the file, you must strip the \r manually. Since most editors treat \r\n as a single character (line end), you can't "see" it but it's still in the file. Use a hex editor if you want to see it or a tool like od.
Open the file in text mode.
/* ... */
fopen(filename, "r"); /* notice no 'b' in mode */
/* ... */
Supposing you're on Windows ... on reading operations, the library is responsible for translating the literal "\r\n" present on disk to "\n"; and on writing operation, the library translates "\n" to "\r\n".

file size c is different than the size data string's size

I have a file I'm writing to and then changing the size of it to the size of text written to it something like:
FILE * file...
I get all the data from the file and change the file's size to the data's size but it differs. The string's size is smaller then the filelength and it cuts it and loses data.
What might be the problem?
while(fgets(cLine, sizeof(cLine), file) )
str.append((string)cLine);
fputs(str.c_str(),file);
_chsize( fileno(file), (int)str.size() );
When I checked it always fileLength(fileno(file)) is larger than str.size()!
Perhaps it's CRLF? Beware of:
fopen(filename, "r") vs fopen(filename, "rb"),
and likewise
fopen(filename, "w") vs fopen(filename, "wb").
The reason is because "r" or "w" will translate CRLF, while "rb" or "wb" will treat the data as binary. On most platforms this is ignored. For instance, the fopen man page on OS X:
The mode string can also include the
letter "b" either as a third
character or as a character between
the characters in any of the
two-character strings described above.
This is strictly for compatibility
with ISO/IEC 9899:1990 ("ISO C90")
and has no effect; the "b" is
ignored.
The fopen page on MSDN says something different:
b
Open in binary (untranslated) mode;
translations involving carriage-return
and linefeed characters are
suppressed.
If t or b is not given in mode, the
default translation mode is defined by
the global variable _fmode. If t or b
is prefixed to the argument, the
function fails and returns NULL.
For more information about using text
and binary modes in Unicode and
multibyte stream-I/O, see Text and
Binary Mode File I/O and Unicode
Stream I/O in Text and Binary Modes.
Depending on what you are doing in your code for cr/lf and what OS you are running, there could be some translating happening in the background when you read/write the file if you open it in text mode.
Jonathan has hit the nail on the head.
Ensure that you are reading the file in binary format or if you are certain that the file only contains text (and that is all that you want) then be prepared for file characters to be in unicode or some other format.
You'll also find that extra control characters will be automatically added not least the EOF character.
My question though is why do you read the data from the file, only to write it back in again?

Resources