If I have a character array that is in EBCDIC format and I want to save that array to a file. I'm thinking of using fputs to output the character array without first converting it to another format.
Question) Is the use of fputs legal for writing EBCDIC? If not, should I convert the string to ASCII before outputting?
I've search online, but couldn't find anything to say fputs should not be used for outputting EBCDIC data.
If your character array that is in EBCDIC format is a c-style string in that in ends with a \0 byte, then there is no problem.
fputs(), in binary mode, is format agnostic other than it does not write a \0.
Assuming your program is written using the ASCII char set, it is important that your output file is opened in binary mode (e. g. "wb"), else the \n of C will not match the same in EBCDIC and some translations are possible.
On the other hand, are you going to do something with this file other than write and maybe read back?
Should your "character array that is in EBCDIC format" not end in \0 or have embedded \0 bytes, suggest you simple use fwrite(). Again be sure to use in binary mode, unless your entire system is EBCDIC.
Well, fputs takes a C string, and that uses the ASCII encoding . So, that won't work. I think you'll need to write the file using a lower level function. Perhaps use fwrite to write the file directly without using strings. Here's the man page on fwrite.
Related
With the C standard library stdio.h, I read that to output ASCII/text data, one should use mode "w" and to output binary data, one should use "wb". But why the difference?
In either case, I'm just outputting a byte (char) array, right? And if I output a non-ASCII byte in ASCII mode, the program still outputs the correct byte.
Some operating systems - mostly named "windows" - don't guarantee that they will read and write ascii to files exactly the way you pass it in. So on windows they actually map \r\n to \n. This is fine and transparent when reading and writing ascii. But it would trash a stream of binary data. Basically just always give windows the 'b' flag if you want it to faithfully read and write data to files exactly the way you passed it in.
There are certain transformations that can take place when outputting in ASCII (e.g. outputting neline+carriage-return when the outputted character is new-line) -- depending on your platform. Such transformations will not take place when using binary format
This program uses the scanf function and %s as the format specifier. This function is fix and I can not change anything in the program. Now I have to insert characters so that I get a special ASCII code in the storage. I already found out that if I want to write NUL (0x00) into the storage I can use 'Ctrl'+'Shift'+'#'.
How can I get all the other special ASCII numbers?
I don't know if this is important, I use linux and have an english keyboard.
I don't know whether the line is ended by '\n' or '\r' or '\r\n'
and don't what the text is encoded by , besides if the encode is utf-8, it can be no bom.
Is there a function or a lib can do this ,or just tell me the termination of a line.
Are you by chance using fgets, fread, fputs, fwrite, etc, on a file that is open for reading text? If so, the implementation will automatically transform OS-specific line terminators (eg. "\r\n") into '\n' when reading, and transform '\n' into OS-specific line terminators when writing.
There are two other scenarios, one of which it turns out was OP:
OP was struggling with "\r\n" being carried over from other OS software, and so opening files for reading in his (presumably Unix-like) OS would no longer convert that. My suggestion is to use dos2unix for these one-off conversions, rather than bloating your code with something which will likely never run again.
You're not using one of those functions. This could be because you're using a stream such as a socket, and perhaps the protocol requires "\r\n". In this case, you should use strstr to find the exact sequence "\r\n".
UTF-8 was designed with a degree of compatibility to ASCII in mind, hence you can assume that any system that uses UTF-8 will also use ASCII or some similar character set. Any characters that use sequences larger than one byte will only use values 0x80 or greater to represent. Since '\n' lies within the 0x00-0x7F range, you're guaranteed that it'll be a single byte and it won't exist as part of a multi-byte character.
Use wcslen to get the size in byte of an utf8 string.
http://linux.die.net/man/3/wcslen
Is there a way one can issue non ascii hex characters to a scanf that uses %s ? I'm trying to insert hexadecimal chars like \x08\xDE\xAD and so on (to demonstrate buffer overflow).
The input is not to a command line parameter, but to a scanf inside the program.
I assume you want to feed arbitrary data on stdin (since you read with scanf).
You can use the shell to create the data and pipe it into your program, e.g.
printf '\x08\xDE\xAD' | yourprogram
Note that this will only work as long as there are no white-space characters to be fed (because scanf with a %s format stops at white-space).
When you say 'to a scanf()', presumably there is other data than just this to be supplied. Would it work to have a program, perhaps a Perl or Python script, generate the data and write the non-ASCII characters to the standard input of your program? If you need standard input to appear like a terminal, then you should investigate expect which handles that for you. This is a common way of dealing with the problem.
With the C standard library stdio.h, I read that to output ASCII/text data, one should use mode "w" and to output binary data, one should use "wb". But why the difference?
In either case, I'm just outputting a byte (char) array, right? And if I output a non-ASCII byte in ASCII mode, the program still outputs the correct byte.
Some operating systems - mostly named "windows" - don't guarantee that they will read and write ascii to files exactly the way you pass it in. So on windows they actually map \r\n to \n. This is fine and transparent when reading and writing ascii. But it would trash a stream of binary data. Basically just always give windows the 'b' flag if you want it to faithfully read and write data to files exactly the way you passed it in.
There are certain transformations that can take place when outputting in ASCII (e.g. outputting neline+carriage-return when the outputted character is new-line) -- depending on your platform. Such transformations will not take place when using binary format