Shifting Extended ASCII codes - c

When assigning Extended ASCII codes to an unsigned char, I noticed that the values are shifted upwards when they are written to a file.
I condensed my code into this simple program to briefly present my question:
#include <stdio.h>
#include <stdlib.h>
int main()
{
unsigned char testascii[3];
testascii[0] = 122;
testascii[1] = 150;
testascii[2] = 175;
printf("%d\n", testascii[0]);
printf("%d\n", testascii[1]);
printf("%d\n", testascii[2]);
return 0;
}
If I run this simple program, I get this terminal output:
122
150
175
This is correct.
If I now add the following to the above program:
FILE *f;
f = fopen("/mystuff/testascii", "wb");
if (f == NULL)
{
printf("Error opening file\n");
exit(1);
}
fwrite(testascii, 1, 3, f);
fclose(f);
It runs correctly but if I now go to the O/S and run:
od -c testascii
I get this output:
0000000 z 226 257
0000003
As you can see the Standard ASCII code (below 128) is correctly shown; however the Extended ASCII codes (above 127) are changed. I expect them to be 150 and 175 but they are 226 and 257.
If I remove the binary flag from the file open command, the result is still the same.
As a final check, instead of the binary print (fwrite), I changed the code again and looped through the array and did a fprintf of each item like this:
fprintf (fp, "%d", appendtxt[i]);
Here's the OD display for that:
0000000 1 2 2 1 5 0 1 7 5
0000011
This all tells me that the binary print (fwrite) isn't doing what I expected. It's my understanding the fwrite command writes the binary data to the file. In that case why does it successful write a value less than 128 but it fails with values equal to or greater than 128?
Environment:
Code::Blocks 16.01
Centos 7.1
Note: I did find this similar question: fwrite with non ASCII characters but it didn't seem to help with my situation. I could be wrong. Please let me know if I missed something in that post?

You are printing in octal (that's what od does by default), 226 octal is 150 decimal.

Related

Writing even integers up to n in a file in c, putw() function error

I'm writing a program that will output all even integers up to 100 in a text file.
Here's the whole code:
#include <stdio.h>
#include <stdlib.h>
#define MAX 100
int main() {
FILE *fp;
int i;
if ((fp = fopen("even_up_to_100.txt", "w")) == NULL) {
perror("Write");
exit(1);
}
for (i = 1; i <= MAX; ++i) {
if (!(i % 2))
putw(i, fp);
}
fclose(fp);
if ((fp = fopen("even_up_to_100.txt", "r")) == NULL) {
perror("Read");
exit(2);
}
while (!feof(fp))
printf("%d ", getw(fp));
fclose(fp);
return 0;
}
OUTPUT(from text file):
" $ & ( * , . 0 2 4 6 8 : < > # B D F H J L N P R T V X Z \ ^ ` b d
OUTPUT(from console window):
2 4 6 8 10 12 14 16 18 20 22 24 -1
Please point out the error(if there's any) in the code with the solution.
Inside the text file there are some control characters, which are shown as blank spaces here.
Since getw/putw are binary I/O functions, you should be opening your file in binary mode ("wb" instead of "w" as the mode argument to fopen, and likewise "rb" instead of "r").
Character 26 is ASCII Ctrl-Z, which Windows (and DOS before it) use as an end-of-file marker for text files. So if you're on such a system, when you attempt to read the number 26 from your file, the library sees a Ctrl-Z byte and treats that as the end of the file. That would explain why your program stops reading after 24. Opening in binary mode disables this behavior, and will also avoid various other problems, e.g. the handling of CR characters.
Note that if your goal was, as you said, to "output all even integers up to 100 in a text file", then getw/putw are the wrong tools for the job as they do binary I/O, not text. (Even if you did want binary format, you should not use getw/putw but rather fread/fwrite, as I explain here.) If you want to create a text file, with human-readable contents, you should use fprintf and fscanf.

why putw on shows correct output with getw, and fprintf and printf only shows correct output with scanf?

Can anyone explain me whats exactly going on here?
I have a problem in file management.
Explanation of code:
Here, it takes integers from user and then stored in DATA file.
then it reads integers from DATA file and filter them by ODD and EVEN and store it respective file.
Here is my code.
#include<stdio.h>
int main()
{
FILE *f1, *f2, *f3;
int number;
printf("Enter the content of data file\n");
f1 = fopen("DATA","w");
for(int i=1;i<=30;i++)
{
/*problem is here*/
number=getw(stdin);
//scanf("%d",&number);
if(number==-1 || number==EOF) break;
putw(number,f1);
}
fclose(f1);
f1 = fopen("DATA","r");
f2 = fopen("ODD","w");
f3 = fopen("EVEN","w");
while((number = getw(f1))!=EOF)
{
if(number % 2 == 0)
putw(number, f3);
else
putw(number,f2);
}
fclose(f1);
fclose(f2);
fclose(f3);
f2 = fopen("ODD","r");
f3 = fopen("EVEN","r");
printf("\n\ncontents of ODD file\n");
while((number=getw(f2))!=EOF)
/* problem is here */
putw(number,stdout);
//fprintf(stdout,"%d\n",number);
//printf("%d",number);
printf("\n\ncontents of EVEN file\n");
while((number=getw(f3))!=EOF)
/*problem is here */
//putw(number,stdout);
fprintf(stdout,"%d\n",number);
//printf("%d",number);
fclose(f2);
fclose(f3);
printf("\n");
return 0;
}
Output:
Enter the content of data file
111
222
333
444
555
contents of ODD file
111
333
555
contents of EVEN file
171061810
171193396
So here, why putw shows correct output with getw, and fprintf and printf only shows correct output with scanf. However, data stored in all files are correct!!!
why is it so?
why putw shows correct output with getw, and fprintf and printf only shows correct output with scanf
Because the getw() function reads the next word from the stream. The size of a word is the size of an int and may vary from machine to machine`
You can see in the comment: 171061810 is 0xA323232 (it's a hex number). When you use the ASCII code to convert from hex to character, you get:
Hex Character
A ==> \n
32 ==> 2
32 ==> 2
32 ==> 2
so
171061810 is "\n222".
It's similar to 444.

Format file to have 5 numbers per line

I am working on a text file containing integers separated by spaces, for instance:
1 2 57 99 8 14 22 36 98 445 1001 221 332 225 789 1111115 147 0 1 21321564 544 489 654 61266 5456 15 19
I would like to re-format this file to only contain 5 integers in any line but the last, and at most 5 integers in the last line.
My code:
#include <stdio.h>
#include <stdlib.h>
int main()
{
FILE *f; // main file (A.txt)
FILE *g; // file copy (B.txt)
// open A.txt to read data
f = fopen("file/path/here/A.txt", "r");
if (f == NULL) {
printf("Read error.\n");
fclose(f);
return -1;
}
// open B.txt to write data
g = fopen("file/path/here/B.txt", "w");
if (g == NULL) {
printf("Write error.\n");
fclose(g);
return -2;
}
int line = 1; // first line in output file
while (!feof(f)) { // not end-of-file
char number[1000];
int i = 0;
for (i = 0; i <= 4; i++)
if (fscanf(f, "%s", number) == 1) { // one number read
fprintf(g, "%s", line + i, number);
}
line += i;
}
// close files
fclose(f);
fclose(g);
return 0;
}
When I run this in Code::Blocks, I get the 'Segmentation fault (core dumped) Process returned 139' message. I suspect that the problem lies in the 'if' statement and my use of formats. Needless to say, I'm relatively new to C. How might I fix this?
The simple reason for your segmentation fault is expression fprintf(g, "%s", line + i, number);, in which you state to pass a pointer to a string (i.e. char*), but actually pass a number (i.e. line + i); hence, the value of line + i, which is probably 1, ..., is interpreted as a pointer to memory address 1, which is not allowed to be addressed. It is as if you wrote fprintf(g, "%s", 1), which crashes, too;
So basically change this expression into fprintf(g, "%s", number);, and it should at least not crash (unless you have numbers with more than 999 digits).
There are some other issues in your code, e.g. that you open B.txt for write and assign it to g, but then you test and close the file using variable f.
But maybe above "crash solution" brings you forward, such that you can work further on your own. Note that - if B.txt failed opening, then your code would also have crashed because of passing NULL as file stream argument to fprintf.
The issue is with the use of fscanf and then fprintf.
fscanf knows how to parse a string into a number. E.g. fscanf(f, "%d", &var);. This reads a signed integer from the file handle f into the variable var. This can then be printed with fprintf.
As it stands, the first fscanf slurps the entire input into number (assuming that 1000 char is enough) and the following ones are not expected to be called

Read and write to binary files in C?

Does anyone have an example of code that can write to a binary file. And also code that can read a binary file and output to screen. Looking at examples I can write to a file ok But when I try to read from a file it is not outputting correctly.
Reading and writing binary files is pretty much the same as any other file, the only difference is how you open it:
unsigned char buffer[10];
FILE *ptr;
ptr = fopen("test.bin","rb"); // r for read, b for binary
fread(buffer,sizeof(buffer),1,ptr); // read 10 bytes to our buffer
You said you can read it, but it's not outputting correctly... keep in mind that when you "output" this data, you're not reading ASCII, so it's not like printing a string to the screen:
for(int i = 0; i<10; i++)
printf("%u ", buffer[i]); // prints a series of bytes
Writing to a file is pretty much the same, with the exception that you're using fwrite() instead of fread():
FILE *write_ptr;
write_ptr = fopen("test.bin","wb"); // w for write, b for binary
fwrite(buffer,sizeof(buffer),1,write_ptr); // write 10 bytes from our buffer
Since we're talking Linux.. there's an easy way to do a sanity check. Install hexdump on your system (if it's not already on there) and dump your file:
mike#mike-VirtualBox:~/C$ hexdump test.bin
0000000 457f 464c 0102 0001 0000 0000 0000 0000
0000010 0001 003e 0001 0000 0000 0000 0000 0000
...
Now compare that to your output:
mike#mike-VirtualBox:~/C$ ./a.out
127 69 76 70 2 1 1 0 0 0
hmm, maybe change the printf to a %x to make this a little clearer:
mike#mike-VirtualBox:~/C$ ./a.out
7F 45 4C 46 2 1 1 0 0 0
Hey, look! The data matches up now*. Awesome, we must be reading the binary file correctly!
*Note the bytes are just swapped on the output but that data is correct, you can adjust for this sort of thing
There are a few ways to do it. If I want to read and write binary I usually use open(), read(), write(), close(). Which are completely different than doing a byte at a time. You work with integer file descriptors instead of FILE * variables. fileno will get an integer descriptor from a FILE * BTW. You read a buffer full of data, say 32k bytes at once. The buffer is really an array which you can read from really fast because it's in memory. And reading and writing many bytes at once is faster than one at a time. It's called a blockread in Pascal I think, but read() is the C equivalent.
I looked but I don't have any examples handy. OK, these aren't ideal because they also are doing stuff with JPEG images. Here's a read, you probably only care about the part from open() to close(). fbuf is the array to read into,
sb.st_size is the file size in bytes from a stat() call.
fd = open(MASKFNAME,O_RDONLY);
if (fd != -1) {
read(fd,fbuf,sb.st_size);
close(fd);
splitmask(fbuf,(uint32_t)sb.st_size); // look at lines, etc
have_mask = 1;
}
Here's a write: (here pix is the byte array, jwidth and jheight are the JPEG width and height so for RGB color we write height * width * 3 color bytes). It's the # of bytes to write.
void simpdump(uint8_t *pix, char *nm) { // makes a raw aka .data file
int sdfd;
sdfd = open(nm,O_WRONLY | O_CREAT);
if (sdfd == -1) {
printf("bad open\n");
exit(-1);
}
printf("width: %i height: %i\n",jwidth,jheight); // to the console
write(sdfd,pix,(jwidth*jheight*3));
close(sdfd);
}
Look at man 2 open, also read, write, close. Also this old-style jpeg example.c: https://github.com/LuaDist/libjpeg/blob/master/example.c I'm reading and writing an entire image at once here. But they're binary reads and writes of bytes, just a lot at once.
"But when I try to read from a file it is not outputting correctly." Hmmm. If you read a number 65 that's (decimal) ASCII for an A. Maybe you should look at man ascii too. If you want a 1 that's ASCII 0x31. A char variable is a tiny 8-bit integer really, if you do a printf as a %i you get the ASCII value, if you do a %c you get the character. Do %x for hexadecimal. All from the same number between 0 and 255.
I'm quite happy with my "make a weak pin storage program" solution. Maybe it will help people who need a very simple binary file IO example to follow.
$ ls
WeakPin my_pin_code.pin weak_pin.c
$ ./WeakPin
Pin: 45 47 49 32
$ ./WeakPin 8 2
$ Need 4 ints to write a new pin!
$./WeakPin 8 2 99 49
Pin saved.
$ ./WeakPin
Pin: 8 2 99 49
$
$ cat weak_pin.c
// a program to save and read 4-digit pin codes in binary format
#include <stdio.h>
#include <stdlib.h>
#define PIN_FILE "my_pin_code.pin"
typedef struct { unsigned short a, b, c, d; } PinCode;
int main(int argc, const char** argv)
{
if (argc > 1) // create pin
{
if (argc != 5)
{
printf("Need 4 ints to write a new pin!\n");
return -1;
}
unsigned short _a = atoi(argv[1]);
unsigned short _b = atoi(argv[2]);
unsigned short _c = atoi(argv[3]);
unsigned short _d = atoi(argv[4]);
PinCode pc;
pc.a = _a; pc.b = _b; pc.c = _c; pc.d = _d;
FILE *f = fopen(PIN_FILE, "wb"); // create and/or overwrite
if (!f)
{
printf("Error in creating file. Aborting.\n");
return -2;
}
// write one PinCode object pc to the file *f
fwrite(&pc, sizeof(PinCode), 1, f);
fclose(f);
printf("Pin saved.\n");
return 0;
}
// else read existing pin
FILE *f = fopen(PIN_FILE, "rb");
if (!f)
{
printf("Error in reading file. Abort.\n");
return -3;
}
PinCode pc;
fread(&pc, sizeof(PinCode), 1, f);
fclose(f);
printf("Pin: ");
printf("%hu ", pc.a);
printf("%hu ", pc.b);
printf("%hu ", pc.c);
printf("%hu\n", pc.d);
return 0;
}
$
This is an example to read and write binary jjpg or wmv video file.
FILE *fout;
FILE *fin;
Int ch;
char *s;
fin=fopen("D:\\pic.jpg","rb");
if(fin==NULL)
{ printf("\n Unable to open the file ");
exit(1);
}
fout=fopen("D:\\ newpic.jpg","wb");
ch=fgetc(fin);
while (ch!=EOF)
{
s=(char *)ch;
printf("%c",s);
ch=fgetc (fin):
fputc(s,fout);
s++;
}
printf("data read and copied");
fclose(fin);
fclose(fout);
I really struggled to find a way to read a binary file into a byte array in C++ that would output the same hex values I see in a hex editor. After much trial and error, this seems to be the fastest way to do so without extra casts. By default it loads the entire file into memory, but only prints the first 1000 bytes.
string Filename = "BinaryFile.bin";
FILE* pFile;
pFile = fopen(Filename.c_str(), "rb");
fseek(pFile, 0L, SEEK_END);
size_t size = ftell(pFile);
fseek(pFile, 0L, SEEK_SET);
uint8_t* ByteArray;
ByteArray = new uint8_t[size];
if (pFile != NULL)
{
int counter = 0;
do {
ByteArray[counter] = fgetc(pFile);
counter++;
} while (counter <= size);
fclose(pFile);
}
for (size_t i = 0; i < 800; i++) {
printf("%02X ", ByteArray[i]);
}
this questions is linked with the question How to write binary data file on C and plot it using Gnuplot by CAMILO HG. I know that the real problem have two parts: 1) Write the binary data file, 2) Plot it using Gnuplot.
The first part has been very clearly answered here, so I do not have something to add.
For the second, the easy way is send the people to the Gnuplot manual, and I sure someone find a good answer, but I do not find it in the web, so I am going to explain one solution (which must be in the real question, but I new in stackoverflow and I can not answer there):
After write your binary data file using fwrite(), you should create a very simple program in C, a reader. The reader only contains the same structure as the writer, but you use fread() instead fwrite(). So it is very ease to generate this program: copy in the reader.c file the writing part of your original code and change write for read (and "wb" for "rb"). In addition, you could include some checks for the data, for example, if the length of the file is correct. And finally, your program need to print the data in the standard output using a printf().
For be clear: your program run like this
$ ./reader data.dat
X_position Y_position (it must be a comment for Gnuplot)*
1.23 2.45
2.54 3.12
5.98 9.52
Okey, with this program, in Gnuplot you only need to pipe the standard output of the reader to the Gnuplot, something like this:
plot '< ./reader data.dat'
This line, run the program reader, and the output is connected with Gnuplot and it plot the data.
*Because Gnuplot is going to read the output of the program, you must know what can Gnuplot read and plot and what can not.
#include <stdio.h>
#include <stdlib.h>
main(int argc, char **argv) //int argc; char **argv;
{
int wd;
FILE *in, *out;
if(argc != 3) {
printf("Input and output file are to be specified\n");
exit(1);
}
in = fopen(argv[1], "rb");
out = fopen(argv[2], "wb");
if(in == NULL || out == NULL) { /* open for write */
printf("Cannot open an input and an output file.\n");
getchar();
exit(0);
}
while(wd = getw(in), !feof(in)) putw(wd, out);
fclose(in);
fclose(out);
}

C: lseek() related question

I want to write some bogus text in a file ("helloworld" text in a file called helloworld), but not starting from the beginning. I was thinking to lseek() function.
If I use the following code (edited):
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <stdlib.h>
#include <stdio.h>
#define fname "helloworld"
#define buf_size 16
int main(){
char buffer[buf_size];
int fildes,
nbytes;
off_t ret;
fildes = open(fname, O_CREAT | O_TRUNC | O_WRONLY, S_IRUSR | S_IWUSR);
if(fildes < 0){
printf("\nCannot create file + trunc file.\n");
}
//modify offset
if((ret = lseek(fildes, (off_t) 10, SEEK_END)) < (off_t) 0){
fprintf(stdout, "\nCannot modify offset.\n");
}
printf("ret = %d\n", (int)ret);
if(write(fildes, fname, 10) < 0){
fprintf(stdout, "\nWrite failed.\n");
}
close(fildes);
return (0);
}
, it compiles well and it runs without any apparent errors.
Still if i :
cat helloworld
The output is not what I expected, but:
helloworld
Can
Where is "Can" comming from, and where are my empty spaces ?
Should i expect for "zeros" instead of spaces ? If i try to open helloworld with gedit, an error occurs, complaining that the file character encoding is unknown.
LATER EDIT:
After I edited my program with the right buffer for writing, and then compile / run again, the "helloworld" file still cannot be opened with gedit.strong text
LATER EDIT
I understand the issue now. I've added to the code the following:
fildes = open(fname, O_RDONLY);
if(fildes < 0){
printf("\nCannot open file.\n");
}
while((nbytes = read(fildes, c, 1)) == 1){
printf("%d ", (int)*c);
}
And now the output is:
0 0 0 0 0 0 0 0 0 0 104 101 108 108 111 119 111 114 108 100
My problem was that i was expecting spaces (32) instead of zeros (0).
In this function call, write(fildes, fname, buf_size), fname has 10 characters (plus a trailing '\0' character, but you're telling the function to write out 16 bytes. Who knows what in the memory locations after the fname string.
Also, I'm not sure what you mean by "where are my empty spaces?".
Apart from expecting zeros to equal spaces, the original problem was indeed writing more than the length of the "helloworld" string. To avoid such a problem, I suggest letting the compiler calculate the length of your constant strings for you:
write(fildes, fname, sizeof(fname) - 1)
The - 1 is due to the NUL character (zero, \0) that is used to terminate C-style strings, and sizeof simply returning the size of the array that holds the string. Due to this you cannot use sizeof to calculate the actual length of a string at runtime, but it works fine for compile-time constants.
The "Can" you saw in your original test was almost certainly the beginning of one of the "\nCannot" strings in your code; after writing the 11 bytes in "helloworld\0" you continued to write the remaining bytes from whatever was following it in memory, which turned out to be the next string constant. (The question has now been amended to write 10 bytes, but the originally posted version wrote 16.)
The presence of NUL characters (= zero, '\0') in a text file may indeed cause certain (but not all) text editors to consider the file binary data instead of text, and possibly refuse to open it. A text file should contain just text, not control characters.
Your buf_size doesn't match the length of fname. It's reading past the buffer, and therefore getting more or less random bytes that just happened to sit after the string in memory.

Resources