What is the use of `putw` and `getw` function in c? - c

I wanna know the use of putw() and getw() function. As I know, these are used to write and read from file as like as putc and getc but these deals with only integers. But when I use these for writing integers, it just write different symbol in file (like if I write 65 to file using putw(). It writes A in the file). Why does it take the ASCII value? I am using codeblocks 13.12. Code:
#include <stdio.h>
int main() {
FILE *fp;
int num;
fp = fopen("file.txt", "w");
printf("Enter any number:\n");
scanf("%d", &num);
putw(num, fp);
fclose(fp);
printf("%d\n", num);
return 0;
}

Let's read the point to point explanation of getw() and putw() functions.
getw() and putw() are related to FILE handling.
putw() is use to write integer data on the file (text file).
getw() is use to read the integer data from the file.
getw() and putw() are similar to getc() and putc(). The only difference is that getw() and putw() are especially meant for reading and writing the integer data.
int putw(integer, FILE*);
Return type of the function is integer.
Having two argument first "integer", telling the integer you want to write on the file and second argument "FILE*" telling the location of the file in which the data would be get written.
Now let's see an example.
int main()
{
FILE *fp;
fp=fopen("file1.txt","w");
putw(65,fp);
fclose(fp);
}
Here putw() takes the integer number as argument (65 in this case) to write it on the file file1.txt, but if we manually open the text file we find 'A' written on the file. It means that putw() actually take integer argument but write it as character on the file.
So, it means that compiler take the argument as the ASCII code of the particular character and write the character on the text file.
int getw(FILE*);
Return type is integer.
Having one argument that is FILE* that is the location of the file from which the integer data to be read.
In this below example we will read the data that we have written on the file named file1.txt in the example above.
int main()
{
FILE *fp;
int ch;
fp=fopen("file1.txt","r");
ch=getw(fp);
printf("%d",ch);
fclose(fp);
}
output
65
Explanation: Here we read the data we wrote to file1.txt in above program and we will get the output as 65.
So, getw() reads the character 'A' that was already written on the file file1.txt and return the ASCII code of the character 'A' that is 65.
We can also write the above program as:
int main()
{
FILE *fp;
fp=fopen("file1.txt","r");
printf("%d",getw(fp));
fclose(fp);
}
output
65

If num is an int, then putw(num, fp) is equivalent to fwrite(&num, sizeof(int), 1, fp), except for having a different return value. It writes an int to the file in binary format. getw is similar but with fread instead. You can see how glibc implements them: putw,getw.
This means that:
They are not appropriate for writing text. If you want to write a number to a file in human-readable decimal or hexadecimal format, use fprintf instead.
They typically read/write more than one byte (one character) to the file. For instance, on a machine with 32-bit ints, they will read/write four bytes. Attempting to do putw('c') will not simply write the single character 'c'.
They should only be used with files opened in binary mode (if that makes a difference on your system).
You should not expect the contents of the file to be human-readable at all. If you attempt to view the file in an editor, you'll see the representation of whatever bytes are in the file, in your current character set (e.g. ASCII).
You should not expect the file to be successfully read back on another computer that uses a different internal representation for int (e.g. different width, different endianness).
On a typical system with 32-bit little-endian int, putw(65, fp) will result in the four bytes 0x41 0x00 0x00 0x00. The 0x41 (decimal 65) is the ASCII code for the character A, so you'll see that if you view it. The 0x00 bytes may or may not be displayed at all, depending on what you are using to view.
These function are not a good idea to use in new code. Even if you do need to store binary data in files, which has various disadvantages as noted and should usually only be done if there is a very good reason for it, you should simply use fwrite and fread. getw/putw are a worse option because:
They will make your code less portable. fwrite/fread are part of the ISO C standard, which is the most widely supported cross-platform modern standard for the C language. getw/putw were present in the Single Unix Specification v2 version 2, which dates to 1997 and is now obsolete. They were not included in the POSIX/SUSv3 specs which superseded SUSv2, and it would be unwise to count on them being available on new systems.
They will make your code less readable. Since fread/fwrite are far more widely used, another programmer reading your code will recognize immediately what they do. Since getw/putw are more obscure, people are likely to have to go and look them up, and the names don't make it easy to remember that they operate specifically on the type int. Readers may also confuse them with the similarly-named ISO-standard functions getwc/putwc. So using getw/putw makes your code less readable.
They may introduce subtle bugs. getw returns EOF on end-of-file or error, but EOF is a valid integer value (often -1). Therefore, if it returns this value, you cannot easily tell whether the file actually contained the integer -1, or whether the read failed. And since this only happens for one particular value, it may be missed in testing. You can check ferror() and feof() to distinguish the two cases, but this is awkward, easy to forget to do, and negates most of the apparent convenience of the "simpler" interface of getw compared to fread.
I speculate that the only reason these functions existed in the first place is that, like putc (respectively getc), putw could be implemented as a macro that directly wrote the buffer of fp and would thus be a little more efficient than calling fwrite. Such an implementation is no longer feasible on modern systems, since it wouldn't be thread-safe, so putw needs a function call anyway. In fact, with glibc in particular, putw just calls fwrite after all, with the overhead of an additional function call, so it's now less efficient. So there is no longer any reason at all to use these functions.

From the man page of putw() and getw()
getw() reads a word (that is, an int) from stream. It's provided for compatibility with SVr4.
putw() writes the word w (that is, an int) to stream. It is provided for compatibility with SVr4.
You can use the fread and fwrite function for better use.
getw() reads the integer from the given FILE stream.
putw() it write the integer given in the first argument into the file pointer.
getw:
It will read the integer from the file. like getchar() doing the work. Consider the file having the content "hello". It will read the h and return ascii value of h.
putw:
It will place the given integer, integer taken as a ascii value. Corresponding value of the ascii value placed in the file. like putchar()

Related

Read from file number of characters from a given position - C

I have a file called "cache.txt", where I have some data. I have a unordered_map, cache, with a pair of ints as value, where : the first element from the pair represents the number of characters I want to read, and the second the position from where I want to read. The key from the unordered_map, comps, is not relevant.
....
fp = fopen("cache.txt", "r");
fseek(fp, cache[comps].second, SEEK_SET);
int number_of_chars = cache[comps].first;
char c;
while((c = getc(fp)) != EOF && number_of_chars > 0) {
--number_of_chars;
printf("%c",c);
}
fclose(fp);
I have to use it several times, so that's the reason for opening and closing the file each time.
If you are using text streams (and apparently you are, because you open the file in mode "r", not "rb"), then the only position values you can pass to fseek are 0 or some value returned by ftell. In other words, you cannot count characters read yourself and use that count in a call to fseek.
If you use binary streams, then you can count the bytes read yourself, but you will have to deal with the fact that whatever the OS uses to represent newlines will not be translated to a newline character. Because there is no way to know how a given OS handles newlines, it's impossible to portably read a text stream in binary mode.
In short, you should take the requirement in the C standard library seriously. Make sure that the offset you are storing in your map came directly from a call to ftell. ("Directly" means that you cannot even use ftell(file) + 1; only ftell(file).)
Unix/Linux programmers can get away with not dealing with the above, since Posix mandates that there is no difference between text and binary modes, and only requires that fseek use a value returned by ftell for wide streams. But if you are using Windows, you will find that trying to use fseek to return to a byte position you computed yourself does not work. And if you are not using Windows, you should ask yourself whether your program might someday be ported to Windows.

How EOF is defined for binary and ascii files

I'm programming C on Windows(system language is Japanese), and I have a problem about EOF of binary and ascii files.
I asked this question last week, a kind guy helped me, but I still can't really understand how the program works when reading a binary or an ascii file.
I did the following test:
Test1:
int oneChar;
iFile = fopen("myFile.tar.gz", "rb");
while ((oneChar = fgetc(iFile)) != EOF) {
printf("%d ", oneChar);
}
Test2:
int oneChar;
iFile = fopen("myFile.tar.gz", "r");
while ((oneChar = fgetc(iFile)) != EOF) {
printf("%d ", oneChar);
}
In the test1 case, things worked perfectly for both binary and ascii files. But in test2, program stopped reading when it encountered 0x1A in a binary file. (Does this mean that 1A == EOF?) ASCII table tells me that 1A is a control character called substitute (whatever that means...) And when I printf("%d", EOF), however, it gave me -1...
I also found this question which tells me that the OS knows exactly where a file ends, so I don't really need to find EOF in the file, because EOF is out of the range of a byte (what about 1A?)
Can someone clear things up a little for me? Thanks in advance.
This is a Windows-specific trick for text files: SUB character, which is represented by Ctrl+Z sequence, is interpreted as EOF by fgetc. You do not have to have 1A in your text file in order to get an EOF back from fgetc, though: once you reach the actual end of file, EOF would be returned.
The standard does not define 1A as the char value to represent an EOF. The constant for EOF is of type int, with a negative value outside the range of unsigned char. In fact, the reason why fgetc returns an int, not char, is to let it return a special value for EOF.
The convention of ending a file with Ctrl-Z originated with CP/M, a very old operating system for 8080/Z80 microcomputers. Its file system did not keep track of file sizes down to the byte level, only to the 128-byte sector level, so there needed to be another way to mark the end-of-file.
Microsoft's DOS was made to be as compatible with CP/M as possible, so it kept the convention when reading text files. By this time the file size was kept by the file system so it wasn't strictly necessary, just retained for backward compatibility.
This convention has persisted to the present day in the C and C++ libraries for Windows; when you open a file in text mode, every character is checked for Ctrl-Z and the end-of-file flag is set if it's detected. You're seeing the effects of backwards compatibility taken to an extreme, back to systems that are almost 40 years old.
Found a terrific article that answers all the question! https://latedev.wordpress.com/2012/12/04/all-about-eof/
EOF in text files is usually character 0x1A or ASCII 26 if you will.

fgets not reading the beginning of a line

I am having trouble reading a few lines of text from a file using fgets. The file is some basic user data that is written to a file within the bundle the first time the plugin is launched. Any subsequent launch of the plugin should result in the user data being read and cross referenced to check the users authenticity.
The data is always 3 lines long and is written with frwite exactly as it should be and is opened with fopen.
My original theory was to just call fgets 3 times reading each line into it's own char array which is part of a data struct. The problem is the first line is read correctly, the second line is read as though the position indicator starts on the next line but offset by the number of characters read from line 1. The third line is then not read at all.
fgets is not returning any errors and is behaving as though it has read the data it should have so i'm obviously missing something.
Anyway here's a portion of my code hopefully someone can some shed some light on my mistakes!
int length;
fgets(var.n, 128, regFile);
length = strlen(var.n);
var.n[length-1] = NULL;
fgets(var.em, 128, regFile);
length = strlen(var.em);
var.em[length-1] = NULL;
fgets(var.k, 128, regFile);
length = strlen(var.k);
var.k[length-1] = NULL;
fclose(regFile);
Setting the last character in each string to NULL is just to remove the /n
This sequence of code outputs the whole of line 1, the second half of line 2 and none of line 3.
Thanks to #alvits for the answer to this one:
fwrite() is not compatible with fgets(). Files created using fwrite() should use fread() to read them ?>back in. Both fwrite() and fread() operates on binary streams unless explicitly converted to and from >strings. fgets() is compatible with fputs(), both operates on strings.
I used fputs() to write my data instead and it read back in perfectly.
In POSIX systems, including Linux, there is no differentiation between binary and text files. When opening a file stream, the b flag is ignored. This is described in fopen().
You might ask "how would you differentiate text from binary files?". The contents differentiate them. How the contents are written makes them a binary or text file.
Look at the signature size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream). You'll notice that it writes the contents of *ptr with size describing the size of each members, nmemb. The written stream is not converted to string. If you were to write 97 it will write the binary 97 which in ascii is A. Binary data does not obey string terminations. Presence of \n and \0 in data is literally written as is.
Now look at the signature int fputs(const char *s, FILE *stream). It writes the string content of *s. If you were to write 97, it will have to be a string "97" which is not A. String termination is obeyed. \n is automatically converted to the O/S supported newline (CRLF or LF).
You can coerce fwrite() to behave like fputs() but not the other way around. For example, if you declare ptr as a pointer to string and calculate the size exactly as the length of the content excluding string terminator, you'll be able to write it out as text instead of binary. You will also need to handle \0 and \n and convert them to O/S supported newline. Writing the entire string buffer will write everything including and past the string terminators.

Why does the file size decrease after encryption using an offset cipher?

I encrypted a text file using an offset cipher in C. For this, I simply added 128 to each character and got the file size decreased by 3 bytes. I tried the same on some other files too just to get the same result, i.e. decrease in file size by 3 bytes. I got the original size after decryption.
Could you please tell me why does it so happen?
Code for the main logic is given below:
while((ch=fgetc(fs))!=EOF){
fputc(ch+128, ft);
Could you please tell me why does it so happen?
Your ch probably has the wrong declaration. The fputc() function returns an int, not a char, and if you cast to char you will lose the distinction between (char) 0xff and EOF.
// WRONG WRONG WRONG
// char ch = fgetc(fs);
The right declaration:
int ch = fgetc(fs);
Otherwise, it shouldn't happen. Is your process exiting cleanly? If you abort(), then there might be data still in FILE * buffers. Show more code. Run with Valgrind. Check the exit status of your process.
I think the file size should have doubled as two bytes were taken for one character after encryption as something greater than 127 can not be stored in 1 byte.
No, fputc() does not work that way. The fputc() man page (run man fputc in a terminal, unless on Windows):
fputc() writes the character c, cast to an unsigned char, to stream.
Conversion to unsigned char is done by taking the value modulo 256*. So fputc() always writes exactly one byte of data (unlesss it fails).
* This is true all but exceedingly rare systems.
If you talk about Windows, I could imagine that you have opened the file in text mode, not in binary mode.
That leads to the following:
Writing \n leads to a \r\n written to the file.
Reading \r\n from the file gives only \n to the user.
Reading stops at the first \x1A, being a EOF character.
If you add 128 to each byte, the data-to-be-written rolls over at 256. While it may be undefined behaviour to call fputc() with a value > 256 (you should write (ch+128)%256 or (ch+128) & 0xFF), on your systems it obviously writes the value wrapped by 256 and thus you may get \n or \x1A by accident.

To copy files in binary mode,why it doesn't work when we read to and write from a character variable? [duplicate]

This question already has answers here:
copying the contents of a binary file
(4 answers)
Closed 9 years ago.
The following program is intended to make a copy of one .exe application file.But just one little thing determines whether it indeed gives me a proper copy of the intended file RealPlayer.exe or gives me a corrupted file.
What I do is read from the source file in binary mode and write to the new copy in the same mode.For this I use a variable ch.But if ch is of type char, I get a corrupted file which has a size of few bytes while the original file is 26MB.But if I change the type of ch to int, the program works fine and gives me the exact copy of RealPlayer.exe sized 26MB.So let me ask two questions that arise from this premise.I would appreciate if you can answer both parts:
1) Why does using type char for ch mess things up while int type works?What is wrong with char type?After all, shouldn't it read byte by byte from the original file(as char is one byte itself) and write it byte by byte to the new copy file?After all isn't what the int type does,ie, read 4 bytes from original file and then write that to the copy file?Why the difference between the two?
2) Why is the file so small-sized compared to original file if we use char type for ch?Let's forget for a moment that the copied file is corrupt to begin with and focus on the size.Why is it that the size is so small if we copy character by character (or byte by byte), but is big(original size) when we copy "integer by integer" (or 4-bytes by 4-bytes)?
I was suggested by a friend to simply stop asking questions and use int because it works while char doesn't!!.But I need to understand what's going on here as I see a serious lapse in my understanding in this matter.Your detailed answers are much sought.Thanks.
#include<stdio.h>
#include<stdlib.h>
int main()
{
char ch; //This is the cause of problem
//int ch; //This solves the problem
FILE *fp,*tp;
fp=fopen("D:\\RealPlayer.exe","rb");
tp=fopen("D:\\copy.exe","wb");
if(fp==NULL||tp==NULL)
{
printf("Error opening files");
exit(-1);
}
while((ch=getc(fp))!=EOF)
putc(ch,tp);
fclose(fp);
fclose(tp);
}
The problem is in the termination condition for the loop. In particular, the type of the variable ch, combined with rules for implicit type conversions.
while((ch=getc(fp))!=EOF)
getc() returns int - either a value from 0-255 (i.e. a char) or -1 (EOF).
You stuff the result into a char, then promote it back to int to do the comparison. Unpleasant things happen, such as sign extension.
Let's assume your compiler treats "char" as "signed char" (the standard gives it a choice).
You read a bit pattern of 0xff (255) from your binary file - that's -1, expressed as a char. That gets promoted to int, giving you 0xffffffff, and compared with EOF (also -1, i.e 0xffffffff). They match, and your program thinks it found the end of file, and obediently stops copying. Oops!
One other note - you wrote:
After all isn't what the int type does,ie, read 4 bytes from original
file and then write that to the copy file?
That's incorrect. getc(fp) behaves the same regardless of what you do with the value returned - it reads exactly one byte from the file, if there's one available, and returns that value - as an int.
int getc ( FILE * stream );
Returns the character currently pointed by the internal file position indicator of the specified stream.
On success, the character read is returned (promoted to an int value).If you have already defined ch as int all works fine but if ch is defined as char, returned value from getc() is supressed back to char.
above reasons are causing corruption in data and loss in size.

Resources