Char value higher then 255? - c

Today I wrote simple program to encryption my .txt file. And I saw, I can set char value higher than 255.
This is the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv)
{
FILE* fp;
FILE* fp2;
char buffor = '\0';
int szyfr = 0;
if(argc < 4)
{
printf("Za malo argumentow (c/dc, sciezka, szyfr)!\n");
exit(0);
}
{
int i;
for(i = 0;i < strlen(argv[3]);++i)
{
szyfr *= 10;
szyfr += argv[3][i]-48;
}
}
if(!strncmp(argv[1], "c", 1))
{
fp = fopen(argv[2], "r");
fp2 = fopen("crypted.data", "w");
if(!fp)
{
printf("Cannot open file: %s!", argv[2]);
exit(0);
}
while(1)
{
buffor = fgetc(fp);
if(feof(fp) != 0) break;
fputc(buffor+szyfr, fp2);
}
fputc_unlocked(
fclose(fp);
fclose(fp2);
}
else if(!strncmp(argv[1], "dc", 2))
{
fp = fopen(argv[2], "r");
fp2 = fopen("uncrypted.txt", "w");
if(!fp)
{
printf("Cannot open file: %s!", argv[2]);
exit(0);
}
while(1)
{
buffor = fgetc(fp);
if(feof(fp) != 0) break;
fputc(buffor-szyfr, fp2);
}
fclose(fp);
fclose(fp2);
}
return 0;
}
Whatever you set in the szyfr value this will work, but chars in the .data file is very strange (for example for 666 szyfr it will be like " ×ýû¤")
Why this doesn't giving error about char memory or something like that?
PS: Sorry for some texts in code in Polish but I forgot about this

I saw, I can set char value higher than 255.
I guess you're talking about the first argument to fputc(), and maybe about the return value of fgetc(). These both have type int, but that doesn't mean what you seem to think it means. The behavior of both functions is defined in terms of type unsigned char:
fgetc():
the fgetc function obtains that character as an unsigned char converted to an int [...]
(C2011, 2.21.7.1/2; emphasis added)
fputc():
The fputc function writes the character specified by c (converted to an unsigned char) to the output stream pointed to by stream [...]
(C2011, 2.21.7.3/2; emphasis added)
So yes, inasmuch as the range of type int is, in practice, invariably larger than that of type unsigned char, you can pass a value larger than unsigned char can represent to fputc(). But no, that does not result in writing that value in a manner that can be read back. The conversion to unsigned char will result in the character actually written being in the range of unsigned char, which is almost certainly 0 - 255 for you.
Why this doesn't giving error about char memory or something like that?
There is no error in fputc() because the behavior is perfectly well defined for the arguments you are providing. Even if there were an error, however, your code would not tell you, because such an error would be communicated to your program via the return value of fputc(), which you do not check.
Regarding wide-character I/O
Note that wide-character I/O functions such as fgetwc() and fputwc() operate in larger units, but their underlying behavior is not fundamentally different. It involves casting analogous to that performed by fgetc() and fputc() -- thus affording the same possibility of data corruption -- and you might still see strange characters in your encrypted file, albeit probably different ones.
Regarding strange characters
As far as strange characters appearing in the encrypted file, this is pretty much to be expected, albeit somewhat dependent on what your editor or terminal (depending on how you display the file) supposes is the file's character encoding. Your encryption scheme effectively converts character data to binary data, so it's unreasonable to expect it to look like character data.

C is a low-level language that just does what you tell it, with no help or argument.
You declare a variable buffor to be a char, and then you call a function fgetc() that returns an int, and then you assign it. C says "Fine. You've asked me to put 16 gallons of water into an 8-gallon bucket, so I did." Now you've got a full 8 gallon bucket and a wet floor. C just chops off 8 bits and drops them, so for instance, you'll never be able to tell when fgetc() returns EOF, since that's a larger-than-8-bit value.
If you want to assure that 8-bit variables only get 8-bit values, you'll have to check them yourself before you assign them.

Related

Why should I put SEEK_SET twice

I want to modify some vowels of a file by "5". The following code works. However, I do not understand why I should put fseek twice.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
void print_file_contents(const char *filename)
{
FILE *fp;
char letter;
if((fp=fopen(filename,"r+"))==NULL)
{
printf("error\n");
exit(1);
}
fseek(fp,0,SEEK_END);
int size=ftell(fp);
rewind(fp);
for(int i=0;i<size;i++)
{
fseek(fp,i,SEEK_SET);
letter=fgetc(fp);
if((letter=='a') || (letter=='e') || (letter=='i'))
{
fseek(fp,i,SEEK_SET); // WHY THIS FSEEK ?
fwrite("5",1,sizeof(char),fp);
}
}
fclose(fp);
}
int main(int argc, char *argv[])
{
print_file_contents("myfile");
return 0;
}
In my opinion, the first fseek(fp, i, SEEK_SET) is used to set the file position indicator to the current character being processed, so that the character can be read using fgetc. Hence, the cursor is updated every time so there is no need to add another fseek(fp, i, SEEK_SET);.
The fgetc advanced the file position; if you want to replace the character you just read, you need to rewind back to the same position you were in when you read the character to replace.
Note that the C standard mandates a seek-like operation when you switch between reading and writing (and between writing and reading).
§7.21.5.s The fopen function ¶7:
¶7 When a file is opened with update mode ('+' as the second or third character in the above list of mode argument values), both input and output may be performed on the associated stream. However, output shall not be directly followed by input without an intervening call to the fflush function or to a file positioning function (fseek, fsetpos, or rewind), and input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end- of-file.
Also, calling fgetc() moves the file position forward one character; if the write worked (it's undefined behaviour if you omit the seek-like operation), you'd overwrite the next character, not the one you just read.
Your intuition is correct: two of the three fseek calls in this program are unnecessary.
The necessary fseek is the one inside the if((letter=='a') || (letter=='e') || (letter=='i')) conditional. That one is needed to back up the file position so you overwrite the character you just read (i.e. the vowel), not the character after the vowel.
The fseek inside the loop (but outside the if) is unnecessary because both fgetc and fwrite advance the file position, so it will always set the file position to the position it already has. And the fseek before the loop is unnecessary because you do not need to know how big the file is to implement this algorithm.
This code can be tightened up considerably. I'd write it like this:
#include <stdio.h>
void replace_aie_with_5_in_place(const char *filename)
{
FILE *fp = fopen(filename, "r+"); // (1)
if (!fp) {
perror(filename); // (2)
exit(1);
}
int letter;
while ((letter = fgetc(fp)) != EOF) { // (3)
if (letter == 'a' || letter == 'e' || letter == 'i') { // (4)
fseek(fp, -1, SEEK_CUR); // (5)
fputc('5', fp);
if (fflush(fp)) { // (6)
perror(filename);
exit(1);
}
}
if (fclose(fp)) { // (7)
perror(filename);
exit(1);
}
}
int main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "usage: %s filename\n", argv[0]);
return 1;
}
replace_aei_with_5_in_place(argv[1]); // (8)
return 0;
}
Notes:
It is often (but not always) better to write operations with side effects, like fopen, separately from conditionals checking whether they succeeded.
When a system-level operation fails, always print both the name of any file involved, and the decoded value of errno. perror(filename) is a convenient way to do this.
You don't need to know the size of the file you're crunching because you can use a loop like this, instead. Also, this is an example of an exception to (1).
Why not 'o' and 'u' also?
Here's the necessary call to fseek, and the other reason you don't need to know the size of the file: you can use SEEK_CUR to back up by one character.
This fflush is necessary because we're switching from writing to reading, as stated in Jonathan Leffler's answer. Inconveniently, it also consumes the notification for some (but not all) I/O errors, so you have to check whether it failed.
Because you are writing to the file, you must also check for delayed I/O errors, reported only on fclose. (This is a design error in the operating system, but one that we are permanently stuck with.)
Best practice is to pass the name of the file to munge on the command line, not to hardcode it into the program.
#Jonathan Leffler well states why code used multiple fseek(): To cope with changing between reading and writing.
int size=ftell(fp); is weak as the range of returned values from ftell() is long.
Seeking in a text file (as OP has) also risks undefined behavior (UB).
For a text stream, either offset shall be zero, or offset shall be a value returned by an earlier successful call to the ftell function on a stream associated with the same file and whence shall be SEEK_SET. C17dr § 7.21.9.1 3.
Better to use #zwol like approach with a small change.
Do not assume a smooth linear mapping. Instead, note the location and then return to it as needed.
int replacement = '5';
for (;;) {
long position = ftell(fp);
if (ftell == -1) {
perror(filename);
exit(1);
}
int letter = fgetc(fp);
if (letter == EOF) {
break;
}
if (letter == 'a' || letter == 'e' || letter == 'i') {
fseek(fp, position, SEEK_SET);
fputc(replacement, fp);
if (fflush(fp)) {
perror(filename);
exit(1);
}
}
}
Research fgetpos(), fsetpos() for an even better solution that handles all file sizes, even ones longer than LONG_MAX.

How to get the length of a string in c, if it has integers in it

I am familiar with the sizeof operation in C, but when I use it for the string "1234abcd" it only returns 4, which I am assuming is accounting for the last 4 characters.
So how would I get this to be a string of size 8?
specific code is as follows:
FILE *in_file;
in_file = fopen(filename, "r");
if (in_file == NULL) {
printf("File does not exist\n");
return 1;
}
int val_to_inspect = 0;
fscanf(in_file, "%x", &val_to_inspect);
while (val_to_inspect != 0) {
printf("%x", val_to_inspect);
int length = sizeof val_to_inspect;
printf("%d", length);
Again, the string that is being read from the file is "1234abcd", just to clarify.
There're a couple of issues here:
sizeof operator returns the size of the object. In this case it returns the size of val_to_inspect, which is an int.
http://en.cppreference.com/w/cpp/language/sizeof
fscanf reads from a stream and interprets it. You are only scanning an integer ("%x"), not a string.
http://en.cppreference.com/w/cpp/io/c/fscanf
Lastly, if you actually had a nil-terminated string, to get its length you could use strlen().
TL;DR, to get the length of a string, you need to use strlen().
That said, be a little cautious while using sizeof, it operates on the data type. So, if you pass a pointer to it, it will return you the size of the pointer variable, not the length of the string it points to.
In several important ways, only some of which have anything to do with sizeof, you are mistaken about what your code actually does.
FILE *in_file;
in_file = fopen(filename, "r");
if (in_file == NULL)
{
printf("File does not exist\n");
return 1;
}
Kudos for actually checking whether fopen succeeded; lots of people forget to do that when they are starting out in C. However, there are many reasons why fopen might fail; the file not existing is just one of them. Whenever an I/O operation fails, make sure to print strerror(errno) so you know the actual reason. Also, error messages should be sent to stderr, not stdout, and should include the name of the affected file(s) if any. Corrected code looks like
if (in_file == NULL)
{
fprintf(stderr, "Error opening %s: %s\n", filename, strerror(errno));
return 1;
}
(You will need to add includes of string.h and errno.h to the top of the file if they aren't already there.)
int val_to_inspect = 0;
fscanf(in_file,"%x", &val_to_inspect);
This code does not read a string from the file. It skips any leading whitespace and then reads a sequence of hexadecimal digits from the file, stopping as soon as it encounters a non-digit, and immediately converts them to a machine number which is stored in val_to_expect. With the file containing 1234abcd, it will indeed read eight characters from the file, but with other file contents it might read more or fewer.
(Technically, with the %x conversion specifier you should be using an unsigned int, but most implementations will let you get away with using a signed int.)
(When you get more practice in C you will learn that scanf is broken-as-specified and also very difficult to use robustly, but for right now don't worry about that.)
while (val_to_inspect != 0) {
printf("%x", val_to_inspect);
int length = sizeof val_to_inspect;
printf("%d", length);
}
You are not applying sizeof to a string, you are applying it to an int. The size of an int, on your computer, is 4 chars, and that is true no matter what the value is.
Moreover, sizeof applied to an actual C string (that is, a char * variable pointing to a NUL-terminated sequence of characters) does not compute the length of the string. It will instead tell you the size of the pointer to the string, which will be a constant (usually either 4 or 8, depending on the computer) independent of the length of the string. To compute the length of a string, use the library function strlen (declared in string.h).
You will sometimes see clever code apply sizeof to a string literal, which does return a number related to (but not equal to!) its length. Exercise for you: figure out what that number is, and why sizeof does this for string literals but not for strings in general. (Hint: sizeof s will return a number related to s's string length when s was declared as char s[] = "string";, but not when it was declared as char *s = "string";.)
As a final note, it doesn't matter in the grand scheme of things whether you like your opening braces on their own lines or not, but pick one style and stick to it throughout the entire file. Don't put some if opening braces on their own lines and others at the end of the if line.
It's better to create own counter to find the length of "1234abcd" by reading the character by character.
FILE *in_file;
char ch;
int length=0;
in_file = fopen("filename.txt", "r");
if (in_file == NULL)
{
printf("File does not exist\n");
return 1;
}
while (1) {
ch = fgetc(in_file);
printf("%c", ch);
if (ch == EOF)
break;
length++;
}
fclose(in_file);
printf ("\n%d",length);
Everyone, thank you for all the feedback. I realize I made a lot of mistakes with the original post, but im just switching to c from c++, so a lot of the things I'm used to cant really be applied the same way. This is all tremendously helpful, it's good to have a place to go to.
Len=sizeof(your string)/sizeof(char)-1
-1 is eof character null
If you want to get length of any from specific begining index just do Len-index

C: simultaneous reading from and writing to file

What i would like to do:
Read bits from one file (input file), and write these (with some probability) inverted bits to other file (output file).
What is the problem:
Probability idea seem not to be working properly. And more importantly, output file always contains more characters then the original input file, while they should contain equal number of characters.
In this code sample, instead of inverted bits i have put 'x' and 'y', so that it is more obvious that output file contains more characters
INPUT file: 01001
OUTPUT file: xyxxxyx
The code:
void invert_bits(FILE **input, FILE **output, double prob){
srand(clock());
char symbol;
while((symbol = getc(*input)) != EOF){
double result = rand()/RAND_MAX;
if(result < prob){
if(simbol == '0'){
char bit = 'x';
fprintf(*output, &bit);
}
else{
char bit = 'y';
fprintf(*output, &bit);
}
}else{
fprintf(*output, &symbol);
}
}
}
(f)printf expects a format string as its second argument. You are providing it with the address of a char, which is not even a valid string (since it is not NUL-terminated).
Don't do that. It's a bad habit. When you use printf, fprintf or sprintf always use a format string. (Read this for more information.)
You could have used fprintf(*output, "%c", bit); but it would be a lot simpler to just print the character with fputc(bit, *output);
I don't understand why you feel the need to pass the FILE* arguments as pointers, by the way.
You aren't using the fprintf function properly.
The function's signature is:
int fprintf ( FILE * stream, const char * format, ... );
Instead of a null terminated string, you're providing it with an address of a char, which might follow by a null character, or might not.
The correct way of printing a character with the *printf functions is:
fprintf(*output, "%c", bit);
P.S. Why are you receiving a pointer to the file handle, i.e. FILE** and not just FILE*?

getc return value stored in a char variable

On this Wikipedia page there is a sample C program reading and printing first 5 bytes from a file:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char buffer[5] = {0}; /* initialized to zeroes */
int i;
FILE *fp = fopen("myfile", "rb");
if (fp == NULL) {
perror("Failed to open file \"myfile\"");
return EXIT_FAILURE;
}
for (i = 0; i < 5; i++) {
int rc = getc(fp);
if (rc == EOF) {
fputs("An error occurred while reading the file.\n", stderr);
return EXIT_FAILURE;
}
buffer[i] = rc;
}
fclose(fp);
printf("The bytes read were... %x %x %x %x %x\n", buffer[0], buffer[1], buffer[2], buffer[3], buffer[4]);
return EXIT_SUCCESS;
}
The part I don’t understand is that it uses getc function which returns an int and stores it in an array of chars - how is it possible to store ints in a char array ?
Techically, C allows you to "shorten" a variable by assigning it to something that is smaller than itself. The specification doesn't say EXACTLY what happens when you do that (because of technicalities in some machines where slightly weird things happens), but in practice, on nearly all machines that you are likely to use unless you work on museum pieces or some very special hardware, it simply acts as if the "upper" bits of the larger number has been "cut off".
And in this particular case, getc is specifically designed to return something that fits in a char, except for the case when it returns EOF, which often has the value -1. Although quite often, char may well support having the value -1 too, but it's not guaranteed to be the case (if char is an unsigned type - something the C and C++ standards support equally with char being a signed type that can be -1).
Check this out:-
If the integer value returned by getc() is stored into a variable of
type char and then compared against the integer constant EOF, the
comparison may never succeed, because sign-extension of a variable of
type char on widening to integer is implementation-defined.
Yes, getc() returns an integer. However, except for the special return value EOF, the returned value will always be within the range of a char (-128 to 127 on a 2's compliment machine with default signed chars).
Therefore, after checking for EOF, it is always safe to transfer the value to a char variable without data loss.

C Read and replace char

I'm trying to read a file and replace every char by it's corresponding char up one in ASCII table. It opens the file properly but keep on reading the first character.
int main(int argc, char * argv[])
{
FILE *input;
input = fopen(argv[2], "r+");
if (!input)
{
fprintf(stderr, "Unable to open file %s", argv[2]);
return -1;
}
char ch;
fpos_t * pos;
while( (ch = fgetc(input)) != EOF)
{
printf("%c\n",ch);
fgetpos (input, pos);
fsetpos(input, pos-1);
fputc(ch+1, input);
}
fclose(input);
return 1;
}
the text file is
abc
def
ghi
I'm pretty sure it's due to the fgetpos and fsetpos but if I remove it then it will add the character at the end of the file and the next fgetc will returns EOF and exit.
You have to be careful when dealing with files opened in update mode.
C11 (n1570), § 7.21.5.3 The fopen function
When a file is opened with update mode ('+' as the second or third character in the
above list of mode argument values), both input and output may be performed on the
associated stream.
However, output shall not be directly followed by input without an
intervening call to the fflush function or to a file positioning function (fseek,
fsetpos, or rewind), and input shall not be directly followed by output without an
intervening call to a file positioning function, unless the input operation encounters end-of-file.
So your reading might look something like :
int c;
while ((c = getc(input)) != EOF)
{
fsetpos(/* ... */);
putc(c + 1, input);
fflush(input);
}
By the way, you will have problems with 'z' character.
procedure for performing random access such
positioned the record
reading of the record
positioned the record
update(write) the record
do flush (to finalize the update)
The following code is a rewrite in consideration to it.
#include <stdio.h>
#include <ctype.h>
int main(int argc, char * argv[]){
FILE *input;
input = fopen(argv[1], "rb+");
if (!input){
fprintf(stderr, "Unable to open file %s", argv[1]);
return -1;
}
int ch;
fpos_t pos, pos_end;
fgetpos(input, &pos);
fseek(input, 0L, SEEK_END);
fgetpos(input, &pos_end);
rewind(input);
while(pos != pos_end){
ch=fgetc(input);
if(EOF==ch)break;
printf("%c",ch);
if(!iscntrl(ch) && !iscntrl(ch+1)){
fsetpos(input, &pos);
fputc(ch+1, input);
fflush(input);
}
pos += 1;
fsetpos(input, &pos);
}
fclose(input);
return 1;
}
I really suspect the problem is here:
fpos_t * pos;
You are declaring a pointer to a fpos_t which is fine but then, where are the infomation stored when you'll retrieve the pos?
It should be:
fpos_t pos; // No pointer
...
fgetpos (input, &pos);
fsetpos(input, &pos); // You can only come back where you were!
Reading the (draft) standard, the only requirement for fpos_t is to be able to represent a position and a state for a FILE, it doesn't seem that there is a way to move the position around.
Note that the expression pos+1 move the pointer, does not affect the value it points to!
What you probably want is the old, dear ftell() and fseek() that will allow you to move around. Just remember to open the file with "rb+" and to flush() after your fputc().
When you'll have solved this basic problem you will note there is another problem with your approach: handling newlines! You most probably should restrict the range of characters you will apply your "increment" and stipulate that a follows z and A follows Z.
That said, is it a requirement to do it in-place?
7.21.9.1p2
The fgetpos function stores the current values of the parse state (if
any) and file position indicator for the stream pointed to by stream
in the object pointed to by pos. The values stored contain unspecified
information usable by the fsetpos function for repositioning the
stream to its position at the time of the call to the fgetpos
function.
The words unspecified information don't seem to inspire confidence in that subtraction. Have you considered calling fgetpos prior to reading the character, so that you don't have to do a non-portable subtraction? Additionally, your call to fgetpos should probably pass a pointer to an existing fpos_t (eg. using the &address-of operator). Your code currently passes a pointer to gibberish.
fgetc returns an int, so that it can represent every possible unsigned char value distinct from negative EOF values.
Suppose your char defaults to an unsigned type. (ch = fgetc(input)) converts the (possibly negative, corresponding to errors) return value straight to your unsigned char type. Can (unsigned char) EOF ever compare equal to EOF? When does your loop end?
Suppose your char defaults, instead, to a signed type. (c = fgetc(input)) is likely to turn the higher range of any returned unsigned char values into negative numbers (though, technically, this statement invokes undefined behaviour). Wouldn't your loop end prematurely (eg. before EOF), in some cases?
The answer to both of these questions indicates that you're handing the return value of fgetc incorrectly. Store it in an int!
Perhaps your loop should look something like:
for (;;) {
fpos_t p;
/* TODO: Handle fgetpos failure */
assert(fgetpos(input, &p) == 0);
int c = fgetc(input);
/* TODO: Handle fgetc failure */
assert(c >= 0);
/* TODO: Handle fsetpos failure */
assert(fsetpos(input, &p) == 0);
/* TODO: Handle fputc failure */
assert(fputc(c + 1, input) != EOF);
/* TODO: Handle fflush failure (Thank Kirilenko for this one) */
assert(fflush(input) == 0);
}
Make sure you check return values...
The update mode('+') can be a little bit tricky to handle. Maybe You could just change approach and load the whole file into char array, iterate over it and then eventually write the whole thing to an emptied input file? No stream issues.

Resources