Using getchar() to read from file - c

I have an assignment and basically i want to read all the bytes from an audio file using getchar() like this:
while(ch = getchar()) != EOF)
At some point I have to read 4 consecutive bytes that stand for size of file and I can't understand the following:
If the file my program is reading is for example 150 bytes in size, that is enough to be stored in 1 of the 4 bytes, which means that 3 of the bytes will be 0 and the last one will be 150 in that case. I understand that I need to read all 4 bytes, through 4 repetitions of the while in the above section of cod, in order to get all the information I need, but what exactly is getchar() going to return to my variable, as it returns the ASCII code for the character it just read?
Also what happens for larger numbers, that can't be stored in a single byte?

Cant comment since i dont have enough reputation, i am deeply perplexed with your question for I do not understand what do you mean or what are you trying to achieve
The function getChar() should be used for returning mostly a single byte at a time, in fact only upon reading your question did i check the manual to learn it reads more than one although from my experience and the tests i performed now it seems it is used for reading multi byte characters heres the simple code i used to check for it
char * c;
printf("Enter character: ");
c = getchar();
printf("%s",c);
The character i used and this will probably unformat is the stack overflow glyph i use in my polybar, 溜, here it shows as an asian character.
Not only that but fgets will return EOF when arriving at the end of the file(or when an error occurs) as stated in the linux manual
https://linux.die.net/man/3/getchar
Also upon further reading it depends on how the file stores data, if its big endian the first byte read will be 0,0,0,150 else if its little endian it will be 150,0,0,0 but thats assuming it is reading 1 character at the time and not 4 at once as you described it
As for the "solution" of your question why not use fread() reading the 4 bytes at once or a derivative when it does it job properly?
EDIT
As asked by the comment the following "concatenates" the values bit-wise i used scanf because i was too lazy to manually check for every ASCII key, this assuming the file is big endian, ie 0,0,0,150 else invert the order in which the << is done and it should "just werk™"
#include <stdio.h>
#include <stdlib.h>
unsigned char c[4];
unsigned int dosomething(){
unsigned int result=0;
result= (unsigned int)c[0]<< 24 | (unsigned int)c[1]<< 16 | (unsigned int)c[2]<< 8 | (unsigned int)c[3];
return result;
}
int main(int argc, char const *argv[]){
for (size_t i = 0; i < 4; i++)
{
printf("Enter character: ");
scanf ("%u", &c[i]);
printf("%u\n", c[i]);
//printf("%s",c);
}
printf("%u",dosomething());
return 0;
}
Now for the fread it is used like the following fread(pointertodatatoread, sizeofdata, sizeofarray, filepointer);
for indepth look here is the manual:
https://www.tutorialspoint.com/c_standard_library/c_function_fread.htm
this should be asked in a different thread as i feel im asking another question

If the file my program is reading is for example 150 bytes in size, that is enough to be stored in 1 of the 4 bytes, which means that 3 of the bytes will be 0 and the last one will be 150 in that case. I understand that I need to read all 4 bytes in order to get all the information I need, but what exactly is getchar() going to return to my variable, as it returns the ASCII code for the character it just read?
getchar doesn't know anything about ASCII. It returns the numeric value of the byte it reads, or a special code, represented by EOF, if it cannot read a byte. If you treat the byte as an ASCII code then that's a matter of interpretation.
Thus, if your file size is encoded as as three zero bytes followed by one byte with value 150, then getchar() will return that as 0, 0, 0, and 150 on four consecutive calls.

Related

C UNIX - read() reads none existing letters

I've got a little problem while experimenting with some C code. I've tried to use read()-Command to read a text out of a file and store the results in a charArray. But when I print the results they're always different from the file.
Here is the code:
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
void main() {
int fd = open("file", 2);
char buf[2];
printf("Read elements: %ld\n", read(fd, buf, 2));
printf("%s\n", buf);
close(fd);
}
The file "file" was created in the same directory using the following UNIX commands:
cat > file
Hi
So it contains just the word "Hi". When I run it, I expect it to read 2 bytes from the file (which are 'H' and 'i') and store them at buf[0] and buf[1]. But when I want to print the result, it appears, that there was an issue, because besides the word "Hi" there are several wierd characters printed (indicating a memory reading/writing problem i guess, due to bad buffer size). I've tried to increase the size of the buf-Array and it appears that when i change the size, the wierd characters printed change. The problem is removed when size reaches 32 bytes.
Can someone explain to me in detail why this is happening?
I've understood so far that read() does not read \'0' when it reads something, and that the third parameter of read() indicates the maximum number of bytes to read.
Antoher thing I've noticed while experimenting with the above code is the following: Let's assume one changes the third parameter (maximum bytes to read) of read() to 3, and the size of buf-Array to 512 (overkill i know, but I really wanted to see what will happen). Now read will acutally read a third character (in my case 'e') and store it into the buffer, even tho this third character does not exist.
I've searched for a while now #stackoverflow and I found many similiar cases, but none of them made me understand my problem. If there is any other thread i missed, it would be a pleasure if u could link me to it.
At last: sry for my bad english, it's not my native language.
Clearly you need to make buf 3 bytes long and use the last byte as the null byte (0 or '\0'). That way, when you print the string, your computer doesn't carry on until he finds another 0 !
The way strings (char arrays really) are handled in C is quite straightforward. Indeed, when dealing with strings (most) if not all functions take under the assumption that string parameters are null terminated (puts) or return null terminated strings (strdup).
The point is that, by default the computer can't tell where a string ends unless it is given the strings size each time he processes it. The easiest implementation around this approach was to append after each string a 0 (namely the null byte). That way, the computer just need to iterate over the string's characters and stop when he finds the termination character (other name for null byte).

fread() a struct in c

For my assignment, I'm required to use fread/fwrite. I wrote
#include <stdio.h>
#include <string.h>
struct rec{
int account;
char name[100];
double balance;
};
int main()
{
struct rec rec1;
int c;
FILE *fptr;
fptr = fopen("clients.txt", "r");
if (fptr == NULL)
printf("File could not be opened, exiting program.\n");
else
{
printf("%-10s%-13s%s\n", "Account", "Name", "Balance");
while (!feof(fptr))
{
//fscanf(fptr, "%d%s%lf", &rec.account, rec.name, &rec.balance);
fread(&rec1, sizeof(rec1),1, fptr);
printf("%d %s %f\n", rec1.account, rec1.name, rec1.balance);
}
fclose(fptr);
}
return 0;
}
clients.txt file
100 Jones 564.90
200 Rita 54.23
300 Richard -45.00
output
Account Name Balance
540028977 Jones 564.90
200 Rita 54.23
300 Richard -45.00╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
╠╠ü☻§9x°é -92559631349317831000000000000000000000000000000000000000000000.000000
Press any key to continue . . .
I can do this with fscanf (which Ive commented out), but I'm required to use fread/fwrite.
Why does it start with a massive number for Jone's account?
Why is there garbage after? Shouldn't feof stop this?
Are there any drawbacks using this method? or fscanf method?
How can I fix these?
Many thanks in advance
As the comments say, fread reads the bytes in your file without any interpretation. The file clients.txt consists of 50 characters, 16 in the first line plus 14 in the second plus 18 in the third line, plus two newline characters. (Your clients.txt does not contain a newline after the third line, as you will soon see.) The newline character is a single byte \n on UNIX or Mac OS X machines, but (probably) two bytes \r\n on Windows machines - hence either 50 or 51 characters. Here is the sequence of ASCII bytes in hexadecimal:
3130 3020 4a6f 6e65 7320 3536 342e 3930 100 Jones 564.90
0a32 3030 2052 6974 6120 3534 2e32 330a \n200 Rita 54.23\n
3330 3020 5269 6368 6172 6420 2d34 352e 300 Richard -45.
3030 00
Your fread statement copies these bytes without any interpretation directly into your rec1 data structure. That structure begins with int account;, which says to interpret the first four bytes as an int. As one of the comments noted, you are running your program on a little-endian machine (most likely an Intel machine), so the least significant byte is the first and the most significant byte is the fourth. Thus, your fread said to interpret the sequence of four ASCII characters "100 " as the four byte integer 0x20303031, which equals, in decimal, 540028977. The next member of your struct is char name[100];, which means that the next 100 bytes of data in rec1 will be the name. But the fread was told to read sizeof(rec1)=112 bytes (4 byte account, 100 byte name, 8 byte balance). Since your file is only 50 (or 52) characters, fread will have only been able to fill in that many bytes of rec1. The return value of fread, had you not discarded it, would have told you that the read stopped short of the number of bytes you requested. Since you hit EOF, the feof call breaks out of the loop after that first pass, having consumed the entire file in one gulp.
All of your output was produced by the first and only call to fprintf. The number 540028977 and the following space were produced by the "%d " and the rec1.account argument. The next bit is only partly determinate, and you got lucky: The "%s" specifier and the corresponding rec1.name argument will print the next characters as ASCII until a \0 byte is found. Thus, the output will begin with the 50-4 (or 52-4) remaining characters of your file -- including the two newlines -- and potentially continue forever, because there are no \0 bytes in your file (or in any text file), which means that after printing the last character of your file, what you are seeing is whatever garbage happened to be in the automatic variable rec1 when your program started. (That kind of unintentional output is similar to the famous heartbleed bug in OpenSSL.) You were lucky the garbage included a \0 byte after only a few dozen more characters. Note that printf has no way to know that rec1.name was declared to be only a 100 byte array -- it only got the pointer to the beginning of name -- it was your responsibility to guarantee that rec1.name contained a terminating \0 byte, and you never did that.
We can tell a little bit more. The number -9.2559631349317831e61 (which is pretty ugly in "%f" format) is the value of rec1.balance. The 8 bytes for that double value on an IEEE 754 machine (like your Intel and all modern computers) are in hex 0xcccccccccccccccc. Sixty four of the peculiar ╠ symbol appear in the "%s" output corresponding to rec1.name, while only 100-46 = 54 characters remain of the 100, so your "%s" output has run off the end of rec1.name, and includes rec1.balance into the bargain, and we learn that your terminal program interpreted the non-ASCII character 0xcc as ╠. There are many ways to interpret bytes bigger than 127 (0x7f); in latin-1 it would have been Ì for example. The graphical character ╠ is the representation of the 0xcc (204) byte in the ancient MS-DOS character set, Windows code page 437. Not only are you running on an Intel machine, it is a Windows machine (of course the mostly likely possibility to begin with).
That answers your first two questions. I'm not sure I understand your third question. The "drawbacks" I hope are obvious.
As for how to fix it, there is no reasonably simple way to read and interpret a text file using fread. To do so, you would need to duplicate much of the code in the libc fscanf function. The only sensible way is to first use fwrite to create a binary file; then fread will work naturally to read it back. So there have to be two programs -- one to write a binary clients.bin file, and a second to read it back. Of course, that does not solve the problem of where the data for that first program should come from in the first place. It could come from reading clients.txt using fscanf. Or it could be included in the source code of the fwrite program, for example by initializing an array of struct rec like this:
struct rec recs[] = {{100, "Jones", 564.90},
{200, "Rita", 54.23},
{300, "Richard", -45.00}};
Or it could come from reading a MySQL database, or... The one place it is unlikely to originate is in a binary file (easily) readable with fread.

linux - serial port programming ( ASCII to Byte )

I tried to receive data from serial port. However, those data is unrecognized to me. The root cause is because those are in ASCII. To decode the data, it needs to be the byte formate.
The buffer I've created is unsigned char [255] and I try to print out the data by using
while (STOP==FALSE) {
res = read(fd,buf,255);
buf[res]=0;
printf(":%x\n", buf[0]);
if (buf[0]=='z') STOP=TRUE;
}
Two questions here:
The data might is shorter than 255 in the real case. It might takes 20 - 30 arrays from 255. In this case, how can I print 20 arrays ?
The correct output should be 41542b ( AT+ ) as the head of the entire command since this is the AT command. So I expect the buf[0] should be 41 in the beginning. It is, however, I dont know why the second one is e0 while I expect to have 54 (T).
Thanks
Ascii is a text encoding in bytes. There's no difference in reading them, it's just a matter of how you interpret what you read. This is not your problem.
Your problem is you read up to 255 bytes at once and only ever print the first of them.
It's pointless to set buf[res] to 0 when you expect binary data (that possibly contains 0 bytes). That's just useful for terminating text strings.
Just use a loop over your buffer, e.g.
for (int i = 0; i < res; ++i)
{
printf("%x", buf[i]);
}

How to XOR two byte streams in C?

I've been reading through SO for the past couple of days trying to figure this out, I am stumped. I want to read in two 32 bit byte arrays (from stdin, input will be hex) and xor them, then print the result.
So far I've tried using scanf, fgets, and gets. My thought was to read the large hex numbers into a char buffer and perform the xor in a for loop until I hit an EOL (with fgets) or a null terminator. So far my output is not even close. I tried lots of variations, but I will only post my latest fail below. The challenge I've been trying to complete is: http://cryptopals.com/sets/1/challenges/2/
I am trying it in C because I'm really trying to learn C, but I'm really getting frustrated with none of these attempts working.
#include <stdio.h>
#include <math.h>
int main()
{
char buff1[100];
char buff2[100];
char buff3[100];
int size = sizeof(buff1);
puts("Enter value\n");
fgets(buff1, size, stdin);
puts(buff1);
puts("Enter value\n");
fgets(buff2, size, stdin);
puts(buff2);
for (int i = 0; i != '\n'; i++) {
buff3[i] = buff2[i] ^ buff1[i];
printf("%x", buff3[i]);
}
return 0;
}
When using sizeof() it should be used with types, not data. For instance if you want space for 100 chars, you need to find the sizeof(char) and then multiply by 100 to find out how many bytes you need and that goes into the buffer. A char is usually a byte so expect 100 bytes. fgets() will work but I prefer to use this
int getchar()
Just stop when the the user
enters a newline/terminator character. Since you don't know how many characters will come in from stdin, you
need to dynamically increase the size of your buffer or it will overflow. For the purposes of this question you can just make it a very big array, check to see if its about to overflow and then terminate the program. So to recap the steps.
1.) Create a big array
2.) While loop over getchar() and stop when the output is the terminator, take note of
how many chars you read.
3.) Since both buffers are guaranteed to have equal chars make your
final array equal to that many chars in size.
4.) For loop over getchar() and as the chars come out, xor them with the first array
and put the result into the final array. You should try doing this with 1 array
afterwards to get some more C practice.
Good luck!
EDIT:
fgets() can be used but depending on the implementation it is useful to know how many chars have been read in.
#include <string.h>
#include <ctype.h>
static inline unsigned char hc2uc(char d){
const char *table = "0123456789abcdef";
return strchr(table, tolower(d)) - table;
}
...
for(int i=0;buff1[i]!='\n';i++){
buff3[i]=hc2uc(buff2[i])^hc2uc(buff1[i]);
printf("%x",buff3[i]);
}

reading data from file in c

I have a txt file named prob which contains:
6 2 8 3
4 98652
914
143 789
1
527 146
85
1 74 8
7 6 3
Each line has 9 chars and there are 9 lines. Since I cant make a string array in c, im be using a two dimensional array. Careful running the code, infinite loops are common and it prints weird output. Im also curious as to where does it stop taking in the string? until newline?
expected result for each "save": 6 2 8 3
or watever the line contained.
#include <stdio.h>
FILE *prob;
main()
{
prob = fopen("prob.txt", "r");
char grid_values[9][9];
char save[9];
int i;
for (i = 0; (fscanf(prob, "%s", save) != EOF); i++)
{
int n;
for (n = 0; n <= 9; n++)
{
grid_values[i][n] = save[n];
printf("%c", grid_values[i][n]);
}
}
fclose(prob);
}
if you use fscanf, it will stop after a space delimiter..
try fgets to do it.. It will read line by line..
for (i = 0; (fgets(save, sizeof(save), prob) != EOF); i++)
the detail of fgets usage can be found here:
http://www.cplusplus.com/reference/clibrary/cstdio/fgets/
--edited--
here's the second
while(!feof(file))
{
fgets(s, sizeof(s), file); ......
}
I think it'll work well..
This looks like a homework problem, so I will try to give you some good advice.
First, read the description of the fscanf function and the description of the "%s" conversion.
Here is a snip from the description I have for "%s":
Matches a sequence of non-white-space characters; the next pointer must be a pointer to a character array that is long enough to hold the input sequence and the terminating null
character (’\0’), which is added automatically. The input string stops at white space or
at the maximum field width, whichever occurs first.
Here are the two important points:
Each of your input lines contains numbers and whitespace characters. So the function will read a number, reach whitespace, and stop. It will not read 9 characters.
If it did read 9 characters, you do not have enough room in your array to store the 10 bytes required. Note that a "terminating null character" will be added. 9 characters read, plus 1 null, equals 10. This is a common mistake in C programming and it is best to learn now to always account for the terminating null in any C string.
Now, to fix this to read characters into a two dimensional array: You need to use a different function. Look through your list of C stdio functions.
See anything useful sounding?
If you haven't, I will give you a hint: fread. It will read a fixed number of bytes from the input stream. In your case you could tell it to always read 9 bytes.
That would only work if each line is guaranteed to be padded out to 9 characters.
Another function is fgets. Again, carefully read the function documentation. fgets is another function that appends a terminating null. However! In this case, if you tell fgets a size of 9, fgets will only read 8 characters and it will write the terminating null as the 9th character.
But there is even another way! Back to fscanf!
If you look at the other conversion specifiers, you could use "%9c" to read 9 characters. If you use this operation, it will not add a terminating null to the string.
With both fread and fscanf "%9c" if you wanted to use those 9 bytes as a string in other functions such as printf, you would need to make your buffers 10 bytes and after every fread or fscanf function you would need to write save[9] = '\0'.
Always read the documentation carefully. C string functions sometimes do it one way. But not always.

Resources