Reading from a text file in C - c

I am having trouble reading a specific integer from a file and I am not sure why. First I read through the entire file to find out how big it is, and then I reset the pointer to the beginning. I then read 3 16-byte blocks of data. Then 1 20-byte block and then I would like to read 1 byte at the end as an integer. However, I had to write into the file as a character but I do not think that should be a problem. My issue is that when I read it out of the file instead of being the integer value of 15 it is 49. I checked in the ACII table and it is not the hex or octal value of 1 or 5. I am thoroughly confused because my read statement is read(inF, pad, 1) which I believe is right. I do know that an integer variable is 4 bytes however, there is only one byte of data left in the file so I read in only the last byte.
My code is reproduced the function(it seems like a lot but it don't think it is)
the code is
#include<math.h>
#include<stdio.h>
#include<string.h>
#include <fcntl.h>
int main(int argc, char** argv)
{
char x;
int y;
int bytes = 0;
int num = 0;
int count = 0;
num = open ("a_file", O_RDONLY);
bytes = read(num, y, 1);
printf("y %d\n", y);
return 0;
}
To sum up my question, how come when I read the byte that stores 15 from the text file, I can't view it as 15 from the integer representation?
Any help would be very appreciated.
Thanks!

You're reading a first byte of int (4 bytes), and then print it as a whole. If you want to read by one byte, you need also to use it as one byte, like this:
char temp; // one-byte signed integer
read(fd, &temp, 1); // read the integer from file
printf("%hhd\n", temp); // print one-byte signed integer
Or, you can use regular int:
int temp; // four byte signed integer
read(fd, &temp, 4); // read it from file
printf("%d\n", temp); // print four-byte signed integer
Note that this will work only on platforms with 32-bit integers, and also depends on platform's byte order.
What you're doing is:
int temp; // four byte signed integer
read(fd, &temp, 1); // read one byte from file into the integer
// now first byte of four is from the file,
// and the other three contain undefined garbage
printf("%d\n", temp); // print contents of mostly uninitialized memory

The read function system call has a declaration like:
ssize_t read(int fd, void* buf, size_t count);
So, you should pass address of the int variable in which you want to read the stuff.
i.e use
bytes = read(num, &y, 1);

You can see all the details of file I/O in C from that link

Based on the read function, I believe it is reading the first byte in the first byte of the 4 bytes of the integer, and that byte is not placed in the lowest byte. This means whatever is in pad for the other 3 bytes will still be there, even if you initialized it to zero (then it will have zeros in the other bytes). I would read in one byte and then cast it to an integer (if you need a 4 byte integer for some reason), as shown below:
/* declare at the top of the program */
char temp;
/* Note line to replace read(inF,pad,1) */
read(inF,&temp,1);
/* Added to cast the value read in to an integer high order bit may be propagated to make a negative number */
pad = (int) temp;
/* Mask off the high order bits */
pad &= 0x000000FF;
Otherwise, you could change your declaration to be an unsigned char which would take care of the other 3 bytes.

Related

8 Byte Number as Hex in C

I have given a number, for example n = 10, and I want to calculate its length in hex with big endian and save it in a 8 byte char pointer. In this example I would like to get the following string:
"\x00\x00\x00\x00\x00\x00\x00\x50".
How do I do that automatically in C with for example sprintf?
I am not even able to get "\x50" in a char pointer:
char tmp[1];
sprintf(tmp, "\x%x", 50); // version 1
sprintf(tmp, "\\x%x", 50); // version 2
Version 1 and 2 don't work.
I have given a number, for example n = 10, and I want to calculate its length in hex
Repeatedly divide by 16 to find the number of hexadecimal digits. A do ... while insures the result is 1 when n==0.
int hex_length = 0;
do {
hex_length++;
} while (number /= 16);
save it in a 8 byte char pointer.
C cannot force your system to use 8-byte pointer. So if you system uses 4 byte char pointer, we are out of luck. Let us assume OP's system uses 8-byte pointer. Yet integers may be assigned to pointers. This may or may not result in valid pointer.
assert(sizeof (char*) == 8);
char *char_pointer = n;
printf("%p\n", (void *) char_pointer);
In this example I would like to get the following string: "\x00\x00\x00\x00\x00\x00\x00\x50".
In C, a string includes the various characters up to an including a null character. "\x00\x00\x00\x00\x00\x00\x00\x50" is not a valid C string, yet is a valid string literal. Code cannot construct string literals at run time, that is a part of source code. Further the relationship between n==10 and "\x00...\x00\x50" is unclear. Instead perhaps the goal is to store n into a 8-byte array (big endian).
char buf[8];
for (int i=8; i>=0; i--) {
buf[i] = (char) n;
n /= 256;
}
OP's code certainly will fail as it attempts to store a string which is too small. Further "\x%x" is not valid code as \x begins an invalid escape sequence.
char tmp[1];
sprintf(tmp, "\x%x", 50); // version 1
Just do:
int i;
...
int length = round(ceil(log(i) / log(16)));
This will give you (in length) the number of hexadecimal digits needed to represent i (without 0x of course).
log(i) / log(base) is the log-base of i. The log16 of i gives you the exponent.
To make clear what we're doing here: When rising 16 to the power of the found exponent, we get back i: 16^log16(i) = i.
By rounding up this exponent using ceil(), you get the number of digits.

Reading 2 bytes from a file and converting to an int gives the wrong output

Basically I have a text file that contains a number. I changed the number to 0 to start and then I read 2 bytes from the file (because an int is 2 bytes) and I converted it to an int. I then print the results, however it's printing out weird results.
So when I have 0 it prints out 2608 for some reason.
I'm going off a document that says I need to read through a file where the offset of bytes 0 to 1 represents a number. So this is why I'm reading bytes instead of characters...
I imagine the issue is due to reading bytes instead of reading by characters, so if this is the case can you please explain why it would make a difference?
Here is my code:
void readFile(FILE *file) {
char buf[2];
int numRecords;
fread(buf, 1, 2, file);
numRecords = buf[0] | buf[1] << 8;
printf("numRecords = %d\n", numRecords);
}
I'm not really sure what the buf[0] | buf[1] << 8 does, but I got it from another question... So I suppose that could be the issue as well.
The number 0 in your text file will actually be represented as a 1-byte hex number 0x30. 0x30 is loaded to buf[0]. (In the ASCII table, 0 is represented by 0x30)
You have garbage data in buf[1], in this case the value is 0x0a. (0x0a is \n in the ASCII table)
Combining those two by buf[0] | buf[1] << 8 results in 0x0a30 which is 2608 in decimal. Note that << is the bit-wise left shift operator.
(Also, the size of int type is 4-byte in many systems. You should check that out.)
You can directly read into integer
fread(&numRecords, sizeof(numRecords), 1, file);
You need to check sizeof(int) on your system, if its four bytes you need to declare numRecords as short int rather than int

Encode and combine int to 32bit int in C binary file

Lets say I have 2 variables:
int var1 = 1; //1 byte
int var2 = 2; //1 byte
I want to combine these and encode as a 32bit unsigned integer (uint32_t). By combining them, it would be 2 bytes. I'd then fill the remaining space with 2 bytes of 0 padding. This is to write to a file, hence the need for this specific type of encoding.
So by combining the above example variables, the output I need is:
1200 //4 bytes
There's no need to go the roundabout way of "combining" the values into an uint32_t. Binary files are streams of bytes, so writing single bytes is very possible:
FILE * const out = fopen("myfile.bin", "wb");
const int val1 = 1;
const int val2 = 2;
if(out != NULL)
{
fputc(val1, out);
fputc(val2, out);
// Pad the file to four bytes, as originally requested. Not needed, though.
fputc(0, out);
fputc(0, out);
fclose(out);
}
This uses fputc() to write single bytes to the file. It takes an integer argument for the value to write, but treats it as unsigned char internally, which is essentially "a byte".
Reading back would be just as simple, using e.g. fgetc() to read out the two values, and of course checking for failure. You should check these writes too, I omitted it because error handling.

How to convert from integer to unsigned char in C, given integers larger than 256?

As part of my CS course I've been given some functions to use. One of these functions takes a pointer to unsigned chars to write some data to a file (I have to use this function, so I can't just make my own purpose built function that works differently BTW). I need to write an array of integers whose values can be up to 4095 using this function (that only takes unsigned chars).
However am I right in thinking that an unsigned char can only have a max value of 256 because it is 1 byte long? I therefore need to use 4 unsigned chars for every integer? But casting doesn't seem to work with larger values for the integer. Does anyone have any idea how best to convert an array of integers to unsigned chars?
Usually an unsigned char holds 8 bits, with a max value of 255. If you want to know this for your particular compiler, print out CHAR_BIT and UCHAR_MAX from <limits.h> You could extract the individual bytes of a 32 bit int,
#include <stdint.h>
void
pack32(uint32_t val,uint8_t *dest)
{
dest[0] = (val & 0xff000000) >> 24;
dest[1] = (val & 0x00ff0000) >> 16;
dest[2] = (val & 0x0000ff00) >> 8;
dest[3] = (val & 0x000000ff) ;
}
uint32_t
unpack32(uint8_t *src)
{
uint32_t val;
val = src[0] << 24;
val |= src[1] << 16;
val |= src[2] << 8;
val |= src[3] ;
return val;
}
Unsigned char generally has a value of 1 byte, therefore you can decompose any other type to an array of unsigned chars (eg. for a 4 byte int you can use an array of 4 unsigned chars). Your exercise is probably about generics. You should write the file as a binary file using the fwrite() function, and just write byte after byte in the file.
The following example should write a number (of any data type) to the file. I am not sure if it works since you are forcing the cast to unsigned char * instead of void *.
int homework(unsigned char *foo, size_t size)
{
int i;
// open file for binary writing
FILE *f = fopen("work.txt", "wb");
if(f == NULL)
return 1;
// should write byte by byte the data to the file
fwrite(foo+i, sizeof(char), size, f);
fclose(f);
return 0;
}
I hope the given example at least gives you a starting point.
Yes, you're right; a char/byte only allows up to 8 distinct bits, so that is 2^8 distinct numbers, which is zero to 2^8 - 1, or zero to 255. Do something like this to get the bytes:
int x = 0;
char* p = (char*)&x;
for (int i = 0; i < sizeof(x); i++)
{
//Do something with p[i]
}
(This isn't officially C because of the order of declaration but whatever... it's more readable. :) )
Do note that this code may not be portable, since it depends on the processor's internal storage of an int.
If you have to write an array of integers then just convert the array into a pointer to char then run through the array.
int main()
{
int data[] = { 1, 2, 3, 4 ,5 };
size_t size = sizeof(data)/sizeof(data[0]); // Number of integers.
unsigned char* out = (unsigned char*)data;
for(size_t loop =0; loop < (size * sizeof(int)); ++loop)
{
MyProfSuperWrite(out + loop); // Write 1 unsigned char
}
}
Now people have mentioned that 4096 will fit in less bits than a normal integer. Probably true. Thus you can save space and not write out the top bits of each integer. Personally I think this is not worth the effort. The extra code to write the value and processes the incoming data is not worth the savings you would get (Maybe if the data was the size of the library of congress). Rule one do as little work as possible (its easier to maintain). Rule two optimize if asked (but ask why first). You may save space but it will cost in processing time and maintenance costs.
The part of the assignment of: integers whose values can be up to 4095 using this function (that only takes unsigned chars should be giving you a huge hint. 4095 unsigned is 12 bits.
You can store the 12 bits in a 16 bit short, but that is somewhat wasteful of space -- you are only using 12 of 16 bits of the short. Since you are dealing with more than 1 byte in the conversion of characters, you may need to deal with endianess of the result. Easiest.
You could also do a bit field or some packed binary structure if you are concerned about space. More work.
It sounds like what you really want to do is call sprintf to get a string representation of your integers. This is a standard way to convert from a numeric type to its string representation. Something like the following might get you started:
char num[5]; // Room for 4095
// Array is the array of integers, and arrayLen is its length
for (i = 0; i < arrayLen; i++)
{
sprintf (num, "%d", array[i]);
// Call your function that expects a pointer to chars
printfunc (num);
}
Without information on the function you are directed to use regarding its arguments, return value and semantics (i.e. the definition of its behaviour) it is hard to answer. One possibility is:
Given:
void theFunction(unsigned char* data, int size);
then
int array[SIZE_OF_ARRAY];
theFunction((insigned char*)array, sizeof(array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(*array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(int));
All of which will pass all of the data to theFunction(), but whether than makes any sense will depend on what theFunction() does.

Decoding Binary via fget / buffer string (Trying to get mp3 header)

I'm writing some quick code to try and extract data from an mp3 file header.
The objective is to extract information from the header such as the bitrate and other vital information so that I can appropriately stream the file to a mp3decoder with the necessary arguments.
Here is a wikipedia image showing the mp3header information:
http://upload.wikimedia.org/wikipedia/commons/0/01/Mp3filestructure.svg
My question is, am I attacking this correctly? Printing the data received is worthless -- I just get a bunch of random characters. I need to get to the binary so that I can decode it and determine vital information.
Here is my baseline code:
// mp3 Header File IO.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include "stdio.h"
#include "string.h"
#include "stdlib.h"
// Main function
int main (void)
{
// Declare variables
FILE *mp3file;
char *mp3syncword; // we will need to allocate memory to this!!
char requestedFile[255] = "";
unsigned long fileLength;
// Counters
int i;
// Memory allocation with malloc
mp3syncword=(char *)malloc(2000);
// Let's get the name of the requested file (hard-coded for now)
strcpy(requestedFile,"testmp3.mp3");
// Open the file with mode read, binary
mp3file = fopen(requestedFile, "rb");
if (!mp3file){
// If we can't find the file, notify the user of the problem
printf("Not found!");
}
// Let's get some header data from the file
fseek(mp3file,1,SEEK_SET);
fread(mp3syncword,32,1,mp3file);
// For debug purposes, lets print the received data
for(i = 0; i < 32; ++i)
printf("%c", ((char *)mp3syncword)[i]);
enter code here
return 0;
}
Help appreciated.
You are printing the bytes out using %c as the format specifier. You need to use an unsigned numeric format specifier (e.g. %u for a decimal number or %x or %X for hexadecimal) to print the byte values.
You should also declare your byte arrays as unsigned char as they are signed by default on Windows.
You might also want to print out a space (or other separator) after each byte value to make the output clearer.
The standard printf does not provide a binary representation type specifier. Some implementations do have this but the version supplied with Visual Studio does not. In order to output this you will need to perform bit operations on the number to extract the individual bits and print each of them in turn for each byte. For example:
unsigned char byte = // Read from file
unsigned char mask = 1; // Bit mask
unsigned char bits[8];
// Extract the bits
for (int i = 0; i < 8; i++) {
// Mask each bit in the byte and store it
bits[i] = (byte & (mask << i)) >> i;
}
// The bits array now contains eight 1 or 0 values
// bits[0] contains the least significant bit
// bits[7] contains the most significant bit
C does not have a printf() specifier to print in binary. Most people print in hex instead, which will give you (typically) eight bits at a time:
printf("the first eight bits are %02x\n", (unsigned char) mp3syncword[0]);
You will need to interpret this manually to figure out the values of individual bits. The cast to unsigned char on the argument is to avoid surprises if it's negative.
To test bits, you can use use the & operator together with the bitwise left shift operator, <<:
if(mp3syncword[2] & (1 << 2))
{
/* The third bit from the right of the third byte was set. */
}
If you want to be able to use "big" (larger than 7) indexes for bits, i.e. treat the data as a 32-bit word, it might be good to read it into e.g. an unsigned int, and then inspect that. Be careful with endian-ness when you do this reading, however.
Warning: there are probably errors with memory layout and/or endianess with this approach. It is not guaranteed that the struct members match the same bits from computer to computer.
In short: don't rely on this (I'll leave the answer, it might be useful for something else)
You can define a struct with bit fields:
struct MP3Header {
unsigned SyncWord : 12;
unsigned Version : 1;
unsigned Layer : 2;
unsigned ErrorProtection : 1;
unsigned BitRate : 4;
unsigned Frequency : 2;
unsigned PadBit : 1;
unsigned PrivBit : 1;
unsigned Mode : 2;
unsigned ModeExtension : 2;
unsigned Copy : 1;
unsigned Original : 1;
unsigned Emphasis : 2;
};
and then use each member as an isolated value:
struct MP3Header h;
/* ... */
fread(&h, sizeof h, 1, mp3file); /* error check!! */
printf("Frequency: %u\n", h.Frequency);

Resources