Lets say I have 2 variables:
int var1 = 1; //1 byte
int var2 = 2; //1 byte
I want to combine these and encode as a 32bit unsigned integer (uint32_t). By combining them, it would be 2 bytes. I'd then fill the remaining space with 2 bytes of 0 padding. This is to write to a file, hence the need for this specific type of encoding.
So by combining the above example variables, the output I need is:
1200 //4 bytes
There's no need to go the roundabout way of "combining" the values into an uint32_t. Binary files are streams of bytes, so writing single bytes is very possible:
FILE * const out = fopen("myfile.bin", "wb");
const int val1 = 1;
const int val2 = 2;
if(out != NULL)
{
fputc(val1, out);
fputc(val2, out);
// Pad the file to four bytes, as originally requested. Not needed, though.
fputc(0, out);
fputc(0, out);
fclose(out);
}
This uses fputc() to write single bytes to the file. It takes an integer argument for the value to write, but treats it as unsigned char internally, which is essentially "a byte".
Reading back would be just as simple, using e.g. fgetc() to read out the two values, and of course checking for failure. You should check these writes too, I omitted it because error handling.
Related
So I'm working with system calls in Linux. I'm using "lseek" to navigate through the file and "read" to read. I'm also using Midnight Commander to see the file in hexadecimal. The next 4 bytes I have to read are in little-endian , and look like this : "2A 00 00 00". But of course, the bytes can be something like "2A 5F B3 00". I have to convert those bytes to an integer. How do I approach this? My initial thought was to read them into a vector of 4 chars, and then to build my integer from there, but I don't know how. Any ideas?
Let me give you an example of what I've tried. I have the following bytes in file "44 00". I have to convert that into the value 68 (4 + 4*16):
char value[2];
read(fd, value, 2);
int i = (value[0] << 8) | value[1];
The variable i is 17480 insead of 68.
UPDATE: Nvm. I solved it. I mixed the indexes when I shift. It shoud've been value[1] << 8 ... | value[0]
General considerations
There seem to be several pieces to the question -- at least how to read the data, what data type to use to hold the intermediate result, and how to perform the conversion. If indeed you are assuming that the on-file representation consists of the bytes of a 32-bit integer in little-endian order, with all bits significant, then I probably would not use a char[] as the intermediate, but rather a uint32_t or an int32_t. If you know or assume that the endianness of the data is the same as the machine's native endianness, then you don't need any other.
Determining native endianness
If you need to compute the host machine's native endianness, then this will do it:
static const uint32_t test = 1;
_Bool host_is_little_endian = *(char *)&test;
It is worthwhile doing that, because it may well be the case that you don't need to do any conversion at all.
Reading the data
I would read the data into a uint32_t (or possibly an int32_t), not into a char array. Possibly I would read it into an array of uint8_t.
uint32_t data;
int num_read = fread(&data, 4, 1, my_file);
if (num_read != 1) { /* ... handle error ... */ }
Converting the data
It is worthwhile knowing whether the on-file representation matches the host's endianness, because if it does, you don't need to do any transformation (that is, you're done at this point in that case). If you do need to swap endianness, however, then you can use ntohl() or htonl():
if (!host_is_little_endian) {
data = ntohl(data);
}
(This assumes that little- and big-endian are the only host byte orders you need to be concerned with. Historically, there have been others, which is why the byte-reorder functions come in pairs, but you are extremely unlikely ever to see one of the others.)
Signed integers
If you need a signed instead of unsigned integer, then you can do the same, but use a union:
union {
uint32_t unsigned;
int32_t signed;
} data;
In all of the preceding, use data.unsigned in place of plain data, and at the end, read out the signed result from data.signed.
Suppose you point into your buffer:
unsigned char *p = &buf[20];
and you want to see the next 4 bytes as an integer and assign them to your integer, then you can cast it:
int i;
i = *(int *)p;
You just said that p is now a pointer to an int, you de-referenced that pointer and assigned it to i.
However, this depends on the endianness of your platform. If your platform has a different endianness, you may first have to reverse-copy the bytes to a small buffer and then use this technique. For example:
unsigned char ibuf[4];
for (i=3; i>=0; i--) ibuf[i]= *p++;
i = *(int *)ibuf;
EDIT
The suggestions and comments of Andrew Henle and Bodo could give:
unsigned char *p = &buf[20];
int i, j;
unsigned char *pi= &(unsigned char)i;
for (j=3; j>=0; j--) *pi++= *p++;
// and the other endian:
int i, j;
unsigned char *pi= (&(unsigned char)i)+3;
for (j=3; j>=0; j--) *pi--= *p++;
I have one encrypted file named encrypt.
Here I calculated crc 16 for this file and store this crc result in unsigned short this unsigned short size is 2 byte(16 bits).
Now I want to append 2 byte of crc value at the end of this file and read these last 2 bytes from file and have to compare this crc so how can I achieve this thing?
I used this code
fseek(readFile, filesize, SEEK_SET);
fprintf(readFile,"%u",result);
Here filesize is my file original encrypted file size and after this i add result which is unsigned short but in file its write 5 bytes.
file content after this
testsgh
30549
original file data is testsgh but here crc is 30459 I want to store this value in 2 byte. so how can I do?
You should open the file in binary append mode:
FILE *out = fopen("myfile.bin", "ab");
This will eliminate the need to seek to the end.
Then, you need to use a direct write, not a print which converts the value to a string and writes the string. You want to write the bits of your unsigned short checksum:
const size_t wrote = fwrite(&checksum, sizeof checksum, 1, out);
This succeeded if and only if the value of wrote is 1.
However, please note that this risks introducing endianness errors, since it writes the value using your machine's local byte order. To be on the safe side, it's cleaner to decide on a byte ordering and implement it directly. For big-endian:
const unsigned char check_bytes[2] = { checksum >> 8, checksum & 255 };
const size_t wrote = fwrite(check_bytes, sizeof check_bytes, 1, out);
Again, we expect wrote to be 1 after the call to indicate that both bytes were successfully written.
Use fwrite(), not fprintf. I don't have access to a C compiler atm but fwrite(&result, sizeof(result), 1, readFile); should work.
You could do something like this:
unsigned char c1, c2;
c1 = (unsigned char)(result >> 8);
c2 = (unsigned char)( (result << 8) >> 8);
and then append c1 and c2 at the end of the file. When you read the file back, just do the opposite:
result = ( (unsigned)c1 << 8 ) + (unsigned)c2;
Hope that helps.
you can write single characters with %c formating. e.g.
fprintf(readfile, "%c%c", result % 256, result / 256)
btw: readfile is misleading, when you write to it :-)
I am having trouble reading a specific integer from a file and I am not sure why. First I read through the entire file to find out how big it is, and then I reset the pointer to the beginning. I then read 3 16-byte blocks of data. Then 1 20-byte block and then I would like to read 1 byte at the end as an integer. However, I had to write into the file as a character but I do not think that should be a problem. My issue is that when I read it out of the file instead of being the integer value of 15 it is 49. I checked in the ACII table and it is not the hex or octal value of 1 or 5. I am thoroughly confused because my read statement is read(inF, pad, 1) which I believe is right. I do know that an integer variable is 4 bytes however, there is only one byte of data left in the file so I read in only the last byte.
My code is reproduced the function(it seems like a lot but it don't think it is)
the code is
#include<math.h>
#include<stdio.h>
#include<string.h>
#include <fcntl.h>
int main(int argc, char** argv)
{
char x;
int y;
int bytes = 0;
int num = 0;
int count = 0;
num = open ("a_file", O_RDONLY);
bytes = read(num, y, 1);
printf("y %d\n", y);
return 0;
}
To sum up my question, how come when I read the byte that stores 15 from the text file, I can't view it as 15 from the integer representation?
Any help would be very appreciated.
Thanks!
You're reading a first byte of int (4 bytes), and then print it as a whole. If you want to read by one byte, you need also to use it as one byte, like this:
char temp; // one-byte signed integer
read(fd, &temp, 1); // read the integer from file
printf("%hhd\n", temp); // print one-byte signed integer
Or, you can use regular int:
int temp; // four byte signed integer
read(fd, &temp, 4); // read it from file
printf("%d\n", temp); // print four-byte signed integer
Note that this will work only on platforms with 32-bit integers, and also depends on platform's byte order.
What you're doing is:
int temp; // four byte signed integer
read(fd, &temp, 1); // read one byte from file into the integer
// now first byte of four is from the file,
// and the other three contain undefined garbage
printf("%d\n", temp); // print contents of mostly uninitialized memory
The read function system call has a declaration like:
ssize_t read(int fd, void* buf, size_t count);
So, you should pass address of the int variable in which you want to read the stuff.
i.e use
bytes = read(num, &y, 1);
You can see all the details of file I/O in C from that link
Based on the read function, I believe it is reading the first byte in the first byte of the 4 bytes of the integer, and that byte is not placed in the lowest byte. This means whatever is in pad for the other 3 bytes will still be there, even if you initialized it to zero (then it will have zeros in the other bytes). I would read in one byte and then cast it to an integer (if you need a 4 byte integer for some reason), as shown below:
/* declare at the top of the program */
char temp;
/* Note line to replace read(inF,pad,1) */
read(inF,&temp,1);
/* Added to cast the value read in to an integer high order bit may be propagated to make a negative number */
pad = (int) temp;
/* Mask off the high order bits */
pad &= 0x000000FF;
Otherwise, you could change your declaration to be an unsigned char which would take care of the other 3 bytes.
As part of my CS course I've been given some functions to use. One of these functions takes a pointer to unsigned chars to write some data to a file (I have to use this function, so I can't just make my own purpose built function that works differently BTW). I need to write an array of integers whose values can be up to 4095 using this function (that only takes unsigned chars).
However am I right in thinking that an unsigned char can only have a max value of 256 because it is 1 byte long? I therefore need to use 4 unsigned chars for every integer? But casting doesn't seem to work with larger values for the integer. Does anyone have any idea how best to convert an array of integers to unsigned chars?
Usually an unsigned char holds 8 bits, with a max value of 255. If you want to know this for your particular compiler, print out CHAR_BIT and UCHAR_MAX from <limits.h> You could extract the individual bytes of a 32 bit int,
#include <stdint.h>
void
pack32(uint32_t val,uint8_t *dest)
{
dest[0] = (val & 0xff000000) >> 24;
dest[1] = (val & 0x00ff0000) >> 16;
dest[2] = (val & 0x0000ff00) >> 8;
dest[3] = (val & 0x000000ff) ;
}
uint32_t
unpack32(uint8_t *src)
{
uint32_t val;
val = src[0] << 24;
val |= src[1] << 16;
val |= src[2] << 8;
val |= src[3] ;
return val;
}
Unsigned char generally has a value of 1 byte, therefore you can decompose any other type to an array of unsigned chars (eg. for a 4 byte int you can use an array of 4 unsigned chars). Your exercise is probably about generics. You should write the file as a binary file using the fwrite() function, and just write byte after byte in the file.
The following example should write a number (of any data type) to the file. I am not sure if it works since you are forcing the cast to unsigned char * instead of void *.
int homework(unsigned char *foo, size_t size)
{
int i;
// open file for binary writing
FILE *f = fopen("work.txt", "wb");
if(f == NULL)
return 1;
// should write byte by byte the data to the file
fwrite(foo+i, sizeof(char), size, f);
fclose(f);
return 0;
}
I hope the given example at least gives you a starting point.
Yes, you're right; a char/byte only allows up to 8 distinct bits, so that is 2^8 distinct numbers, which is zero to 2^8 - 1, or zero to 255. Do something like this to get the bytes:
int x = 0;
char* p = (char*)&x;
for (int i = 0; i < sizeof(x); i++)
{
//Do something with p[i]
}
(This isn't officially C because of the order of declaration but whatever... it's more readable. :) )
Do note that this code may not be portable, since it depends on the processor's internal storage of an int.
If you have to write an array of integers then just convert the array into a pointer to char then run through the array.
int main()
{
int data[] = { 1, 2, 3, 4 ,5 };
size_t size = sizeof(data)/sizeof(data[0]); // Number of integers.
unsigned char* out = (unsigned char*)data;
for(size_t loop =0; loop < (size * sizeof(int)); ++loop)
{
MyProfSuperWrite(out + loop); // Write 1 unsigned char
}
}
Now people have mentioned that 4096 will fit in less bits than a normal integer. Probably true. Thus you can save space and not write out the top bits of each integer. Personally I think this is not worth the effort. The extra code to write the value and processes the incoming data is not worth the savings you would get (Maybe if the data was the size of the library of congress). Rule one do as little work as possible (its easier to maintain). Rule two optimize if asked (but ask why first). You may save space but it will cost in processing time and maintenance costs.
The part of the assignment of: integers whose values can be up to 4095 using this function (that only takes unsigned chars should be giving you a huge hint. 4095 unsigned is 12 bits.
You can store the 12 bits in a 16 bit short, but that is somewhat wasteful of space -- you are only using 12 of 16 bits of the short. Since you are dealing with more than 1 byte in the conversion of characters, you may need to deal with endianess of the result. Easiest.
You could also do a bit field or some packed binary structure if you are concerned about space. More work.
It sounds like what you really want to do is call sprintf to get a string representation of your integers. This is a standard way to convert from a numeric type to its string representation. Something like the following might get you started:
char num[5]; // Room for 4095
// Array is the array of integers, and arrayLen is its length
for (i = 0; i < arrayLen; i++)
{
sprintf (num, "%d", array[i]);
// Call your function that expects a pointer to chars
printfunc (num);
}
Without information on the function you are directed to use regarding its arguments, return value and semantics (i.e. the definition of its behaviour) it is hard to answer. One possibility is:
Given:
void theFunction(unsigned char* data, int size);
then
int array[SIZE_OF_ARRAY];
theFunction((insigned char*)array, sizeof(array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(*array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(int));
All of which will pass all of the data to theFunction(), but whether than makes any sense will depend on what theFunction() does.
I'm writing some quick code to try and extract data from an mp3 file header.
The objective is to extract information from the header such as the bitrate and other vital information so that I can appropriately stream the file to a mp3decoder with the necessary arguments.
Here is a wikipedia image showing the mp3header information:
http://upload.wikimedia.org/wikipedia/commons/0/01/Mp3filestructure.svg
My question is, am I attacking this correctly? Printing the data received is worthless -- I just get a bunch of random characters. I need to get to the binary so that I can decode it and determine vital information.
Here is my baseline code:
// mp3 Header File IO.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include "stdio.h"
#include "string.h"
#include "stdlib.h"
// Main function
int main (void)
{
// Declare variables
FILE *mp3file;
char *mp3syncword; // we will need to allocate memory to this!!
char requestedFile[255] = "";
unsigned long fileLength;
// Counters
int i;
// Memory allocation with malloc
mp3syncword=(char *)malloc(2000);
// Let's get the name of the requested file (hard-coded for now)
strcpy(requestedFile,"testmp3.mp3");
// Open the file with mode read, binary
mp3file = fopen(requestedFile, "rb");
if (!mp3file){
// If we can't find the file, notify the user of the problem
printf("Not found!");
}
// Let's get some header data from the file
fseek(mp3file,1,SEEK_SET);
fread(mp3syncword,32,1,mp3file);
// For debug purposes, lets print the received data
for(i = 0; i < 32; ++i)
printf("%c", ((char *)mp3syncword)[i]);
enter code here
return 0;
}
Help appreciated.
You are printing the bytes out using %c as the format specifier. You need to use an unsigned numeric format specifier (e.g. %u for a decimal number or %x or %X for hexadecimal) to print the byte values.
You should also declare your byte arrays as unsigned char as they are signed by default on Windows.
You might also want to print out a space (or other separator) after each byte value to make the output clearer.
The standard printf does not provide a binary representation type specifier. Some implementations do have this but the version supplied with Visual Studio does not. In order to output this you will need to perform bit operations on the number to extract the individual bits and print each of them in turn for each byte. For example:
unsigned char byte = // Read from file
unsigned char mask = 1; // Bit mask
unsigned char bits[8];
// Extract the bits
for (int i = 0; i < 8; i++) {
// Mask each bit in the byte and store it
bits[i] = (byte & (mask << i)) >> i;
}
// The bits array now contains eight 1 or 0 values
// bits[0] contains the least significant bit
// bits[7] contains the most significant bit
C does not have a printf() specifier to print in binary. Most people print in hex instead, which will give you (typically) eight bits at a time:
printf("the first eight bits are %02x\n", (unsigned char) mp3syncword[0]);
You will need to interpret this manually to figure out the values of individual bits. The cast to unsigned char on the argument is to avoid surprises if it's negative.
To test bits, you can use use the & operator together with the bitwise left shift operator, <<:
if(mp3syncword[2] & (1 << 2))
{
/* The third bit from the right of the third byte was set. */
}
If you want to be able to use "big" (larger than 7) indexes for bits, i.e. treat the data as a 32-bit word, it might be good to read it into e.g. an unsigned int, and then inspect that. Be careful with endian-ness when you do this reading, however.
Warning: there are probably errors with memory layout and/or endianess with this approach. It is not guaranteed that the struct members match the same bits from computer to computer.
In short: don't rely on this (I'll leave the answer, it might be useful for something else)
You can define a struct with bit fields:
struct MP3Header {
unsigned SyncWord : 12;
unsigned Version : 1;
unsigned Layer : 2;
unsigned ErrorProtection : 1;
unsigned BitRate : 4;
unsigned Frequency : 2;
unsigned PadBit : 1;
unsigned PrivBit : 1;
unsigned Mode : 2;
unsigned ModeExtension : 2;
unsigned Copy : 1;
unsigned Original : 1;
unsigned Emphasis : 2;
};
and then use each member as an isolated value:
struct MP3Header h;
/* ... */
fread(&h, sizeof h, 1, mp3file); /* error check!! */
printf("Frequency: %u\n", h.Frequency);