I have a programm in C that writes a frequency table to a binary file.
The frequency table is an array filled with structs that contains an int and a char.
So I have to write an unsigned int counter and an unsigned char character to the file (multiple times).
I know that an integer normally uses 4 bytes however I know that the int counter can never be bigger than 2^24-1.
So I could use 4 bytes to write the counter and the character to the file => 3 bytes for counter and 1 byte for the character. This would also be easy to read.
Is there an easy way to do this in C without using special libraries?
Yes, there is a very easy way of doing it in C. You can combine a char, which is one byte on all platforms, with an int of up to 24 bits in size by shifting the char by 24 bits to the left:
uint32_t toWrite = (myChar << 24) | myCount;
When you read the data back, perform the opposite operation:
uint32_t fromFile;
uint32_t myCount = fromFile & 0xFFFFFF;
char myChar = (fromFile >> 24) & 0xFF;
Related
I am trying to convert the input from a device (always integer between 1 and 600000) to four 8-bit integers.
For example,
If the input is 32700, I want 188 127 00 00.
I achieved this by using:
32700 % 256
32700 / 256
The above works till 32700. From 32800 onward, I start getting incorrect conversions.
I am totally new to this and would like some help to understand how this can be done properly.
Major edit following clarifications:
Given that someone has already mentioned the shift-and-mask approach (which is undeniably the right one), I'll give another approach, which, to be pedantic, is not portable, machine-dependent, and possibly exhibits undefined behavior. It is nevertheless a good learning exercise, IMO.
For various reasons, your computer represents integers as groups of 8-bit values (called bytes); note that, although extremely common, this is not always the case (see CHAR_BIT). For this reason, values that are represented using more than 8 bits use multiple bytes (hence those using a number of bits with is a multiple of 8). For a 32-bit value, you use 4 bytes and, in memory, those bytes always follow each other.
We call a pointer a value containing the address in memory of another value. In that context, a byte is defined as the smallest (in terms of bit count) value that can be referred to by a pointer. For example, your 32-bit value, covering 4 bytes, will have 4 "addressable" cells (one per byte) and its address is defined as the first of those addresses:
|==================|
| MEMORY | ADDRESS |
|========|=========|
| ... | x-1 | <== Pointer to byte before
|--------|---------|
| BYTE 0 | x | <== Pointer to first byte (also pointer to 32-bit value)
|--------|---------|
| BYTE 1 | x+1 | <== Pointer to second byte
|--------|---------|
| BYTE 2 | x+2 | <== Pointer to third byte
|--------|---------|
| BYTE 3 | x+3 | <== Pointer to fourth byte
|--------|---------|
| ... | x+4 | <== Pointer to byte after
|===================
So what you want to do (split the 32-bit word into 8-bits word) has already been done by your computer, as it is imposed onto it by its processor and/or memory architecture. To reap the benefits of this almost-coincidence, we are going to find where your 32-bit value is stored and read its memory byte-by-byte (instead of 32 bits at a time).
As all serious SO answers seem to do so, let me cite the Standard (ISO/IEC 9899:2018, 6.2.5-20) to define the last thing I need (emphasis mine):
Any number of derived types can be constructed from the object and function types, as follows:
An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. [...] Array types are characterized by their element type and by the number of elements in the array. [...]
[...]
So, as elements in an array are defined to be contiguous, a 32-bit value in memory, on a machine with 8-bit bytes, really is nothing more, in its machine representation, than an array of 4 bytes!
Given a 32-bit signed value:
int32_t value;
its address is given by &value. Meanwhile, an array of 4 8-bit bytes may be represented by:
uint8_t arr[4];
notice that I use the unsigned variant because those bytes don't really represent a number per se so interpreting them as "signed" would not make sense. Now, a pointer-to-array-of-4-uint8_t is defined as:
uint8_t (*ptr)[4];
and if I assign the address of our 32-bit value to such an array, I will be able to index each byte individually, which means that I will be reading the byte directly, avoiding any pesky shifting-and-masking operations!
uint8_t (*bytes)[4] = (void *) &value;
I need to cast the pointer ("(void *)") because I can't bear that whining compiler &value's type is "pointer-to-int32_t" while I'm assigning it to a "pointer-to-array-of-4-uint8_t" and this type-mismatch is caught by the compiler and pedantically warned against by the Standard; this is a first warning that what we're doing is not ideal!
Finally, we can access each byte individually by reading it directly from memory through indexing: (*bytes)[n] reads the n-th byte of value!
To put it all together, given a send_can(uint8_t) function:
for (size_t i = 0; i < sizeof(*bytes); i++)
send_can((*bytes)[i]);
and, for testing purpose, we define:
void send_can(uint8_t b)
{
printf("%hhu\n", b);
}
which prints, on my machine, when value is 32700:
188
127
0
0
Lastly, this shows yet another reason why this method is platform-dependent: the order in which the bytes of the 32-bit word is stored isn't always what you would expect from a theoretical discussion of binary representation i.e:
byte 0 contains bits 31-24
byte 1 contains bits 23-16
byte 2 contains bits 15-8
byte 3 contains bits 7-0
actually, AFAIK, the C Language permits any of the 24 possibilities for ordering those 4 bytes (this is called endianness). Meanwhile, shifting and masking will always get you the n-th "logical" byte.
It really depends on how your architecture stores an int. For example
8 or 16 bit system short=16, int=16, long=32
32 bit system, short=16, int=32, long=32
64 bit system, short=16, int=32, long=64
This is not a hard and fast rule - you need to check your architecture first. There is also a long long but some compilers do not recognize it and the size varies according to architecture.
Some compilers have uint8_t etc defined so you can actually specify how many bits your number is instead of worrying about ints and longs.
Having said that you wish to convert a number into 4 8 bit ints. You could have something like
unsigned long x = 600000UL; // you need UL to indicate it is unsigned long
unsigned int b1 = (unsigned int)(x & 0xff);
unsigned int b2 = (unsigned int)(x >> 8) & 0xff;
unsigned int b3 = (unsigned int)(x >> 16) & 0xff;
unsigned int b4 = (unsigned int)(x >> 24);
Using shifts is a lot faster than multiplication, division or mod. This depends on the endianess you wish to achieve. You could reverse the assignments using b1 with the formula for b4 etc.
You could do some bit masking.
600000 is 0x927C0
600000 / (256 * 256) gets you the 9, no masking yet.
((600000 / 256) & (255 * 256)) >> 8 gets you the 0x27 == 39. Using a 8bit-shifted mask of 8 set bits (256 * 255) and a right shift by 8 bits, the >> 8, which would also be possible as another / 256.
600000 % 256 gets you the 0xC0 == 192 as you did it. Masking would be 600000 & 255.
I ended up doing this:
unsigned char bytes[4];
unsigned long n;
n = (unsigned long) sensore1 * 100;
bytes[0] = n & 0xFF;
bytes[1] = (n >> 8) & 0xFF;
bytes[2] = (n >> 16) & 0xFF;
bytes[3] = (n >> 24) & 0xFF;
CAN_WRITE(0x7FD,8,01,sizeof(n),bytes[0],bytes[1],bytes[2],bytes[3],07,255);
I have been in a similar kind of situation while packing and unpacking huge custom packets of data to be transmitted/received, I suggest you try below approach:
typedef union
{
uint32_t u4_input;
uint8_t u1_byte_arr[4];
}UN_COMMON_32BIT_TO_4X8BIT_CONVERTER;
UN_COMMON_32BIT_TO_4X8BIT_CONVERTER un_t_mode_reg;
un_t_mode_reg.u4_input = input;/*your 32 bit input*/
// 1st byte = un_t_mode_reg.u1_byte_arr[0];
// 2nd byte = un_t_mode_reg.u1_byte_arr[1];
// 3rd byte = un_t_mode_reg.u1_byte_arr[2];
// 4th byte = un_t_mode_reg.u1_byte_arr[3];
The largest positive value you can store in a 16-bit signed int is 32767. If you force a number bigger than that, you'll get a negative number as a result, hence unexpected values returned by % and /.
Use either unsigned 16-bit int for a range up to 65535 or a 32-bit integer type.
I'm trying to convert numbers to binary and then fwrite the binary numbers to binary file.
Assuming all numbers are 7 bits numbers (int numbers < 127).
So in the end, the file will contain the numbers as blocks of 7 bits
I know that each BYTE is 8 bits, and i can't to write each number to 1 BYTE, but need to use the whole BYTE (i.e some of the numbers will be in 2 different BYTES)
if : 120 = 1111000 | 7 = 0000111 | 64 = 1000000
so the bit stream is 111100000001111000000 and should be written as
1111000|0 000111|10 00000|00
1BYTE 2BYTE 3BYTE
I thought to use a buffer
shifting 8 bits to buffer and then fwrite to the file, using pointers
But i just can't mange to write it. Thank you for your help
First, when dealing with bits use unsigned integers
unsigned char a = 0;
char bits[] = "1111000";
char *p = bits;
while (*p) {
a <<= 1; // shift left
a |= (*p == '1'); // add bit
p++; // next bit
}
So, you want to put 8 7-bit values in 7 8-bit locations
[-val-][-val-][-val-]...
00000001111111222222233333334444444555555566666667777777
[-loc8-][-loc8-][-loc8-]...
Just map bits and you're done
For example, the 3rd 7-bit value (assuming int val7[8]) can be written into the 8-bit locations (assuming int loc8[7]) with
loc8[1] &= 0xfc; // clear 2 bits
loc8[1] |= (val7[2] & 0x60) >> 5; // set 2 bits
loc8[2] &= 0x7; // clear 5 bits
loc8[2] |= (val7[2] & 0x1f) << 3; // set 5 bits
I'm making a program to receive a bunch of 0's and 1's with a µC and need to take any amount of bits (1 to 16) from any position.
I.E. I have 150 bits and I want to take 6 bits from the 32th bit and copy it to a char (8bits) variable; I know I can do it with strings by saving as ASCII 0's and 1's, but I have not a lot of RAM, so I need to save it as bits.
The bigger variable is a unsigned 32 bits long, but save the data is not my problem, the problem is how to access to a specific bits positions and copy that to a char(8) variable.
You can use bitwise operators:
//bits: your bits (byte array), start: index of the first bit of the char you want
char select(char* bits, int start) {
dec = start%8;
return bits[start/8]>>dec + bits[start/8+1]<<dec;
}
The code above supposed start < (bits.size()-8)
[EDIT]
You can change the char* to any type you want. However you will need to change dec value to the appropriate number of bits (8*SIZE_IN_BYTES) and then apply operator | ("logical or") to get your char back
example:
char select(int* bits, int start) {
nbitsint = 8*4;
dec = start%nbitsint;
if (dec < nbitsint-8) {
// | 0xff creates creates a byte
return (bits[start/nbitsint]>>((3-dec/8)*8+dec%8) | 0xff;
}
// Getting a byte which is astride two values is tricky
return (bits[start/nbitsint]>>(start%8) + bits[start/nbitsint+1]<<(start%8) | 0xff;
}
I need to read 32bit instructions from a binary file.
so what i have right now is:
unsigned char buffer[4];
fread(buffer,sizeof(buffer),1,file);
which will put 4 bytes in an array
how should I approach that to connect those 4 bytes together in order to process 32bit instruction later?
Or should I even start in a different way and not use fread?
my weird method right now is to create an array of ints of size 32 and the fill it with bits from buffer array
The answer depends on how the 32-bit integer is stored in the binary file. (I'll assume that the integer is unsigned, because it really is an id, and use the type uint32_t from <stdint.h>.)
Native byte order The data was written out as integer on this machine. Just read the integer with fread:
uint32_t op;
fread(&op, sizeof(op), 1, file);
Rationale: fread read the raw representation of the integer into memory. The matching fwrite does the reverse: It writes the raw representation to thze file. If you don't need to exchange the file between platforms, this is a good method to store and read data.
Little-endian byte order The data is stored as four bytes, least significant byte first:
uint32_t op = 0u;
op |= getc(file); // 0x000000AA
op |= getc(file) << 8; // 0x0000BBaa
op |= getc(file) << 16; // 0x00CCbbaa
op |= getc(file) << 24; // 0xDDccbbaa
Rationale: getc reads a char and returns an integer between 0 and 255. (The case where the stream runs out and getc returns the negative value EOF is not considered here for brevity, viz laziness.) Build your integer by shifting each byte you read by multiples of 8 and or them with the existing value. The comments sketch how it works. The capital letters are being read, the lower-case letters were already there. Zeros have not yet been assigned.
Big-endian byte order The data is stored as four bytes, least significant byte last:
uint32_t op = 0u;
op |= getc(file) << 24; // 0xAA000000
op |= getc(file) << 16; // 0xaaBB0000
op |= getc(file) << 8; // 0xaabbCC00
op |= getc(file); // 0xaabbccDD
Rationale: Pretty much the same as above, only that you shift the bytes in another order.
You can imagine little-endian and big-endian as writing the number one hundred and twenty tree (CXXIII) as either 321 or 123. The bit-shifting is similar to shifting decimal digtis when dividing by or multiplying with powers of 10, only that you shift my 8 bits to multiply with 2^8 = 256 here.
Add
unsigned int instruction;
memcpy(&instruction,buffer,4);
to your code. This will copy the 4 bytes of buffer to a single 32-bit variable. Hence you will get connected 4 bytes :)
If you know that the int in the file is the same endian as the machine the program's running on, then you can read straight into the int. No need for a char buffer.
unsigned int instruction;
fread(&instruction,sizeof(instruction),1,file);
If you know the endianness of the int in the file, but not the machine the program's running on, then you'll need to add and shift the bytes together.
unsigned char buffer[4];
unsigned int instruction;
fread(buffer,sizeof(buffer),1,file);
//big-endian
instruction = (buffer[0]<<24) + (buffer[1]<<16) + (buffer[2]<<8) + buffer[3];
//little-endian
instruction = (buffer[3]<<24) + (buffer[2]<<16) + (buffer[1]<<8) + buffer[0];
Another way to think of this is that it's a positional number system in base-256. So just like you combine digits in a base-10.
257
= 2*100 + 5*10 + 7
= 2*10^2 + 5*10^1 + 7*10^0
So you can also combine them using Horner's rule.
//big-endian
instruction = ((((buffer[0]*256) + buffer[1]*256) + buffer[2]*256) + buffer[3]);
//little-endian
instruction = ((((buffer[3]*256) + buffer[2]*256) + buffer[1]*256) + buffer[0]);
#luser droog
There are two bugs in your code.
The size of the variable "instruction" must not be 4 bytes: for example, Turbo C assumes sizeof(int) to be 2. Obviously, your program fails in this case. But, what is much more important and not so obvious: your program will also fail in case sizeof(int) be more than 4 bytes! To understand this, consider the following example:
int main()
{ const unsigned char a[4] = {0x21,0x43,0x65,0x87};
const unsigned char* p = &a;
unsigned long x = (((((p[3] << 8) + p[2]) << 8) + p[1]) << 8) + p[0];
printf("%08lX\n", x);
return 0;
}
This program prints "FFFFFFFF87654321" under amd64, because an unsigned char variable becomes SIGNED INT when it is used! So, changing the type of the variable "instruction" from "int" to "long" does not solve the problem.
The only way is to write something like:
unsigned long instruction;
instruction = 0;
for (int i = 0, unsigned char* p = buffer + 3; i < 4; i++, p--) {
instruction <<= 8;
instruction += *p;
}
As part of my CS course I've been given some functions to use. One of these functions takes a pointer to unsigned chars to write some data to a file (I have to use this function, so I can't just make my own purpose built function that works differently BTW). I need to write an array of integers whose values can be up to 4095 using this function (that only takes unsigned chars).
However am I right in thinking that an unsigned char can only have a max value of 256 because it is 1 byte long? I therefore need to use 4 unsigned chars for every integer? But casting doesn't seem to work with larger values for the integer. Does anyone have any idea how best to convert an array of integers to unsigned chars?
Usually an unsigned char holds 8 bits, with a max value of 255. If you want to know this for your particular compiler, print out CHAR_BIT and UCHAR_MAX from <limits.h> You could extract the individual bytes of a 32 bit int,
#include <stdint.h>
void
pack32(uint32_t val,uint8_t *dest)
{
dest[0] = (val & 0xff000000) >> 24;
dest[1] = (val & 0x00ff0000) >> 16;
dest[2] = (val & 0x0000ff00) >> 8;
dest[3] = (val & 0x000000ff) ;
}
uint32_t
unpack32(uint8_t *src)
{
uint32_t val;
val = src[0] << 24;
val |= src[1] << 16;
val |= src[2] << 8;
val |= src[3] ;
return val;
}
Unsigned char generally has a value of 1 byte, therefore you can decompose any other type to an array of unsigned chars (eg. for a 4 byte int you can use an array of 4 unsigned chars). Your exercise is probably about generics. You should write the file as a binary file using the fwrite() function, and just write byte after byte in the file.
The following example should write a number (of any data type) to the file. I am not sure if it works since you are forcing the cast to unsigned char * instead of void *.
int homework(unsigned char *foo, size_t size)
{
int i;
// open file for binary writing
FILE *f = fopen("work.txt", "wb");
if(f == NULL)
return 1;
// should write byte by byte the data to the file
fwrite(foo+i, sizeof(char), size, f);
fclose(f);
return 0;
}
I hope the given example at least gives you a starting point.
Yes, you're right; a char/byte only allows up to 8 distinct bits, so that is 2^8 distinct numbers, which is zero to 2^8 - 1, or zero to 255. Do something like this to get the bytes:
int x = 0;
char* p = (char*)&x;
for (int i = 0; i < sizeof(x); i++)
{
//Do something with p[i]
}
(This isn't officially C because of the order of declaration but whatever... it's more readable. :) )
Do note that this code may not be portable, since it depends on the processor's internal storage of an int.
If you have to write an array of integers then just convert the array into a pointer to char then run through the array.
int main()
{
int data[] = { 1, 2, 3, 4 ,5 };
size_t size = sizeof(data)/sizeof(data[0]); // Number of integers.
unsigned char* out = (unsigned char*)data;
for(size_t loop =0; loop < (size * sizeof(int)); ++loop)
{
MyProfSuperWrite(out + loop); // Write 1 unsigned char
}
}
Now people have mentioned that 4096 will fit in less bits than a normal integer. Probably true. Thus you can save space and not write out the top bits of each integer. Personally I think this is not worth the effort. The extra code to write the value and processes the incoming data is not worth the savings you would get (Maybe if the data was the size of the library of congress). Rule one do as little work as possible (its easier to maintain). Rule two optimize if asked (but ask why first). You may save space but it will cost in processing time and maintenance costs.
The part of the assignment of: integers whose values can be up to 4095 using this function (that only takes unsigned chars should be giving you a huge hint. 4095 unsigned is 12 bits.
You can store the 12 bits in a 16 bit short, but that is somewhat wasteful of space -- you are only using 12 of 16 bits of the short. Since you are dealing with more than 1 byte in the conversion of characters, you may need to deal with endianess of the result. Easiest.
You could also do a bit field or some packed binary structure if you are concerned about space. More work.
It sounds like what you really want to do is call sprintf to get a string representation of your integers. This is a standard way to convert from a numeric type to its string representation. Something like the following might get you started:
char num[5]; // Room for 4095
// Array is the array of integers, and arrayLen is its length
for (i = 0; i < arrayLen; i++)
{
sprintf (num, "%d", array[i]);
// Call your function that expects a pointer to chars
printfunc (num);
}
Without information on the function you are directed to use regarding its arguments, return value and semantics (i.e. the definition of its behaviour) it is hard to answer. One possibility is:
Given:
void theFunction(unsigned char* data, int size);
then
int array[SIZE_OF_ARRAY];
theFunction((insigned char*)array, sizeof(array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(*array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(int));
All of which will pass all of the data to theFunction(), but whether than makes any sense will depend on what theFunction() does.