Extract chars from array as UINT32 - c

So I have a buffer filled with bytes that I know should be at least 16 bytes long.
I dont care about bytes 0 - 11.
I know that the 4 bytes from 12 to 15 represent a 32 bit number.
How can I just extract these bytes and represent them as a 32 bit number.

You can convert each byte to an 8-bit unsigned number, and you can combine these numbers to one 32-bit number using bit operations:
uint32_t result = 0;
for (int i = 12; i < 16; i++) {
result <<= 8;
result |= (uint8_t)bytes[i];
}

I have a fixation with unions{}, I can't help it. This might help:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <arpa/inet.h>
uint32_t
convert_int(char bytes[4])
{
union {uint32_t n; char bytes[4];} box;
memcpy(&box.bytes[0], bytes, sizeof(*bytes));
return ntohl(box.n);
}
int
main(void)
{
uint32_t number;
number = convert_int("\x0A\x00\x00\x00");
printf("%d\n", number);
return 0;
}
convert_int() accepts 4 bytes in network byte order (Big endian) (most significant byte first) and translate it to a 32 bits integer; host byte order (either Big or little endian). Since you have control over the buffer, you can place the argument as needed.

Related

bit programing in C [duplicate]

This question already has answers here:
How do I split up a long value (32 bits) into four char variables (8bits) using C?
(6 answers)
Closed 8 months ago.
I am new to bits programming in C and finding it difficult to understand how ipv4_to_bit_string() in below code works.
Can anyone explain that, what is happening when I pass integer 1234 to this function. Why integer is right shifted at 24,16,8 and 4 places?
#include <stdio.h>
#include <string.h>
#include <stdint.h>
#include <stdlib.h>
typedef struct BIT_STRING_s {
uint8_t *buf; /* BIT STRING body */
size_t size; /* Size of the above buffer */
int bits_unused; /* Unused trailing bits in the last octet (0..7) */
} BIT_STRING_t;
BIT_STRING_t tnlAddress;
void ipv4_to_bit_string(int i, BIT_STRING_t *p)
{
do {
(p)->buf = calloc(4, sizeof(uint8_t));
(p)->buf[0] = (i) >> 24 & 0xFF;
(p)->buf[1] = (i) >> 16 & 0xFF;
(p)->buf[2] = (i) >> 8 & 0xFF;
(p)->buf[3] = (i) >> 4 & 0xFF;
(p)->size = 4;
(p)->bits_unused = 0;
} while(0);
}
int main()
{
BIT_STRING_t *p = (BIT_STRING_t*)calloc(1, sizeof(BIT_STRING_t));
ipv4_to_bit_string(1234, p);
}
An IPv4 address is four eight-bit pieces that have been put together into one 32-bit piece. To take the 32-bit piece apart into the four eight-bit pieces, you extract each eight bits separately. To extract one eight-bit piece, you shift right by 0, 8, 16, or 24 bits, according to which piece you want at the moment, and then mask with 0xFF to take only the low eight bits after the shift.
The shift by 4 instead of 0 appears to be an error.
The use of an int for the 32-bit piece appears to be an error, primarily because the high bit may be set, which indicates the int value is negative, and then the right-shift is not fully defined by the C standard; it is implementation-defined. An unsigned type should be used. Additionally, int is not necessarily 32 bits; it is preferable to use uint32_t, which is defined in the <stdint.h> header.

value in array will be printed as 0 even after changing it in c

So I made a custom type by using typedef unsigned char byte;, and then declared an array of it, like using byte mem[255];. I used mem[0] = 0x10100000; to init the first value, but when I print it using printf("%d", mem[0]); I get 0. Why?
An unsigned char can typically only hold values between 0 and 255. The hex value 0x10100000 is well out of range for that type, so (essentially) only the low-order byte of that value is used, which is 0.
Presumably you wanted to use a binary constant. Not all compilers support that, but those that do would specify it as 0b10100000. For those than don't you can use the hex value 0xA0.
You're assigning it the hexidecimal number 0x10100000 which is far larger than a single character, and thus can't be stored in a byte. If you want to use a binary number, and your compiler supports this, you might try using 0b10100000 instead.
unsigned char can only hold the value of ((1 << CHAR_BIT) - 1)
You can check what is the maximum value yourself
#include <stdio.h>
#include <limits.h>
int main(void)
{
printf("%u\n", (1 << CHAR_BIT) - 1);
}
On most systems it is 255 or 0xff.
When you assign the unsigned char with 0x10100000 only the lowest two hex digits will be assigned (in your case 0x00).
If you wanted to copy all the bytes from the 0x10100000 to the byte array mem you defined, the assignment will not work. You need to copy then instead:
#include <stdio.h>
#include <limits.h>
#include <string.h>
typedef unsigned char byte;
int main(void)
{
byte mem[100];
memcpy(mem, &(unsigned){0x10100000}, sizeof(0x10100000));
for(size_t index = 0; index < sizeof(0x10100000); index++)
{
printf("mem[%zu] = 0x%hhx\n", index, mem[index]);
}
}
Output:
mem[0] = 0x0
mem[1] = 0x0
mem[2] = 0x10
mem[3] = 0x10
https://godbolt.org/z/cGYa8MTef
Why in this order? Because the machine, where godbolt is run, uses little endioan. https://en.wikipedia.org/wiki/Endianness
0x prefix means that number hexadecimal. If you wanted to use binary number then gcc supports 0b prefix which is not standard.
mem[0] = 0b10100000
You can also create .h file
#define b00000000 0
#define b00000001 1
#define b00000010 2
#define b00000011 3
/* .... */
#define b11111110 254
#define b11111110 255
and use those definitions portable way
mem[0] = b10100000;
You can't fit a 32 bit value inside an 8 bit variable (mem[0]). Do you perhaps mean to do this?
*(int *)mem = 0x10100000;

How to calculate size of structure with bit field?

#include <stdio.h>
struct test {
unsigned int x;
long int y : 33;
unsigned int z;
};
int main()
{
struct test t;
printf("%d", sizeof(t));
return 0;
}
I am getting the output as 24. How does it equate to that?
As your implementation accepts long int y : 33; a long int hase more than 32 bits on your system, so I shall assume 64.
If plain int are also 64 bits, the result of 24 is normal.
If they are only 32 bits, you have encountered padding and alignment. For performance reasons, 64 bits types on 64 bits systems are aligned on a 64 bits boundary. So you have:
4 bytes for the first int
4 padding bytes to have a 8 bytes boundary
8 bytes for the container of the bit field
4 bytes for the second int
4 padding bytes to allow proper alignment of arrays
Total: 24 bytes

How can I copy 4 letter ascii word to buffer in C?

I am trying to copy the word: 0x0FF0 to a buffer but unable to do so.
Here is my code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <math.h>
#include <time.h>
#include <linux/types.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
void print_bits(unsigned int x);
int main(int argc, char *argv[])
{
char buffer[512];
unsigned int init = 0x0FF0;
unsigned int * som = &init;
printf("print bits of som now: \n");
print_bits(init);
printf("\n");
memset(&buffer[0], 0, sizeof(buffer)); // reinitialize the buffer
memcpy(buffer, som, 4); // copy word to the buffer
printf("print bits of buffer[0] now: \n");
print_bits(buffer[0]);
printf("\n");
return 0;
}
void print_bits(unsigned int x)
{
int i;
for (i = 8 * sizeof(x)-17; i >= 0; i--) {
(x & (1 << i)) ? putchar('1') : putchar('0');
}
printf("\n");
}
this is the result I get in the console:
Why am I getting different values from the bit printing if I am using memcpy?
Don't know if it has something to do with big-little-endian but I am losing 4 bits of 1's here so in both of the methods it shouldn't happen.
When you call
print_bits(buffer[0]);
you're taking just one byte out of the buffer, converting it to unsigned int, and passing that to the function. The other bytes in buffer are ignored.
You are mixing up types and relying on specific settings of your architecture/platform; This already breaks your existing code, and it may get even more harmful once you compile with different settings.
Your buffer is of type char[512], while your init is of type unsigned int.
First, it depends on the settings whether char is signed or unsigned char. This is actually relevant, since it influences how a char-value is promoted to an unsigned int-value. See the following code that demonstrated the difference using explicitly signed and unsigned chars:
signed char c = 0xF0;
unsigned char uc = c;
unsigned int ui_from_c = c;
unsigned int ui_from_uc = uc;
printf("Singned char c:%hhd; Unsigned char uc:%hhu; ui_from_c:%u ui_from_uc:%u\n", c, uc, ui_from_c,ui_from_uc);
// output: Singned char c:-16; Unsigned char uc:240; ui_from_c:4294967280 ui_from_uc:240
Second, int may be represented by 4 or by 8 bytes (which can hold a "word"), yet char will typically be 1 byte and can therefore not hold a "word" of 16 bit.
Third, architectures can be big endian or little endian, and this influences where a constant like 0x0FF0, which requires 2 bytes, would actually be located in a 4 or 8 byte integral representation.
So it is for sure that buffer[0] selects just a portion of that what you think it does, the portion might get promoted in the wrong way to an unsigned int, and it might even be a portion completely out of the 0x0FF0-literal.
I'd suggest to use fixed-width integral values representing exactly a word throughout:
#include <stdio.h>
#include <stdint.h>
void print_bits(uint16_t x);
int main(int argc, char *argv[])
{
uint16_t buffer[512];
uint16_t init = 0x0FF0;
uint16_t * som = &init;
printf("print bits of som now: \n");
print_bits(init);
printf("\n");
memset(buffer, 0, sizeof(buffer)); // reinitialize the buffer
memcpy(buffer, som, sizeof(*som)); // copy word to the buffer
printf("print bits of buffer[0] now: \n");
print_bits(buffer[0]);
printf("\n");
return 0;
}
void print_bits(uint16_t x)
{
int i;
for (i = 8 * sizeof(x); i >= 0; i--) {
(x & (1 << i)) ? putchar('1') : putchar('0');
}
printf("\n");
}
You are not writing the bytes "0F F0" to the buffer. You are writing whatever bytes your platform uses internally to store the number 0x0FF0. There is no reason these need to be the same.
When you write 0x0FF0 in C, that means, roughly, "whatever my implementation uses to encode the number four thousand eighty". That might be the byte string 0F, F0. But it might not be.
I mean, how weird would it be if unsigned int init = 0x0FF0; and unsigned int init = 4080; would do the same thing on some platforms and different things on others? But surely not all platforms store the number 4,080 using the byte string "0F F0".
For example, I might store the number ten as "10" or "ten" or any number of other ways. It's unreasonable for you to expect "ten", "10", or any other particular byte sequence to appear in memory just because you stored the number ten unless you do happen to specifically know how your platform stores the number ten. Given that you asked this question, you don't know that.
Also, you are only printing the value of buffer[0], which is a single character. So it couldn't possibly hold any version of 0x0FF0.

Bitfields and alignment

Trying to pack data into a packet. This packet should be 64 bits. I have this:
typedef union {
uint64_t raw;
struct {
unsigned int magic : 8;
unsigned int parity : 1;
unsigned int stype : 8;
unsigned int sid : 8;
unsigned int mlength : 31;
unsigned int message : 8;
} spacket;
} packet_t;
But it seems that alignment is not guaranteed. Because when I run this:
#include <strings.h>
#include <stdio.h>
#include <stddef.h>
#include <stdint.h>
const char *number_to_binary(uint64_t x)
{
static char b[65];
b[64] = '\0';
uint64_t z;
int w = 0;
for (z = 1; w < 64; z <<= 1, ++w)
{
b[w] = ((x & z) == z) ? '1' : '0';
}
return b;
}
int main(void)
{
packet_t ipacket;
bzero(&ipacket, sizeof(packet_t));
ipacket.spacket.magic = 255;
printf("%s\n", number_to_binary(ipacket.raw));
ipacket.spacket.parity = 1;
printf("%s\n", number_to_binary(ipacket.raw));
ipacket.spacket.stype = 255;
printf("%s\n", number_to_binary(ipacket.raw));
ipacket.spacket.sid = 255;
printf("%s\n", number_to_binary(ipacket.raw));
ipacket.spacket.mlength = 2147483647;
printf("%s\n", number_to_binary(ipacket.raw));
ipacket.spacket.message = 255;
printf("%s\n", number_to_binary(ipacket.raw));
}
I get (big endian):
1111111100000000000000000000000000000000000000000000000000000000
1111111110000000000000000000000000000000000000000000000000000000
1111111111111111100000000000000000000000000000000000000000000000
1111111111111111111111111000000000000000000000000000000000000000
1111111111111111111111111000000011111111111111111111111111111110
1111111111111111111111111000000011111111111111111111111111111110
My .mlength field is lost somewhere on the right part although it should be right next to the .sid field.
This page confirms it: Alignment of the allocation unit that holds a bit field is unspecified. But if this is the case, how do people are packing data into bit fields which is their purpose in the first place?
24 bits seems to be the maximum size the .mlength field is able to take before the .message field is kicked out.
Almost everything about the layout of bit-fields is implementation-defined in the standard, as you'd find from numerous other questions on the subject on SO. (Amongst others, you could look at Questions about bitfields and especially Bit field's memory management in C).
If you want your bit fields to be packed into 64 bits, you'll have to trust that your compiler allows you to use 64-bit types for the fields, and then use:
typedef union {
uint64_t raw;
struct {
uint64_t magic : 8;
uint64_t parity : 1;
uint64_t stype : 8;
uint64_t sid : 8;
uint64_t mlength : 31;
uint64_t message : 8;
} spacket;
} packet_t;
As originally written, under one plausible (common) scheme, your bit fields would be split into new 32-bit words when there isn't space enough left in the current one. That is, magic, parity, stype and sid would occupy 25 bits; there isn't enough room left in a 32-bit unsigned int to hold another 31 bits, so mlength is stored in the next unsigned int, and there isn't enough space left over in that unit to store message so that is stored in the third unsigned int unit. That would give you a structure occupying 3 * sizeof(unsigned int) or 12 bytes — and the union would occupy 16 bytes because of the alignment requirements on uint64_t.
Note that the standard does not guarantee that what I show will work. However, under many compilers, it probably will work. (Specifically, it works with GCC 5.3.0 on Mac OS X 10.11.4.)
Depending on your architecture and/or compiler your data will be aligned to different sizes. From your observations I would guess that you are seeing the consequences of 32 bit aligning. If you take a look at the sizeof your union and that is more than 8 bytes (64 bits) data has been padded for alignment.
With 32 bit alignment mlength and message will only be able to stay next to each other if they sum up to less than or equal 32 bits. This is probably what you see with your 24 bit limit.
If you want your struct to only take 64 bits with 32 bit alignment you will have to rearrange it a little bit. The single bit parity should be next to the 31 bit mlength and your 4 8 bit variables should be grouped together.

Resources