Convert a char array into an integer (big and little endian)

Convert a char array into an integer (big and little endian) - c

I am trying to convert a char array into integer, then I have to increment that integer (both in little and big endian).
Example:
char ary[6 ] = { 01,02,03,04,05,06};
long int b=0; // 64 bits
this char will be stored in memory
address 0 1 2 3 4 5
value 01 02 03 04 05 06 (big endian)
Edit : value 01 02 03 04 05 06 (little endian)
-
memcpy(&b, ary, 6); // will do copy in bigendian L->R
This how it can be stored in memory:
01 02 03 04 05 06 00 00 // big endian increment will at MSByte
01 02 03 04 05 06 00 00 // little endian increment at LSByte
So if we increment the 64 bit integer, the expected value is 01 02 03 04 05 07. But endianness is a big problem here, since if we directly increment the value of the integer, it will results some wrong numbers. For big endian we need to shift the value in b, then do an increment on it.
For little endian we CAN'T increment directly. ( Edit : reverse and inc )
Can we copy the w r t to endianess? So we don't need to worry about shift operations and all.
Any other solution for incrementing char array values after copying it into integer?
Is there any API in the Linux kernel to it copy w.r.t to endianess ?

Unless you want the byte array to represent a larger integer, which doesn't seem to be the case here, endianess does not matter. Endianess only applies to integer values of 16 bits or larger. If the character array is an array of 8 bit integers, endianess does not apply. So all your assumptions are incorrect, the char array will always be stored as
address 0 1 2 3 4 5
value 01 02 03 04 05 06
No matter endianess.
However, if you memcpy the array into a uint64_t, endianess does apply. For a big endian machine, simply memcpy() and you'll get everything in the expected format. For little endian, you'll have to copy the array in reverse, for example:
#include <stdio.h>
#include <stdint.h>
int main (void)
{
uint8_t array[6] = {1,2,3,4,5,6};
uint64_t x=0;
for(size_t i=0; i<sizeof(uint64_t); i++)
{
const uint8_t bit_shifts = ( sizeof(uint64_t)-1-i ) * 8;
x |= (uint64_t)array[i] << bit_shifts;
}
printf("%.16llX", x);
return 0;
}

You need to read up on documentation. This page lists the following:
__u64 le64_to_cpup(const __le64 *);
__le64 cpu_to_le64p(const __u64 *);
__u64 be64_to_cpup(const __be64 *);
__be64 cpu_to_be64p(const __u64 *);
I believe they are sufficient to do what you want to do. Convert the number to CPU format, increment it, then convert back.

Related

Bitwise operators: How to check if source port is great than 32767 (0x7FFF)

I have a program that reads a binary file of 20 bytes, modifies the data and then write on another binary file. I am trying to check if the source port is bigger than 32767 or(0x7FFF).
If it is bigger then I must zero out the most significant two bits of it. I am only allowed to use bitwise operators. Anybody has any idea how could I do it? Thanks.
Data: 32 27 A9 49 03 8C BD 89 01 87 9B 8D 50 13 00 00 FF FF 00 00
Source port: 32 27 (16 bits)
void modify(const unsigned char oldData [], unsigned char newData []){
/*Accessing the souce port oldData[0] and oldData[1].*/
}

Just pick out the port value from the data. This depends on endianness. little or big.
You will need to know how your data is organized - if the 32 is the high or low part.
Assuming your data is little-endian, you can read it like this:
#include <endian.h>
void modify()
{
uint16_t port = le16toh( *((uint16_t*)oldData) );
if(port>32767)
{/*whatever*/}
For the particular case of checking against >=32768 , which is >=0x8000, can test the high-bit as already commented:
if(oldData[1]&0x80) // [1] is highbyte if using little-endian data.
{/*whatever*/}

Breaking down raw byte string with particular structure gives wrong data

I am working with one ZigBee module based on 32-bit ARM Cortex-M3. But my question is not related to ZigBee protocol itself. I have access to the source code of application layer only which should be enough for my purposes. Lower layer (APS) passes data to application layer within APSDE-DATA.indication primitive to the following application function:
void zbpro_dataRcvdHandler(zbpro_dataInd_t *data)
{
DEBUG_PRINT(DBG_APP,"\n[APSDE-DATA.indication]\r\n");
/* Output of raw bytes string for further investigation.
* Real length is unknown, 50 is approximation.
*/
DEBUG_PRINT(DBG_APP,"Raw data: \n");
DEBUG_PRINT(DBG_APP,"----------\n");
for (int i = 0; i < 50; i++){
DEBUG_PRINT(DBG_APP,"%02x ",*((uint8_t*)data+i));
}
DEBUG_PRINT(DBG_APP,"\n");
/* Output of APSDE-DATA.indication primitive field by field */
DEBUG_PRINT(DBG_APP,"Field by field: \n");
DEBUG_PRINT(DBG_APP,"----------------\n");
DEBUG_PRINT(DBG_APP,"Destination address: ");
for (int i = 0; i < 8; i++)
DEBUG_PRINT(DBG_APP,"%02x ",*((uint8_t*)data->dstAddress.ieeeAddr[i]));
DEBUG_PRINT(DBG_APP,"\n");
DEBUG_PRINT(DBG_APP,"Destination address mode: 0x%02x\r\n",*((uint8_t*)data->dstAddrMode));
DEBUG_PRINT(DBG_APP,"Destination endpoint: 0x%02x\r\n",*((uint8_t*)data->dstEndPoint));
DEBUG_PRINT(DBG_APP,"Source address mode: 0x%02x\r\n",*((uint8_t*)data->dstAddrMode));
DEBUG_PRINT(DBG_APP,"Source address: ");
for (int i = 0; i < 8; i++)
DEBUG_PRINT(DBG_APP,"%02x ",*((uint8_t*)data->srcAddress.ieeeAddr[i]));
DEBUG_PRINT(DBG_APP,"\n");
DEBUG_PRINT(DBG_APP,"Source endpoint: 0x%02x\r\n",*((uint8_t*)data->srcEndPoint));
DEBUG_PRINT(DBG_APP,"Profile Id: 0x%04x\r\n",*((uint16_t*)data->profileId));
DEBUG_PRINT(DBG_APP,"Cluster Id: 0x%04x\r\n",*((uint16_t*)data->clusterId));
DEBUG_PRINT(DBG_APP,"Message length: 0x%02x\r\n",*((uint8_t*)data->messageLength));
DEBUG_PRINT(DBG_APP,"Flags: 0x%02x\r\n",*((uint8_t*)data->flags));
DEBUG_PRINT(DBG_APP,"Security status: 0x%02x\r\n",*((uint8_t*)data->securityStatus));
DEBUG_PRINT(DBG_APP,"Link quality: 0x%02x\r\n",*((uint8_t*)data->linkQuality));
DEBUG_PRINT(DBG_APP,"Source MAC Address: 0x%04x\r\n",*((uint16_t*)data->messageLength));
DEBUG_PRINT(DBG_APP,"Message: ");
for (int i = 0; i < 13; i++){
DEBUG_PRINT(DBG_APP,"%02x ",*((uint8_t*)data->messageContents+i));
}
DEBUG_PRINT(DBG_APP,"\n");
bufm_deallocateBuffer((uint8_t *)data, CORE_MEM);
}
APSDE-DATA.indication primitive is implemented by following structures:
/**
* #brief type definition for address (union of short address and extended address)
*/
typedef union zbpro_address_tag {
uint16_t shortAddr;
uint8_t ieeeAddr[8];
} zbpro_address_t;
/**
* #brief apsde data indication structure
*/
PACKED struct zbpro_dataInd_tag {
zbpro_address_t dstAddress;
uint8_t dstAddrMode;
uint8_t dstEndPoint;
uint8_t srcAddrMode;
zbpro_address_t srcAddress;
uint8_t srcEndPoint;
uint16_t profileId;
uint16_t clusterId;
uint8_t messageLength;
uint8_t flags; /* bit0: broadcast or not; bit1: need aps ack or not; bit2: nwk key used; bit3: aps link key used */
uint8_t securityStatus; /* not-used, reserved for future */
uint8_t linkQuality;
uint16_t src_mac_addr;
uint8_t messageContents[1];
};
typedef PACKED struct zbpro_dataInd_tag zbpro_dataInd_t;
As a result I receive next:
[APSDE-DATA.indication]
Raw data:
---------
00 00 00 72 4c 19 40 00 02 e8 03 c2 30 02 fe ff 83 0a 00 e8 05 c1 11 00 11 08 58 40 72 4c ae 53 4d 3f 63 9f d8 51 da ca 87 a9 0b b3 7b 04 68 ca 87 a9
Field by field:
---------------
Destination address: 00 00 00 28 fa 44 34 00
Destination address mode: 0x12
Destination endpoint: 0xc2
Source address mode: 0x12
Source address: 13 01 12 07 02 bd 02 00
Source endpoint: 0xc2
Profile Id: 0xc940
Cluster Id: 0x90a0
Message length: 0x00
Flags: 0x00
Security status: 0x04
Link quality: 0x34
Source MAC Address: 0x90a0
Message: ae 53 4d 3f 63 9f d8 51 da ca 87 a9 0b
From this output I can see that while raw string has some expected values, dispatched fields are totally different. What is the reason of this behavior and how to fix it? Is it somehow related to ARM architecture or wrong type casting?
I don't have access to implementation of DEBUG_PRINT, but we can assume that it works properly.

There's no need to dereference in your DEBUG_PRINT statements, for example
DEBUG_PRINT(DBG_APP,"%02x ",*((uint8_t*)data->dstAddress.ieeeAddr[i]));
should be simply
DEBUG_PRINT(DBG_APP,"%02x ", data->dstAddress.ieeeAddr[i]);
so on and so forth...

Consider this code:
DEBUG_PRINT(DBG_APP,"%02x ",*((uint8_t*)data->dstAddress.ieeeAddr[i]));
Array subscripting and direct and indirect member access have higher precedence than does casting, so the third argument is equivalent to
*( (uint8_t*) (data->dstAddress.ieeeAddr[i]) )
But data->dstAddress.ieeeAddr[i] is not a pointer, it is an uint8_t. C permits you to convert it to a pointer by casting, but the result is not a pointer to the value, but rather a pointer interpretation of the value. Dereferencing it produces undefined behavior.
Similar applies to your other DEBUG_PRINT() calls.

c get data from BMP

I find myself writing a simple program to extract data from a bmp file. I just got started and I am at one of those WTF moments.
When I run the program and supply this image: http://www.hack4fun.org/h4f/sites/default/files/bindump/lena.bmp
I get the output:
type: 19778
size: 12
res1: 0
res2: 54
offset: 2621440
The actual image size is 786,486 bytes. Why is my code reporting 12 bytes?
The header format specified in,
http://en.wikipedia.org/wiki/BMP_file_format matches my BMP_FILE_HEADER structure. So why is it getting filled with wrong information?
The image file doesn't appear to be corrupt and other images are giving equally wrong outputs. What am I missing?
#include <stdio.h>
#include <stdlib.h>
typedef struct {
unsigned short type;
unsigned int size;
unsigned short res1;
unsigned short res2;
unsigned int offset;
} BMP_FILE_HEADER;
int main (int args, char ** argv) {
char *file_name = argv[1];
FILE *fp = fopen(file_name, "rb");
BMP_FILE_HEADER file_header;
fread(&file_header, sizeof(BMP_FILE_HEADER), 1, fp);
if (file_header.type != 'MB') {
printf("ERROR: not a .bmp");
return 1;
}
printf("type: %i\nsize: %i\nres1: %i\nres2: %i\noffset: %i\n", file_header.type, file_header.size, file_header.res1, file_header.res2, file_header.offset);
fclose(fp);
return 0;
}

Here the header in hex:
0000000 42 4d 36 00 0c 00 00 00 00 00 36 00 00 00 28 00
0000020 00 00 00 02 00 00 00 02 00 00 01 00 18 00 00 00
The length field is the bytes 36 00 0c 00`, which is in intel order; handled as a 32-bit value, it is 0x000c0036 or decimal 786,486 (which matches the saved file size).
Probably your C compiler is aligning each field to a 32-bit boundary. Enable a pack structure option, pragma, or directive.

There are two mistakes I could find in your code.
First mistake: You have to pack the structure to 1, so every type size is exactly the size its meant to be, so the compiler doesn't align it for example in 4 bytes alignment. So in your code, short, instead of being 2 bytes, it was 4 bytes. The trick for this, is using a compiler directive for packing the nearest struct:
#pragma pack(1)
typedef struct {
unsigned short type;
unsigned int size;
unsigned short res1;
unsigned short res2;
unsigned int offset;
} BMP_FILE_HEADER;
Now it should be aligned properly.
The other mistake is in here:
if (file_header.type != 'MB')
You are trying to check a short type, which is 2 bytes, with a char type (using ''), which is 1 byte. Probably the compiler is giving you a warning about that, it's canonical that single quotes contain just 1 character with 1-byte size.
To get this around, you can divide this 2 bytes into 2 1-byte characters, which are known (M and B), and put them together into a word. For example:
if (file_header.type != (('M' << 8) | 'B'))
If you see this expression, this will happen:
'M' (which is 0x4D in ASCII) shifted 8 bits to the left, will result in 0x4D00, now you can just add or or the next character to the right zeroes: 0x4D00 | 0x42 = 0x4D42 (where 0x42 is 'B' in ASCII). Thinking like this, you could just write:
if (file_header.type != 0x4D42)
Then your code should work.

Why does fread mess with my byte order?

Im trying to parse a bmp file with fread() and when I begin to parse, it reverses the order of my bytes.
typedef struct{
short magic_number;
int file_size;
short reserved_bytes[2];
int data_offset;
}BMPHeader;
...
BMPHeader header;
...
The hex data is 42 4D 36 00 03 00 00 00 00 00 36 00 00 00;
I am loading the hex data into the struct by fread(&header,14,1,fileIn);
My problem is where the magic number should be 0x424d //'BM' fread() it flips the bytes to be 0x4d42 // 'MB'
Why does fread() do this and how can I fix it;
EDIT: If I wasn't specific enough, I need to read the whole chunk of hex data into the struct not just the magic number. I only picked the magic number as an example.

This is not the fault of fread, but of your CPU, which is (apparently) little-endian. That is, your CPU treats the first byte in a short value as the low 8 bits, rather than (as you seem to have expected) the high 8 bits.
Whenever you read a binary file format, you must explicitly convert from the file format's endianness to the CPU's native endianness. You do that with functions like these:
/* CHAR_BIT == 8 assumed */
uint16_t le16_to_cpu(const uint8_t *buf)
{
return ((uint16_t)buf[0]) | (((uint16_t)buf[1]) << 8);
}
uint16_t be16_to_cpu(const uint8_t *buf)
{
return ((uint16_t)buf[1]) | (((uint16_t)buf[0]) << 8);
}
You do your fread into an uint8_t buffer of the appropriate size, and then you manually copy all the data bytes over to your BMPHeader struct, converting as necessary. That would look something like this:
/* note adjustments to type definition */
typedef struct BMPHeader
{
uint8_t magic_number[2];
uint32_t file_size;
uint8_t reserved[4];
uint32_t data_offset;
} BMPHeader;
/* in general this is _not_ equal to sizeof(BMPHeader) */
#define BMP_WIRE_HDR_LEN (2 + 4 + 4 + 4)
/* returns 0=success, -1=error */
int read_bmp_header(BMPHeader *hdr, FILE *fp)
{
uint8_t buf[BMP_WIRE_HDR_LEN];
if (fread(buf, 1, sizeof buf, fp) != sizeof buf)
return -1;
hdr->magic_number[0] = buf[0];
hdr->magic_number[1] = buf[1];
hdr->file_size = le32_to_cpu(buf+2);
hdr->reserved[0] = buf[6];
hdr->reserved[1] = buf[7];
hdr->reserved[2] = buf[8];
hdr->reserved[3] = buf[9];
hdr->data_offset = le32_to_cpu(buf+10);
return 0;
}
You do not assume that the CPU's endianness is the same as the file format's even if you know for a fact that right now they are the same; you write the conversions anyway, so that in the future your code will work without modification on a CPU with the opposite endianness.
You can make life easier for yourself by using the fixed-width <stdint.h> types, by using unsigned types unless being able to represent negative numbers is absolutely required, and by not using integers when character arrays will do. I've done all these things in the above example. You can see that you need not bother endian-converting the magic number, because the only thing you need to do with it is test magic_number[0]=='B' && magic_number[1]=='M'.
Conversion in the opposite direction, btw, looks like this:
void cpu_to_le16(uint8_t *buf, uint16_t val)
{
buf[0] = (val & 0x00FF);
buf[1] = (val & 0xFF00) >> 8;
}
void cpu_to_be16(uint8_t *buf, uint16_t val)
{
buf[0] = (val & 0xFF00) >> 8;
buf[1] = (val & 0x00FF);
}
Conversion of 32-/64-bit quantities left as an exercise.

I assume this is an endian issue. i.e. You are putting the bytes 42 and 4D into your short value. But your system is little endian (I could have the wrong name), which actually reads the bytes (within a multi-byte integer type) left to right instead of right to left.
Demonstrated in this code:
#include <stdio.h>
int main()
{
union {
short sval;
unsigned char bval[2];
} udata;
udata.sval = 1;
printf( "DEC[%5hu] HEX[%04hx] BYTES[%02hhx][%02hhx]\n"
, udata.sval, udata.sval, udata.bval[0], udata.bval[1] );
udata.sval = 0x424d;
printf( "DEC[%5hu] HEX[%04hx] BYTES[%02hhx][%02hhx]\n"
, udata.sval, udata.sval, udata.bval[0], udata.bval[1] );
udata.sval = 0x4d42;
printf( "DEC[%5hu] HEX[%04hx] BYTES[%02hhx][%02hhx]\n"
, udata.sval, udata.sval, udata.bval[0], udata.bval[1] );
return 0;
}
Gives the following output
DEC[ 1] HEX[0001] BYTES[01][00]
DEC[16973] HEX[424d] BYTES[4d][42]
DEC[19778] HEX[4d42] BYTES[42][4d]
So if you want to be portable you will need to detect the endian-ness of your system and then do a byte shuffle if required. There will be plenty of examples round the internet of swapping the bytes around.
Subsequent question:
I ask only because my file size is 3 instead of 196662
This is due to memory alignment issues. 196662 is the bytes 36 00 03 00 and 3 is the bytes 03 00 00 00. Most systems need types like int etc to not be split over multiple memory words. So intuitively you think your struct is laid out im memory like:
Offset
short magic_number; 00 - 01
int file_size; 02 - 05
short reserved_bytes[2]; 06 - 09
int data_offset; 0A - 0D
BUT on a 32 bit system that means files_size has 2 bytes in the same word as magic_number and two bytes in the next word. Most compilers will not stand for this, so the way the structure is laid out in memory is actually like:
short magic_number; 00 - 01
<<unused padding>> 02 - 03
int file_size; 04 - 07
short reserved_bytes[2]; 08 - 0B
int data_offset; 0C - 0F
So when you read your byte stream in the 36 00 is going into your padding area which leaves your file_size as getting the 03 00 00 00. Now if you used fwrite to create this data it should have been OK as the padding bytes would have been written out. But if your input is always going to be in the format you have specified it is not appropriate to read the whole struct as one with fread. Instead you will need to read each of the elements individually.

Writing a struct to a file is highly non-portable -- it's safest to just not try to do it at all. Using a struct like this is guaranteed to work only if a) the struct is both written and read as a struct (never a sequence of bytes) and b) it's always both written and read on the same (type of) machine. Not only are there "endian" issues with different CPUs (which is what it seems you've run into), there are also "alignment" issues. Different hardware implementations have different rules about placing integers only on even 2-byte or even 4-byte or even 8-byte boundaries. The compiler is fully aware of all this, and inserts hidden padding bytes into your struct so it always works right. But as a result of the hidden padding bytes, it's not at all safe to assume a struct's bytes are laid out in memory like you think they are. If you're very lucky, you work on a computer that uses big-endian byte order and has no alignment restrictions at all, so you can lay structs directly over files and have it work. But you're probably not that lucky -- certainly programs that need to be "portable" to different machines have to avoid trying to lay structs directly over any part of any file.

Passing a 256-bit wire to a C function through the Verilog VPI

I have a 256-bit value in Verilog:
reg [255:0] val;
I want to define a system task $foo that calls out to external C using the VPI, so I can call $foo like this:
$foo(val);
Now, in the C definition for the function 'foo', I cannot simply read the argument as an integer (PLI_INT32), because I have too many bits to fit in one of those. But, I can read the argument as a string, which is the same thing as an array of bytes. Here is what I wrote:
static int foo(char *userdata) {
vpiHandle systfref, args_iter, argh;
struct t_vpi_value argval;
PLI_BYTE8 *value;
systfref = vpi_handle(vpiSysTfCall, NULL);
args_iter = vpi_iterate(vpiArgument, systfref);
argval.format = vpiStringVal;
argh = vpi_scan(args_iter);
vpi_get_value(argh, &argval);
value = argval.value.str;
int i;
for (i = 0; i < 32; i++) {
vpi_printf("%.2x ", value[i]);
}
vpi_printf("\n");
vpi_free_object(args_iter);
return 0;
}
As you can see, this code reads the argument as a string and then prints out each character (aka byte) in the string. This works almost perfectly. However, the byte 00 always gets read as 20. For example, if I assign the Verilog reg as follows:
val = 256'h000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f;
And call it using $foo(val), then the C function prints this at simulation time:
VPI: 20 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f
I have tested this with many different values and have found that the byte 00 always gets mapped to 20, no matter where or how many times it appears in val.
Also, note that if I read the value in as a vpiHexStrVal, and print the string, it looks fine.
So, two questions:
Is there a better way to read in my 256-bit value from the Verilog?
What's going on with the 20? Is this a bug? Am I missing something?
Note: I am using Aldec for simulation.

vpiStringVal is used when the value is expected to be ASCII text, in order to get the value as a pointer to a C string. This is useful if you want to use it with C functions that expect a C string, such as printf() with the %s format, fopen(), etc. However, C strings cannot contain the null character (since null is used to terminate C strings), and also cannot represent x or z bits, so this is not a format that should be used if you need to distinguish any possible vector value. It looks like the simulator you are using formats the null character as a space (0x20); other simulators just skip them, but that doesn't help you either. To distinguish any possible vector value use either vpiVectorVal (the most compact representation) or vpiBinStrVal (a binary string with one 0/1/x/z character for each bit).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Convert a char array into an integer (big and little endian) - c

Related

Bitwise operators: How to check if source port is great than 32767 (0x7FFF)

Breaking down raw byte string with particular structure gives wrong data

c get data from BMP

Why does fread mess with my byte order?

Passing a 256-bit wire to a C function through the Verilog VPI

Categories

Resources