Parse data with C header files which defined the structures - c

I have a C header file like this:
#define NAME_LEN 8
#define DEV_MAX 4
typedef struct __device
{
int iDevID;
int iDevSN;
}DEVICE;
typedef struct __person
{
int iID;
char acName[NAME_LEN];
DEVICE aDevices[DEV_MAX];
}PERSON;
and a binary data file maybe like this:
0000000 01 00 08 00 4a 61 63 6b 00 00 00 00 0a 00 00 00
0000020 11 11 11 11 0b 00 00 00 22 22 22 22 0c 00 00 00
0000040 33 33 33 33 0d 00 00 00 44 44 44 44
All that what I need is to visulized data representation with field names using the C header file above....
It'll be better like this...
m--iID : 0x80001
m--acName : Jack
m--aDevices[]
|--aDevices[0]
|--|--iDevID : 0xa
|--|--iDevSN : 0x11111111
|--aDevices[1]
|--|--iDevID : 0xb
|--|--iDevSN : 0x22222222
|--aDevices[2]
|--|--iDevID : 0xc
|--|--iDevSN : 0x33333333
|--aDevices[3]
|--|--iDevID : 0xd
|--|--iDevSN : 0x44444444
or other structured data .. xml / python pickle / json strings / whatever
Of course, the header file which I faced is far more complicated, there will be a msgtype and a msglenth field in the data, so I can find out which is the correct structure and how long is it.

How badly do you need that?
A possible solution might be to make a GCC plugin or a MELT extension (MELT is a domain specific language to extend GCC), but to do that you'll need to understand in some details the internal representation of GCC (notably Tree, and perhaps Gimple), and that will take you some time (days, not hours).
If your declarations are simpler, perhaps consider using SWIG (or maybe the RPCXDR parser), but that supposes that you are able to change or simplify them.

If the binary format were identical to the memory layout of your structure, you could just cast it, no parsing required (with some caveats). However, that evidently isn't what you mean, since your hex dump and sample output don't match that interpretation.
You'll need to actually explain your format though: as described below, it isn't obvious.
You seem to have fixed-length 4-octet integers in little-endian order, OK.
If I assume variable length strings with a nul-terminator, 4a 61 63 6b 00 = acName:"Jack" and 0a 00 00 00 = iDevID:0x0a looks ok, but there is a 3-octet sequence between them I don't know the meaning of.
Or is Jack not nul-terminated, in which case it's fixed at 4 characters long and not the 8 you defined for NAME_LEN? That would make 00 6f 70 65 another 4-byte integer, but I still don't know what it means.
...

Related

Converting AMF Number from char to double an back

in context of a protokoll I get messages in AMF Format.
The AMF Object Type "Number" is defined as
number-type = number-marker DOUBLE
The data following a Number type marker is always an 8 byte IEEE-754 double [...] in network byte order.
The following Examples are captured using Wireshark:
Hex: 40 00 00 00 00 00 00 00
Number: 2
Hex: 40 08 00 00 00 00 00 00
Number: 3
Hex: 3f f0 00 00 00 00 00 00
Number: 1
I tried to treat these as doube, long long and int64_t but none of these Types seems to use the correct order/format.
The implementation needs to be in C so I cant use any Librarys (The are none as it seems)
What would be the correct approach?
Likely your platform supports 8-byte IEEE-754 doubles but requires them to be in little-endian format. Your examples are in big-endian format. If you store them in an aligned array of unsigned characters from last to first and cast the pointer to a double *, you should get the right value.

How to print data in a buffer irrespective of zeros in between till the given length

I wrote a simple program to read a packet till layer 3 and print the same in hex format.
I gave input in hex format. My output should be same as this.
Input:
45 00 00 44 ad 0b 00 00 40 11 72 72 ac 14 02 fd ac 14
00 06 e5 87 00 35 00 30 5b 6d ab c9 01 00 00 01
00 00 00 00 00 00 09 6d 63 63 6c 65 6c 6c 61 6e
02 63 73 05 6d 69 61 6d 69 03 65 64 75 00 00 01
00 01
I am able to read the packet. Here the hex dump in gdb
(gdb) p packet
$1 = 0x603240 "E"
(gdb) x/32x 0x603240
0x603240: 0x00440045 0x00000000 0x00400b0e 0x00000000
0x603250: 0x00603010 0x0035e587 0xe3200030 0x63206261
0x603260: 0x31302039 0x20303020 0x30203030 0x30302031
0x603270: 0x20303020 0x30203030 0x30302030 0x20303020
0x603280: 0x36203930 0x33362064 0x20333620 0x36206336
0x603290: 0x63362035 0x20633620 0x36203136 0x32302065
0x6032a0: 0x20333620 0x30203337 0x64362035 0x20393620
0x6032b0: 0x36203136 0x39362064 0x20333020 0x36203536
But when I tried to print the packet in console using %s I can't see the total packet because of zeros in between. But I wanted to print it till length of the packet(I am taking it as input to print function).
output on console is:
packet: E
My print function is something like this.
void print(char *packet, int len) {
printf("packet: ");
printf("%s\n\n" , packet );
}
Can you tell me any other way to print the packet till the len(input to print function).
PS: Reading l3 information I didn,t complete. So in gdb of packet l3 information vary form my input.
A string in C is defined as a sequence of characters ending with '\0' (a 0-byte), and the %s conversion specifier of printf() is for strings. So the solution to your problem is doing something else for printing the binary bytes. If you want for example to print their hexadecimal values, you could change your print function like this:
void print(unsigned char *packet, size_t len)
{
for (size_t i = 0; i < len; ++i) printf("%02hhx ", packet[i]);
puts("");
}
Note I also changed the types here:
char can be signed. If you want to handle raw bytes, it's better to always use unsigned char.
int might be too small for a size, always use size_t which is guaranteed to hold any size possible on your platform.
If you really want to print encoded characters (which is unlikely with binary data), you can use %c in printf(), or use the putchar() function, or fwrite() the whole chunk to stdout.

Examine arguments on stack with lldb

I am using lldb to trace through some plain C or C++ code (32 bit) that calls CoreFoundation functions such as CFRunLoopTimerCreate.
I've set a breakpoint on CFRunLoopTimerCreate and would like to examine the passed arguments.
How do I do that? frame variable is not working here (it prints nothing) as it's not in an ObjC context.
I guess I'll have to use the x command somehow to look at the memory above sp but whatever syntax I try, I keep getting error messages.
So, basically, what's the syntax for examining memory at an address a register points to? Also, is there a better way to look at arguments on the stack?
x is actually shorthand for the memory read command. You can choose the word size, e.g. this:
memory read --format x --size 4 --count 8 `$esp - 32`
Will show the top 32 bytes of the stack (on i386) formatted as 4-byte hexadecimal numbers. This might make it easier if you're looking for pointer values, etc. The argument to --format can also be d for decimal output. --outfile lets you specifiy a file path to which to write the memory contents, which may be more useful for large amounts. Surround expressions to evaluate with backticks `.
Is this what you are looking for?
(lldb) x $sp-10
0x7fff5cd3eda6: 00 00 86 0a ec 02 01 00 00 00 00 00 00 00 00 00 ................
0x7fff5cd3edb6: 00 00 00 00 00 00 00 00 00 00 90 94 33 75 ff 7f ............3u..
Registers are generally addressed as $rax etc.
You might also wish to check out this earlier question for some hints on shortening lldb memory read commands: Dump memory in lldb

Get specific byte from M68k ram address with C language

Through the IDA disassembler I've reached this address:
0010FD74 00 00 00 00 00 00 03 00 00 00 00 00 82 03 80 02
Now I need, given the address to get particular bytes; for example the 7th position where there is "03".
I've tried using C language to do this:
char *dummycharacter;
*dummycharacter = *(char*)0x10FD74;
Now if I try to access 7th value with this:
dummycharacter[6]
I don't get 0x03…where am I going wrong?
You're trying to assign the value dummycharacter points to (which is pretty much nowhere, since it's not initialized). Try dummycharacter = (char*)0x10FD74;.

How can I access members of a struct when it's not aligned properly?

I'm afraid that I'm not very good at low level C stuff, I'm more used to
using objects in Obj-c, so please excuse me if this is an obvious question, or if I've completely misunderstood something...
I am attempting to write an application in Cocoa/Obj-C which communicates with an external bit of hardware (a cash till.) I have the format of the data the device sends and receives - and have successfully got some chunks of data from the device.
For example: the till exchanges PLU (price data) in chunks of data in the following format: (from the documentation)
Name Bytes Type
Name Bytes Type
PLU code h 4 long
PLU code L 4 long
desc 19 char
Group 3 char
Status 5 char
PLU link code h 4 long
PLU link code l 4 long
M&M Link 1 char
Min. Stock. 2 int
Price 1 4 long
Price 2 4 long
Total 54 Bytes
So I have a struct in the following form in which to hold the data from the till:
typedef struct MFPLUStructure {
UInt32 pluCodeH;
UInt32 pluCodeL;
unsigned char description[19];
unsigned char group[3];
unsigned char status[5];
UInt32 linkCodeH;
UInt32 linkCodeL;
unsigned char mixMatchLink;
UInt16 minStock;
UInt32 price[2];
} MFPLUStructure;
I have some known sample data from the till (below) which I have checked by hand and is valid
00 00 00 00 4E 61 BC 00 54 65 73 74 20 50 4C 55 00 00 00 00 00 00 00 00 00 00 00 09 08 07 17 13 7C 14 04 00 00 00 00 09 03 00 00 05 BC 01 7B 00 00 00 00 00 00 00
i.e.
bytes 46 to 50 are <7B 00 00 00> == 123 as I would expect as the price is set to '123' on the till.
byte 43 is <05> == 5 as I would expect as the 'mix and match link' is set to 5 on the till.
bytes 39 to 43 are <09 03 00 00> == 777 as I would expect as the 'link code' is set to '777' on the till.
Bytes 27,28,29 are <09 08 07> which are the three groups (7,8 & 9) that I would expect.
The problem comes when I try to get some of the data out of the structure programmatically: The early members work correctly right up to, and including the five 'status' bytes. However, members after that don't come out properly. (see debugger screenshot below.)
Image 1 - http://i.stack.imgur.com/nOdER.png
I assume that the reason for this is because the five status bytes push later members out of alignment - i.e. they are over machine word boundaries. Is this right?
Image 2 - i.imgur.com/ZhbXU.png
Am I right in making that assumption?
And if so, how can I get the members in and out correctly?
Thanks for any help.
Either access the data a byte at a time and assemble it into larger types, or memcpy it into an aligned variable. The latter is better if the data is known to be in a format specific to the host's endianness, etc. The former is better if the data follows an external specification that might not match the host.
If you're sure that endianness of host and wire agree, you could also use a packed structure to read the data in a single pass. However, this is compiler-specific and will most likely impact performance of member access.
Assuming gcc, you'd use the following declarations:
struct __attribute__ ((__packed__)) MFPLUStructure { ... };
typedef struct MFPLUStructure MFPLUStructure;
If you decide to use a packed structure, you should also verify that it is of correct size:
assert(sizeof (MFPLUStructure) == 54);

Resources