Printing actual bit representation of integers in C - c

I wanted to print the actual bit representation of integers in C. These are the two approaches that I found.
First:
union int_char {
int val;
unsigned char c[sizeof(int)];
} data;
data.val = n1;
// printf("Integer: %p\nFirst char: %p\nLast char: %p\n", &data.f, &data.c[0], &data.c[sizeof(int)-1]);
for(int i = 0; i < sizeof(int); i++)
printf("%.2x", data.c[i]);
printf("\n");
Second:
for(int i = 0; i < 8*sizeof(int); i++) {
int j = 8 * sizeof(int) - 1 - i;
printf("%d", (val >> j) & 1);
}
printf("\n");
For the second approach, the outputs are 00000002 and 02000000. I also tried the other numbers and it seems that the bytes are swapped in the two. Which one is correct?

Welcome to the exotic world of endian-ness.
Because we write numbers most significant digit first, you might imagine the most significant byte is stored at the lower address.
The electrical engineers who build computers are more imaginative.
Someimes they store the most significant byte first but on your platform it's the least significant.
There are even platforms where it's all a bit mixed up - but you'll rarely encounter those in practice.
So we talk about big-endian and little-endian for the most part. It's a joke about Gulliver's Travels where there's a pointless war about which end of a boiled egg to start at. Which is itself a satire of some disputes in the Christian Church. But I digress.
Because your first snippet looks at the value as a series of bytes it encounters then in endian order.
But because the >> is defined as operating on bits it is implemented to work 'logically' without regard to implementation.
It's right of C to not define the byte order because hardware not supporting the model C chose would be burdened with an overhead of shuffling bytes around endlessly and pointlessly.
There sadly isn't a built-in identifier telling you what the model is - though code that does can be found.
It will become relevant to you if (a) as above you want to breakdown integer types into bytes and manipulate them or (b) you receive files for other platforms containing multi-byte structures.
Unicode offers something called a BOM (Byte Order Marker) in UTF-16 and UTF-32.
In fact a good reason (among many) for using UTF-8 is the problem goes away. Because each component is a single byte.
Footnote:
It's been pointed out quite fairly in the comments that I haven't told the whole story.
The C language specification admits more than one representation of integers and particularly signed integers. Specifically signed-magnitude, twos-complement and ones-complement.
It also permits 'padding bits' that don't represent part of the value.
So in principle along with tackling endian-ness we need to consider representation.
In principle. All modern computers use twos complement and extant machines that use anything else are very rare and unless you have a genuine requirement to support such platforms, I recommend assuming you're on a twos-complement system.

The correct Hex representation as string is 00000002 as if you declare the integer with hex represetation.
int n = 0x00000002; //n=2
or as you where get when printing integer as hex like in:
printf("%08x", n);
But when printing integer bytes 1 byte after the other, you also must consider the endianess, which is the byte order in multi-byte integers:
In big endian system (some UNIX system use it) the 4 bytes will be ordered in memory as:
00 00 00 02
While in little endian system (most of OS) the bytes will be ordered in memory as:
02 00 00 00

The first prints the bytes that represent the integer in the order they appear in memory. Platforms with different endian will print different results as they store integers in different ways.
The second prints the bits that make up the integer value most significant bit first. This result is independent of endian. The result is also independent of how the >> operator is implemented for signed ints as it does not look at the bits that may be influenced by the implementation.
The second is a better match to the question "Printing actual bit representation of integers in C". Although there is a lot of ambiguity.

It depends on your definition of "correct".
The first one will print the data exactly like it's laid out in memory, so I bet that's the one you're getting the maybe unexpected 02000000 for. *) IMHO, that's the correct one. It could be done simpler by just aliasing with unsigned char * directly (char pointers are always allowed to alias any other pointers, in fact, accessing representations is a usecase for char pointers mentioned in the standard):
int x = 2;
unsigned char *rep = (unsigned char *)&x;
for (int i = 0; i < sizeof x; ++i) printf("0x%hhx ", rep[i]);
The second one will print only the value bits **) and take them in the order from the most significant byte to the least significant one. I wouldn't call it correct because it also assumes that bytes have 8 bits, and because the shifting used is implementation-defined for negative numbers. ***) Furthermore, just ignoring padding bits doesn't seem correct either if you really want to see the representation.
edit: As commented by Gerhardh meanwhile, this second code doesn't print byte by byte but bit by bit. So, the output you claim to see isn't possible. Still, it's the same principle, it only prints value bits and starts at the most significant one.
*) You're on a "little endian" machine. On these machines, the least significant byte is stored first in memory. Read more about Endianness on wikipedia.
**) Representations of types in C may also have padding bits. Some types aren't allowed to include padding (like char), but int is allowed to have them. This second option doesn't alias to char, so the padding bits remain invisible.
***) A correct version of this code (for printing all the value bits) must a) correctly determine the number of value bits (8 * sizeof int is wrong because bytes (char) can have more then 8 bits, even CHAR_BIT * sizeof int is wrong, because this would also count padding bits if present) and b) avoid the implementation-defined shifting behavior by first converting to unsigned. It could look for example like this:
#define IMAX_BITS(m) ((m) /((m)%0x3fffffffL+1) /0x3fffffffL %0x3fffffffL *30 \
+ (m)%0x3fffffffL /((m)%31+1)/31%31*5 + 4-12/((m)%31+3))
int main(void)
{
int x = 2;
for (unsigned mask = 1U << (IMAX_BITS((unsigned)-1) - 1); mask; mask >>= 1)
{
putchar((unsigned) x & mask ? '1' : '0');
}
puts("");
}
See this answer for an explanation of this strange macro.

Related

"Bit-fields are assigned left to right on some machines and right to left on others"- unable to get the concept from "The C Programming Language" book

I was going through the text "The C Programming Language" by Kernighan and Ritchie. While discussing about bit-fields at the end of that section, the authors say:
"Fields are assigned left to right on some machines and right to left on others. This means that although fields are useful for maintaining internally-defined data structures, the question of which end comes first has to be carefully considered when picking apart externally-defined data; programs that depend on such things are not portable."
- The C Programming Language [2e] by Kernighan & Ritchie [Section 6.9, p.150]
Strictly I do not get the meaning of these lines. Can anyone please explain me with a possible diagram?
PS: Well I have taken a computer organization and architecture course. I know how computers deal with bits and bytes. In a computer system, the smallest unit of information is a single bit which can be either 0 or 1. 8 such bits form a byte. Memories are byte-addressable, which means that each byte in the memory has an address associated with it. But usually, the processors have word lengths as 2 bytes (very old systems),4 bytes, 8 bytes... This means in one memory cycle, the CPU can take up a word length number of bytes from the main memory and put it inside its registers. Now how these bytes are placed in registers depends on the endianness of the system.
But I do not get what the authors mean by "left to right" or "right to left". The words seem like they are related to the endianness but endianness depends on the CPU and C compilers have nothing to do with it... The question which comes to my mind is "left to right" of "what"? What object are the authors referring to?
When a structure contains bit-fields, the C implementation uses some storage unit to hold them (or multiple storage units if needed). The storage unit might be one eight-bit byte or it might be four bytes, or it might be other sizes—this is a determination made by each C implementation. The C standard only requires that it be addressable, which effectively means it has to be a whole number of bytes.
Once we have a storage unit, it is some number of bits. Say it is 32 bits, and number the bits from 31 to 0, where, if we consider the bits to represent a binary numeral, bit 0 represents 20, and bit 31 represents 231. Note that Kernighan and Ritchie are imprecise to use “left” and “right” here. There is no inherent left or right. We usually write numerals with the most significant digits on the left, so we might consider bit 31 to be the leftmost and bit 0 to be the rightmost.
Now we have a storage unit with some number of bits and some labeling for those bits (31 to 0 or left to right). Say you want to put two bit-fields in them, say fields of width 7 and 5.
Which 7 of the bits from bit 31 to bit 0 are used for the first field? Which 5 of the bits are used for the second field?
We could use bits 31-25 for the first field and bits 24-20 for the second field. Or we could use bits 6-0 for the first field and bits 11-7 for the second field.
In theory, we could also use bits 27-21 for the first field and bits 15-11 for the second field. However, the C standard does say that “If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit” (C 2018 6.7.2.1 11). “Adjacent” is not formally defined, but we can assume it means consecutively numbered bits. So, if the C implementation puts the first field in bits 31-25, it is required to put the second field in bits 24-20. Conversely, it it puts the first field in bits 6-0, it must put the second field in 11-7.
Thus, the C standard requires an implementation to arrange successive bit-fields in a storage unit from left-to-right or from right-to-left, but it does not say which.
(I do not see anything in the standard that says the first field must start at one end of the storage unit or the other, rather than somewhere in the middle. That would lead to wasting some bits.)
When you write:
struct {
unsigned int version: 4;
unsigned int length: 4;
unsigned char dcsn;
you end up with a big headache you weren't expecting because your code is non-portable.
When you set version to 4 and length to 5, some systems may set the first byte of the structure to 0x45 and other systems may set the first byte of the structure to 0x54.
When I went to college this thing was #ifdef'd as follows (incorrect):
struct {
#if BIG_ENDIAN
unsigned int version: 4;
unsigned int length: 4;
#else
unsigned int length: 4;
unsigned int version: 4;
#endif
unsigned char dcsn;
but this is still rolling the dice as there's no rule that the order of the bits in the bytes in a bitfield corresponds to the order of bytes in the word in the machine. I would not be surprised that when you cross-compile the bit order in the struct comes from the host machine's rules while the bit order of integers comes from the target machine's rules (as it must). In theory the code could be corrected by having a separate #ifdef for BIG_ENDIAN_BITFIELD but I've never seen it done.
Here is some demonstration code. The only goal is to demonstrate what you are asking about. Clean coding etc. is neglected.
#include <stdio.h>
#include <stdint.h>
union
{
uint32_t Everything;
struct
{
uint32_t FirstMentionedBit : 1;
uint32_t FewOTherBits :30;
uint32_t LastMentionedBit : 1;
} bitfield;
} Demonstration;
int main()
{
Demonstration.Everything =0;
Demonstration.bitfield.LastMentionedBit=1;
printf("%x\n", Demonstration.Everything);
Demonstration.Everything =0;
Demonstration.bitfield.FirstMentionedBit=1;
printf("%x\n", Demonstration.Everything);
return 0;
}
If you use this here https://www.tutorialspoint.com/compile_c_online.php
the output is
80000000
1
But in other environments it might easily be
1
80000000
This is because compilers are free to consider the first mentioned bit the MSB or the LSB and correspondingly the last mentioned bit to be the LSB or MSB.
And that is what your quote describes.

What does casting char* do to a reference of an int? (Using C)

In my course for intro to operating systems, our task is to determine if a system is big or little endian. There's plenty of results I've found on how to do it, and I've done my best to reconstruct my own version of a code. I suspect it's not the best way of doing it, but it seems to work:
#include <stdio.h>
int main() {
int a = 0x1234;
unsigned char *start = (unsigned char*) &a;
int len = sizeof( int );
if( start[0] > start[ len - 1 ] ) {
//biggest in front (Little Endian)
printf("1");
} else if( start[0] < start[ len - 1 ] ) {
//smallest in front (Big Endian)
printf("0");
} else {
//unable to determine with set value
printf( "Please try a different integer (non-zero). " );
}
}
I've seen this line of code (or some version of) in almost all answers I've seen:
unsigned char *start = (unsigned char*) &a;
What is happening here? I understand casting in general, but what happens if you cast an int to a char pointer? I know:
unsigned int *p = &a;
assigns the memory address of a to p, and that can you affect the value of a through dereferencing p. But I'm totally lost with what's happening with the char and more importantly, not sure why my code works.
Thanks for helping me with my first SO post. :)
When you cast between pointers of different types, the result is generally implementation-defined (it depends on the system and the compiler). There are no guarantees that you can access the pointer or that it correctly aligned etc.
But for the special case when you cast to a pointer to character, the standard actually guarantees that you get a pointer to the lowest addressed byte of the object (C11 6.3.2.3 §7).
So the compiler will implement the code you have posted in such a way that you get a pointer to the least significant byte of the int. As we can tell from your code, that byte may contain different values depending on endianess.
If you have a 16-bit CPU, the char pointer will point at memory containing 0x12 in case of big endian, or 0x34 in case of little endian.
For a 32-bit CPU, the int would contain 0x00001234, so you would get 0x00 in case of big endian and 0x34 in case of little endian.
If you de reference an integer pointer you will get 4 bytes of data(depends on compiler,assuming gcc). But if you want only one byte then cast that pointer to a character pointer and de reference it. You will get one byte of data. Casting means you are saying to compiler that read so many bytes instead of original data type byte size.
Values stored in memory are a set of '1's and '0's which by themselves do not mean anything. Datatypes are used for recognizing and interpreting what the values mean. So lets say, at a particular memory location, the data stored is the following set of bits ad infinitum: 01001010 ..... By itself this data is meaningless.
A pointer (other than a void pointer) contains 2 pieces of information. It contains the starting position of a set of bytes, and the way in which the set of bits are to be interpreted. For details, you can see: http://en.wikipedia.org/wiki/C_data_types and references therein.
So if you have
a char *c,
an short int *i,
and a float *f
which look at the bits mentioned above, c, i, and f are the same, but *c takes the first 8 bits and interprets it in a certain way. So you can do things like printf('The character is %c', *c). On the other hand, *i takes the first 16 bits and interprets it in a certain way. In this case, it will be meaningful to say, printf('The character is %d', *i). Again, for *f, printf('The character is %f', *f) is meaningful.
The real differences come when you do math with these. For example,
c++ advances the pointer by 1 byte,
i++ advanced it by 4 bytes,
and f++ advances it by 8 bytes.
More importantly, for
(*c)++, (*i)++, and (*f)++ the algorithm used for doing the addition is totally different.
In your question, when you do a casting from one pointer to another, you already know that the algorithm you are going to use for manipulating the bits present at that location will be easier if you interpret those bits as an unsigned char rather than an unsigned int. The same operatord +, -, etc will act differently depending upon what datatype the operators are looking at. If you have worked in Physics problems wherein doing a coordinate transformation has made the solution very simple, then this is the closest analog to that operation. You are transforming one problem into another that is easier to solve.

Most efficient way to store an unsigned 16-bit Integer to a file

I'm making a dictionary compressor in C with dictionary max size 64000. Because of this, I'm storing my entries as 16-bit integers.
What I'm currently doing:
To encode 'a', I get its ASCII value, 97, and then convert this number into a string representation of the 16-bit integer of 97. So I end up encoding '0000000001100001' for 'a', which obviously isn't saving much space in the short run.
I'm aware that more efficient versions of this algorithm would start with smaller integer sizes (less bits of storage until we need more), but I'm wondering if there's a better way to either
Convert my integer '97' into an ASCII string of fixed length that can store 16 bits of data (97 would be x digits, 46347 would also be x digits)
writing to a file that can ONLY store 1s and 0s. Because as it is, it seems like I'm writing 16 ascii characters to a text file, each of which is 8 bits...so that's not really helping the cause much, is it?
Please let me know if I can be more clear in any way. I'm pretty new to this site. Thank you!
EDIT: How I store my dictionary is entirely up to me as far as I know. I just know that I need to be able to easily read the encoded file back and get the integers from it.
Also, I can only include stdio.h, stdlib.h, string.h, and header files I wrote for the program.
Please, do ignore these people who are suggesting that you "write directly to the file". There are a number of issues with that, which ultimately fall into the category of "integer representation". There appear to be some compelling reasons to write integers straight to external storage using fwrite or what-not, there are some solid facts in play here.
The bottleneck is the external storage controller. Either that, or the network, if you're writing a network application. Thus, writing two bytes as a single fwrite, or as two distinct fputcs, should be roughly the same speed, providing your memory profile is adequate for your platform. You can adjust the amount of buffer that your FILE *s use to a degree using setvbuf (note: must be a power of two), so we can always fine-tune per platform based on what our profilers tell us, though this information should probably float gracefully upstream to the standard library through gentle suggestions to be useful for other projects, too.
Underlying integer representations are inconsistent between todays computers. Suppose you write unsigned ints directly to a file using system X which uses 32-bit ints and big endian representation, you'll end up with issues reading that file on system Y which uses 16-bit ints and little endian representation, or system Z which uses 64-bit ints with mixed endian representation and 32 padding bits. Nowadays we have this mix of computers from 15 years ago that people torture themselves with to ARM big.Little SoCs, smartphones and smart TVs, gaming consoles and PCs, all of which have their own quirks which fall outside of the realm of standard C, especially with regards to integer representation, padding and so on.
C was developed with abstractions in mind that allow you to express your algorithm portably, so that you don't have to write different code for each OS! Here's an example of reading and converting four hex digits to an unsigned int value, portably:
unsigned int value;
int value_is_valid = fscanf(fd, "%04x", &value) == 1;
assert(value_is_valid); // #include <assert.h>
/* NOTE: Actual error correction should occur in place of that
* assertioon
*/
I should point out the reason why I choose %04X and not %08X or something more contemporary... if we go by questions asked even today, unfortunately there are students for example using textbooks and compilers that are over 20 years old... Their int is 16-bit and technically, their compilers are compliant in that aspect (though they really ought to push gcc and llvm throughout academia). With portability in mind, here's how I'd write that value:
value &= 0xFFFF;
fprintf(fd, "%04x", value);
// side-note: We often don't check the return value of `fprintf`, but it can also become \
very important, particularly when dealing with streams and large files...
Supposing your unsigned int values occupy two bytes, here's how I'd read those two bytes, portably, using big endian representation:
int hi = fgetc(fd);
int lo = fgetc(fd);
unsigned int value = 0;
assert(hi >= 0 && lo >= 0); // again, proper error detection & handling logic should be here
value += hi & 0xFF; value <<= 8;
value += lo & 0xFF;
... and here's how I'd write those two bytes, in their big endian order:
fputc((value >> 8) & 0xFF, fd);
fputc(value & 0xFF, fd);
// and you might also want to check this return value (perhaps in a finely tuned end product)
Perhaps you're more interested in little endian. The neat thing is, the code really isn't that different. Here's input:
int lo = fgetc(fd);
int hi = fgetc(fd);
unsigned int value = 0;
assert(hi >= 0 && lo >= 0);
value += hi & 0xFF; value <<= 8;
value += lo & 0xFF;
... and here's output:
fputc(value & 0xFF, fd);
fputc((value >> 8) & 0xFF, fd);
For anything larger than two bytes (i.e. a long unsigned or long signed), you might want to fwrite((char unsigned[]){ value >> 24, value >> 16, value >> 8, value }, 1, 4, fd); or something for example, to reduce boilerplate. With that in mind, it doesn't seem abusive to form a preprocessor macro:
#define write(fd, ...) fwrite((char unsigned){ __VA_ARGS__ }, 1, sizeof ((char unsigned) { __VA_ARGS__ }), fd)
I suppose one might look at this like choosing the better of two evils: preprocessor abuse or the magic number 4 in the code above, because now we can write(fd, value >> 24, value >> 16, value >> 8, value); without the 4 being hard-coded... but a word for the uninitiated: side-effects might cause headaches, so don't go causing modifications, writes or global state changes of any kind in arguments of write.
Well, that's my update to this post for the day... Socially delayed geek person signing out for now.
What you are contemplating is to utilize ASCII characters in saving your numbers, this is completely unnecessary and most inefficient.
The most space efficient way to do this (without utilizing complex algorithms) would be to just dump the bytes of the numbers into the file (the number of bits would have to depend on the largest number you intend to save. Or have multiple files for 8bit, 16bit etc.
Then when you read the file you know that your numbers are located per x # of bits so you just read them out one by one or in a big chunk(s) and then just make the chunk(s) into an array of a type that matches x # of bits.

Please help me for using memcpy() in C

I have the following code:
void main()
{
char tmp[3]= "AB";
short k;
memcpy(&k,tmp,2);
printf("%x\n", k);
}
In ASCII, the hex value of char 'A' is 41 and the hex value of char 'B' is 42. Why is the result of this program 4241? I think the correct result is 4142.
You are apparently running this on a "little-endian" machine, where the least significant byte comes first. See http://en.wikipedia.org/wiki/Endianness.
Your platform stores less significant bytes of a number at smaller memory addresses, and more significant bytes at higher memory addresses. Such platforms are called little-endian platforms.
However, when you print a number the more significant digits are printed first while the less significant digits are printed later (which is how our everyday numeric notation works). For this reason the result looks "reversed" compared to the way it is stored in memory on a little-endian platform.
If you compile and run the same program on a big-endian platform, the output should be 4142 (assuming a platform with 2-byte short).
P.S. One can argue that the "problem" in this case is the "weirdness" of our everyday numerical notation: we write numbers so that the significance of their digits increase in right-to-left direction. This appears to be inconsistent in the context of societies that write and read in left-to-right direction. In other words, it in not the little-endian memory that is reversed. It is the way we write numbers that is reversed.
Your system is little-endian. That means that a short (16-bit integer) is stored with the least significant byte first, followed by the most significant byte.
The same goes for larger integers. The following code would result in "44434241".
void main()
{
char tmp[5]= "ABCD";
int k;
memcpy(&k,tmp,4);
printf("%x\n", k);
}

How to convert struct to char array in C

I'm trying to convert a struct to a char array to send over the network. However, I get some weird output from the char array when I do.
#include <stdio.h>
struct x
{
int x;
} __attribute__((packed));
int main()
{
struct x a;
a.x=127;
char *b = (char *)&a;
int i;
for (i=0; i<4; i++)
printf("%02x ", b[i]);
printf("\n");
for (i=0; i<4; i++)
printf("%d ", b[i]);
printf("\n");
return 0;
}
Here is the output for various values of a.x (on an X86 using gcc):
127:
7f 00 00 00
127 0 0 0
128:
ffffff80 00 00 00
-128 0 0 0
255:
ffffffff 00 00 00
-1 0 0 0
256:
00 01 00 00
0 1 0 0
I understand the values for 127 and 256, but why do the numbers change when going to 128? Why wouldn't it just be:
80 00 00 00
128 0 0 0
Am I forgetting to do something in the conversion process or am I forgetting something about integer representation?
*Note: This is just a small test program. In a real program I have more in the struct, better variable names, and I convert to little-endian.
*Edit: formatting
What you see is the sign preserving conversion from char to int. The behavior results from the fact that on your system, char is signed (Note: char is not signed on all systems). That will lead to negative values if a bit-pattern yields to a negative value for a char. Promoting such a char to an int will preserve the sign and the int will be negative too. Note that even if you don't put a (int) explicitly, the compiler will automatically promote the character to an int when passing to printf. The solution is to convert your value to unsigned char first:
for (i=0; i<4; i++)
printf("%02x ", (unsigned char)b[i]);
Alternatively, you can use unsigned char* from the start on:
unsigned char *b = (unsigned char *)&a;
And then you don't need any cast at the time you print it with printf.
The x format specifier by itself says that the argument is an int, and since the number is negative, printf requires eight characters to show all four non-zero bytes of the int-sized value. The 0 modifier tells to pad the output with zeros, and the 2 modifier says that the minimum output should be two characters long. As far as I can tell, printf doesn't provide a way to specify a maximum width, except for strings.
Now then, you're only passing a char, so bare x tells the function to use the full int that got passed instead — due to default argument promotion for "..." parameters. Try the hh modifier to tell the function to treat the argument as just a char instead:
printf("%02hhx", b[i]);
char is a signed type; so with two's complement, 0x80 is -128 for an 8-bit integer (i.e. a byte)
Treating your struct as if it were a char array is undefined behavior. To send it over the network, use proper serialization instead. It's a pain in C++ and even more so in C, but it's the only way your app will work independently of the machines reading and writing.
http://en.wikipedia.org/wiki/Serialization#C
Converting your structure to characters or bytes the way you're doing it, is going to lead to issues when you do try to make it network neutral. Why not address that problem now? There are a variety of different techniques you can use, all of which are likely to be more "portable" than what you're trying to do. For instance:
Sending numeric data across the network in a machine-neutral fashion has long been dealt with, in the POSIX/Unix world, via the functions htonl, htons, ntohl and ntohs. See, for example, the byteorder(3) manual page on a FreeBSD or Linux system.
Converting data to and from a completely neutral representation like JSON is also perfectly acceptable. The amount of time your programs spend converting the data between JSON and native forms is likely to pale in comparison to the network transmission latencies.
char is a signed type so what you are seeing is the two-compliment representation, casting to (unsigned char*) will fix that (Rowland just beat me).
On a side note you may want to change
for (i=0; i<4; i++) {
//...
}
to
for (i=0; i<sizeof(x); i++) {
//...
}
The signedness of char array is not the root of the problem! (It is -a- problem, but not the only problem.)
Alignment! That's the key word here. That's why you should NEVER try to treat structs like raw memory. Compliers (and various optimization flags), operating systems, and phases of the moon all do strange and exciting things to the actual location in memory of "adjacent" fields in a structure. For example, if you have a struct with a char followed by an int, the whole struct will be EIGHT bytes in memory -- the char, 3 blank, useless bytes, and then 4 bytes for the int. The machine likes to do things like this so structs can fit cleanly on pages of memory, and such like.
Take an introductory course to machine architecture at your local college. Meanwhile, serialize properly. Never treat structs like char arrays.
When you go to send it, just use:
(char*)&CustomPacket
to convert. Works for me.
You may want to convert to a unsigned char array.
Unless you have very convincing measurements showing that every octet is precious, don't do this. Use a readable ASCII protocol like SMTP, NNTP, or one of the many other fine Internet protocols codified by the IETF.
If you really must have a binary format, it's still not safe just to shove out the bytes in a struct, because the byte order, basic sizes, or alignment constraints may differ from host to host. You must design your wire protcol to use well-defined sizes and to use a well defined byte order. For your implementation, either use macros like ntohl(3) or use shifting and masking to put bytes into your stream. Whatever you do, make sure your code produces the same results on both big-endian and little-endian hosts.

Resources