I got confused with the inet_pton() function. According to the man page
This function converts the character string src into a network address structure in the af address family, then copies the network address structure to dst. The af argument must be either AF_INET or AF_INET6. dst is written in network byte order.
So the function produces a network byte order value. But I got this code:
struct sockaddr_in a;
char sip[20];
inet_pton(AF_INET, "192.168.0.182", (void *)&a.sin_addr);
inet_ntop(AF_INET, (void *)&a.sin_addr, sip, 20);
printf("htonl:%08x\n", htonl(a.sin_addr.s_addr));
printf("inet_pton:%08x\n", a.sin_addr.s_addr);
printf("inet_ntop:%s\n", sip);
output:
htonl:c0a800b6
inet_pton:b600a8c0
inet_ntop:192.168.0.182
the output of inet_pton is b6.00.a8.c0, which converts to 182.0.168.192, and it's also different from the output of htonl.
Since htonl converts host byte order to network byte order, so if inet_pton produces network byte order, I suppose their outputs should be the same? Does it mean that inet_pton actually produces host byte order?
if inet_pton already produces network byte order, why do I need htonl to get the right value?
Yes, inet_pton puts the address bytes into the destination buffer in network order. Let's go through an example to see what happens. Using your address of "192.168.0.182", inet_pton produces these four bytes:
c0 a8 00 b6 (hex)
192 168 0 182 (dec)
That is network byte order. When you then call htonl (which is not actually correct -- you should be calling ntohl to convert from network order to host order but as #ZanLynx pointed out, the two functions are identical on x86), you re-order the bytes to:
b6 00 a8 c0
But then you pass that form to printf with %x as the format. That tells printf to interpret the four bytes as a single 32-bit integer, but x86 is a little endian machine, so when it loads the four bytes as an integer, the b6 is the lowest order byte and the c0 is the highest, which produces what you saw:
htonl:c0a800b6
So, in general, if you have an IPv4 address in network form, and you want to quickly display it (for debugging or whatever) in an order that "makes sense" to you (as a programmer) you would use:
printf("%x\n", ntohl(a.sin_addr.s_addr));
You could also display the single bytes as they reside in network order (which is really the exact same thing but may be easier to wrap your head around), just use an unsigned char * (or equivalently uint8_t * from <stdint.h>) to print the individual bytes:
uint8_t *ipp = (void *)&a.sin_addr.s_addr;
printf("%02x %02x %02x %02x\n", ipp[0], ipp[1], ipp[2], ipp[3]);
(You need to use an unsigned type here to avoid sign extension. In the above statement, each char will be promoted to an int in the call to printf. If you have a signed char containing, say, 0xc0, it will typically be sign-extended into a 32-bit int: 0xffffffc0. As an alternative, you can use the format specification "%02hhx" which explicitly tells printf that you are really passing it a char; then it will only look at the lowest order byte of each promoted int.)
Related
I'm trying to decode a STUN success response basing on RFC 5389:
If the IP address family is IPv6, X-Address is computed by taking the mapped IP address
in host byte order, XOR'ing it with the concatenation of the magic
cookie and the 96-bit transaction ID, and converting the result to
network byte order.
Magic cookie is a constant and it is 0x2112A442.
Transaction ID in my case is: 0x6FA22B0D9C5F5AD75B6A4E43.
My X-Address (IPv6) in Host Byte Order is:
0x034A67D82F4B3657B193039A8BA8FDA1
Do I have to xor Host Byte Order X-Address with the concatenation of Magic Cookie and Transaction ID in Network or Host Byte Order?
In the first case, Network Byte Order concatenation is equal to:
0x2112A442 6FA22B0D9C5F5AD75B6A4E43
The first byte 0x03 is xored with 0x21, the last byte 0xA1 is xored with 0x43
But in the second case, the Host Byte Order concatenation is:
0x434E6A5BD75A5F9C0D2BA26F 42A41221
The first byte 0x03 is xored with 0x43, the last byte 0xA1 is xored with 0x21.
Another possible behavior is if it takes and converts separately Magic cookie and Transaction ID to Host Byte Order but it concatenate them preserving header order:
0x42A41221 434E6A5BD75A5F9C0D2BA26F
The first byte 0x03 is xored with 0x42, the last byte 0xA1 is xored with 0x6F.
Everything is done in Network Byte Order.
But here's the thing, for IPv6 addresses, there is no difference between "host byte order" and "network byte order". IPv6 addresses are always understood to be an array of 16 bytes. And individual bytes don't have a "byte order". In "C" code we'd just express that IPv6 address as:
unsigned char ipv6address[16];
Or in terms of the sockaddr_in6 struct;
struct sockaddr_in6 addr;
unsigned char* ipv6addresss = addr.sin6_addr.s6_addr; // points to a sequence of 16 bytes
Contrast that with IPv4, which are often passed around in code as 32-bit integers. In the IPv4 case, you often wind up having to invoke the htonl and ntohl functions.
Unless you are doing something like maintaining the the IPv6 address as an array of 8 16-bit integers instead of an array bytes, then you shouldn't have to think about endianness and byte order too much. (As a matter of fact, I'd encourage you to not think about byte order with regards to 16-byte ip addresses).
Example:
My IPv6 address is this:
2001:0000:9d38:6abd:347d:0d08:3f57:fefd
As an array of hex bytes that's logically written out as:
200100009d386abd347d0d083f57fefd
When my STUN server receives a binding request from this IPv6 address, it applies the following XOR operation to send back the XOR-MAPPED-ADDRESS. Let's assume it's the same transaction id as yours and it includes the magic cookie to indicate RFC 5389 support (2112A442 6FA22B0D9C5F5AD75B6A4E43)
XOR:
200100009D386ABD347D0D083F57FEFD
2112A4426FA22B0D9C5F5AD75B6A4E43
The result:
0113A442F29A41B0A82257DF643DB0BE
Similarly, the client receiving the STUN binding response applies the inverse XOR operation on this byte array with the same transaction id.
XOR:
0113A442F29A41B0A82257DF643DB0BE
2112A4426FA22B0D9C5F5AD75B6A4E43
The result:
200100009D386ABD347D0D083F57FEFD
You can reference the source code to Stuntman if that helps for an example of how to apply the xor mapping operation. The XOR operation is here on GitHub. The code here doesn't distinguish between transaction ids with a magic cookie and those that don't. It just treats the transaction id as a logical sequence of 16 bytes.
The above takes care of the IPv6 address. But the 16-bit port value does have to get byte flipped if it is getting treated as a short or 16-bit integer. In C code, that's typically handled with a call to ntohs.
I would like to confirm that any value in a ip header bigger than one byte (short, int.. Or their alternative int16_t..) should be converted to big endian using ntohs/ntohl etc to send over the wire.
Did The kernel managed that under the hood when normal socket were used or another technic was used?
It is quite of a mess since some functions, like getting the ip address of the interface with ioctl seem to already put the data in a big endian fashion when casted to sockaddr_in*. It output my address like 36.2.168.192 (with printf's %d) but the ifreq output it like 192.168.2.36
code
int addr = ((struct sockaddr_in *)(&ifr.ifr_addr))->sin_addr.s_addr;
printf("%d %d %d %d ", (addr >> 24) & 255 , (addr >> 16) & 255,(addr >> 8) & 255, (addr) & 255);
gives me my ip address in the reverse order
whereas using
for (int _x = 0; x < 14; ++_x) {
printf("%d ", ifr.ifr_ifru.ifru_addr.sa_data[_x] );
}
will give me some zeros the ip address in the right order (192.168.2.36) followed by zeros.
Waw.. I am lost.
Quite of a jungle if you ask me.
QUESTION
what to convert to big endian and what not to ?
Best not to think of it as big-endian or little-endian, but rather host order (which may be either) and network order (which is big-endian). You are correct that in the IP standard, every field is in network order. You should use the ntohs and ntohl functions for converting network to host order, and the htons and htonl functions for converting host to network order. That way your code will compile right on a big-endian machine too.
An IP address is normally stored internally in network order, in which case it can be converted to/from presentation format using inet_pton and inet_ntop. You thus don't normally need to play around with the storage format of these addresses unless you are manually applying netmasks etc. If you are doing this, the octets (bytes to you and me) are stored in the natural order, i.e. 111.222.33.44 is stored in the order 111, 222, 33 and 44. If you think about it, that's a big-endian order.
I have probably asked this question twice since y'day, but I have still not got a favourable answer. My problem is I have an IP address which is stored in an unsigned char. Now i want to send this IP address via socket from client to server. People have advised me to use htonl() and ntohl() for network byte transfer, but I am not able to understand that the arguments for htonl() and ntohl() are integers...how can i use it in case of unsigned char?? if I can't use it, how can I make sure that if I send 130.191.166.230 in my buffer, the receiver will receive the same all the time?? Any inputs or guidance will be appreciated. Thanks in advance.
If you have an unsigned char array string (along the lines of "10.0.0.7") forming the IP address (and I'm assuming you do since there are very few 32-bit char systems around, making it rather difficult to store an IP address into a single character), you can just send that through as it is and let the other end use it (assuming you both encode characters the same way of course, such as with ASCII).
On the other hand, you may have a four byte array of chars (assuming chars are eight bits) containing the binary IP address.
The use of htonl and ntohl is to ensure that this binary data is sent through in an order that both big-endian and little-endian systems can understand.
To that end, network byte order (the order of the bytes "on the wire") is big-endian so these functions basically do nothing on big-endian systems. On little-endian systems, they swap the bytes around.
In other words, you may have the following binary data:
uint32_t ipaddress = 0x0a010203; // for 10.1.2.3
In big endian layout that would be stored as 0x0a,0x01,0x02,0x03, in little endian as 0x03,0x02,0x01,0x0a.
So, if you want to send it in network byte order (that any endian system will be able to understand), you can't just do:
write (fd, &ipaddress, 4);
since sending that from little endian system to a big endian one will end up with the bytes reversed.
What you need to do is:
uint32_t ipaddress = 0x0a010203; // for 10.1.2.3
uint32_t ip_netorder = htonl (ipaddress); // change if necessary.
write (fd, &ip_netorder, 4);
That forces it to be network byte order which any program at the other end can understand (assuming it uses ntohl to ensure it's correct for its purposes).
In fact, this scheme can handle more than just big and little endian. If you have a 32-bit integer coding scheme where ABCD (four bytes) is encoded as A,D,B,C or even where you have a bizarrely wild bit mixture forming your integers (like using even bits first then odd bits), this will still work since your local htonl and ntohl know about those formats and can convert them correctly to network byte order.
An array of chars has a defined ordering and is not endian dependent - they always operate from low to high addresses by convention.
Do you have a string or 4 bytes?
IP4 address is 4 bytes (aka chars). So you will be having 4 unsigned chars, in an array somewhere. cast that array to send it across.
e.g. unsigned char IP[4];
use ((char *)IP) as data buffer to send, and send 4 bytes from it.
I am trying to write server that will communicate with any standard client that can make socket connections (e.g. telnet client)
It started out as an echo server, which of course did not need to worry about network byte ordering.
I am familiar with ntohs, ntohl, htons, htonl functions. These would be great by themselves if I were transfering either 16 or 32-bit ints, or if the characters in the string being sent were multiples of 2 or 4 bytes.
I'd like create a function that operates on strings such as:
str_ntoh(char* net_str, char* host_str, int len)
{
uint32_t* netp, hostp;
netp = (uint32_t*)&net_str;
for(i=0; i < len/4; i++){
hostp[i] = ntoh(netp[i]);
}
}
Or something similar. The above thing assumes that the wordsize is 32-bits. We can't be sure that the wordsize on the sending machine is not 16-bits, or 64-bits right?
For client programs, such as telnet, they must be using hton* before they send and ntoh* after they receive data, correct?
EDIT: For the people that thing because 1-char is a byte that endian-ness doesn't matter:
int main(void)
{
uint32_t a = 0x01020304;
char* c = (char*)&a;
printf("%x %x %x %x\n", c[0], c[1], c[2], c[3]);
}
Run this snippet of code. The output for me is as follows:
$ ./a.out
4 3 2 1
Those on powerPC chipsets should get '1 2 3 4' but those of us on intel chipset should see what I got above for the most part.
Maybe I'm missing something here, but are you sending strings, that is, sequences of characters? Then you don't need to worry about byte order. That is only for the bit pattern in integers. The characters in a string are always in the "right" order.
EDIT:
Derrick, to address your code example, I've run the following (slightly expanded) version of your program on an Intel i7 (little-endian) and on an old Sun Sparc (big-endian)
#include <stdio.h>
#include <stdint.h>
int main(void)
{
uint32_t a = 0x01020304;
char* c = (char*)&a;
char d[] = { 1, 2, 3, 4 };
printf("The integer: %x %x %x %x\n", c[0], c[1], c[2], c[3]);
printf("The string: %x %x %x %x\n", d[0], d[1], d[2], d[3]);
return 0;
}
As you can see, I've added a real char array to your print-out of an integer.
The output from the little-endian Intel i7:
The integer: 4 3 2 1
The string: 1 2 3 4
And the output from the big-endian Sun:
The integer: 1 2 3 4
The string: 1 2 3 4
Your multi-byte integer is indeed stored in different byte order on the two machines, but the characters in the char array have the same order.
With your function signature as posted you don't have to worry about byte order. It accepts a char*, that can only handle 8-bit characters. With one byte per character, you cannot have a byte order problem.
You'd only run into a byte order problem if you send Unicode, either in UTF16 or UTF32 encoding. And the endian-ness of the sending machine doesn't match the one of the receiving machine. The simple solution for that is to use UTF8 encoding. Which is what most text is sent as across networks. Being byte oriented, it doesn't have a byte order issue either. Or you could send a BOM.
If you'd like to send them as an 8-bit encoding (the fact that you're using char implies this is what you want), there's no need to byte swap. However, for the unrelated issue of non-ASCII characters, so that the same character > 127 appears the same on both ends of the connection, I would suggest that you send the data in something like UTF-8, which can represent all unicode characters and can be safely treated as ASCII strings. The way to get UTF-8 text based on the default encoding varies by the platform and set of libraries you're using.
If you're sending 16-bit or 32-bit encoding... You can include one character with the byte order mark which the other end can use to determine the endianness of the character. Or, you can assume network byte order and use htons() or htonl() as you suggest. But if you'd like to use char, please see the previous paragraph. :-)
It seems to me that the function prototype doesn't match its behavior. You're passing in a char *, but you're then casting it to uint32_t *. And, looking more closely, you're casting the address of the pointer, rather than the contents, so I'm concerned that you'll get unexpected results. Perhaps the following would work better:
arr_ntoh(uint32_t* netp, uint32_t* hostp, int len)
{
for(i=0; i < len; i++)
hostp[i] = ntoh(netp[i]);
}
I'm basing this on the assumption that what you've really got is an array of uint32_t and you want to run ntoh() on all of them.
I hope this is helpful.
I'm trying to convert a struct to a char array to send over the network. However, I get some weird output from the char array when I do.
#include <stdio.h>
struct x
{
int x;
} __attribute__((packed));
int main()
{
struct x a;
a.x=127;
char *b = (char *)&a;
int i;
for (i=0; i<4; i++)
printf("%02x ", b[i]);
printf("\n");
for (i=0; i<4; i++)
printf("%d ", b[i]);
printf("\n");
return 0;
}
Here is the output for various values of a.x (on an X86 using gcc):
127:
7f 00 00 00
127 0 0 0
128:
ffffff80 00 00 00
-128 0 0 0
255:
ffffffff 00 00 00
-1 0 0 0
256:
00 01 00 00
0 1 0 0
I understand the values for 127 and 256, but why do the numbers change when going to 128? Why wouldn't it just be:
80 00 00 00
128 0 0 0
Am I forgetting to do something in the conversion process or am I forgetting something about integer representation?
*Note: This is just a small test program. In a real program I have more in the struct, better variable names, and I convert to little-endian.
*Edit: formatting
What you see is the sign preserving conversion from char to int. The behavior results from the fact that on your system, char is signed (Note: char is not signed on all systems). That will lead to negative values if a bit-pattern yields to a negative value for a char. Promoting such a char to an int will preserve the sign and the int will be negative too. Note that even if you don't put a (int) explicitly, the compiler will automatically promote the character to an int when passing to printf. The solution is to convert your value to unsigned char first:
for (i=0; i<4; i++)
printf("%02x ", (unsigned char)b[i]);
Alternatively, you can use unsigned char* from the start on:
unsigned char *b = (unsigned char *)&a;
And then you don't need any cast at the time you print it with printf.
The x format specifier by itself says that the argument is an int, and since the number is negative, printf requires eight characters to show all four non-zero bytes of the int-sized value. The 0 modifier tells to pad the output with zeros, and the 2 modifier says that the minimum output should be two characters long. As far as I can tell, printf doesn't provide a way to specify a maximum width, except for strings.
Now then, you're only passing a char, so bare x tells the function to use the full int that got passed instead — due to default argument promotion for "..." parameters. Try the hh modifier to tell the function to treat the argument as just a char instead:
printf("%02hhx", b[i]);
char is a signed type; so with two's complement, 0x80 is -128 for an 8-bit integer (i.e. a byte)
Treating your struct as if it were a char array is undefined behavior. To send it over the network, use proper serialization instead. It's a pain in C++ and even more so in C, but it's the only way your app will work independently of the machines reading and writing.
http://en.wikipedia.org/wiki/Serialization#C
Converting your structure to characters or bytes the way you're doing it, is going to lead to issues when you do try to make it network neutral. Why not address that problem now? There are a variety of different techniques you can use, all of which are likely to be more "portable" than what you're trying to do. For instance:
Sending numeric data across the network in a machine-neutral fashion has long been dealt with, in the POSIX/Unix world, via the functions htonl, htons, ntohl and ntohs. See, for example, the byteorder(3) manual page on a FreeBSD or Linux system.
Converting data to and from a completely neutral representation like JSON is also perfectly acceptable. The amount of time your programs spend converting the data between JSON and native forms is likely to pale in comparison to the network transmission latencies.
char is a signed type so what you are seeing is the two-compliment representation, casting to (unsigned char*) will fix that (Rowland just beat me).
On a side note you may want to change
for (i=0; i<4; i++) {
//...
}
to
for (i=0; i<sizeof(x); i++) {
//...
}
The signedness of char array is not the root of the problem! (It is -a- problem, but not the only problem.)
Alignment! That's the key word here. That's why you should NEVER try to treat structs like raw memory. Compliers (and various optimization flags), operating systems, and phases of the moon all do strange and exciting things to the actual location in memory of "adjacent" fields in a structure. For example, if you have a struct with a char followed by an int, the whole struct will be EIGHT bytes in memory -- the char, 3 blank, useless bytes, and then 4 bytes for the int. The machine likes to do things like this so structs can fit cleanly on pages of memory, and such like.
Take an introductory course to machine architecture at your local college. Meanwhile, serialize properly. Never treat structs like char arrays.
When you go to send it, just use:
(char*)&CustomPacket
to convert. Works for me.
You may want to convert to a unsigned char array.
Unless you have very convincing measurements showing that every octet is precious, don't do this. Use a readable ASCII protocol like SMTP, NNTP, or one of the many other fine Internet protocols codified by the IETF.
If you really must have a binary format, it's still not safe just to shove out the bytes in a struct, because the byte order, basic sizes, or alignment constraints may differ from host to host. You must design your wire protcol to use well-defined sizes and to use a well defined byte order. For your implementation, either use macros like ntohl(3) or use shifting and masking to put bytes into your stream. Whatever you do, make sure your code produces the same results on both big-endian and little-endian hosts.