TCP socket: When ntoh/hton conversion not needed?

TCP socket: When ntoh/hton conversion not needed? - c

I am using existing code, that passes data - union ibv_gid through TCP connection without converting the endianness. There is a comment inside: "The gid will be transfer byte by byte so no need to do "hton". The code is correct and works, but I don't understand why the data is passed byte by byte (actually they pass the entire struct) and why there is no need in endianess convertion. The data type they pass is:
union ibv_gid {
uint8_t raw[16];
struct {
uint64_t subnet_prefix;
uint64_t interface_id;
} global;
};
** For other data types (as int etc.) they do convert the data before and after
//VL_sock_sync_data function synchronizes between the two sides
//by exchanging data between the two sides.
//size bytes are being copied from out_buf to the other side, and being saved in in_buf.
rc = VL_sock_sync_data(sock_p, sizeof(union ibv_gid), &local_gid, &tmp_gid);
Can you please explain why there is no need in endianness conversion?
Thank you for any help

The reasoning here seems to be that there's no need to do endianness conversion here because the GID (in its canonical representation) is not two 64-bit integers. It is 16 bytes.
The complication is that two systems with different endianness will see different values in the subnet_prefix and interface_id fields. So if they were to write those values to strings, send the strings back and forth, and compare them, that would be a problem. If they were to compare GIDs based on which one had a greater subnet_prefix, and expected the comparison to be the same between systems, that would be a problem. If one generated only consecutive interface_ids, and the other expected them to be consecutive, that would be a problem. But as long as they're only being used as opaque arrays of bytes, there's no problem.

Related

Different ways to store information efficiently when working with limited space

I am and have been working on software for the Pebble. It is the first time I have worked with C, and I am struggling to get my head around how to manage information/data within the program.
I am used to being able to have multi-dimensional arrays with thousands of entries. With the Pebble we are very limited.
I can talk to the requirements for my program, but happy to see any sort of discussion on the topic.
The application I am building needs to store a running feed of data with every button press. Ideally I would like to store one binary value and two small integer values with each press. I would like to take advantage of the local storage on the Pebble which is limited to 256 bytes per array which presents a challenge.
I have thought about using a custom struct - and having multiple arrays of those, making sure to check that each array doesn't exceed the 256 byte mark. It just seems really messy and complicated to manage... am I missing something fundamentally simple, or does it need to be this complicated?
At the moment my program only stores the binary value and I haven't bothered with the small integer values at all.

Perhaps you could define structures as follows:
#pragma pack(1)
typedef struct STREAM_RECORD_S
{
unsigned short uint16; // The uint16 field will store a number from 0-65535
unsigned short uint15 : 15; // The uint15 field will store a number from 0-32767
unsigned short binary : 1; // The binary field will store a number from 0-1
} STREAM_RECORD_T;
typedef struct STREAM_BLOCK_S
{
struct STREAM_BLOCK_S *nextBlock; // Store a pointer to the next block.
STREAM_RECORD_T records[1]; // Array of records for this block.
} STREAM_BLOCK_T;
#pragma pack(0);
The actual number of records in the array would depend on the size of the nextBlock pointer. For example, if you are running with 32-bit addressing, the nextBlock size would likely be 4 bytes; and it would be 2 bytes with 16-bit addressing, or 8 bytes with 64-bit addressing. (I do not know the pointer size on an ARM Cortex-M3 processor).
So, recordsPerArray = (256 - sizeof(nextBlock)) / sizeof(STREAM_RECORD_T);

How to convert a complex data type from host byte order to network order

I want to translate a message from host byte order to network order using htonl() and htos(). In this message, there are some complex defined data type slike structure, enum, union and union in union.
Shall I have to htonl(s) on every structure's members, and members in member, including the union's member that are multi-byte?
For an union, can I just translate the largest one?
For enum, can I just translate it just as a long?
Can I just write one function that is using htonl(s) for both sending and receiving message? Or do I have to come up with another one that is using ntohl(s) for receiving the same message?
Structures
typedef struct {
unsigned short un1_s;
unsigned char un1_c;
union {
unsigned short un1_u_s;
unsigned long un1_u_l;
}u;
}UN1;
typedef struct {
unsigned short un2_s1;
unsigned short un2_s2;
} UN2;
typedef enum {
ONE,
TWO,
TRHEE,
FOUR
} ENUM_ID;
typedef struct {
unsigned short s_sid;
unsigned int i_sid;
unsigned char u_char;
ENUM_ID i_enum;
union {
UN1 un1;
UN2 un2;
} u;
} MSG;
Code
void msgTranslate (MSG* in_msg, MSG* out_msg){
/* ignore the code validating pointer ... */
*out_msg = *in_msg;
#ifdef LITLE_ENDIAN
/* translating messeage */
out_msg->s_sid = htons( in_msg->s_sid ); /* short */
out_msg->i_sid = htonl( in_msg->i_sid ); /* int */
/* Can I simply leave out_msg->u_char not to translate,
* because it is a single byte? */
out_msg->i_enum = htonl(in_msg->i_enum);
/* Can I simply translate a enum this way,? */
/* For an union whose 1st member is largest one in size than
* others, can I just translate the 1st one,
* leaving the others not to convert? */
out_msg->u.un1.un1_s = htons(in_msg->u.un1.un1_s);
/* for out_msg->u_char, can I simply leave it
* not to be converted, because it is a single byte? */
/* for an union whose 2nd member is largest one,
* can I just convert the 2nd one, leaving others
* not to be converted? */
out_msg->u.un1.u.un1_u_s = htos(in_msg->u.un1.u.un1_u_s ); /* short */
/* As above question, the following line can be removed?
* just because the u.un1.u.un2_u_i is smaller
* than u.un1.u.un1 in size ? */
out_msg->u.un1.u.un2_u_i = htol(in_msg->u.un1.u.un2_u_l ); /* long */
/* Since un1 is largest than un2, the coding translation un2 can be ignored? */
...
#endif
return;
}

You will need to map every multi-byte type appropriately.
For a union, you need to identify which is the 'active' element of the union, and map that according to the normal rules. You may also need to provide a 'discriminator' which tells the receiving code which of the various possibilities was transmitted.
For enum, you could decide that all such values will be treated as a long and encode and decode accordingly. Alternatively, you can deal with each enum separately, handling each type according to its size (where, in theory, different enums could have different sizes).
It depends a bit on what you're really going to do next. If you're packaging data for transmission over the network, then the receive and the send operations are rather different. If all you're doing is flipping bits in a structure in memory, then you will probably find that on most systems, the results of applying the htonl() function to the result of htonl() is the number you first thought of. If you're planning to do a binary copy of all the bytes in the mapped (flipped) structure, you're probably not doing it right.
Note that your data structures have various padding holes in them on most plausible systems. In structure UN1, you almost certainly have a padding byte between un1_c and the following union u, if it is a 32-bit system; you'd probably have 5 bytes padding there if it is a 64-bit system. Similarly, in the MSG structure, you have probably got 2 padding bytes after s_sid, and 3 more after u_char. Depending on the size of the enum (and whether you're on a 32-bit or 64-bit machine), you might have 1-7 bytes of padding after i_enum.
Note that because you do not have platform independent sizes for the data types, you cannot reliably interwork between 32-bit and 64-bit Unix systems. If the systems are all Windows, then you get away with it since sizeof(long) == 4 on both 32-bit and 64-bit Windows. However, on essentially all 64-bit variants of Unix, sizeof(long) == 8. So, if working cross-platform is an issue, you have to worry about those sizes as well as the padding. Investigate the types in the <inttypes.h> header such as uint16_t and uint32_t.
You should simply do the same packing on all hosts, carefully copying the bytes of the various values into the appropriate place in a character buffer, which is then sent over the wire and unpacked by the inverse coding.
Also check out whether Google's Protocol Buffers would do the job for you sensibly; it might save you a fair amount of pain and grief.

You have to endian-flip any integer that is longer than 1 byte (short, int, long, long long).
No. See below.
No. enum might be any size, depending on your platform (see What is the size of an enum in C?).
Realistically, you should just use Protocol Buffers or something instead of trying to do all of this conversion...
Unions are hard to handle. Say, for instance, I store the value 0x1234 in the short of a union {short; long;} in big-endian. Then, the union contains the bytes 12 34 00 00, since the short occupies the low two bytes of the union. If you endian-flip the long, you get 00 00 34 12, which produces the short 0x0000. If you endian-flip the short, you get 34 12 00 00. I'm not sure which one you would consider correct, but it's pretty clear that you have a problem.
It's more typical to have two shorts in a union like that, with one short being the low halfword and the other short being the high halfword. Which one is which depends on endianness, but you can do
union {
#ifdef LITTLE_ENDIAN
uint16_t s_lo, s_hi;
#else
uint16_t s_hi, s_lo;
#endif
uint32_t l;
};

Typecasting a char to an int (for socket)

I have probably asked this question twice since y'day, but I have still not got a favourable answer. My problem is I have an IP address which is stored in an unsigned char. Now i want to send this IP address via socket from client to server. People have advised me to use htonl() and ntohl() for network byte transfer, but I am not able to understand that the arguments for htonl() and ntohl() are integers...how can i use it in case of unsigned char?? if I can't use it, how can I make sure that if I send 130.191.166.230 in my buffer, the receiver will receive the same all the time?? Any inputs or guidance will be appreciated. Thanks in advance.

If you have an unsigned char array string (along the lines of "10.0.0.7") forming the IP address (and I'm assuming you do since there are very few 32-bit char systems around, making it rather difficult to store an IP address into a single character), you can just send that through as it is and let the other end use it (assuming you both encode characters the same way of course, such as with ASCII).
On the other hand, you may have a four byte array of chars (assuming chars are eight bits) containing the binary IP address.
The use of htonl and ntohl is to ensure that this binary data is sent through in an order that both big-endian and little-endian systems can understand.
To that end, network byte order (the order of the bytes "on the wire") is big-endian so these functions basically do nothing on big-endian systems. On little-endian systems, they swap the bytes around.
In other words, you may have the following binary data:
uint32_t ipaddress = 0x0a010203; // for 10.1.2.3
In big endian layout that would be stored as 0x0a,0x01,0x02,0x03, in little endian as 0x03,0x02,0x01,0x0a.
So, if you want to send it in network byte order (that any endian system will be able to understand), you can't just do:
write (fd, &ipaddress, 4);
since sending that from little endian system to a big endian one will end up with the bytes reversed.
What you need to do is:
uint32_t ipaddress = 0x0a010203; // for 10.1.2.3
uint32_t ip_netorder = htonl (ipaddress); // change if necessary.
write (fd, &ip_netorder, 4);
That forces it to be network byte order which any program at the other end can understand (assuming it uses ntohl to ensure it's correct for its purposes).
In fact, this scheme can handle more than just big and little endian. If you have a 32-bit integer coding scheme where ABCD (four bytes) is encoded as A,D,B,C or even where you have a bizarrely wild bit mixture forming your integers (like using even bits first then odd bits), this will still work since your local htonl and ntohl know about those formats and can convert them correctly to network byte order.

An array of chars has a defined ordering and is not endian dependent - they always operate from low to high addresses by convention.
Do you have a string or 4 bytes?

IP4 address is 4 bytes (aka chars). So you will be having 4 unsigned chars, in an array somewhere. cast that array to send it across.
e.g. unsigned char IP[4];
use ((char *)IP) as data buffer to send, and send 4 bytes from it.

Sending the array of arbitrary length through a socket. Endianness

I'm fighting with socket programming now and I've encountered a problem, which I don't know how to solve in a portable way.
The task is simple : I need to send the array of 16 bytes over the network, receive it in a client application and parse it. I know, there are functions like htonl, htons and so one to use with uint16 and uint32. But what should I do with the chunks of data greater than that?
Thank you.

You say an array of 16 bytes. That doesn't really help. Endianness only matters for things larger than a byte.
If it's really raw bytes then just send them, you will receive them just the same
If it's really a struct you want to send it
struct msg
{
int foo;
int bar;
.....
Then you need to work through the buffer pulling that values you want.
When you send you must assemble a packet into a standard order
int off = 0;
*(int*)&buff[off] = htonl(foo);
off += sizeof(int);
*(int*)&buff[off] = htonl(bar);
...
when you receive
int foo = ntohl((int)buff[off]);
off += sizeof(int);
int bar = ntohl((int)buff[off]);
....
EDIT: I see you want to send an IPv6 address, they are always in network byte order - so you can just stream it raw.

Endianness is a property of multibyte variables such as 16-bit and 32-bit integers. It has to do with whether the high-order or low-order byte goes first. If the client application is processing the array as individual bytes, it doesn't have to worry about endianness, as the order of the bits within the bytes is the same.

htons, htonl, etc., are for dealing with a single data item (e.g. an int) that's larger than one byte. An array of bytes where each one is used as a single data item itself (e.g., a string) doesn't need to be translated between host and network byte order at all.

Bytes themselves don't have endianness any more in that any single byte transmitted by a computer will have the same value in a different receiving computer. Endianness only has relevance these days to multibyte data types such as ints.
In your particular case it boils down to knowing what the receiver will do with your 16 bytes. If it will treat each of the 16 entries in the array as discrete single byte values then you can just send them without worrying about endiannes. If, on the other hand, the receiver will treat your 16 byte array as four 32 bit integers then you'll need to run each integer through hton() prior to sending.
Does that help?

Byte order with a large array of characters in C

I am doing some socket programming in C, and trying to wrestle with byte order problems. My request (send) is fine but when I receive data my bytes are all out of order. I start with something like this:
char * aResponse= (char *)malloc(512);
int total = recv(sock, aResponse, 511, 0);
When dealing with this response, each 16bit word seems to have it's bytes reversed (I'm using UDP). I tried to fix that by doing something like this:
unsigned short * _netOrder= (unsigned short *)aResponse;
unsigned short * newhostOrder= (unsigned short *)malloc(total);
for (i = 0; i < total; ++i)
{
newhostOrder[i] = ntohs(_netOrder[i]);
}
This works ok when I am treating the data as a short, however if I cast the pointer to a char again the bytes are reversed. What am I doing wrong?

Ok, there seems to be problems with what you are doing on two different levels. Part of the confusion here seems to stem for your use of pointers, what type of objects they point to, and then the interpretation of the encoding of the values in the memory pointed to by the pointer(s).
The encoding of multi-byte entities in memory is what is referred to as endianess. The two common encodings are referred to as Little Endian (LE) and Big Endian (BE). With LE, a 16-bit quantity like a short is encoded least significant byte (LSB) first. Under BE, the most significant byte (MSB) is encoded first.
By convention, network protocols normally encode things into what we call "network byte order" (NBO) which also happens to be the same as BE. If you are sending and receiving memory buffers on big endian platforms, then you will not run into conversion problems. However, your code would then be platform dependent on the BE convention. If you want to write portable code that works correctly on both LE and BE platforms, you should not assume the platform's endianess.
Achieving endian portability is the purpose of routines like ntohs(), ntohl(), htons(), and htonl(). These functions/macros are defined on a given platform to do the necessary conversions at the sending and receiving ends:
htons() - Convert short value from host order to network order (for sending)
htonl() - Convert long value from host order to network order (for sending)
ntohs() - Convert short value from network order to host order (after receive)
ntohl() - Convert long value from network order to host order (after receive)
Understand that your comment about accessing the memory when cast back to characters has no affect on the actual order of entities in memory. That is, if you access the buffer as a series of bytes, you will see the bytes in whatever order they were actually encoded into memory as, whether you have a BE or LE machine. So if you are looking at a NBO encoded buffer after receive, the MSB is going to be first - always. If you look at the output buffer after your have converted back to host order, if you have BE machine, the byte order will be unchanged. Conversely, on a LE machine, the bytes will all now be reversed in the converted buffer.
Finally, in your conversion loop, the variable total refers to bytes. However, you are accessing the buffer as shorts. Your loop guard should not be total, but should be:
total / sizeof( unsigned short )
to account for the double byte nature of each short.

This works ok when I'm treating the data as a short, however if I cast the pointer to a char again the bytes are reversed.
That's what I'd expect.
What am I doing wrong?
You have to know what the sender sent: know whether the data is bytes (which don't need reversing), or shorts or longs (which do).
Google for tutorials associated with the ntohs, htons, and htons APIs.

It's not clear what aResponse represents (string of characters? struct?). Endianness is relevant only for numerical values, not chars. You also need to make sure that at the sender's side, all numerical values are converted from host to network byte-order (hton*).

Apart from your original question (which I think was already answered), you should have a look at your malloc statement. malloc allocates bytes and an unsigned short is most likely to be two bytes.
Your statement should look like:
unsigned short *ptr = (unsigned short*) malloc(total * sizeof(unsigned short));

the network byte order is big endian, so you need to convert it to little endian if you want it to make sense, but if it is only an array it shouldn't make a fuss, how does the sender sends it's data ?

For single byte we might not care about byte ordering.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight