I checked the git log as https://patchwork.sourceware.org/patch/12453/. This modification seems to fix an issue on a specific platform.
But I don't understand why to swap __ss_align and __ss_padding in struct sockaddr_storage.
The Qualcomm platform I'm now developing on has lots of typecast as follows.
struct sockaddr_storage prefix_addr
(struct sockaddr_in6 *)&(prefix_addr)->sin6_addr.s6_addr
On our Cortex A7 platform, struct alignment are as follows:
Before glibc2.23:
struct sockaddr_in6
{
sin6_family; //0th byte
sin6_port; //2nd byte
sin6_flowinfo; //4th byte
sin6_addr; //8th byte
};
struct sockaddr_storage
{
ss_family; //0th byte
__ss_align; //4th byte
__ss_padding; //8th byte
};
After glibc2.23:
struct sockaddr_storage
{
ss_family; //0th byte
__ss_padding; //2nd byte
__ss_align; //124th byte
};
glibc changed struct sockaddr_storage, but struct sockaddr_in6 is not changed, so this modification would cause many alignment issues on our platform, which lead getting IPV6 addresses errors.
Please refer to the reply from the Florian:
On 09/01/2017 12:07 PM, honan li wrote:
define SASTORAGE_DATA(addr) (addr).__ss_padding
typedef struct qcmap_cm_nl_prefix_info_s {
boolean prefix_info_valid;
unsigned char prefix_len;
unsigned int mtu;
struct sockaddr_storage prefix_addr;
struct ifa_cacheinfo cache_info;
} qcmap_cm_nl_prefix_info_t;
void QCMAP_Backhaul::GetIPV6PrefixInfo(char *devname,
qcmap_cm_nl_prefix_info_t
*ipv6_prefix_info)
{
struct sockaddr_in6 *sin6 = NULL;
...
sin6 = (struct sockaddr_in6 *)&ipv6_prefix_info->prefix_addr;
memcpy(SASTORAGE_DATA(ipv6_prefix_info->prefix_addr),
RTA_DATA(rta),
sizeof(sin6->sin6_addr));
...
}
Currently publicly available here:
https://github.com/Bigcountry907/HTC_a13_vzw_Kernel/blob/master/vendor/qcom/proprietary/data/mobileap_v2/server/src/QCMAP_ConnectionManager.cpp#L3658
I would expect applications do something like this instead:
struct sockaddr_in6 sin6; memset (&sin6, 0, sizeof (sin6));
sin6.sin6_family = AF_INET6; memcpy (&sin6.sin6_addr, RTA_DATA
(rta), sizeof (sin6.sin6_addr)); memcpy
(&ipv6_prefix_info->prefix_addr, &sin6, sizeof (sin6));
This avoids any aliasing issues and a dependency on the internals of
struct sockaddr_storage (which is only intended as a way to allocate a
generic struct sockaddr of sufficient size and alignment). It also
initializes the other components of the socket address (such as the
port number and flow label).
Thanks, Florian
Also refer to:
https://books.google.com.hk/books?id=ptSC4LpwGA0C&pg=PA72&lpg=PA72&dq=alignment+sufficient&source=bl&ots=Kt0BQhjiMt&sig=HTUbm2bzVNSoMxNX98EMzORFc30&hl=zh-CN&sa=X&ved=0ahUKEwiP78iMoovWAhVLipQKHYFdCxcQ6AEIQzAF#v=onepage&q=alignment%20sufficient&f=false
Conclusion is Qualcomm's usage of sockaddr_storage is incorrect.
I'm not sure this is the change in glibc that generates alignment warnings: this change has no effect on the source code you show.
On the contrary, I think Qualcomm is using clang, and beginning with clang 4.x, there is a new alignment warning that I've seen generating false positive warning about alignements, depending on the way you write your code: -Waddress-of-packed-member.
You should check if the clang version you are using has changed, since glibc changed. If this is the case, this could certainly explain your problem.
Using a temporary variable makes those warning disappear.
instead of making a call like:
my_function(((struct sockaddr_in6 *)&prefix_addr)->sin6_addr.s6_addr)
just write:
struct sockaddr_in6 p = *(struct sockaddr_in6 *) &prefix_addr;
my_function(p.sin6_addr.s6_addr);
This may avoid alignment warnings.
Related
I have made a tiny client/server app, but I was wondering if the default value of the field s_addr was INADDR_ANY or not as when I don't specify an address for the socket, for both the client and the server, it works the same as with this flag given in the structure. I haven't find anything on the doc about that
struct sockaddr_in {
short sin_family;
unsigned short sin_port;
struct in_addr sin_addr;
};
struct in_addr {
unsigned long s_addr;
};
The default value is "undefined", i.e. the initial state of the struct depends on the pre-existing state of the bytes it occupies in RAM (which in turn will depend on what those memory-locations were previously used for earlier in your program's execution). Obviously you can't depend on that behavior, so you should always set the fields to some appropriate well-known value before using the struct for anything.
A convenient way to ensure that all fields of the struct are set to all-zeros would be: memset(&myStruct, 0, sizeof(myStruct));
In addition to Jeremy Friesner's answer, you can use
struct sockaddr_in addr ={0};
declaring and initializing the struct in one go so you do not forget to zero it. I believe this been valid since C99; in C23 you will also be able to omit the 0; so struct sockaddr_in addr ={}; would also work.
But my favourite approach is to use designated initializers (valid since C99) like so:
struct sockaddr_in addr ={
.sin_family = AF_INET,
.sin_addr.s_addr = htonl(INADDR_ANY),
.sin_port = htons(PORT)
};
The fields you do not explicitly initialize, will be zeroed.
What is a designated initializer in C?
I'm learning HTTP protocol following a tutorial which gives an understandable piece of code and here's part of it.
struct sockaddr_in address;
...
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons( PORT );
memset(address.sin_zero, '\0', sizeof address.sin_zero);
if (bind(server_fd, (struct sockaddr *)&address, sizeof(address))<0)
{
perror("In bind");
exit(EXIT_FAILURE);
}
The example code works well, although I don't understand the some kind of transfer between two structs.
the definition of struct sockaddr_in in <netinet/in.h> is
struct sockaddr_in {
__uint8_t sin_len;
sa_family_t sin_family;
in_port_t sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
the definition of struct sockaddr in <sys/socket.h> is
struct sockaddr {
__uint8_t sa_len; /* total length */
sa_family_t sa_family; /* [XSI] address family */
char sa_data[14]; /* [XSI] addr value (actually larger) */
};
They have different structures, how the "transfer/casting" works there?
I don't understand the some kind of transfer between two structs.
There is no data transfer between different structs, nor any conversion of structure objects. In bind(server_fd, (struct sockaddr *)&address, sizeof(address)), a pointer to a struct is converted to a different object pointer type. This is explicitly allowed by C.
The C language specification does not define any behavior for accessing the struct via the converted pointer. Any attempt to do so would violate the strict aliasing rule, but that's not your problem. The example you presented demonstrates an utterly standard usage idiom for the bind() function, for which it was designed. Therefore, you can rely on the bind() implementation to do the right thing with it, by whatever magic is required.
Conceptually, though, you can observe that the first two members of struct sockaddr and struct sockaddr_in have the same data types. You could imagine, then, that bind is able to access those two members via the converted pointer, despite it constituting a strict-aliasing violation. Although C does not define behavior for that, POSIX implicitly requires that it work in at least this case. Having then done that, the second of those members indicates the address family, by which bind() can invoke the appropriate behavior for the address's actual type.
That is a variation on C-style polymorphism. It is helped out by the third bind argument, the size of the address object, which enables bind() to copy the address object without knowing its true effective data type.
These structure types and the bind() API could have been defined a bit differently to avoid the implied strict-aliasing violation, but that wasn't necessary in early C, where member names corresponded directly to offsets from the beginning of the structure. And where those names were global, which is why you see the sin_ and sa_ prefixes in those member names, and similar in many other structure types provided by the system. Nowadays, it's best to just accept that that's how bind() is used, and it's up to the system to provide a bind() implementation that accommodates it.
The casting works.
Looking at the two structures:
struct sockaddr_in {
__uint8_t sin_len;
sa_family_t sin_family;
in_port_t sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
struct sockaddr {
__uint8_t sa_len; /* total length */
sa_family_t sa_family; /* [XSI] address family */
char sa_data[14]; /* [XSI] addr value (actually larger) */
};
First two members, sin_len and sa_len, sin_family and sa_family will not be problematic as those are of the same data type. The padding for sa_family_t works exactly the same on both ends.
Looking at the reference,
in_port_t Equivalent to the type uint16_t as described in <inttypes.h>
in_addr_t Equivalent to the type uint32_t as described in <inttypes.h>
For windows, struct in_addr looks like below:
struct in_addr {
union {
struct {
u_char s_b1;
u_char s_b2;
u_char s_b3;
u_char s_b4;
} S_un_b;
struct {
u_short s_w1;
u_short s_w2;
} S_un_w;
u_long S_addr;
} S_un;
};
and that for a linux is:
struct in_addr {
uint32_t s_addr; /* address in network byte order */
};
The whole confusion you might have is because of how the contents align. However, it is a well-thought historic design. It is intended to accommodate implementation-dependent aspects in the design.
When I Secondly, implementation-dependent -- it refers to the fact that implementation of in_addr_t is not consistent across all systems, as seen above.
In a nutshell, this entire magic works, because of the 2 things: The exact size and padding nature of the first two members and then lastly the data type of sa_data[14] is char, or more precisely an array of a 1-byte data-type. This design trick with union inside a struct has been widely used.
Unix Network Programming Volume 1 states:
The reason the sin_addr member is a structure, and not just an in_addr_t, is historical. Earlier releases (4.2BSD) defined the in_addr structure as a union of various structures, to allow access to each of the 4 bytes and to both of the 16-bit values contained within the 32-bit IPv4 address. This was used with class A, B, and C addresses to fetch the appropriate bytes of the address. But with the advent of subnetting and then the disappearance of the various address classes with classless addressing, the need for the union disappeared. Most systems today have done away with the union and just define in_addr as a structure with a single in_addr_t member.
Not what you asked for, but good to know:
The same header states:
The sockaddr_in structure is used to store addresses for the Internet address family. Values of this type shall be cast by applications to struct sockaddr for use with socket functions.
So, sockaddr_in is a struct specific to IP-based communication and sockaddr is more of a generic structure for socket operations.
Just a try:
#include <stdio.h>
#include <sys/socket.h>
#include <netinet/in.h>
int main(void)
{
printf("sizeof(struct sockaddr_in) = %zu bytes\n", sizeof(struct sockaddr_in));
printf("sizeof(struct sockaddr) = %zu bytes\n", sizeof(struct sockaddr));
return 0;
}
Prints:
sizeof(struct sockaddr_in) = 16 bytes
sizeof(struct sockaddr) = 16 bytes
I think this cast breaks the strict aliasing rule and then is undefined behaviour if the bind function dereferences the pointer.
In practice the code assumes that all fields of struct sockaddr_in are contiguous so you can access a buffer of bytes either as a struct sockaddr_in or as a struct sockaddr equivalently. But the fields of a structure are not guaranteed to be contiguous. If in_port_tis two bytes long for example, there may very well be a hole between sin_portand sin_addr with a 32 bytes machine compiler because it may want to align sin_addr field on 32 bytes address.
This way of coding is frequent when you develop a communication interface driver: you receive a buffer of bytes that need to be interpreted as a data structure (like: first byte is an adress, following bytes are a length, etc...). Casting from a structure to another one avoids to copy data.
Note that usually compilers provide non-standard-C ways to guarantee that all fields of structures are contigiuous. For example with gcc it is __attribute__((packed))
Now, to answer to your question: provided the structures are packed and there is no undefined behaviour, the cast basically does nothing. sa_data will be the array of bytes located after the field sin_family. So this array will consist of sin_port, followed by sin_addr followed by the array sin_zero.
EDIT: I compiled tje following structures on STM32H7 (ARM cortex M7, 32 bits architecture) with arm-none-eabi-gcc:
struct in_addr {
uint32_t s_addr;
};
struct sockaddr_in {
uint8_t sin_len;
uint16_t sin_family;
uint16_t sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
struct sockaddr {
uint8_t sa_len;
uint16_t sa_family;
char sin_zero[14];
};
The size of sockaddr_in is 20.
The size of sockaddr is 18.
Note that if sa_family_t is of type char and not short, due to alignment, both structures are same size.
The compiler produces this warning when I'm working with some code which looks like -
....
for(p = res; p != NULL; p = p->ai_next) {
void *addr;
std::string ipVer = "IPv0";
if(p->ai_family == AF_INET) {
ipVer = "IPv4";
struct sockaddr_in *ipv4 = (struct sockaddr_in *)p->ai_addr;
addr = &(ipv4->sin_addr);
}
else {
ipVer = "IPv6";
struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)p->ai_addr;
addr = &(ipv6->sin6_addr);
}
....
}
where p = res are of type struct addrinfo and the types producing warnings are sockaddr_in and sockaddr_in6. The warning comes from statements :
struct sockaddr_in *ipv4 = (struct sockaddr_in *)p->ai_addr;
struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)p->ai_addr;
All I want to know is what is causing this warning and what can I do to correct it if this is not the proper way to do things. Could I use any of static_cast / dynamic_cast / reinterpret_cast here?
The exact warning is - cast from 'struct sockaddr *' to 'struct sockaddr_in *' increases required alignment from 2 to 4.
TLDR: This warning doesn't indicate an error in your code, but you can avoid it by using a poper c++ reinterpret_cast (thanks to #Kurt Stutsman).
Explanation:
Reason for the warning:
sockaddr consists of a unsigned short (usually 16 bit) and a char array, so its alignment requirement is 2.
sockaddr_in contains (among other things) a struct in_addr which has an alignment requirement of 4 which in turn means sockaddr_in also must be aligned to a 4 Byte boundary.
For that reason, casting an arbitrary sockaddr* to an sockaddr_in* changes the alignment requirement, and accessing the object via the new pointer would even violate aliasing rules and result in undefined behavior.
Why you can ignore it:
In your case, the object, p->ai_addr is pointing to, most likely is a sockaddr_in or sockaddr_in6 object anyway (as determined by checking ai_family) and so the operation is safe. However you compiler doesn't know that and produces a warning.
It is essentially the same thing as using a static_cast to cast a pointer to a base class to a pointer to a derived class - it is unsafe in the general case, but if you know the correct dynamic type extrinsically, it is well defined.
Solution:
I don't know a clean way around this (other than suppress the warning), which is not unusual with warnings enabled by -Weverything . You could copy the object pointed to by p->ai_addr byte by byte to an object of the appropriate type, but then you could (most likely) no longer use addr the same way as before, as it would now point to a different (e.g. local) variable.
-Weverything isn't something I would use for my usual builds anyway, because it adds far too much noise, but if you want to keep it, #Kurt Stutsman mentioned a good solution in the comments:
clang++ (g++ doesn't emit a warning in any case) doesn't emit a warning, if you use a reinterpret_cast instead of the c style cast (which you shouldn't use anyway), although both have (in this case) exactly the same functionality. Maybe because reinterpret_cast explicitly tells the compiler: "Trust me, I know, what I'm doing" .
On a side Note: In c++ code you don't need the struct keywords.
Well -Weverything enables quite a lot of warnings some of them are known to throw unwanted warnings.
Here your code fires the cast-align warning, that says explicitely
cast from ... to ... increases required alignment from ... to ...
And it is the case here because the alignement for struct addr is only 2 whereas it is 4 for struct addr_in.
But you (and the programmer for getaddrinfo...) know that the pointer p->ai_addr already points to an actual struct addr_in, so the cast is valid.
You can either:
let the warning fire and ignore it - after all it is just a warning...
silence it with -Wno-cast-align after -Weverything
I must admit that I seldom use -Weverything for that reason, and only use -Wall
Alternatively, if you know that you only use CLang, you can use pragmas to explicetely turn the warning only on those lines:
for(p = res; p != NULL; p = p->ai_next) {
void *addr;
std::string ipVer = "IPv0";
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wcast-align"
if(p->ai_family == AF_INET) {
ipVer = "IPv4";
struct sockaddr_in *ipv4 = (struct sockaddr_in *)p->ai_addr;
addr = &(ipv4->sin_addr);
}
else {
ipVer = "IPv6";
struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)p->ai_addr;
addr = &(ipv6->sin6_addr);
}
#pragma clang diagnostic pop
....
}
To elaborate on the memcpy version. I thnk this is needed for ARM which cannot have misalligned data.
I created a struct that contains just the first two fields (I only needed port)
struct sockaddr_in_header {
sa_family_t sin_family; /* address family: AF_INET */
in_port_t sin_port; /* port in network byte order */
};
Then to get the port out, I used memcpy to move the data to the stack
struct sockaddr_in_header sinh;
unsigned short sin_port;
memcpy(&sinh, conn->local_sockaddr, sizeof(struct sockaddr_in_header));
And return the port
sin_port = ntohs(sinh.sin_port);
This answer is really related to getting the port on Arm
How do I cast sockaddr pointer to sockaddr_in on Arm
The powers that be think that to be the same question as this one, however I dont want to ignore warnings. Experience has taught me that is a bad idea.
I have a function which takes in "struct sockaddr *" as a parameter (let's call this input_address), and then I need to operate on that address, which may be a sockaddr_in or sockaddr_in6, since I support both IPv4 and IPv6.
I'm getting some memory corruption and trying to track it down to it's source, and in the process found some code that seems suspect, so I would like to validate if this is the right way to do things.
struct sockaddr_storage *input_address_storage = (struct sockaddr_storage *) input_address;
struct sockaddr_storage result = [UtilityClass performSomeOperation: *input_address_storage];
At first I thought the cast in the first line was safe, but then in the second line I need to dereference that pointer, which seems like it may be wrong. The reason I am concerned is that it may end up copying memory that is beyond where the original structure is (since sockaddr_in is shorter than sockaddr_in6). I am not sure if this could cause a memory corruption (my guess is no), but nevertheless this code gives me a bad feeling.
I can't change the fact my function takes a "struct sockaddr *", so it seems like it would be difficult to work around this type of code, and yet I want to avoid copying from a memory location where I shouldn't be.
If anyone can validate whether what I am doing is wrong, and the best way to fix this, I'd appreciate it.
EDIT: An admin had changed my C tag for C# for some reason. The code I gave is primarily C, with one function call from objective C that doesn't really matter. That call could have been C.
The problem with your approach is that you are converting an existing struct sockaddr* into a struct sockaddr_storage*. Imagine what happens if the original was a ``struct sockaddr_in. Sincesizeof(struct sockaddr_in) < sizeof(struct sockaddr_storage)`, the memory-sanitizer complains of unbound memory reference.
struct sockaddr_storage is essentially a container to contain either your struct sockaddr_in or struct sockaddr_in6.
Hence, it is useful when you want to pass in a struct sockaddr* object but want to allocate enough memory for both sockaddr_in and sockaddr_in6.
A good example is the recvfrom(3) call:
ssize_t recvfrom(int socket, void *restrict buffer, size_t length,
int flags, struct sockaddr *restrict address,
socklen_t *restrict address_len);
Since address requires a struct sockaddr* object, we will construct a struct sockaddr_storage first, and pass it in:
struct sockaddr_storage address;
socklen_t address_length = sizeof(struct sockaddr_storage);
ssize_t ret = recvfrom(fd, buffer, buffer_length, 0, (struct sockaddr*)&address, &address_length);
if (address.ss_family == AF_INET) {
DoIpv4Work((struct sockaddr_in*)&address, ...);
} else if (address.ss_family == AF_INET6) {
DoIpv6Work((struct sockaddr_in6*)&address, ...);
}
The difference in your approach and mine is that I allocate a struct sockaddr_storage and then use it as struct sockaddr, but you do the REVERSE, and use a struct sockaddr and then use it as struct sockaddr_storage.
I'm looking at functions such as connect() and bind() in C sockets and notice that they take a pointer to a sockaddr struct. I've been reading and to make your application AF-Independent, it is useful to use the sockaddr_storage struct pointer and cast it to a sockaddr pointer because of all the extra space it has for larger addresses.
What I am wondering is how functions like connect() and bind() that ask for a sockaddr pointer go about accessing the data from a pointer that points at a larger structure than the one it is expecting. Sure, you pass it the size of the structure you are providing it, but what is the actual syntax that the functions use to get the IP Address off the pointers to larger structures that you have cast to struct *sockaddr?
It's probably because I come from OOP languages, but it seems like kind of a hack and a bit messy.
Functions that expect a pointer to struct sockaddr probably typecast the pointer you send them to sockaddr when you send them a pointer to struct sockaddr_storage. In that way, they access it as if it was a struct sockaddr.
struct sockaddr_storage is designed to fit in both a struct sockaddr_in and struct sockaddr_in6
You don't create your own struct sockaddr, you usually create a struct sockaddr_in or a struct sockaddr_in6 depending on what IP version you're using. In order to avoid trying to know what IP version you will be using, you can use a struct sockaddr_storage which can hold either. This will in turn be typecasted to struct sockaddr by the connect(), bind(), etc functions and accessed that way.
You can see all of these structs below (the padding is implementation specific, for alignment purposes):
struct sockaddr {
unsigned short sa_family; // address family, AF_xxx
char sa_data[14]; // 14 bytes of protocol address
};
struct sockaddr_in {
short sin_family; // e.g. AF_INET, AF_INET6
unsigned short sin_port; // e.g. htons(3490)
struct in_addr sin_addr; // see struct in_addr, below
char sin_zero[8]; // zero this if you want to
};
struct sockaddr_in6 {
u_int16_t sin6_family; // address family, AF_INET6
u_int16_t sin6_port; // port number, Network Byte Order
u_int32_t sin6_flowinfo; // IPv6 flow information
struct in6_addr sin6_addr; // IPv6 address
u_int32_t sin6_scope_id; // Scope ID
};
struct sockaddr_storage {
sa_family_t ss_family; // address family
// all this is padding, implementation specific, ignore it:
char __ss_pad1[_SS_PAD1SIZE];
int64_t __ss_align;
char __ss_pad2[_SS_PAD2SIZE];
};
So as you can see, if the function expects an IPv4 address, it will just read the first 4 bytes (because it assumes the struct is of type struct sockaddr. Otherwise it will read the full 16 bytes for IPv6).
In C++ classes with at least one virtual function are given a TAG. That tag allows you to dynamic_cast<>() to any of the classes your class derives from and vice versa. The TAG is what allows dynamic_cast<>() to work. More or less, this can be a number or a string...
In C we are limited to structures. However, structures can also be assigned a TAG. In fact, if you look at all the structures that theprole posted in his answer, you will notice that they all start with 2 bytes (an unsigned short) which represents what we call the family of the address. This defines exactly what the structure is and thus its size, fields, etc.
Therefore you can do something like this:
int bind(int fd, struct sockaddr *in, socklen_t len)
{
switch(in->sa_family)
{
case AF_INET:
if(len < sizeof(struct sockaddr_in))
{
errno = EINVAL; // wrong size
return -1;
}
{
struct sockaddr_in *p = (struct sockaddr_in *) in;
...
}
break;
case AF_INET6:
if(len < sizeof(struct sockaddr_in6))
{
errno = EINVAL; // wrong size
return -1;
}
{
struct sockaddr_in6 *p = (struct sockaddr_in6 *) in;
...
}
break;
[...other cases...]
default:
errno = EINVAL; // family not supported
return -1;
}
}
As you can see, the function can check the len parameter to make sure that the length is enough to fit the expected structure and therefore they can reinterpret_cast<>() (as it would be called in C++) your pointer. Whether the data is correct in the structure is up to the caller. There is not much choice on that end. These functions are expected to verify all sorts of things before it uses the data and return -1 and errno whenever a problem is found.
So in effect, you have a struct sockaddr_in or struct sockaddr_in6 that you (reinterpret) cast to a struct sockaddr and the bind() function (and others) cast that pointer back to a struct sockaddr_in or struct sockaddr_in6 after they checked the sa_family member and verified the size.