How to cast sockaddr_storage and avoid breaking strict-aliasing rules - c

I'm using Beej's Guide to Networking and came across an aliasing issue. He proposes a function to return either the IPv4 or IPv6 address of a particular struct:
1 void *get_in_addr( struct sockaddr *sa )
2 {
3 if (sa->sa_family == AF_INET)
4 return &(((struct sockaddr_in*)sa)->sin_addr);
5 else
6 return &(((struct sockaddr_in6*)sa)->sin6_addr);
7 }
This causes GCC to spit out a strict-aliasing error for sa on line 3. As I understand it, it is because I call this function like so:
struct sockaddr_storage their_addr;
...
inet_ntop(their_addr.ss_family,
get_in_addr((struct sockaddr *)&their_addr),
connection_name,
sizeof connection_name);
I'm guessing the aliasing has to do with the fact that the their_addr variable is of type sockaddr_storage and another pointer of a differing type points to the same memory.
Is the best way to get around this sticking sockaddr_storage, sockaddr_in, and sockaddr_in6 into a union? It seems like this should be well worn territory in networking, I just can't find any good examples with best practices.
Also, if anyone can explain exactly where the aliasing issue takes place, I'd much appreciate it.

I tend to do this to get GCC do the right thing with type-punning, which is explicitly allowed with unions:
/*! Multi-family socket end-point address. */
typedef union address
{
struct sockaddr sa;
struct sockaddr_in sa_in;
struct sockaddr_in6 sa_in6;
struct sockaddr_storage sa_stor;
}
address_t;

I tend to do this to get GCC do the right thing with type-punning, which is explicitly allowed with unions
I am pretty sure this (mis)use of union will not work (or only by accident) with GCC:
short type_pun2 (int i, int *pi, short *ps) {
*pi = i;
return *ps;
}
union U {
int i;
short s;
};
short type_pun (int i) {
U u;
return type_pun2 (i, &u.i, &u.s);
}
The correct way to do that is with memcpy, not union.

I recently had a similar alias warning on HPUX system when trying to write code to get the MAC address of the machine
The &(((struct sockaddr_in *)addr)->sin_addr) complains about strict-aliasing rules
This is the code in some context
char ip[INET6_ADDRSTRLEN] = {0};
strucut sockaddr *addr
...
get addr from ioctl(socket,SOCGIFCONF...) call
...
inet_ntop(AF_INET, &(((struct sockaddr_in *)addr)->sin_addr),ip,sizeof ip);
I overcame the aliasing warning by doing the following
struct sockaddr_in sin;
memcpy(&sin,addr,sizeof(struct sockaddr));
inet_ntop(AF_INET, &sin.sin_addr,ip,sizeof ip);
And whilst this is potentially dangerous I added the following lines before it
static_assert(sizeof(sockaddr)==sizeof(sockaddr_in));
I'm not sure if that is something would be considered bad practice, but it worked and was cross platform to other *Nix flavors and compilers

The issue has nothing to do with the call to the function. Rather, it's with ((struct sockaddr_in*)sa)->sin_addr. The problem is that sa is a pointer of one type, but you're casting it to a pointer of a different type and then dereferencing it. This breaks a rule called "strict aliasing", which says that variables of different types can never alias. In your case, aliasing to a different type is exactly what you want to do.
The simple solution is to turn off this optimization, which allows aliasing in this manner. On GCC, the flag is -fno-strict-aliasing.
The better solution is to use a union, as mentioned by Nikolai.
void *get_in_addr(struct sockaddr *sa)
{
union {
struct sockaddr *sa;
struct sockaddr_in *sa_in;
struct sockaddr_in6 *sa_in6;
} u;
u.sa = sa;
if (sa->sa_family == AF_INET)
return &(u.sa_in->sin_addr);
else
return &(u.sa_in6->sin6_addr);
}
That said, I can't actually get GCC to give me a warning when using your original code, so I'm not sure if this buys you anything.

Related

How do I cast sockaddr pointer to sockaddr_in on Arm [duplicate]

The compiler produces this warning when I'm working with some code which looks like -
....
for(p = res; p != NULL; p = p->ai_next) {
void *addr;
std::string ipVer = "IPv0";
if(p->ai_family == AF_INET) {
ipVer = "IPv4";
struct sockaddr_in *ipv4 = (struct sockaddr_in *)p->ai_addr;
addr = &(ipv4->sin_addr);
}
else {
ipVer = "IPv6";
struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)p->ai_addr;
addr = &(ipv6->sin6_addr);
}
....
}
where p = res are of type struct addrinfo and the types producing warnings are sockaddr_in and sockaddr_in6. The warning comes from statements :
struct sockaddr_in *ipv4 = (struct sockaddr_in *)p->ai_addr;
struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)p->ai_addr;
All I want to know is what is causing this warning and what can I do to correct it if this is not the proper way to do things. Could I use any of static_cast / dynamic_cast / reinterpret_cast here?
The exact warning is - cast from 'struct sockaddr *' to 'struct sockaddr_in *' increases required alignment from 2 to 4.
TLDR: This warning doesn't indicate an error in your code, but you can avoid it by using a poper c++ reinterpret_cast (thanks to #Kurt Stutsman).
Explanation:
Reason for the warning:
sockaddr consists of a unsigned short (usually 16 bit) and a char array, so its alignment requirement is 2.
sockaddr_in contains (among other things) a struct in_addr which has an alignment requirement of 4 which in turn means sockaddr_in also must be aligned to a 4 Byte boundary.
For that reason, casting an arbitrary sockaddr* to an sockaddr_in* changes the alignment requirement, and accessing the object via the new pointer would even violate aliasing rules and result in undefined behavior.
Why you can ignore it:
In your case, the object, p->ai_addr is pointing to, most likely is a sockaddr_in or sockaddr_in6 object anyway (as determined by checking ai_family) and so the operation is safe. However you compiler doesn't know that and produces a warning.
It is essentially the same thing as using a static_cast to cast a pointer to a base class to a pointer to a derived class - it is unsafe in the general case, but if you know the correct dynamic type extrinsically, it is well defined.
Solution:
I don't know a clean way around this (other than suppress the warning), which is not unusual with warnings enabled by -Weverything . You could copy the object pointed to by p->ai_addr byte by byte to an object of the appropriate type, but then you could (most likely) no longer use addr the same way as before, as it would now point to a different (e.g. local) variable.
-Weverything isn't something I would use for my usual builds anyway, because it adds far too much noise, but if you want to keep it, #Kurt Stutsman mentioned a good solution in the comments:
clang++ (g++ doesn't emit a warning in any case) doesn't emit a warning, if you use a reinterpret_cast instead of the c style cast (which you shouldn't use anyway), although both have (in this case) exactly the same functionality. Maybe because reinterpret_cast explicitly tells the compiler: "Trust me, I know, what I'm doing" .
On a side Note: In c++ code you don't need the struct keywords.
Well -Weverything enables quite a lot of warnings some of them are known to throw unwanted warnings.
Here your code fires the cast-align warning, that says explicitely
cast from ... to ... increases required alignment from ... to ...
And it is the case here because the alignement for struct addr is only 2 whereas it is 4 for struct addr_in.
But you (and the programmer for getaddrinfo...) know that the pointer p->ai_addr already points to an actual struct addr_in, so the cast is valid.
You can either:
let the warning fire and ignore it - after all it is just a warning...
silence it with -Wno-cast-align after -Weverything
I must admit that I seldom use -Weverything for that reason, and only use -Wall
Alternatively, if you know that you only use CLang, you can use pragmas to explicetely turn the warning only on those lines:
for(p = res; p != NULL; p = p->ai_next) {
void *addr;
std::string ipVer = "IPv0";
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wcast-align"
if(p->ai_family == AF_INET) {
ipVer = "IPv4";
struct sockaddr_in *ipv4 = (struct sockaddr_in *)p->ai_addr;
addr = &(ipv4->sin_addr);
}
else {
ipVer = "IPv6";
struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)p->ai_addr;
addr = &(ipv6->sin6_addr);
}
#pragma clang diagnostic pop
....
}
To elaborate on the memcpy version. I thnk this is needed for ARM which cannot have misalligned data.
I created a struct that contains just the first two fields (I only needed port)
struct sockaddr_in_header {
sa_family_t sin_family; /* address family: AF_INET */
in_port_t sin_port; /* port in network byte order */
};
Then to get the port out, I used memcpy to move the data to the stack
struct sockaddr_in_header sinh;
unsigned short sin_port;
memcpy(&sinh, conn->local_sockaddr, sizeof(struct sockaddr_in_header));
And return the port
sin_port = ntohs(sinh.sin_port);
This answer is really related to getting the port on Arm
How do I cast sockaddr pointer to sockaddr_in on Arm
The powers that be think that to be the same question as this one, however I dont want to ignore warnings. Experience has taught me that is a bad idea.

Why glibc2.23 change struct sockaddr_storage?

I checked the git log as https://patchwork.sourceware.org/patch/12453/. This modification seems to fix an issue on a specific platform.
But I don't understand why to swap __ss_align and __ss_padding in struct sockaddr_storage.
The Qualcomm platform I'm now developing on has lots of typecast as follows.
struct sockaddr_storage prefix_addr
(struct sockaddr_in6 *)&(prefix_addr)->sin6_addr.s6_addr
On our Cortex A7 platform, struct alignment are as follows:
Before glibc2.23:
struct sockaddr_in6
{
sin6_family; //0th byte
sin6_port; //2nd byte
sin6_flowinfo; //4th byte
sin6_addr; //8th byte
};
struct sockaddr_storage
{
ss_family; //0th byte
__ss_align; //4th byte
__ss_padding; //8th byte
};
After glibc2.23:
struct sockaddr_storage
{
ss_family; //0th byte
__ss_padding; //2nd byte
__ss_align; //124th byte
};
glibc changed struct sockaddr_storage, but struct sockaddr_in6 is not changed, so this modification would cause many alignment issues on our platform, which lead getting IPV6 addresses errors.
Please refer to the reply from the Florian:
On 09/01/2017 12:07 PM, honan li wrote:
define SASTORAGE_DATA(addr) (addr).__ss_padding
typedef struct qcmap_cm_nl_prefix_info_s {
boolean prefix_info_valid;
unsigned char prefix_len;
unsigned int mtu;
struct sockaddr_storage prefix_addr;
struct ifa_cacheinfo cache_info;
} qcmap_cm_nl_prefix_info_t;
void QCMAP_Backhaul::GetIPV6PrefixInfo(char *devname,
qcmap_cm_nl_prefix_info_t
*ipv6_prefix_info)
{
struct sockaddr_in6 *sin6 = NULL;
...
sin6 = (struct sockaddr_in6 *)&ipv6_prefix_info->prefix_addr;
memcpy(SASTORAGE_DATA(ipv6_prefix_info->prefix_addr),
RTA_DATA(rta),
sizeof(sin6->sin6_addr));
...
}
Currently publicly available here:
https://github.com/Bigcountry907/HTC_a13_vzw_Kernel/blob/master/vendor/qcom/proprietary/data/mobileap_v2/server/src/QCMAP_ConnectionManager.cpp#L3658
I would expect applications do something like this instead:
struct sockaddr_in6 sin6; memset (&sin6, 0, sizeof (sin6));
sin6.sin6_family = AF_INET6; memcpy (&sin6.sin6_addr, RTA_DATA
(rta), sizeof (sin6.sin6_addr)); memcpy
(&ipv6_prefix_info->prefix_addr, &sin6, sizeof (sin6));
This avoids any aliasing issues and a dependency on the internals of
struct sockaddr_storage (which is only intended as a way to allocate a
generic struct sockaddr of sufficient size and alignment). It also
initializes the other components of the socket address (such as the
port number and flow label).
Thanks, Florian
Also refer to:
https://books.google.com.hk/books?id=ptSC4LpwGA0C&pg=PA72&lpg=PA72&dq=alignment+sufficient&source=bl&ots=Kt0BQhjiMt&sig=HTUbm2bzVNSoMxNX98EMzORFc30&hl=zh-CN&sa=X&ved=0ahUKEwiP78iMoovWAhVLipQKHYFdCxcQ6AEIQzAF#v=onepage&q=alignment%20sufficient&f=false
Conclusion is Qualcomm's usage of sockaddr_storage is incorrect.
I'm not sure this is the change in glibc that generates alignment warnings: this change has no effect on the source code you show.
On the contrary, I think Qualcomm is using clang, and beginning with clang 4.x, there is a new alignment warning that I've seen generating false positive warning about alignements, depending on the way you write your code: -Waddress-of-packed-member.
You should check if the clang version you are using has changed, since glibc changed. If this is the case, this could certainly explain your problem.
Using a temporary variable makes those warning disappear.
instead of making a call like:
my_function(((struct sockaddr_in6 *)&prefix_addr)->sin6_addr.s6_addr)
just write:
struct sockaddr_in6 p = *(struct sockaddr_in6 *) &prefix_addr;
my_function(p.sin6_addr.s6_addr);
This may avoid alignment warnings.

How to legally use type-punning with unions to cast between variations of struct sockaddr without violating the strict aliasing rule?

POSIX intends pointers to variations of struct sockaddr to be castable, however depending on the interpretation of the C standard this may be a violation of the strict aliasing rule and therefore UB. (See this answer with comments below it.) I can, at least, confirm that there may at least be a problem with gcc: this code prints Bug! with optimization enabled, and Yay! with optimization disabled:
#include <sys/types.h>
#include <netinet/in.h>
#include <stdio.h>
sa_family_t test(struct sockaddr *a, struct sockaddr_in *b)
{
a->sa_family = AF_UNSPEC;
b->sin_family = AF_INET;
return a->sa_family; // AF_INET please!
}
int main(void)
{
struct sockaddr addr;
sa_family_t x = test(&addr, (struct sockaddr_in*)&addr);
if(x == AF_INET)
printf("Yay!\n");
else if(x == AF_UNSPEC)
printf("Bug!\n");
return 0;
}
Observe this behavior on an online IDE.
To workaround this problem this answer proposes the use of type punning with unions:
/*! Multi-family socket end-point address. */
typedef union address
{
struct sockaddr sa;
struct sockaddr_in sa_in;
struct sockaddr_in6 sa_in6;
struct sockaddr_storage sa_stor;
}
address_t;
However, apparently things are still not as simple as they look… Quoting this comment by #zwol:
That can work but takes a fair bit of care. More than I can fit into this comment box.
What kind of fair bit of care does it take? What are the pitfalls of the use of type punning with unions to cast between variations of struct sockaddr?
I prefer to ask than to run into UB.
Using a union like this is safe,
from C11 §6.5.2.3:
A postfix expression followed by the . operator and an identifier designates a member of
a structure or union object. The value is that of the named member,95) and is an lvalue if
the first expression is an lvalue. If the first expression has qualified type, the result has
the so-qualified version of the type of the designated member.
95) If the member used to read the contents of a union object is not the same as the member last used to
store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type
punning’’). This might be a trap representation.
and
One special guarantee is made in order to simplify the use of unions: if a union contains
several structures that share a common initial sequence (see below), and if the union
object currently contains one of these structures, it is permitted to inspect the common
initial part of any of them anywhere that a declaration of the completed type of the union
is visible. Two structures share a common initial sequence if corresponding members
have compatible types (and, for bit-fields, the same widths) for a sequence of one or more
initial members
(highlighted what I think is most important)
With accessing the struct sockaddr member, you will read from the common initial part.
Note: This will not make it safe to pass pointers to the members around anywhere and expect the compiler knows they refer to the same stored object. So the literal version of your example code might still break because in your test() the union is not known.
Example:
#include <stdio.h>
struct foo
{
int fooid;
char x;
};
struct bar
{
int barid;
double y;
};
union foobar
{
struct foo a;
struct bar b;
};
int test(struct foo *a, struct bar *b)
{
a->fooid = 23;
b->barid = 42;
return a->fooid;
}
int test2(union foobar *a, union foobar *b)
{
a->a.fooid = 23;
b->b.barid = 42;
return a->a.fooid;
}
int main(void)
{
union foobar fb;
int result = test(&fb.a, &fb.b);
printf("%d\n", result);
result = test2(&fb, &fb);
printf("%d\n", result);
return 0;
}
Here, test() might break, but test2() will be correct.
Given the address_t union you propose
typedef union address
{
struct sockaddr sa;
struct sockaddr_in sa_in;
struct sockaddr_in6 sa_in6;
struct sockaddr_storage sa_stor;
}
address_t;
and a variable declared as address_t,
address_t addr;
you can safely initialize addr.sa.sa_family and then read addr.sa_in.sin_family (or any other pair of aliased _family fields). You can also safely use addr in a call to recvfrom, recvmsg, accept, or any other socket primitive that takes a struct sockaddr * out-parameter, e.g.
bytes_read = recvfrom(sockfd, buf, sizeof buf, &addr.sa, sizeof addr);
if (bytes_read < 0) goto recv_error;
switch (addr.sa.sa_family) {
case AF_INET:
printf("Datagram from %s:%d, %zu bytes\n",
inet_ntoa(addr.sa_in.sin_addr), addr.sa_in.sin_port,
(size_t) bytes_read);
break;
case AF_INET6:
// etc
}
And you can also go in the other direction,
memset(&addr, 0, sizeof addr);
addr.sa_in.sin_family = AF_INET;
addr.sa_in.sin_port = port;
inet_aton(address, &addr.sa_in.sin_addr);
connect(sockfd, &addr.sa, sizeof addr.sa_in);
It is also okay to allocate address_t buffers with malloc, or embed it in a larger structure.
What's not safe is to pass pointers to individual sub-structures of an address_t union to functions that you write. For instance, your test function ...
sa_family_t test(struct sockaddr *a, struct sockaddr_in *b)
{
a->sa_family = AF_UNSPEC;
b->sin_family = AF_INET;
return a->sa_family; // AF_INET please!
}
... may not be called with (void *)a equal to (void *)b, even if this happens because the callsite passed &addr.sa and &addr.sa_in as the arguments. Some people used to argue that this should be allowed when a complete declaration of address_t was in scope when test was defined, but that's too much like "spukhafte Fernwirkung" for the compiler devs; the interpretation of the "common initial subsequence" rule (quoted in Felix's answer) adopted by the current generation of compilers is that it only applies when the union type is statically and locally involved in a particular access. You must write instead
sa_family_t test2(address_t *x)
{
x->sa.sa_family = AF_UNSPEC;
x->sa_in.sa_family = AF_INET;
return x->sa.sa_family;
}
You might be wondering why it's okay to pass &addr.sa to connect then. Very roughly, connect has its own internal address_t union, and it begins with something like
int connect(int sock, struct sockaddr *addr, socklen_t len)
{
address_t xaddr;
memcpy(xaddr, addr, len);
at which point it can safely inspect xaddr.sa.sa_family and then xaddr.sa_in.sin_addr or whatever.
Whether it would be okay for connect to just cast its addr argument to address_t *, when the caller might not have used such a union itself, is unclear to me; I can imagine arguments both ways from the text of the standard (which is ambiguous on certain key points having to do with the exact meanings of the words "object", "access", and "effective type"), and I don't know what compilers would actually do. In practice connect has to do a copy anyway, because it's a system call and almost all memory blocks passed across the user/kernel boundary have to be copied.

Safely converting from struct sockaddr to struct sockaddr_storage

I have a function which takes in "struct sockaddr *" as a parameter (let's call this input_address), and then I need to operate on that address, which may be a sockaddr_in or sockaddr_in6, since I support both IPv4 and IPv6.
I'm getting some memory corruption and trying to track it down to it's source, and in the process found some code that seems suspect, so I would like to validate if this is the right way to do things.
struct sockaddr_storage *input_address_storage = (struct sockaddr_storage *) input_address;
struct sockaddr_storage result = [UtilityClass performSomeOperation: *input_address_storage];
At first I thought the cast in the first line was safe, but then in the second line I need to dereference that pointer, which seems like it may be wrong. The reason I am concerned is that it may end up copying memory that is beyond where the original structure is (since sockaddr_in is shorter than sockaddr_in6). I am not sure if this could cause a memory corruption (my guess is no), but nevertheless this code gives me a bad feeling.
I can't change the fact my function takes a "struct sockaddr *", so it seems like it would be difficult to work around this type of code, and yet I want to avoid copying from a memory location where I shouldn't be.
If anyone can validate whether what I am doing is wrong, and the best way to fix this, I'd appreciate it.
EDIT: An admin had changed my C tag for C# for some reason. The code I gave is primarily C, with one function call from objective C that doesn't really matter. That call could have been C.
The problem with your approach is that you are converting an existing struct sockaddr* into a struct sockaddr_storage*. Imagine what happens if the original was a ``struct sockaddr_in. Sincesizeof(struct sockaddr_in) < sizeof(struct sockaddr_storage)`, the memory-sanitizer complains of unbound memory reference.
struct sockaddr_storage is essentially a container to contain either your struct sockaddr_in or struct sockaddr_in6.
Hence, it is useful when you want to pass in a struct sockaddr* object but want to allocate enough memory for both sockaddr_in and sockaddr_in6.
A good example is the recvfrom(3) call:
ssize_t recvfrom(int socket, void *restrict buffer, size_t length,
int flags, struct sockaddr *restrict address,
socklen_t *restrict address_len);
Since address requires a struct sockaddr* object, we will construct a struct sockaddr_storage first, and pass it in:
struct sockaddr_storage address;
socklen_t address_length = sizeof(struct sockaddr_storage);
ssize_t ret = recvfrom(fd, buffer, buffer_length, 0, (struct sockaddr*)&address, &address_length);
if (address.ss_family == AF_INET) {
DoIpv4Work((struct sockaddr_in*)&address, ...);
} else if (address.ss_family == AF_INET6) {
DoIpv6Work((struct sockaddr_in6*)&address, ...);
}
The difference in your approach and mine is that I allocate a struct sockaddr_storage and then use it as struct sockaddr, but you do the REVERSE, and use a struct sockaddr and then use it as struct sockaddr_storage.

socket structure : casting?

Context
I am self-learning how sockets work.
Admitted, I am not a C guru, but learn fast.
I read this page :
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/topic/rzab6/rzab6xafunixsrv.htm
Problem
I am stuck at this line :
rc = bind(sd, (struct sockaddr *)&serveraddr, SUN_LEN(&serveraddr));
I just cannot figure what we get from this cast (struct sockaddr *)&serveraddr.
My test so far :
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <sys/socket.h>
int main(void)
{
/*JUST TESTING THE CAST THING NOTHING ELSE HERE*/
struct sockaddr_in localaddr ;
struct sockaddr_in * mi;
struct sockaddr * toto;
localaddr.sin_family = AF_INET;
localaddr.sin_addr.s_addr = htonl(INADDR_ANY);
localaddr.sin_port = 38999;
/* DID I DEFINED MI CORRECTLY ? */
mi = (struct sockaddr*)&localaddr;
toto = (struct sockaddr*)&localaddr;
printf("mi %d\n",mi->sin_family);
printf("mi %d\n",mi->sin_port);
printf("toto %d\n",toto->sa_family);
/*ERROR*/
printf("toto %d\n",toto->sa_port);
}
SUM UP
Could someone please tell me what is really passed to the bind function concerning the structure cast ?
What members do we have in that structure ?
How can I check it ?
Thanks
Here's struct sockaddr:
struct sockaddr {
uint8_t sa_len;
sa_family_t sa_family;
char sa_data[14];
};
and, for instance, here's struct sockaddr_in:
struct sockaddr_in {
uint8_t sa_len;
sa_family_t sa_family;
in_port_t sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
and struct sockaddr_in6:
struct sockaddr_in6 {
uint8_t sa_len;
sa_family_t sa_family;
in_port_t sin_port;
uint32_t sin6_flowinfo;
struct in6_addr sin6_addr;
};
You'll note that they all share the first two members in common. Thus, functions like bind() can accept a pointer to a generic struct sockaddr and know that, regardless of what specific struct it actually points to, it'll have sa_len and sa_family in common (and "in common" here means "laid out the same way in memory", so there won't be any weirdness where both structs have an sa_family member, but they're in totally different places in the two different structs. Technically sa_len is optional, but if it's not there, none of the structs will have it, so sa_family will still be aligned in the same way, and often the datatype of sa_family_t will be increased to make up the difference in size). So, it can access sa_family and determine exactly what type of struct it is, and proceed accordingly, e.g. something like:
int bind(int socket, const struct sockaddr *address, socklen_t address_len) {
if ( address->sa_family == AF_INET ) {
struct sockaddr_in * real_struct = (struct sockaddr_in *)address;
/* Do stuff with IPv4 socket */
}
else if ( address->sa_family == AF_INET6 ) {
struct sockaddr_in6 * real_struct = (struct sockaddr_in6 *)address;
/* Do stuff with IPv6 socket */
}
/* etc */
}
(Pedantic note: technically, according to the C standard [section 6.5.2.3.6 of C11], you're only supposed to inspect common initial parts of structs like this if you embed them within a union, but in practice it'll almost always work without one, and for simplicity I haven't used one in the above code).
It's basically a way of getting polymorphism when you don't actually have real OOP constructs. In other words, it means you don't have to have a bunch of functions like bind_in(), bind_in6(), and all the rest of it, one single bind() function can handle them all because it can figure out what type of struct you actually have (provided that you set the sa_family member correctly, of course).
The reason you need the actual cast is because C's type system requires it. You have a generic pointer in void *, but beyond that everything has to match, so if a function accepts a struct sockaddr * it just won't let you pass anything else, including a struct sockaddr_in *. The cast essentially tells the compiler "I know what I'm doing, here, trust me", and it'll relax the rules for you. bind() could have been written to accept a void * instead of a struct sockaddr * and a cast would not have been necessary, but it wasn't written that way, because:
It's semantically more meaningful - bind() isn't written to accept any pointer whatsoever, just one to a struct which is "derived" from struct sockaddr; and
The original sockets API was released in 1983, which was before the ANSI C standard in 1989, and its predecessor - K&R C - just didn't have void *, so you were going to have to cast it to something in any case.
Casts appear in socket code because C doesn't have inheritance.
struct sockaddr is an abstract supertype of struct sockaddr_in and friends. The syscalls take the abstract type, you want to pass an actual instance of a derived type, and C doesn't know that converting a struct sockaddr_in * to a struct sockaddr * is automatically safe because it has no idea of the relationship between them.

Resources