what does struct "sockaddr_in servaddr, clientaddr;" mean? - c

struct sockaddr_in servaddr, cliaddr;
I am a newbie in socket programing.
What does this statement in socket programing in c mean?
Are we creating a struct named sockaddr_in , and are the servaddr and cliaddr the members? Why is their datatype not mentioned?

What does this statement in socket programing in c mean?
It declares two uninitialized variables of the type struct sockaddr_in.
Are we creating a struct named sockaddr_in?
No. It must be defined already to declare variables, or your program is ill-formed.
and are the servaddr and cliaddr the members?
Nope.
Why is their datatype not mentioned?
It is. See the answer to your first question.

In Unix sockets, header netinet/in.h defines a struct type sockaddr_in, something like this:
struct sockaddr_in {
short sin_family; // e.g. AF_INET, AF_INET6
unsigned short sin_port; // e.g. htons(3490)
struct in_addr sin_addr; // see struct in_addr, below
char sin_zero[8]; // zero this if you want to
};
struct in_addr {
unsigned long s_addr; // load with inet_pton()
};
(ref: http://beej.us/guide/bgnet/output/html/multipage/sockaddr_inman.html)
Your servaddr and cliaddr are names for your sockaddr_in -type structs. Both of them have (in Unix, at least) four members (short sin_family etc.). (One of the members is struct type in_addr called sin_addr that is defined couple of lines later having one member unsigned long s_addr. There's a reason for a struct with only one member, you can find explanation for that in stackoverflow too.)
If you want to use sockaddr_in you must #include <netinet/in.h> or Windows equivalent for netinet/in.h (unless you define the struct yourself).

Related

What is the difference between struct addrinfo and struct sockaddr

From what I understand struct addrinfo is used to prep the socket address structure and struct sockaddr contains socket address information. But what does that actually mean? struct addrinfo contains a pointer to a struct sockaddr. Why keep them separate? Why can't we combine all things within sockaddr into addr_info?
I'm just guessing here but is the reason for their separation is to save space when passing structs? For example in the bind() call, all it needs is the port number and the internet address. So both of these are grouped in a struct sockaddr. So, we can just pass this small struct instead of the larger struct addrinfo?
struct addrinfo {
int ai_flags; // AI_PASSIVE, AI_CANONNAME, etc.
int ai_family; // AF_INET, AF_INET6, AF_UNSPEC
int ai_socktype; // SOCK_STREAM, SOCK_DGRAM
int ai_protocol; // use 0 for "any"
size_t ai_addrlen; // size of ai_addr in bytes
struct sockaddr *ai_addr; // struct sockaddr_in or _in6
char *ai_canonname; // full canonical hostname
struct addrinfo *ai_next; // linked list, next node
};
struct sockaddr {
unsigned short sa_family; // address family, AF_xxx
char sa_data[14]; // 14 bytes of protocol address
};
struct addrinfo is returned by getaddrinfo(), and contains, on success, a linked list of such structs for a specified hostname and/or service.
The ai_addr member isn't actually a struct sockaddr, because that struct is merely a generic one that contains common members for all the others, and is used in order to determine what type of struct you actually have. Depending upon what you pass to getaddrinfo(), and what that function found out, ai_addr might actually be a pointer to struct sockaddr_in, or struct sockaddr_in6, or whatever else, depending upon what is appropriate for that particular address entry. This is one good reason why they're kept "separate", because that member might point to one of a bunch of different types of structs, which it couldn't do if you tried to hardcode all the members into struct addrinfo, because those different structs have different members.
This is probably the easiest way to get this information if you have a hostname, but it's not the only way. For an IPv4 connection, you can just populate a struct sockaddr_in structure yourself, if you want to and you have the data to do so, and avoid going through the rigamarole of calling getaddrinfo(), which you might have to wait for if it needs to go out into the internet to collect the information for you. You don't have to use struct addrinfo at all.

socket structure : casting?

Context
I am self-learning how sockets work.
Admitted, I am not a C guru, but learn fast.
I read this page :
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/topic/rzab6/rzab6xafunixsrv.htm
Problem
I am stuck at this line :
rc = bind(sd, (struct sockaddr *)&serveraddr, SUN_LEN(&serveraddr));
I just cannot figure what we get from this cast (struct sockaddr *)&serveraddr.
My test so far :
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <sys/socket.h>
int main(void)
{
/*JUST TESTING THE CAST THING NOTHING ELSE HERE*/
struct sockaddr_in localaddr ;
struct sockaddr_in * mi;
struct sockaddr * toto;
localaddr.sin_family = AF_INET;
localaddr.sin_addr.s_addr = htonl(INADDR_ANY);
localaddr.sin_port = 38999;
/* DID I DEFINED MI CORRECTLY ? */
mi = (struct sockaddr*)&localaddr;
toto = (struct sockaddr*)&localaddr;
printf("mi %d\n",mi->sin_family);
printf("mi %d\n",mi->sin_port);
printf("toto %d\n",toto->sa_family);
/*ERROR*/
printf("toto %d\n",toto->sa_port);
}
SUM UP
Could someone please tell me what is really passed to the bind function concerning the structure cast ?
What members do we have in that structure ?
How can I check it ?
Thanks
Here's struct sockaddr:
struct sockaddr {
uint8_t sa_len;
sa_family_t sa_family;
char sa_data[14];
};
and, for instance, here's struct sockaddr_in:
struct sockaddr_in {
uint8_t sa_len;
sa_family_t sa_family;
in_port_t sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
and struct sockaddr_in6:
struct sockaddr_in6 {
uint8_t sa_len;
sa_family_t sa_family;
in_port_t sin_port;
uint32_t sin6_flowinfo;
struct in6_addr sin6_addr;
};
You'll note that they all share the first two members in common. Thus, functions like bind() can accept a pointer to a generic struct sockaddr and know that, regardless of what specific struct it actually points to, it'll have sa_len and sa_family in common (and "in common" here means "laid out the same way in memory", so there won't be any weirdness where both structs have an sa_family member, but they're in totally different places in the two different structs. Technically sa_len is optional, but if it's not there, none of the structs will have it, so sa_family will still be aligned in the same way, and often the datatype of sa_family_t will be increased to make up the difference in size). So, it can access sa_family and determine exactly what type of struct it is, and proceed accordingly, e.g. something like:
int bind(int socket, const struct sockaddr *address, socklen_t address_len) {
if ( address->sa_family == AF_INET ) {
struct sockaddr_in * real_struct = (struct sockaddr_in *)address;
/* Do stuff with IPv4 socket */
}
else if ( address->sa_family == AF_INET6 ) {
struct sockaddr_in6 * real_struct = (struct sockaddr_in6 *)address;
/* Do stuff with IPv6 socket */
}
/* etc */
}
(Pedantic note: technically, according to the C standard [section 6.5.2.3.6 of C11], you're only supposed to inspect common initial parts of structs like this if you embed them within a union, but in practice it'll almost always work without one, and for simplicity I haven't used one in the above code).
It's basically a way of getting polymorphism when you don't actually have real OOP constructs. In other words, it means you don't have to have a bunch of functions like bind_in(), bind_in6(), and all the rest of it, one single bind() function can handle them all because it can figure out what type of struct you actually have (provided that you set the sa_family member correctly, of course).
The reason you need the actual cast is because C's type system requires it. You have a generic pointer in void *, but beyond that everything has to match, so if a function accepts a struct sockaddr * it just won't let you pass anything else, including a struct sockaddr_in *. The cast essentially tells the compiler "I know what I'm doing, here, trust me", and it'll relax the rules for you. bind() could have been written to accept a void * instead of a struct sockaddr * and a cast would not have been necessary, but it wasn't written that way, because:
It's semantically more meaningful - bind() isn't written to accept any pointer whatsoever, just one to a struct which is "derived" from struct sockaddr; and
The original sockets API was released in 1983, which was before the ANSI C standard in 1989, and its predecessor - K&R C - just didn't have void *, so you were going to have to cast it to something in any case.
Casts appear in socket code because C doesn't have inheritance.
struct sockaddr is an abstract supertype of struct sockaddr_in and friends. The syscalls take the abstract type, you want to pass an actual instance of a derived type, and C doesn't know that converting a struct sockaddr_in * to a struct sockaddr * is automatically safe because it has no idea of the relationship between them.

What is the need of separate address structure in sockaddr_in?

This is the internet(IPv4) socket address structure defined in netinet/in.h
struct sockaddr_in {
uint8_t sin_len;
sa_family_t sin_family;
in_port_t sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
struct in_addr {
in_addr_t s_addr;
};
Here what is the need of separate structure only for address field.
Why can't we use following structure ?
struct sockaddr_in {
uint8_t sin_len;
sa_family_t sin_family;
in_port_t sin_port;
in_addr_t sin_addr;
char sin_zero[8];
};
It's for historical reasons. In the early days of socket programming, struct in_addr contained a union of various structures so you could get to the individual bytes. This union became unnecessary when subnetting and classless addressing came along, but switching out the struct for a simple unsigned long would break a lot of code, so it just stayed that way.
If you're interested in network programming and you haven't yet picked up a copy of UNIX Network Programming then I'd highly recommend doing so, it's a goldmine for little details like this.

Why is sin_addr inside the structure in_addr?

My doubt is related to the following structure of sockets in UNIX :
struct sockaddr_in {
short sin_family; // e.g. AF_INET, AF_INET6
unsigned short sin_port; // e.g. htons(3490)
struct in_addr sin_addr; // see struct in_addr, below
char sin_zero[8]; // zero this if you want to
};
Here the member sin_addr is of type struct in_addr.
But I don't get why someone would like to do that as all struct inaddr has is :
struct in_addr {
unsigned long s_addr; // load with inet_pton()
};
All in_addr has is just one member s_addr. Why cannot we have something like this :
struct sockaddr_in {
short sin_family; // e.g. AF_INET, AF_INET6
unsigned short sin_port; // e.g. htons(3490)
unsigned long s_addr ;
char sin_zero[8]; // zero this if you want to
};
struct in_addr is sometimes very different than that, depending on what system you're on. On Windows for example:
typedef struct in_addr {
union {
struct {
u_char s_b1,s_b2,s_b3,s_b4;
} S_un_b;
struct {
u_short s_w1,s_w2;
} S_un_w;
u_long S_addr;
} S_un;
} IN_ADDR, *PIN_ADDR, FAR *LPIN_ADDR;
The only requirement is that it contain a member s_addr.
struct in_addr is a more than just an integer is because it might have more than in_addr_t. In many systems, it has a union, and the reason of such implementation is for class A/B/C addresses, which are not used now.
Unix Network Programming Volume 1 explains the historical reason in detail:
The reason the sin_addr member is a structure, and not just an in_addr_t,
is historical. Earlier releases (4.2BSD) defined the in_addr structure as a
union of various structures, to allow access to each of the 4 bytes and to both of the 16-bit values contained within the 32-bit IPv4 address. This was used with class A, B, and C addresses to fetch the appropriate bytes of the address. But with the advent of subnetting and then the disappearance of the various address classes with classless addressing, the need for the
union disappeared. Most systems today have done away with the union and
just define in_addr as a structure with a single in_addr_t member.
Because the in_addr structure may contain more than one member.
http://pubs.opengroup.org/onlinepubs/009604599/basedefs/netinet/in.h.html

What does the abbreviation "s_", "ai_", "sin_", "in" (if such) in the IP structures mean?

Pretty simple questions. And yes, maybe not (that) important, but I'm really curious what do they mean and I couldn't find their meanings.
// ipv4
struct sockaddr_in {
short int sin_family; // Address family, AF_INET
unsigned short int sin_port; // Port number
struct in_addr sin_addr; // Internet address
unsigned char sin_zero[8]; // Same size as struct sockaddr
};
// ipv4
struct in_addr {
uint32_t s_addr; // that's a 32-bit int (4 bytes)
};
// ipv6
struct addrinfo {
int ai_flags; // AI_PASSIVE, AI_CANONNAME, etc.
int ai_family; // AF_INET, AF_INET6, AF_UNSPEC
int ai_socktype; // SOCK_STREAM, SOCK_DGRAM
int ai_protocol; // use 0 for "any"
size_t ai_addrlen; // size of ai_addr in bytes
struct sockaddr *ai_addr; // struct sockaddr_in or _in6
char *ai_canonname; // full canonical hostname
struct addrinfo *ai_next; // linked list, next node
};
// ipv6
struct sockaddr {
unsigned short sa_family; // address family, AF_xxx
char sa_data[14]; // 14 bytes of protocol address
};
sin_ means sockaddr_in, ai_ means addrinfo, sa_ means sockaddr. I'm not sure about the s_ in in_addr. The sockets API was designed with pre-standard early 1980s C compilers in mind, which might have had a single namespace for all struct members.
larsman is mostly right, but it's not merely a matter of legacy single-namespace considerations. All the structs defined in standard headers use names of this form to avoid stepping on the application's namespace for macros. If struct members were not prefixed with ai_, sin_, etc., then whatever member names were included in the struct (including extensions not even specified in the C or POSIX standards) would clash and result in errors if an application defined the same name as a preprocessor macro. By using these "struct-local namespaces" that can be reserved by simple pattern rules in the standards (for instance, netdb.h reserves ai_*) there is a clear distinction between names reserved for use by the implementation and names reserved for use by the application, and new extensions or new revisions of the standard will not result in clashes.
The Microsoft definition of in_addr could imply that the "S_" prefix means struct as per #R..'s answer and seeing "un" for union, however they use a capital "S" unlike POSIX land.
typedef struct in_addr {
union {
struct {
u_char s_b1,s_b2,s_b3,s_b4;
} S_un_b;
struct {
u_short s_w1,s_w2;
} S_un_w;
u_long S_addr;
} S_un;
} IN_ADDR, *PIN_ADDR, FAR *LPIN_ADDR;
For the set of socket structures "s" usually means "sock" short for "socket", so there does not appear to be a definition reason.
For curiosity the MSDN page on IPv6 Support shows struct in6_addr containing the name s6_addr implying the IPv6 version of an IPv4 "socket" structure rather than just a "IPv4 structure".

Resources