Whats the addrlen field in recvfrom() used for? - c

I'm using recvfrom in my program to get DGRAM data from a server I specify in src_addr. However, I'm not sure why I need to initialize and pass in addrlen.
I read the man page and I didn't really understand what it's getting at.
If src_addr is not NULL, and the underlying protocol provides the source address, this source address is filled in. When
src_addr is NULL, nothing is filled in; in this case, addrlen is not
used, and should also be NULL. The argument addrlen is a value-result argument, which the caller should initialize before the
call to the size of the buffer associated with src_addr, and
modified on return to indicate the actual size of the source address. The returned address is truncated if the buffer provided is too small;
in this case, addrlen will return a value greater than was supplied to the call.
I'm guessing that it's got something to do with src_addr being ipv4 or ipv6. Is this correct?
Thanks!

Maybe there is a missinterpretation from your side. Talking about:
ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags,
struct sockaddr *src_addr, socklen_t *addrlen);
src_addr is not used to hand in the adress that you would like to listen to, but rather a storage location provided by you to get the actual source address handed out.
Thus if you set src_addr to NULL because youre not interested in the address at all, you don't have to care about addrlen as it won't get used anyway.
If on the other hand you want to be informed about the source address, you not only have to provide a storage location, but also tell how big the storage location you provided is.
Thats why you should initialize *addr_len to the buffer size you allocated.
After your call the value pointed to by addrlen will inform you about how much (if any) of the space you allocated to store the source address got actually filled with data.
About sizes
The whole hassle with struct sockaddr and passing sizes back and forth has to do with the fact that even thoug they're most heavily used in networking sockets were intended to be much more general concept.
Think about unix domain sockets as an an example as they are implemented via the filesystem they require an adressing scheme totaly different from that known from IP based networking. The type of sockaddr used here is:
struct sockaddr_un {
sa_family_t sun_family; /* AF_UNIX */
sun_path[UNIX_PATH_MAX]; /* pathname */
};
Compare this to the struct used in IP based networking:
struct sockaddr_in {
sa_family_t sin_family; /* address family: AF_INET */
in_port_t sin_port; /* port in network byte order */
struct in_addr sin_addr; /* internet address */
};
it should be clear both don't have too much in common.
sockets were designed to be able to fit both cases.

ssize_t recvfrom(int socket, void *buffer, size_t length, int flags,
struct sockaddr *address, socklen_t *address_len);`
The address_len argument specifies the length of the address structure i.e. the number of bytes to use from the start address indicated at address(start address of memory location + number of bytes from the start address that hold the value)
The structure is defined in /usr/include/bits/socket.h
/* Structure describing a generic socket address. */
struct sockaddr
{
__SOCKADDR_COMMON (sa_); /* Common data: address family and length. */
char sa_data[14]; /* Address data. */
};
Thus the sa_data field holds the address data (start address of the data) whose length is indicated by the address_len argument.
... whenever a function says it takes a struct sockaddr* you can cast your
struct sockaddr_in*, struct sockaddr_in6*, or struct sockadd_storage*
to that type with ease and safety.
Therefore, as indicated in the man page and #WhozCraig in the comment to your question, this field is updated with the actual size when the method returns.
More information
recvfrom
Beej's Guide to Network Programming - struct sockaddr and pals

Related

What does struct hostent stands for?

A pointer to hostent is the struct returned by gethostbyname().
Exact function signature : struct hostent* gethostbyname(const char*)
And I have no idea what the 'ent' part means here at the end of hostent.
I get very forgetful when I try to memorize that I don't understand, so please help me out.
A quick search on GitHub points to basedefs/netdb.h (definitions for network database operations)
The <netdb.h> header shall define the hostent structure that includes at least the following members:
char *h_name Official name of the host.
char **h_aliases A pointer to an array of pointers to
alternative host names, terminated by a
null pointer.
int h_addrtype Address type.
int h_length The length, in bytes, of the address.
char **h_addr_list A pointer to an array of pointers to network
addresses (in network byte order) for the host,
terminated by a null pointer.
From there, the official documentation for gethostbyaddr() includes:
Entries shall be returned in hostent structures.
The gethostbyaddr() function shall return an entry containing addresses of address family type for the host with address addr.
The len argument contains the length of the address pointed to by addr.
The gethostbyaddr() function need not be reentrant. A function that is not required to be reentrant is not required to be thread-safe.
Entries shall be returned in hostent structures.
Upon successful completion, these functions shall return a pointer to a hostent structure if the requested entry was found, and a null pointer if the end of the database was reached or the requested entry was not found.
So there you have it: ent for entry. Not entity.

why Network programs store IP addresses in the IP address structure

I'm a beginner in C, my textbooks cover some network programming in C, and states that Network programs store IP addresses in the IP address structure
/* Internet address structure */
struct in_addr {
unsigned int s_addr; /* Network byte order (big-endian) */
};
I'm confused, can't we just store a 32 bit integer?
in_addr represents an IPv4 address, which can indeed fit in a 32bit integer.
But, there are other types of socket addresses that cannot, such as IPv6 addresses.
Each type of socket address uses its own struct type:
in_addr for IPv4
in6_addr for IPv6
char[] for UNIX paths
etc
Usually wrapped inside a corresponding sockaddr struct:
sockaddr_in for IPv4
sockaddr_in6 for IPv6
sockaddr_un for UNIX
etc
Which is what you use with socket APIs like bind(), connect(), accept(), sendto(), recvfrom(), etc.
Very rarely would you ever need to use something like in_addr directly by itself. Typically you would use it in conjunction with an API that requires an IPv4 address to be passed via the in_addr struct.

Why does recvfrom care about who the data comes from

So, I'm creating a server in C which uses UDP, and I want to listen for incoming packets from many sources. Therefore, when I call ssize_t recvfrom(int, void *, size_t, int, struct sockaddr * __restrict, socklen_t * __restrict), the 5th parameter, that which contains the sender's information, may vary.
Is there a way to receive the packets without knowing each individual client's address information? And, is this possible with C's library?
Here's my code:
int file_descriptor;
char data[1024];
int bytes_recved;
sockaddr_in iDontKnow;
socklen_t addr_len = sizeof(iDontKnow);
if ((bytes_recved = recvfrom(file_descriptor, data, strlen(data), 0, (struct sockaddr*)&iDontKnow, &addr_len)) < 0) {
perror("Failed to receive data");
}
I noticed that when receiving data with Java's DatagramSocket and DatagramPacket classes, the DatagramSocket's receive function took in a parameter of type DatagramPacket. This DatagramPacket, however, only held the object in which to place the data. So, why does C's implementation of UDP receiving require that you know the sender's information?
Is there a way to receive the packets without knowing each individual client's address information?
Well, you don't need to know the sender information beforehand, anyway. Once a packet is received, the sender information (if available) will be stored into address.
From the man page,
ssize_t recvfrom(int socket, void *restrict buffer, size_t length,
int flags, struct sockaddr *restrict address,
socklen_t *restrict address_len);
[...] If the address argument is not a null pointer and the protocol provides the source address of messages, the source address of the received message shall be stored in the sockaddr structure pointed to by the address argument, and the length of this address shall be stored in the object pointed to by the address_len argument.
Regarding the why part, in case of connectionless sockets, unless you know of the sender address for a packet in a communication, you cannot reply or respond to the sender. So, it is required to know the sender info specifically in connectionless mode and there comes recvfrom() which, along with the received data, gives us the info about the sender, also.
EDIT:
In your code
recvfrom(file_descriptor, data, strlen(data), 0, (struct sockaddr*)&iDontKnow, &addr_len)
is wrong, as strlen(data) is UB, as you're trying to count the length of an uninitialized char array, which is not qualified to be a string. It invokes undefined behavior. You may want to use sizeof(data), as data is an array.
In case you're not interested in sender's info, just pass a NULL as the corresponding argument.
To add to that, for connectionless sockets (UDP), it's actually required to get the sender information. For connection oriented sockets, you have another stripped-down alternative , recv() which only takes care of receiving and storing the data.
This DatagramPacket, however, only held the object in which to place the data.
And the source address and port, and the length of the data.
So, why does C's implementation of UDP receiving require that you know the sender's information?
It doesn't. It has the option to tell you the source address and port. It's a result parameter, not an input.
You compare different functions from Java and C.
In C there is also a recv() function that does not provide any address.
The sole puprpose of recvfrom over recv is to get the sender's address.
Normally servers reply to packets that they receive. Wihout an address that is not possible.
If you do not care about the sender of your packets, just take recv.
Or to put it the other way around:
If you don't care about the sender, why did you pick the recvfrom version of recv?
I wonder what does the server server if it doesn't care about the client's addresses... But that is not related to your question.
You could do it like these,
int sockfd_recv;
struct sockaddr_in recvaddr;
bzero(&recvaddr, sizeof(recvaddr));
recvaddr.sin_family = AF_INET;
recvaddr.sin_port = htons(port_recv);
recvaddr.sin_addr.s_addr = htonl(INADDR_ANY);
int ret = bind(sockfd_recv, (struct sockaddr *)&recvaddr, sizeof(recvaddr));

Why socklent_t * is used in accept() in socket programming?

In C socket programming the accept() declaration looks like:
int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);
I can understand the uses of sockfd, struct sockaddr *addr.
But why we have to pass the address of the length of the socket, it could have been socklen_t. Because if the accept() function needs the length then it can get it by socklen_t. Why the protype of the function is declared in such that way?
So what is the reason behind using socklen_t * type?
In code that's agnostic to the address/protocol family of the socket it's accepting from, it may be using a generic sockaddr_storage structure to hold the result. The initial value of the pointed-to socklen_t is the size of this storage; the value after accept returns is the actual size of the resulting peer address. Also, some address/protocol families like AF_UNIX have variable length addresses, so even if you know the type you may not know the size.
why addrlen is needed
accept designed to deal with lots of protocal family, their addr struct maybe different length.
The argument addr is a pointer to a sockaddr structure. This structure is filled in with
the address of the peer socket, as known to the communications layer. The exact format
of the address returned addr is determined by the socket's address family (see socket(2)
and the respective protocol man pages). When addr is NULL, nothing is filled in; in this
case, addrlen is not used, and should also be NULL.
why pointer
The addrlen argument is a value-result argument: the caller must initialize it to contain
the size (in bytes) of the structure pointed to by addr; on return it will contain the
actual size of the peer address.
why socklen_t
The socklen_t type
The third argument of accept() was originally declared as an int * (and is that under
libc4 and libc5 and on many other systems like 4.x BSD, SunOS 4, SGI); a POSIX.1g draft
standard wanted to change it into a size_t *, and that is what it is for SunOS 5. Later
POSIX drafts have socklen_t *, and so do the Single UNIX Specification and glibc2. Quot‐
ing Linus Torvalds:
"Any sane library must have "socklen_t" be the same size as int. Anything else
breaks any BSD socket layer stuff. POSIX initially did make it a size_t, and I (and
hopefully others, but obviously not too many) complained to them very loudly indeed.
Making it a size_t is completely broken, exactly because size_t very seldom is the same
size as "int" on 64-bit architectures, for example. And it has to be the same size as
"int" because that's what the BSD socket interface is. Anyway, the POSIX people eventu‐
ally got a clue, and created "socklen_t". They shouldn't have touched it in the first
place, but once they did they felt it had to have a named type for some unfathomable rea‐
son (probably somebody didn't like losing face over having done the original stupid
thing, so they silently just renamed their blunder)."
ref: man accept, man socket

Checking the address family in socket programming

what will be the output of the following code :
char peer_ip[16];
inet_pton(AF_INET,"127.0.0.1",peer_ip);
now I have peer_ip in network form. How can I check what is the address family ??? I cannot use inet_ntop now. Is there any way ?? Will getaddrinfo work in this case ???
You can't—inet_pton gives you either a struct in_addr (for AF_INET) or a struct in6_addr (for AF_INET6), depending on what address family you pass in. If you consider these structures to be binary blobs of memory, there's no way you can recover the address family from them, you just have to keep track of what type of binary blob you have.
You should really be using a struct in_addr, not a char[16] as the value passed into inet_pton:
struct in_addr peer_ip;
inet_pton(AF_INET, "127.0.0.1", &peer_ip);
You have to go higher up and use getaddrinfo instead of inet_pton (which doesn't handle IPv6 scopes) and instead of opaque buffers use struct sockaddr_storage and struct sockaddr pointers then you can immediately determine the family with ss.ss_family or sa.sa_family as appropriate.

Resources