I'm trying to understand recv() at a high level. So recv takes data in "chunks" but I'm still not getting how it is precisely handled. Example:
char buffer[1000];
int received= recv(sock, buffer, sizeof(buffer), 0)
Does this mean if I'm receiving a massive file, the buffer, if connected through sock might for example reflect it stored 500 bytes in the received variable, then in a loop receive another 300 bytes, and all 800 bytes of data will be stored in buffer by the end of the loop (lost in the received variable unless accounted for), or does buffer need a pointer to keep track of where it last received the data to store it in then next iteration?
recv has no context. All it knows that it got some address (pointer) to write into and some maximum size - and then it will try this. It will always start writing with the given address. If for example on wish to add data after some previously received data one can simply give the pointer into the location after the previous data instead of the beginning of the buffer. Of course on should adjust the maximum size it is allowed to read to not overflow the buffer.
You asked "How does recv() work?", so it may be worth briefly studying a simpler function that does essentially the same thing - read().
recv() operates in more or less the same way as the read() function. The main difference is that recv() allows you to pass flags in the last argument - but you are not using these flags anyway.
My suggestion would be - before trying to use recv() to read from a network socket - to practice using read() on a plain text file.
Both functions return the number of bytes read - except in the case of an error, in which case they will return -1. You should always check for this scenario - and handle appropriately.
Both functions can also return less than the number of bytes requested. In the case of recv() - and reading from a socket - this may be because the other end has simply not sent all the required data yet. In the case of a reading from a file - with read() - it may be because you have reached the end of the file.
Anyway ...
You will need to keep track of the current offset within your buffer - and update it on each read. So declare a file-scope variable offset.
static off_t offset; static char buffer[1000];
Then - when your 'loop' is running - increment the offset after each read ...
while (1) {
size_t max_len = sizeof(buffer) - offset;
ssize_t count = recv(sock, buffer+offset, max_len, 0);
if (count == -1) {
switch (errno) {
case EAGAIN:
usleep(20000);
break;
default:
perror("Failed to read from socket");
close(sock);
break;
}
}
if (count == 0) {
puts("Looks like connection has been closed.");
break;
}
offset += count;
if (offset >= expected_len) {
puts("Got the expected amount of data. Wrapping up ...");
}
}
Notes:
Using this approach, you will either need to know the expected amount of data before-hand - or use a special delimiter to mark the end of the message
the max_len variable indicates how much space is left in your buffer - and (perhaps needless to say) you should not try to read more bytes than this
the destination for the recv() command is buffer+offset - not buffer.
if recv() returns zero, AFAIK this indicates that the other end has performed an "orderly shutdown".
if recv() returns -1, you really need to check the return code. EAGAIN is non-fatal - and just means you need to try again.
Related
Today I have encountered some weird looking code that at first glance it's not apparent to me what it does.
send(file_desc,"Input \'y\' to continue.\t",0x18,0);
read(file_desc,buffer,100);
iVar1 = strcmp("y",(char *)buffer);
if (iVar1 == 0) {
// some more code
}
It seems that a text string is being written into the file descriptor. Immediately then after that it reads from that file descriptor into a buffer. And it compares if the text written into the buffer is a "y".
My understanding (please correct me if I am wrong), is that it writes some data which is a text string into the file descriptor, and then the file descriptor acts as a temporary storage location for anything you write to it. And after that it reads that data from the file descriptor into the buffer. It actually is the same file descriptor. It seems as a primitive way of using a file descriptor to copy data from the text string into the buffer. Why not just use a strcpy() instead?
What would be the use case of writing to a file descriptor and then immediately read from it? It seems like a convoluted way to copy data using file descriptors. Or maybe I don't understand this code well enough, what this sequence of a send() and a read() does?
And assuming that this code is instead using the file descriptor to copy the text string "Input \'y\' to continue.\t" into the buffer, why are they comparing it with the string "y"? It should probably be false every single time.
I am assuming that any data written into a file descriptor stays in that file descriptor until it is read from. So here it seems that send() is being used to write the string into, and read() is used to read it back out.
In man send it says:
The only difference between send() and write(2) is the presence of flags. With a zero
flags argument, send() is equivalent to write(2).
why would they use send() instead of write()? This code is just so mind boggling.
Edit: here's the full function where this code is originally from:
void send_read(int file_desc)
{
int are_equal;
undefined2 buffer [8];
char local_28 [32];
/* 0x6e == 110 == 'n' */
buffer[0] = 0x6e;
send(file_desc,"Input \'y\' to continue.\t",0x18,0);
read(file_desc,buffer,100);
are_equal = strcmp("y",(char *)buffer);
if (are_equal == 0) {
FUN_00400a86(file_desc,local_28);
}
else {
close(file_desc);
}
return;
}
The send() and recv() functions are for use with sockets (send: send a message on a socket — recv: receive a message from a connected socket). See also the POSIX description of Sockets in general.
Socket file descriptors are bi-directional — you can read and write on them. You can't read what you wrote, unlike with pipe file descriptors. With pipes, the process writing to the write end of a pipe can read what it wrote from the read end of the pipe — if another process didn't read it first. When a process writes on a socket, that information goes to the peer process and cannot be read by the writer.
send(2) is a system call that can only be used with sockets. A socket is a descriptor that allows you to use it to send data or receive from a remote point (a remote socket) that can be on a different computer or in the same as you are. But it works like a phone line, what you send is received by your parnter and what he/she sends is received by you. read(2) system call can be used by sockets, while send(2) cannot be used by files, so your sample code is mixing calls related to files with calls related to sockets (that's not uncommon, as read(2) and write(2) can both be used with sockets)
The code you post above is erroneous, as it blindly compares the received buffer with strcmp function, assuming that it received a null terminated string. This can be the case, but it also cannot.
Even if the sender (in the other side of the connection) agreed on sending a full message, nul terminated string. The receiver must first get the amount of data received (this is the return value of the read(2) call, which can be:
-! indicating some error on reception. The connection can be reset by the other side, or the other side can have rebooted while you send the data.
0 indicating no more data or end of data (the other side closed the connection) This can happen if the other side has a timeout and you take too much to respond. It closes the connection without sending anything. You just receive nothing.
n some data, less than the buffer size, but including the full packet sent by the peer (and the agreed nul byte it sent with it). This is the only case in which you can safely strcmp the data.
n some data, less than the buffer size, and less than the data transmitted. This can happen due to some data fragmentation of the data in several packets. Then you have to do another read until you have all the data send by your peer. Packet fragmentation is something natural in TCP, for example.
n some data, less than the buffer size, and more than the data transmitted. The sender did another transmit, after the one you receive, and both packets got into the kernel buffer. You have to investigate this case, as you have one full packet, and must save the rest of the received data in the buffer, for later processing, or you'll lose data you have received.
n some data, the full buffer filled, and no space to store the full transmitted data remained. You have filled the buffer and no \0 char came... the packet is larger than the buffer, you run out of buffer space and have to decide what to do (allocate other buffer to receive the rest, discard the data, or whatever you decide to do) This will not happen to you because you expect a packet of 1 or 2 characters, and you have a buffer of 100, but who knows...
At least, and as a minimum safe net, you can do this:
send(file_desc,"Input \'y\' to continue.\t",0x18,0);
int n = read(file_desc,buffer,sizeof buffer - 1); /* one cell reserved for '\0' */
switch (n) {
case -1: /* error */
do_error();
break;
case 0: /* disconnect */
do_disconnect();
break;
default: /* some data */
buffer[n] = '\0'; /* append the null */
break;
}
if (n > 0) {
iVar1 = strcmp("y",(char *)buffer);
if (iVar1 == 0) {
// some more code
}
}
Note:
As you didn't post a complete and verifiable example, I couldn't post a complete and verifiable response.
My apologies for that.
I have a problem when I want to execute two consecutive times the read function Address to retrieve all the data sent by the server.
I read once then read create an infinity loop.
Code :
char buff[50] = {0};
nbytes = 1;
while (nbytes > 0)
{
nbytes = read(m_socket, buff, sizeof(buff));
}
why read create infinity loop ? is not the "while" the problem.
thank you for your answers.
socket(2) gives you a blocking file descriptor, meaning a system call like read(2) blocks until there's enough some data available to fulfill your request, or end of stream (for TCP) happens (return value 0), or some error happens (return value -1).
This means you never get out of your while loop until you hit an error, or the other side closes the connection.
Edit 0:
#EJP is thankfully here to correct me, as usual - read(2) blocks until any data is available (not the whole thing you requested, as I initially stated above), including an end-of-stream or an error.
I have a problem about a server (call servera) that receive a file from another server end send it to the client. The problem is that client receive 0 as file size and so zero bytes of the file:
/* receive file size from serverB */
recv(s,&bytes,sizeof(bytes),0);
/* send file size to client */
send(file_descriptor,&bytes,sizeof(bytes),0);
bytes = ntohs(bytes);
/* receive (from serverb) and send immediately (to client)*/
while (total != bytes) {
nread = read(s,&c,sizeof(char));
if(nread == 1){
send(file_descriptor,&c,sizeof(c),0);
total += nread;
}
}
What's wrong?
Everything could be wrong.
You must check I/O calls for errors before relying on side-effects, otherwise you will get unpredictable results.
In your case, perhaps the first recv() fails, leaving bytes uninitialized at 0.
Also, the loop reading a single byte at a time is very inefficient, and still fails to check that it manages to send that byte (send() can fail in which case you need to re-try).
I need to read from an AF_UNIX socket to a buffer using the function read from C, but I don't know the buffer size.
I think the best way is to read N bytes until the read returns 0 (no more writers in the socket). Is this correct? Is there a way to guess the size of the buffer being written on the socket?
I was thinking that a socket is a special file. Opening the file in binary mode and getting the size would help me in knowing the correct size to give to the buffer?
I'm a very new to C, so please keep that in mind.
On common way is to use ioctl(..) to query FIONREAD of the socket which will return how much data is available.
int len = 0;
ioctl(sock, FIONREAD, &len);
if (len > 0) {
len = read(sock, buffer, len);
}
One way to read an unknown amount from the socket while avoiding blocking could be to poll() a non-blocking socket for data.
E.g.
char buffer[1024];
int ptr = 0;
ssize_t rc;
struct pollfd fd = {
.fd = sock,
.events = POLLIN
};
poll(&fd, 1, 0); // Doesn't wait for data to arrive.
while ( fd.revents & POLLIN )
{
rc = read(sock, buffer + ptr, sizeof(buffer) - ptr);
if ( rc <= 0 )
break;
ptr += rc;
poll(&fd, 1, 0);
}
printf("Read %d bytes from sock.\n", ptr);
I think the best way is to read N
bytes until the read returns 0 (no
more writers in the socket). Is this
correct?
0 means EOF, other side has closed the connection. If other side of communication closes the connection, then it is correct.
If connection isn't closed (multiple transfers over the same connect, chatty protocol), then the case is bit more complicated and behavior generally depends on whether you have SOCK_STREAM or SOCK_DGRAM socket.
Datagram sockets are already delimited for you by the OS.
Stream sockets do not delimit messages (all data are an opaque byte stream) and if desired one has to implement that on application level: for example by defining a size field in the message header structure or using a delimiter (e.g. '\n' for single-line text messages). In first case you would first read the header, extract length and using the length read the rest of the message. In other case, read stream into partial buffer, search for the delimiter and extract from buffer the message including the delimiter (you might need to keep the partial buffer around as depending on protocol several command can be received with single recv()/read()).
Is there a way to guess the
size of the buffer being written on
the socket?
For stream sockets, there is no reliable way as the other side of communication might be still in process of writing the data. Imagine the quite normal case: socket buffer is 32K and 128K is being written. Writing application would block inside send()/write(), the OS waiting for reading application to read out the data and thus free space for the next chunk of written data.
For datagram sockets, one normally knows the size of the message beforehand. Or one can try (never did that myself) recvmsg( MSG_PEEK ) and if the MSG_TRUNC is in the returned msghdr.msg_flags, try to increase the buffer size.
you are correct, if you don't know the size of the input you can just read one byte each time and append it to a larger buffer.
read N bytes until the read returns 0
Yes!
One added detail. If the sender doesn't close the connection, the socket will just block, instead of returning. A nonblocking socket will return -1 (with errno == EAGAIN) when there's nothing to read; that's another case.
Opening the file in binary mode and getting the size would help me in knowing the correct size to give to the buffer?
Nope. Sockets don't have a size. Suppose you sent two messages over the same connection: How long is the file?
I'm writing myself a small server daemon in C, and the basic parts like processing connects, disconnects and receives are already in, but a problem in receiving still persists.
I use "recv" to read 256 bytes at once into a char array, and because it can contain multiple lines of data as one big chunk, I need to be able to split each line separatly to process it.
That alone wouldn't be the problem, but because of the possibility that a line could be cut off because it didn't fit into the buffer anymore, I also need to be able to see if a line has been cut off. Not that bad, too, just check the last char for \r or \n, but what if the line was cut off? My code does not allow for easy "just keep reading more data" because I'm using select() to handle multiple requests.
Basically, this is my situation:
//This is the chunk of code ran after select(), when a socket
//has readable data
char buf[256] = { 0 };
int nbytes;
if ((nbytes = recv(i, buf, sizeof(buf) - 1, 0)) <= 0)
{
if (nbytes == 0)
{
struct remote_address addr;
get_peername(i, &addr);
do_log("[Socket #%d] %s:%d disconnected", i, addr.ip, addr.port);
}
else
do_log("recv(): %s", strerror(errno));
close(i);
FD_CLR(i, &clients);
}
else
{
buf[sizeof(buf) - 1] = 0;
struct remote_address addr;
get_peername(i, &addr);
do_log("[Socket #%d] %s:%d (%d bytes): %s", i, addr.ip, addr.port, nbytes, buf);
// split "buf" here, and process each line
// but how to be able to get the rest of a possibly cut off line
// in case it did not fit into the 256 byte buffer?
}
I was thinking about having a higher scoped temporary buffer variable (possibly malloc()'d) to save the current buffer in, if it was too long to fit in at once, but I always feel bad about introducing unnecessarily high scoped variables if there's a better solution :/
I appreciate any pointers (except for the XKCD ones :))!
I guess you need to add another per-stream buffer that holds the incomplete line until the line feed that comes after is received.
I'd use some kind of dynamically expanding buffer like GString to accumulate data.
The other thing that might help would be putting the socket into nonblocking mode using fcntl(). Then you can recv() in a loop until you get a -1. Check errno, it will be either EAGAIN or EWOULDBLOCK (and those aren't required to have the same value: check for both).
Final remark: I found that using libev (google it; I can't post multiple links) was more fun than using select().