Code Connected Volume 1 - page 47 have an example about how to receive multipart message:
while (1) {
zmq_msg_t message;
zmq_msg_init(&message);
zmq_msg_recv(&message, socket, 0);
// Process the message frame
...
zmq_msg_close(&message);
if (!zmq_msg_more(&message))
break;
}
Is this correct? Shouldn't we use zmq_msg_close() after zmq_msg_more()?
The API reference for zmq_msg_more() and zmq_msg_recv() (ZeroMQ 3.2.2 Stable) both contain examples showing that zmq_msg_close() is called after zmq_msg_more. As far as I know the API docs do not specifically state anything to contradict this, thus the example from Code Connected seems wrong. The documentation for zmq_msg_close() states that the actual memory release may be postponed by underlaying layers, implicating that the zmq_msg_more() operation may succeed but it still looks wrong to call it after closing the message.
Example from zmq_msg_more() API documentation (3.2.2) (edited slightly for readability):
zmq_msg_t part;
while (true)
{
// Create an empty ØMQ message to hold the message part
int rc = zmq_msg_init (&part);
assert (rc == 0);
// Block until a message is available to be received from socket
rc = zmq_recvmsg (socket, &part, 0);
assert (rc != -1);
if (zmq_msg_more (&part))
fprintf (stderr, "more\n");
else
{
fprintf (stderr, "end\n");
break;
}
zmq_msg_close (part);
}
However, looking in the ZeroMq Guide regarding Multi-Part Messages, that example actually checks for more messages after closing the message, but that is achieved by checking the socket using zmq_getsockopt(), without using any references to the message. I suspect the Code Connected examples simply used that example and changed from zmq_getsockopt() to zmq_msg_more() (probably incorrecly so).
Example from ZeroMq Guide (multi-part messages):
while (1)
{
zmq_msg_t message;
zmq_msg_init (&message);
zmq_msg_recv (&message, socket, 0);
// Process the message frame
zmq_msg_close (&message);
int more;
size_t more_size = sizeof (more);
zmq_getsockopt (socket, ZMQ_RCVMORE, &more, &more_size);
if (!more)
break; // Last message frame
}
Maybe this answer will help someone.
from "libzmq/doc/zmq_msg_more.txt"
The zmq_msg_more() function indicates whether this is part of a multi-part
message, and there are further parts to receive. This method can safely be
called after zmq_msg_close(). This method is identical to zmq_msg_get()
with an argument of ZMQ_MORE.
Related
bool done;
done = false;
while (!done) {
/* read the message */
bzero(msg, 100);
printf("[client]Type something: ");
fflush(stdout);
read(0, msg, 100);
if (strcmp(msg, "/done") == 0) {
done = true;
/* sending the message to the server */
if (write(sd, msg, 100) <= 0) {
perror("[client]Error sending the message to the server.\n");
return errno;
}
} else {
/* sending the message to the server */
if (write(sd, msg, 100) <= 0) {
perror("[client]Error sending the message to the server.\n");
return errno;
/* reading the answer given by the server*/
if (read(sd, msg, 100) < 0) {
perror("[client]read() error from server.\n");
return errno;
}
/* printing the received message */
printf("[client]The received message is: %s\n", msg);
}
}
Here's the code that i have problem with. So i want to send messages to the server until I send the message "/done", the code works, I send messages continuously but even when i type and send "/done" the process doesn't end.
I think there's a problem with the bzero function that "clears" the msg or maybe i don't understand it so good.
I also tried to wrote my own function to check if two strings are the same, but no effect also.
So how should i write the condition or "clear" the msg so i can send messages continuously and after i send "/done" the execution ends?
P.S. the msg is declared earlier in the code as char msg[100];
When you read from 0, you're reading from stdin. If that is a terminal that you are typing into (you don't say), you likely have it set up in normal (canonical) mode, so you'll read a line, which probably includes a newline (\n) character. So when you enter /done, the string you get in your msg buffer is "/done\n" which doesn't match "/done"...
read(2) is including the '\n' at the end of the string. When you use low-level read you get everything. When trying to debug strings, it can be helpful to put quotes in your print statement, like
printf("[client]The received message is: '%s'\n", msg);
as this immediately shows invisible whitespace.
TCP is not a message protocol. It does not glue bytes together into messages. If you want to use TCP to send and receive messages, you'll need to implement functions that send and receive messages.
I must strongly caution you not to fail into the trap of thinking that changing your code from happening not to work when you tried it to happening to work when you try it means that you've fixed it. Your code fails if read returns 1. You need to implement a sensible function to receive a message or your code will only be working by luck, and I can tell you from painful experience that one day your luck will run out.
Say that I've implemented a epoll-based TCP server where each thread is running something very similar to the below (taken from the epoll manpage where kdpfd is the epoll file descriptor and listener is a socket that is listening on a port):
struct epoll_event ev, *events;
for(;;) {
nfds = epoll_wait(kdpfd, events, maxevents, -1);
for(n = 0; n < nfds; ++n) {
if(events[n].data.fd == listener) {
client = accept(listener, (struct sockaddr *) &local,
&addrlen);
if(client < 0){
perror("accept");
continue;
}
setnonblocking(client);
ev.events = EPOLLIN | EPOLLET;
ev.data.fd = client;
if (epoll_ctl(kdpfd, EPOLL_CTL_ADD, client, &ev) < 0) {
fprintf(stderr, "epoll set insertion error: fd=%d0,
client);
return -1;
}
}
else
do_use_fd(events[n].data.fd);
}
}
For the do_use_fd(events[n].data.fd) above, say we want to write everything we receive to stdout:
int do_use_fd(int fd) {
int err;
char buf[512];
while ((err = read(fd, buf, 512)) > 0) {
write(1, buf, err);
}
if (err == -1 && errno != EAGAIN && errno != EWOULDBLOCK)
// do some error handling and return -1
return 0;
}
Now, say I have 10k+ connections, all of who send me a lot of messages over a prolonged period of time. Assume that my clients send me the message hello, my name is {client's name} every few seconds. Assume that (somehow) this message is large enough that it has to be transfered as multiple packets.
As such, read(fd, buf, 512) may occasionally return -1 with an errno indicating it would block. As such, I think the above solution could end up with the something like following output:
hello, my nam
hello, my name is Pau
e is John Le
hello, my name is Geo
nnon
l McCartney
rge
hello, my name is Ringo
Starr
Harrison
because as soon as a read blocks on one connection, another read can start on a different connection. Instead, I'd like the following to be printed:
hello, my name is John Lennon
hello, my name is Paul McCartney
hello, my name is George Harrison
hello, my name is Ringo Starr
Is there a recommended way of dealing with this issue? One option would be to keep a buffer per connection, and check if the message is completed and only print once this happens. But with 10k+ connections, would this be a good idea? On one hand, something tells me this solution does not scale well. On the other hand, if the messages are only 500 bytes, with 10k connections, this solution is only going to take up 5MB.
Thanks in advance.
I think using a buffer per connection would be OK in your case. It may however be more elegant to create a buffer per incomplete message. That would mean that you somehow have to know when your message is done, so you would need a small protocol, such as using a length field or a terminator (, and possibly a timeout to kill incomplete messages after a certain time). This would also guarantee that no unused memory is allocated, as the buffer could be released right after the message is complete and passed up. You could for example access these buffers through a hashmap using the connection 5-tuple as key. If you decide to use a message-bound identifier, which of course will incur extra overhead, you could even demux messages from a single tcp-connection used to transmit multiple messages at a time.
If you need to enforce ordering among these messages you will have to detail your situation, because ordering is a tough problem in many situations.
Edit: Sorry, I have a lot to do at the moment, so I could not answer any sooner. You are correct that using a connection-based approach is easier. Message-based is the more advantageous the sparser the connections are used. If you can expect all connections to receive messages at all times it is just an overhead. If connections are sometimes idle for a while it may reduce the memory usage considerably though. Also note that your applications memory usage no longer scales with the number of clients but the the number of messages, which is usually nice, because message-rates typically vary. You are also correct about the ordering on a TCP-stream. As long as you send only one complete message at a time over the connection, TCP will ensure ordering. Some applications e.g., HTTP2 reuse the same TCP-connection to send multiple messages at the same time. In that case TCP will not be helpful, because message fragments arrive in an unspecified order and you need to demultiplex them (e.g. via stream-ids in HTTP2).
I have a socket that is receiving streaming stock tick data. However, I seem to get a lot of truncated messages, or what appears to be truncated messages. Here is how I am receiving data:
if((numbytes = recv(sockfd, buf, MAXDATASIZE-1, 0)) == -1) {
perror("recv()");
exit(1);
}
else {
buf[numbytes] = '\0';
// Process data
}
Can recv() receive just a partial message of what was sent?
My feeling is I might need another loop around the recv() call that receives until a complete message is sent. I know that a libcurl implementation I have (not possible to use libcurl here I would think) has an outer loop:
// Read the response (sum total bytes read in tot_bytes)
for(tot_bytes=0; ; tot_bytes += iolen)
{
wait_on_socket(sockfd, 1, 60000L);
res = curl_easy_recv(curl, buf + tot_bytes, sizeof_buf - tot_bytes, &iolen);
if(CURLE_OK != res) {
// printf( "## %d", res );
break;
}
}
Do I need an recv() loop similar to the libcurl example (that verifiably works)?
We can also pass the flag to recv to wait until all the message has arrived. It works when you know the number of bytes to receive. You can pass the command like this.
numbytes = recv(sockfd, buf, MAXDATASIZE-1, MSG_WAITALL);
You're right, you need a loop. recv only retrieves the data that's currently available; once any data has been read, it doesn't wait for more to appear before it returns.
The manual page says "The receive calls normally return any data available, up to the requested amount, rather than waiting for receipt of the full amount requested."
TCP does not respect message boundaries. That means that recv() is not guaranteed to get the entire message, exactly as you hypothesize. And that is indeed why you need a loop around your recv(). (That's also why upper-layer protocols like HTTP either close the socket, or prepend a length indicator, so the recipient knows exactly when to stop reading from the socket.)
can recv() receive just a partial message of what was sent?
Yes, indeed, if you use TCP. I think this can help you.
Handling partial return from recv() TCP in C
I am building a http proxy in c.
The proxy is supposed to filter some keywords in the URL and in the html content.
The first problem I have is with the send() function. When I am loading the page for the first time all is fine and dandy. And if I let the page finnish loading, the next request is also fine. But if I open www.google.com and start to type the "instant-feature" is making a new request before the last one is complete and i get the following error:
Program received signal SIGPIPE, Broken pipe.
0x00007ffff7b2efc2 in send () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) up
#1 0x0000000000401f1a in main () at net-ninny2.c:232
232 bytes_sent += send(i, buffer+bytes_sent, buffer_size-bytes_sent, 0);
The code-block that generates the error looks like this:
while(bytes_sent < buffer_size) {
bytes_sent += send(i, buffer+bytes_sent, buffer_size-bytes_sent, 0);
printf("* Bytes sent to Client: %d/%d\n", bytes_sent, buffer_size);
}
If you think it's relevant i'll be happy to provide more code.
My second problem is related to Http headers. Since I want to filter keywords in the html content, I don't want the content to be encoded. Google doesn't seem to agree with that and no matter what I put in the Accept-Encoding -header, I always get the content back encoded in gzip. Any ideas how to get rid of that?
EDIT:
I am also trying to use fork() to create child processes for the new connections, but that just throws a nasty error:
select: Interrupted system call
I have put it where I create a new file descriptor from a incoming connection:
if (i == listener) {
// New connection
remote_addr_len = sizeof remote_addr;
newfd = accept(listener, (struct sockaddr *)&remote_addr, &remote_addr_len);
if (newfd == -1) {
perror("accept");
}
else {
FD_SET(newfd, &master); // Add new connection to master set
if (newfd > fdmax) {
fdmax = newfd;
}
printf("* New connection from %s on "
"socket %d\n",
inet_ntop(remote_addr.ss_family,
get_in_addr((struct sockaddr*)&remote_addr),
remoteIP, INET6_ADDRSTRLEN), newfd);
if(!fork()) {
fprintf(stderr, "!fork()\n");
close(newfd);
exit(5);
}
}
}
But I'm guessing I am doing it all wrong.
Cheers!
For your first question, you will want to ignore the SIGPIPE signal:
signal(SIGPIPE, SIG_IGN);
See How to prevent SIGPIPEs (or handle them properly) for more detail. If you ignore the signal and the socket connection is reset, you will also want to handle the -1 error return value from send() appropriately.
For your second question, you may not be able to force Google to send data uncompressed, since Google may assume that all browsers can handle compressed data. You will probably need to embed a gzip decompressor in your proxy. It's certainly not fair to increase the bandwidth requirements of both ends just because you want to filter some keywords.
I've made a piece of code in what's on my server as multiple threads
The problem is that it doesn't send data while im receiving on the other socket.
so if i send something from to client 1 to client 2, client2 only receives if he sends something himself(jumps out of the recv function) .. how can i solve this ?
/* Thread*/
while (! stop_received) {
nr_bytes_recv = recv(s, buffer, BUFFSIZE, 0);
if(strncmp(buffer, "SEND", 4) == 0) {
char *message = "Text asads \n";
rv = send(users[0].s, message, strlen(message), 0);
rv = send(users[1].s, message, strlen(message), 0);
if (rv < 0) {
perror("Error sending");
exit(EXIT_FAILURE);
}
}else{
char *message = "Unknown command \n";
rv = send(s, message, strlen(message), 0);
if (rv < 0) {
perror("Error sending");
exit(EXIT_FAILURE);
}
}
}
To be a little more specific, there are a few types of I/O. What you're doing currently is called blocking i/o. In general that means that when you call send or recv the operation will "block" until it has completed.
In contrast to that there is what is known as non-blocking i/o. In this i/o model an operation will return immediately if it's unable to complete. Typically the select function is used with this i/o model.
You can see an example program here at the Select Tutorial. The full source code is at the bottom of the page.
As others have noted, your other option is to use threads.
Your code will block on the recv() call. Either write a multi-threaded application, or investigate the use of the select() function.
Put send and receive in separate threads.
I notice that you are using perror() (the POSIX error function), which leads me to believe you are using a POSIX operating system, which makes me suspect its GNU/Linux.
select() is portable, poll() is POSIX centric and epoll() is Linux centric. If using GNU/Linux, I strongly suggest avoiding select() and using:
poll() if you are polling only a few dozen file descriptors
epoll() if you need to scale to thousands of connections, and its available.
If your application need not be portable, and no requirement prohibits using extensions, use poll() or epoll(). Once you learn how select() works, you'll be very happy to get rid of it, especially for something that has to scale to serve many clients.
If portability is a requirement, see if either poll() or epoll() exist during your build configuration and use either in favor of select().
Note, epoll() did not appear until Linux 2.5(something), so its best to get used to using both.
You shoud separete the code in two threads, one transmitter and one receiver.
Somewthing like this:
/* 1st Thread*/
while (! stop_received) {
nr_bytes_recv = recv(s, buffer, BUFFSIZE, 0);
}
/* 2nd Thread*/
while (! stop_received) {
if(strncmp(buffer, "SEND", 4) == 0) {
char *message = "Text asads \n";
rv = send(users[0].s, message, strlen(message), 0);
rv = send(users[1].s, message, strlen(message), 0);
if (rv < 0) {
perror("Error sending");
exit(EXIT_FAILURE);
}
}else{
char *message = "Unknown command \n";
rv = send(s, message, strlen(message), 0);
if (rv < 0) {
perror("Error sending");
exit(EXIT_FAILURE);
}
}
}
The concurrency will bring some issues, like access to the buffer variable.
There are two ways of achieving the goal you want:
1.) implement the sending and receiving codes in different threads. but there will be some issues, like increasing no of clients might get you into troubles to handle the code. also there will be some some problem of concurrency (as mentioned by pcent).
you can go for no blocking sockets but i suggest not to do so, as i hope you dont want a cpu hog.
2.) The other way is to use of select() function which will let you monitor multiple sockets of different types at the same time. for more description of "select()" you can google it. :)