I have made a multi-client server, which uses select() to determine what clients are currently sending. However, I am wanting to send data that is larger than my buffer size (e.g. text from a file) while remaining a non-blocking client.
At first I have found solutions that place the send/recv into while loops to send the data, with the while loop condition being the amount of bytes sent, but wouldn't that block the server for a certain amount of time? Especially if the contents of the file is large?
I was thinking to send say 1024bytes in one iteration of my server while loop, and then on the next iteration it sends the next 1024bytes to the client etc. Although this would have consequences on the client side. Possibly the client could ask for the next x bytes per query to the server?
Please let me know if there is a standard way to go about this. Thanks.
You don't need to do anything special for this. Your sockets are presumably already configured as non-blocking, so when you write to them, pass as much data as you have, and check the return value to see how much was actually sent. Then keep the rest of the data in a buffer, and wait until the file descriptor is ready again before attempting to write more.
Related
This refers to C sockets. Say I have written some data into the socket, then I call read, will read system call read buffer size(say 4096 etc) information and delete all the information in the socket? Basically will read just move the seek pointer forward to those many bytes that it read or will it read and just delete all the information from socket, so that next time when read is called, it reads from the 0th index?
Or say I write into the socket without calling read from anywhere else? Will the data be replaced or appended?
If there is more data available on a socket than the amount that you read(), the extra data will be kept in the socket's buffer until you read it. No data is lost during a short read.
Writing works similarly. If you call write() multiple times, each write will append data to the buffer on the remote host. Again, no data is lost.
(Eventually, the buffer on the remote host will fill up. When this happens, write() will block -- the local host will wait for the buffer to empty before sending more data.)
Conceptually, each direction in a socket pair behaves like a pipe between the two peers. The overall stream of data sent will be received in the same order it was sent, regardless of how much data was read/written at a time.
I'm currently writing a proxy application that reads from one socket and writes on another. Both are set as non-blocking, allowing multiple sockets pairs to be handle.
To control a proper flow between the sockets, the application should NOT read from the source socket if the writing on the target socket may block.
The idea is nice, however I found no way to detect a blocking state of the target socket without first writing to it... and that is not what is needed.
I know of an option to use SIOCOUTQ (using ioctl()) and calculate the remaining buffer, but this seems ugly compared to a simple check if the target socket is ready for writing.
I guess I can also use select() for just this socket, but that is so much waste of a such heavy system call.
select or poll should be able to give you the information.
I assume you're already using one of them to detect which of your reading sockets has data
When you have a reading socket available for read, replace it with the corresponding writing socket (but put it in the write fds of course), and call select again. Then, if the writing socket is available, you can read and write.
Note that it's possible that the writing socket is ready to get data, but not as much as you want. So you might manage to read 100 bytes, and write only 50.
Thank you all for the feedback.
Summarizing all comments and answers up until now:
Directly to the question, the only known way to detect that a socket will block on an attempt to write to it is using select/poll/epoll.
The original objective was to build a proxy, which reads from one socket and writes to another, keeping proper balance between them (reading at the same rate of writing and vice versa) and using no major buffering in the application. For this, the following options were presented:
Use of SIOCOUTQ to find out how much buffer is left on the destination socket and transmit no more than that. As pointed out by #ugoren, this has a disadvantage of being unreliable, mainly because in between reading the value, calculating it and attempting to write the actual value may change. It also introduces some issues of busy-waiting if wrongly managed. I guess, if this technique is to be used, it should be followed by a more reliable one for full protection.
Use of select/poll/epoll and adding a small limited buffer per read socket: Initially, all read sockets will be added to the poll, when ready for read, we read it in a limited buffer size, then remove the read socket from the poll and add instead of it the destination socket for writing, when we return to the poll and we detect that the destination socket is ready for writing, we write the buffer, if all data was accepted by the socket, we remove the destination socket from the poll and return the read socket back. The disadvantage here is that we increase the number of system calls (adding/removing to the poll/select) and we need to keep an internal buffer per socket. This seems to be the preferred approach, and some optimization could be added to try and reduce the call to system calls (for example, trying to write right after reading the read socket, and only if something is left to perform the above).
Thank you all again for participating in this discussion, it helped me a lot ordering the ideas.
This isn't a show-stopping programming problem as such, but perhaps more of a design pattern issue. I'd have thought it'd be a common design issue on embedded resource-limited systems, but none of the questions I found so far on SO seem relevant (but please point out anything relevant that I could have missed).
Essentially, I'm trying to work out the best strategy of estimating the largest buffer size required by some writer function, when that writer function's output isn't fixed, particularly because some of the data are text strings of variable length.
This is a C application that runs on a small ARM micro. The application needs to send various message types via TCP socket. When I want to send a TCP packet, the TCP stack (Keil RL) provides me with a buffer (which the library allocates from its own pool) into which I may write the packet data payload. That buffer size depends of course on the MSS; so let's assume it's 1460 at most, but it could be smaller.
Once I have this buffer, I pass this buffer and its length to a writer function, which in turn may call various nested writer functions in order to build the complete message. The reason for this structure is because I'm actually generating a small XML document, where each writer function typically generates a specific XML element. Each writer function wants to write a number of bytes to my allocated TCP packet buffer. I only know exactly how many bytes a given writer function writes at run-time, because some of the encapsulated content depends on user-defined text strings of variable length.
Some messages need to be around (say) 2K in size, meaning they're likely to be split across at least two TCP packet send operations. Those messages will be constructed by calling a series of writer functions that produce, say, a hundred bytes at a time.
Prior to making a call to each writer function, or perhaps within the writer function itself, I initially need to compare the buffer space available with how much that writer function requires; and if there isn't enough space available, then transmit that packet and continue writing into a fresh packet later.
Possible solutions I am considering are:
Use another much larger buffer to write everything into initially. This isn't preferred because of resource constraints. Furthermore, I would still wish for a means to algorithmically work out how much space I need by my message writer functions.
At compile time, produce a 'worst case size' constant for each writer function. Each writer function typically generates an XML element such as <START_TAG>[string]</START_TAG>, so I could have something like: #define SPACE_NEEDED ( START_TAG_LENGTH + START_TAG_LENGTH + MAX_STRING_LENGTH + SOME_MARGIN ). All of my content writer functions are picked out of a table of function pointers anyway, so I could have the worst-case size estimate constants for each writer function exist as a new column in that table. At run-time, I check the buffer room against that estimate constant. This probably my favourite solution at the moment. The only downside is that it does rely on correct maintenance to make it work.
My writer functions provide a special 'dummy run' mode where they run though and calculate how many bytes they want to write but don't write anything. This could be achieved by perhaps simply sending NULL in place of the buffer pointer to the function, in which case the functions's return value (which usually states amount written to buffer) just states how much it wants to write. The only thing I don't like about this is that, between the 'dummy' and 'real' call, the underlying data could - at least in theory - change. A possible solution for that could be to statically capture the underlying data.
Thanks in advance for any thoughts and comments.
Solution
Something I had actually already started doing since posting the question was to make each content writer function accept a state, or 'iteration' parameter, which allows the writer to be called many times over by the TCP send function. The writer is called until it flags that it has no more to write. If the TCP send function decides after a certain iteration that the buffer is now nearing full, it sends the packet and then the process continues later with a new packet buffer. This technique is very similar I think to Max's answer, which I've therefore accepted.
A key thing is that on each iteration, a content writer must be designed so that it won't write more than LENGTH bytes to the buffer; and after each call to the writer, the TCP send function will check that it has LENGTH room left in the packet buffer before calling the writer again. If not, it continues in a new packet.
Another step I did was to have a serious think about how I structure my message headers. It became apparent that, like I suppose with almost all protocols that use TCP, it is essential to implement into the application protocol some means of indicating the total message length. The reason for this is because TCP is a stream-based protocol, not a packet-based protocol. This is again where it got a bit of a headache because I needed some upfront means of knowing the total message length for insertion into the start header. The simple solution to this was to insert a message header into the start of every sent TCP packet, rather than only at the start of the application protocol message (which may of course span several TCP sockets), and basically implement fragmentation. So, in the header, I implemented two flags: a fragment flag, and a last-fragment flag. Therefore the length field in each header only needs to state the size of the payload in the particular packet. At the receiving end, individual header+payload chunks are read out of the stream and then reassembled into a complete protocol message.
This of course is no doubt very simplistically how HTTP and so many other protocols work over TCP. It's just quite interesting that, only once I've attempted to write a robust protocol that works over TCP, have I started to realise the importance of really thinking the your message structure in terms of headers, framing, and so forth so that it works over a stream protocol.
I had a related problem in a much smaller embedded system, running on a PIC 16 micro-controller (and written in assembly language, rather than C). My 'buffer size' was always going to be the two byte UART transmit queue, and I had only one 'writer' function, which was walking a DOM and emitting its XML serialisation.
The solution I came up with was to turn the problem 'inside out'. The writer function becomes a task: each time it is called it writes as many bytes as it can (which may be >2 depending on the serial data transmission rate) until the transmit buffer is full, then it returns. However, it remembers, in a state variable, how far it had got through the DOM. The next time it is called, it caries on from the point previously reached. The writer task is called repeatedly from a loop. If there is no free buffer space, it returns immediately without changing its state. It is called repeatedly from an infinite loop, which acts as a round-robin scheduler for this task and the others in the system. Each time round the loop, there is a delay which waits for the TMR0 timer to overflow. So each task gets called exactly once in a fixed time slice.
In my implementation, the data is transmitted by a TxEmpty interrupt routine, but it could also be sent by another task.
I guess the 'pattern' here is that one role of the program counter is to hold the current state of the flow of control, and that this role can be abstracted away from the PC to another data structure.
Obviously, this isn't immediately applicable to your larger, higher level system. But it is a different way of looking at the problem, which may spark your own particulr insight.
Good luck!
In .NET there is the DataAvailable property in the network stream and the Available property in the tcp client.
However silverlight lacks those.
Should I send a header with the lenght of the message? I'd rather not waste network resources.
Is there any other way?
You are micro-optimizing. Why do you think that another 4 bytes would affect the performance?
In other words: Use a length header.
Update
I saw your comment on the other answer. You are using BeginRead in the wrong way. It will never block or wait until the entire buffer have been filled.
You should declare a buffer which can receive your entire message. The return value from EndRead will report the number of bytes received.
You should also know that TCP is stream based. There is no guarantees that your entire JSON message will be received at once (or that only your first message is received). Therefore you must have some sort of way to know when a message is complete.
And I say it again: A length header will hardly affect the performance.
What do you mean by 'waste network resources'? Every network read API I am aware of returns the actual number of bytes read, somehow. What's the actual problem here?
I have a question about a situation that I face quite often. From time to time I have to implement various TCP-based protocols. Most of them define variable-length data packets that begin with a common header ([packet ID, length, payload] or something really similar). Obviously, there can be two approaches to reading these packets:
Read header (since header length is usually fixed), extract the payload length, read the payload
Read all available data and store it in a buffer; parse the buffer afterwards
Obviously, the first approach is simple, but requires two calls to read() (or probably more). The second one is slightly more complicated, but requires less calls.
The question is: does the first approach affect the performance badly enough to worry about it?
yes, system calls are generally expensive, compared to memory copies. IMHO it is particularly true on x86 architecture, and arguable on RISC machine (arm, mips, ...).
To be honest, unless you must handle hundreds or thousands of request per second, you will hardly notice the difference.
Depending on what is exactly the protocol, an hybrid approach could be the best. When the protocol uses a lot of small packets and less big ones, you can read the header and a partial amount of data. When it is a small packet, you win by avoiding a large memcpy, when the packet is big, you win by issuing a second syscall only for that case.
If your application is a server capable of handling multiple clients simultaneously and non-blocking sockets are used to handle multiple clients in one thread, you have little choice but to only ever issue one recv() syscall when a socket becomes ready for read.
The reason for that is if you keep calling recv() in a loop and the client sends a large volume of data, what can happen is that your recv() loop may block the thread for long time from doing anything else. E.g., recv() reads some amount of data from the socket, determines that there is now a complete message in the buffer and forwards that message to the callback. The callback processes the message somehow and returns. If you call recv() once more there can be more messages that have arrived while the callback was processing the previous message. This leads to a busy recv() loop on one socket preventing the thread from processing any other pending events.
This issue is exacerbated if the socket read buffer in your application is smaller than the kernel socket receive buffer. In other words, the whole contents of the kernel receive buffer can not be read in one recv() call. Anecdotal evidence is that I hit this issue on a busy production system when there was a 16Kb user-space buffer for a 2Mb kernel socket receive buffer. A client sending many messages in succession would block the thread in that recv() loop for minutes because more messages would arrive when the just read messages were being processed, leading to disruption of the service.
In such event-driven architectures it is best to have the user-space read buffer equal to the size of the kernel socket receive buffer (or the maximum message size, whichever is bigger), so that all the data available in the kernel buffer can be read in one recv() call. This works by doing one recv() call, processing all complete messages in the user-space read buffer and then returning control to the event loop. This way a connections with a lot of incoming data is not blocking the thread from processing other events and connections, rather it round-robin's processing of all connections with incoming data available.
The best way to get your answer is to measure. The strace program is decent for the purpose of measuring system call times. Using it adds a lot of overhead in itself, but if you merely compare the cost of one recv for this purpose versus the cost of two, it should be reasonably meaningful. Use the -tt option to get times. Or you can use the -c option to get an overview of time spent separated by which syscall it was spent on.
A better way to measure, albeit with more of a learning curve, is oprofile.
Also note that if you do decide buffering is worthwhile, you may be able to use fdopen and the stdio functions to take care of it for you. This is extremely easy and will work well if you're only dealing with a single connection or if you have a thread/process per connection, but won't work at all if you want to use a select/poll-based model.
Note that you generally have to "read all the available data into a buffer and process it afterwards" anyway, to account for the (unlikely, but possible) scenario where a recv() call returns only part of your header - so you might as well go the whole hog and use option 2.
Yes, depending upon the scenario the read/recv calls may be expensive. For example, if you are issuing huge number of recv() calls to read very small amount of data every small interval, it would be a performance hit. In such scenario you could issue a recv() with reasonably large buffer, let say 4k, and then parse that 4k buffer. It may contain multiple header+data combo. By reading header first you can find the data and its length. And to avoid the mem copy of data into a new buffer, you can just use the offset from where the actual data start, and store that pointer.