Switching many threads using curl easy to single thread using curl multi - c

I use libcurl easy interface and I create lots of threads in my c++ app to handle these http requests. I would like to convert the code to use libcurl multi instead. Conceptually, the idea is clear: instead of calling blocking curl_easy_perform on each curl easy handle from multiple threads I'll call a blocking curl_multi_perform from a single thread and this call internally will handle all attached curl easy handles.
Things that aren't clear to me:
how do I cancel any of the outstanding http requests that are being handled by the blocking curl_multi_perform call (from another thread). Similarly, would the same work with easy interface, can I end/about an http request from another thread while there is another thread does curl_easy_perform on that handle.
Is it ok to add new easy handles to a multi handle while there is another thread calls curl_multi_perform on the multi handle?
If I use curl_multi_remove_handle to abort one of outgoing http requests while it was loading data (let's say it was doing 1GB file download) then I can reuse the same handle right after that. Does curl close that tcp connection that was aborted in the middle? Otherwise, I don't see how that connection could possibly be reused without completely downloading entire 1GB body.
Is there a simple example that used to do multiple easy requests from different threads and same example converted to multi interface?

(This is really several questions disguised as one, which is not a good fit for stackoverflow.)
curl_multi_perform() doesn't block. It does as much as it can do for now, then it returns and expects the program to call it again when it's time or when there's activity on one of its sockets.
Ideally you can mark which transfers to stop in the other threads and as soon as curl_multi_perform() returns you can remove said easy handles from the multi handle and they're no longer in the game. Alternatively, you can use the individual transfer's callbacks (write/read/progress) to return error when you want that transfer to end.
It is not OK to use the same libcurl handle in more than one thread at any given moment. If you really need to use the same handle from more than one thread, then you need to do careful mutexing. See the libcurl treading man page. It is usually better to put things into qeueus from the other threads and let the single libcurl-using thread read handles or actions from that queue when it can, which then assures single thread access to the handles.
If you abort a transfer by removing the handle with curl_multi_remove_handle(), that transfer is aborted. Stopped. You can indeed reuse that handle immediately and if you just put it back in, it will be treated as a brand new transfer and unless you change any options in the easy handle it will simply start off from the beginning again with the same URL. Prematurely aborted transfers will of course be treated correctly, which might include closing the TCP connection if necessary.

Related

How do I implement event driven POSIX threads?

I'm coding for a linux platform using C. Let's say I have 2 threads. A and B.
A is an infinite loop and constantly trying to find out if there is data on the socket localhost:8080, where as B is a thread that spends most of its time in a blocked state until A calls mutex unlock function on a mutex that B uses to block itself. A will unlock B when it received appropriate data on the socket.
So you see here is a problem. B is "event driven" largely whereas A is in a constant running state. My target platform isn't resource rich so I wish A could be "activated" and enter running state only when it received data on socket, instead of constantly looping.
So how can I do that? If it matters - I wish to do this for both UDP and TCP sockets.
There are Multiple was of doing what you want in a clean was. One approach, you are kind of using already, is a event system. A real event system would be overkill for the kind of problem you are dealing with, but can be found here. This is a (random) better implementation, capable of listening for multiple file descriptors and time based events, all in a single thread.
If you want to build one yourself, you should take a look at the select or poll function.
But I agree with #Jeremy Friesner, you should definitely use the functions made for socket programming, they are perfect for your kind of problem. Only use the event system approach if you really need it (with multiple sockets/timed events).
You simply call recv (or recvfrom, recvmsg, etc) and it doesn't return until some data has been received. There's no need to "constantly try to find out if there is data" - that's silly.
If you set the socket to non-blocking mode then recv will return even if there's no data. If that's what you're doing, then the solution is simple: don't set the socket to non-blocking mode.

How to handle hooked WSARecv

I'm working on a project that involves hooking WSARecv. I know how to hook this function, I mean its just the same as hooking another function. Anyway the hard part is when WSARecv is used to perform overlapped operations. The idea is that when an application receives data to intercept that and be possible to modify this, I'm using pipes for this. The native DLL tunnels all data to a managed 'server'. This processes the input etc and returns it back to the native DLL. This works great for WSASend, send and recv. However the hard part is when an application uses overlapped sockets.
So I need the received data first before I can process it, this is the hard part. How would I do something like this? I thought of this, but they both seem like a mess:
When WSARecv is called using the WSAOverlapped:
Create a new thread, use WaitForSingleObject and pass the hEvent of the WSAOverlapped structure. When the event is signaled process the data to the managed server and pass the data to the program.
When WSARecv is called using the completion routine:
Create a new thread, modify the call to the original function with lpOperationCompleted to a new function. Use SleepEx to put the thread in an alertable state. When the OperationCompleted is called process the data and pass data back to the program.
I could post my code but I didn't write because it seems like a bad solution.. So there is not really a point for that.
I cannot think of a better solution and this seems horrible because when an application calls WSARecv a lot (for example a large server using overlapped sockets to handle lots of clients) it creates a new thread for every call and that just seems like a bad idea.
So how can I do such thing?
There's no need to create a thread for each overlapped IO call.
When overlapped operations are used, they either have an associated event (which you can safely ignore), a completion routine, or are associated with an I/O Completion port.
To handle the first two cases you should hook both WSARecv() and WSAGetOverlappedResult().
If you need to handle the last, you'll also need to hook GetQueuedCompletionStatus()
Now, when you get a call to WSARecv(), for the event case, you do nothing special there (except possibly save some information in relation to the lpOverlapped, eg. the buffer), and process the data in WSAGetOverlappedResult() (which the application must call to get the success/error and bytes transferred.)
If a completion routine is present, save the lpOverlapped and lpCompletionRoutine, and pass your own completion routine to the real WSARecv().
Your routine should process the data and call the original completion routine.
To handle the I/O completion port case, have WSARecv() save lpOverlapped and buffers etc., in GetQueuedCompletionStatus(), call the original, and if the returned overlapped structure matches, handle the data.
You should also note that overlapped operations may complete immediately, in which case the event isn't signaled, the completion routine isn't called, and (IIRC) no completion is queued on the IOCP.

The most efficient way to manage multiple socket(maximum 50 sockets.) in a single process?

I'm trying to implement Bittorrent client. in order to receive pieces from different peers, The client should manage multiple socket.
Well-known solution that I know are
1. Each thread has one socket.
2. Using select() call, non-blocking I/O.
3. a mix of 1 and 2.
The first solution requires too many threads. The second solution wastes CPU time since it continue to checks maximum 50 socket. Also, when deciding to use the third solution, I don't know how many threads a single process use.
Which solution is the best one, to receive a fairly large file?
Is there any web page that give me a good solution?
Any advice would be awesome.
Some High Level Ideas from my side. : )
Have a main thread in which you will be doing the "select" / "poll" call for all the connections.
Have a thread pool of worker threads
If for a particular connection, select indicates that there is data to read, then pass the socket + additional information to one of the free worker threads for receiving / sending data on that connection.
Upon completion of the work, the worker thread returns to the free worker thread queue, which can be used again for another connection.
Hope this helps
You're right, the first solution is the worst.
The second one, with select() can do the job, but there's a problem: select() has a complexity of log(n). You should use /dev/poll, epoll(), kqueue() or whatever, but don't use select().
Don't use one thread per socket !! You will loose a lot of time due to the context switch.
You should have:
A Listener thread : just do all the accept and put the new socket
in a Worker thread.
Multiple Worker thread: do all the other stuff. It will check if there's data available and will handle it. A Worker thread manage many sockets.
Take a look at the Kegel's c10k page if you want more informations.
Check some Open Source BitTorrent client and check the code to get some ideas, it is the best thing you could do.
I recommend you to check BitTorrent in C or Hadouken in C# for example:
https://github.com/bittorrent
https://github.com/hadouken/hdkn

Removing a handle from a I/O completion port and other questions about IOCP

The CreateIoCompletionPort function allows the creation of a new I/O completion port and the registration of file handles to an existing I/O completion port.
Then, I can use any function, like a recv on a socket or a ReadFile on a file with a OVERLAPPED structure to start an asynchronous operation.
I have to check whether the function call returned synchronously although it was called with an OVERLAPPED structure and in this case handle it directly. In the other case, when ERROR_IO_PENDING is returned, I can use the GetQueuedCompletionStatus function to be notified when the operation completes.
The question which arise are:
How can I remove a handle from the I/O completion port? For example, when I add sockets to the IOCP, how can I remove closed ones? Should I just re-register another socket with the same completion key?
Also, is there a way to make the calls ALWAYS go over the I/O completion port and don't return synchronously?
And finally, is it possible for example to recv asynchronously but to send synchronously? For example when a simple echo service is implemented: Can I wait with an asynchronous recv for new data but send the response in a synchronous way so that code complexity is reduced? In my case, I wouldn't recv a second time anyways before the first request was processed.
What happens if an asynchronous ReadFile has been requested, but before it completes, a WriteFile to the same file should be processed. Will the ReadFile be cancelled with an error message and I have to restart the read process as soon as the write is complete? Or do I have to cancel the ReadFile manually before writing? This question arises in combination with a communication device; so, the write and read should not do problems if happening concurrently.
How can I remove a handle from the I/O completion port?
In my experience you can't disassociate a handle from a completion port. However, you may disable completion port notification by setting the low-order bit of your OVERLAPPED structure's hEvent field: See the documentation for GetQueuedCompletionStatus.
For example, when I add sockets to the IOCP, how can I remove closed ones? Should I just re-register another socket with the same completion key?
It is not necessary to explicitly disassociate a handle from an I/O completion port; closing the handle is sufficient. You may associate multiple handles with the same completion key; the best way to figure out which request is associated with the I/O completion is by using the OVERLAPPED structure. In fact, you may even extend OVERLAPPED to store additional data.
Also, is there a way to make the calls ALWAYS go over the I/O completion port and don't return synchronously?
That is the default behavior, even when ReadFile/WriteFile returns TRUE. You must explicitly call SetFileCompletionNotificationModes to tell Windows to not enqueue a completion packet when TRUE and ERROR_SUCCESS are returned.
is it possible for example to recv asynchronously but to send synchronously?
Not by using recv and send; you need to use functions that accept OVERLAPPED structures, such as WSARecv, WSASend, or alternatively ReadFile and WriteFile. It might be more handy to use the latter if your code is meant to work multiple types of I/O handles, such as both sockets and named pipes. Those functions provide a synchronous mode, so if you use those them you can mix asynchronous and synchronous calls.
What happens if an asynchronous ReadFile has been requested, but before it completes, a WriteFile to the same file should be processed?
There is no implicit cancellation. As long as you're using separate OVERLAPPED structures for each read/write to a full-duplex device, I see no reason why you can't do concurrent I/O operations.
As I’ve already pointed out there, the commonly held belief that it is impossible to remove handles from completion ports is wrong, probably caused by the abscence of any hint whatsoever on how to do this from nearly all documentation I could find. Actually, it’s pretty easy:
Call NtSetInformationFile with the FileReplaceCompletionInformationenumerator value for FileInformationClass and a pointer to a FILE_COMPLETION_INFORMATION structure for the FileInformation parameter. In this structure, set the Port member to NULL (or nullptr, in C++) to disassociate the file from the port it’s currently attached to (I guess if it isn’t attached to any port, nothing would happen),
or set Port to a valid HANDLE to another completion port to associate the file with that one instead.
First some important corrections.
In case the overlapped I/O operation completes immediately (ReadFile or similar I/O function returns success) - the I/O completion is already scheduled to the IOCP.
Also, according to your questions I think you confuse between the file/socket handles, and the specific I/O operations issued on them.
Now, regarding your questions:
AFAIK there is no conventional way to remove a file/socket handle from the IOCP (usually you just don't have to do this). You talk about removing closed handles from the IOCP, which is absolutely incorrect. You can't remove a closed handle, because it does not reference a valid kernel object anymore!
A more correct question should be how the file/socket should be properly closed. The answer is: just close your handle. All the outstanding I/O operations (issued on this handle) will return soon with an error code (abortion). Then, in your completion routine (the one that calls GetQueuedCompletionStatus in a loop) should perform the per-I/O needed cleanup.
As I've already said, all the I/O completion arrives at IOCP in both synchronous and asynchronous cases. The only situation where it does not arrive at IOCP is when an I/O completes synchronously with an error. Anyway, if you want a unified processing - in such a case you may post an artificial completion data to IOCP (use PostQueuedCompletionStatus).
You should use WSASend and WSARecv (not recv and send) for overlapped I/O. Nevertheless, even of the socket was opened with flag WSA_FLAG_OVERLAPPED - you are allowed to call the I/O functions without specifying the OVERLAPPED structure. In such a case those functions work synchronously.
So that you may decide on synchronous/asynchronous modes for every function call.
There is no problem to mix overlapped read/write requests. The only delicate point here is what happens if you try to read the data from the file position where you're currently writing to. The result may depend on subtle things, such as order of completion of I/Os by the hardware, some PC timing parameters and etc. Such a situation should be avoided.
How can I remove a handle from the I/O completion port? For example, when I add sockets to the IOCP, how can I remove closed ones? Should I just re-register another socket with the same completion key?
You've got it the wrong way around. You set the I/O completion port to be used by a file object - when the file object is deleted, you have nothing to worry about. The reason you're getting confused is because of the way Win32 exposes the underlying native API functionality (CreateIoCompletionPort does two very different things in one function).
Also, is there a way to make the calls
ALWAYS go over the I/O completion port
and don't return synchronously?
This is how it's always been. Only starting with Windows Vista can you customize how the completion notifications are handled.
What happens if an asynchronous
ReadFile has been requested, but
before it completes, a WriteFile to
the same file should be processed.
Will the ReadFile be cancelled with an
error message and I have to restart
the read process as soon as the write
is complete?
I/O operations in Windows are asynchronous inherently, and requests are always queued. You may not think this is so because you have to specify FILE_FLAG_OVERLAPPED in CreateFile to turn on asynchronous I/O. However, at the native layer, synchronous I/O is really an add-on, convenience thing where the kernel keeps track of the file position for you and waits for the I/O to complete before returning.

How to signal select() to return immediately?

I have a worker thread that is listening to a TCP socket for incoming traffic, and buffering the received data for the main thread to access (let's call this socket A). However, the worker thread also has to do some regular operations (say, once per second), even if there is no data coming in. Therefore, I use select() with a timeout, so that I don't need to keep polling. (Note that calling receive() on a non-blocking socket and then sleeping for a second is not good: the incoming data should be immediately available for the main thread, even though the main thread might not always be able to process it right away, hence the need for buffering.)
Now, I also need to be able to signal the worker thread to do some other stuff immediately; from the main thread, I need to make the worker thread's select() return right away. For now, I have solved this as follows (approach basically adopted from here and here):
At program startup, the worker thread creates for this purpose an additional socket of the datagram (UDP) type, and binds it to some random port (let's call this socket B). Likewise, the main thread creates a datagram socket for sending. In its call to select(), the worker thread now lists both A and B in the fd_set. When the main thread needs to signal, it sendto()'s a couple of bytes to the corresponding port on localhost. Back in the worker thread, if B remains in the fd_set after select() returns, then recvfrom() is called and the bytes received are simply ignored.
This seems to work very well, but I can't say I like the solution, mainly as it requires binding an extra port for B, and also because it adds several additional socket API calls which may fail I guess – and I don't really feel like figuring out the appropriate action for each of the cases.
I think ideally, I would like to call some function which takes A as input, and does nothing except makes select() return right away. However, I don't know such a function. (I guess I could for example shutdown() the socket, but the side effects are not really acceptable :)
If this is not possible, the second best option would be creating a B which is much dummier than a real UDP socket, and doesn't really require allocating any limited resources (beyond a reasonable amount of memory). I guess Unix domain sockets would do exactly this, but: the solution should not be much less cross-platform than what I currently have, though some moderate amount of #ifdef stuff is fine. (I am targeting mainly for Windows and Linux – and writing C++ by the way.)
Please don't suggest refactoring to get rid of the two separate threads. This design is necessary because the main thread may be blocked for extended periods (e.g., doing some intensive computation – and I can't start periodically calling receive() from the innermost loop of calculation), and in the meanwhile, someone needs to buffer the incoming data (and due to reasons beyond what I can control, it cannot be the sender).
Now that I was writing this, I realized that someone is definitely going to reply simply "Boost.Asio", so I just had my first look at it... Couldn't find an obvious solution, though. Do note that I also cannot (easily) affect how socket A is created, but I should be able to let other objects wrap it, if necessary.
You are almost there. Use a "self-pipe" trick. Open a pipe, add it to your select() read and write fd_set, write to it from main thread to unblock a worker thread. It is portable across POSIX systems.
I have seen a variant of similar technique for Windows in one system (in fact used together with the method above, separated by #ifdef WIN32). Unblocking can be achieved by adding a dummy (unbound) datagram socket to fd_set and then closing it. The downside is that, of course, you have to re-open it every time.
However, in the aforementioned system, both of these methods are used rather sparingly, and for unexpected events (e.g., signals, termination requests). Preferred method is still a variable timeout to select(), depending on how soon something is scheduled for a worker thread.
Using a pipe rather than socket is a bit cleaner, as there is no possibility for another process to get hold of it and mess things up.
Using a UDP socket definitely creates the potential for stray packets to come in and interfere.
An anonymous pipe will never be available to any other process (unless you give it to it).
You could also use signals, but in a multithreaded program you'll want to make sure that all threads except for the one you want have that signal masked.
On unix it will be straightforward with using a pipe. If you are on windows and want to keep using the select statement to keep your code compatible with unix, the trick to create an unbound UDP socket and close it, works well and easy. But you have to make it multi-threadsafe.
The only way I found to make this multi-threadsafe is to close and recreate the socket in the same thread as the select statement is running. Of course this is difficult if the thread is blocking on the select. And then comes in the windows call QueueUserAPC. When windows is blocking in the select statement, the thread can handle Asynchronous Procedure Calls. You can schedule this from a different thread using QueueUserAPC. Windows interrupts the select, executes your function in the same thread, and continues with the select statement. You can now in your APC method close the socket and recreate it. Guaranteed thread safe and you will never loose a signal.
To be simple:
a global var saves the socket handle, then close the global socket, the select() will return immediately: closesocket(g_socket);

Resources