raw socket bypassing tcp/ip headers - c

I have a 2 programs that are communicating via sockets on the same computer.
Currently 1.6 million bytes is taking about 7 seconds to transfer using TCP/IP.
I need to make it fast.
If I use a raw socket instead, and ignore the TCP/IP headers, then this should increase the speed? Is there anything else I can do to increase speed? Is the SOCKET_RAW option a straight copy or does it do anything else?

1.6MB shouldn't take 7 seconds using "normal" TCP/IP - certainly not on the same machine! That suggests you've got inefficient code somewhere. I'd address that before trying to do anything "special" in terms of the networking.
EDIT: I've just written a short C# program on a netbook, and that transfers 2MB (generating random data as it goes) in 279ms. That's with no optimization. Unless you're running on a machine from the 1980s, you should definitely be getting better performance than that...

Try using Unix Domain Sockets instead.

To get that poor of performance, you are doing something very inefficient. Perhaps the i/o operations are single byte?
Changing to raw sockets is a bad idea. To get reliable communication, you'd then have to add some sort of data checking, sequencing, etc., etc.: everything that TCP does for reliability.
If the purpose is to transfer data from one process to another on the same machine, use shared memory and a mutex to synchronize access. Of course this is not a good solution if the programs will eventually have to run on separate machines.

No, using raw IP sockets is definitely not a good idea. Using a unix-domain socket might be marginally more efficient, but I doubt it's going to solve your problem. You clearly have another problem. Perhaps it is your application-level protocol which is inefficient?

Related

Upload in a restricted country is too slow

I'm a C programmer and may write programs in Linux area to connect machines over internet. After I examined that speedtest.net can't upload or has a very poor upload speed, I decided to examine a simple TCP socket connection and see whether it's really slow, and found that yes, it's really slow. I've rented a VPS outside of my country. I don't know what's happening by the government in infrastructure and how packets are routed and how they're restricted. Examining what I saw in speedtest.net, again in a simple socket connection proves me that I can't have a chance. When the traffic in shaped so, there's no way. It proves that it's not a restriction on HTTPS or any application layer protocol, when just a simple TCP socket connection also can't succeed to gain reasonable speed. The speed is below 10 kilobytes per second! Damn!
In contrast, after I got disappointed, I examined some barrier breakers like CyberGhost extension for Chrome. I wondered when I saw that it may overcome the barrier by increasing the upload speed to about 200 kilobytes per second! How?! They can't use any method closer to hardware than sockets.
Now I come here to consult with you and see what ideas you may have about it, so that I may write a program or change the written program based on it.
Thank you

Epoll vs Libevent for Bittorrent like application

I am implementing bit torrent for P2p file sharing. Let's say, Maximum of among 100 peers sharing simultaneously. TCP Connections are setup between each peer to every other peer. Initially, One peer has whole file and it starts sharing pieces and subsequently, all peers share their pieces.
Typically, piece size is 50kB - 1MB. I am wondering, What is the best approach to write such application in C. Using threads with epoll or libevent??
Can anybody please give positives/negatives of different possible approaches??
If we're only talking about 100 peer connections at any given moment, the traditional approach of using select or poll on a group of TCP sockets will work out just fine.
EPoll helps solve the problem of when you need to scale to thousands of long running connections. Read the doc on the C10K problem for more details.
I've heard good things about libevent. I believe it's an abstraction on top of epoll and other socket functions that provides a few nice things. If it makes your programming easier, then by all means use it. But you probably don't need it for performance.
Libevent is essentially a wrapper around epoll, mostly considered for writing good portable code. Since its a wrapper, drawbacks of epoll will be retained and does not add much from performance perspective. If portability is not concern, epoll should just work fine. Even better, if the volume is considerably less than one still use poll.

Thousands of IP Addresses/Interfaces vs. slow program performance

I have a CentOS 5.9 machine set up with 5000+ IP addresses (secondary) for eth2.
My program only uses 2 for 2 UDP sockets (1 RX, 1 TX).
When I run the application, the CPU usage is almost 100% all the time.
When I drop down the number of the IP addresses (10), everything go to the normal - hardly 1% CPU usage.
Program is basically a client - server application. It uses non blocking r/w and epoll_wait()
for event waiting.
Can someone please explain to me why so high CPU usage for binary that only use small portion
of configured addresses.
I don't think the question posted talks about number of sockets but rather number of addresses on the interface. Although it seems a little strange as to why your program goes too high in CPU with this number, but in general number of addresses will affect the performance of the IP stack to deal with incoming packets and outgoing packets. Like when you call a send, and your socket is not bound, kernel needs to determine an IP address to put in the packet based on the destination address, and if that takes time it will show up in your process context.
But these still does not explain much, I guess putting a gprof will be a good idea.
Handling thousands of sockets takes specialized software. Most network programmers naively use "select" and expect that to scale up to thousands of sockets well... which it definitely does not. A more event-driven model scales much better ... the event being a new socket or data on the socket, etc.
For Linux and Windows I use Libevent. It's a socket wrapper and not very hard to use and it scales nicely to ten-of-thousands of sockets.
http://libevent.org/
Look at the website here and you can see the logarithmic graph that shows tens of thousands of sockets performing as though they were 100. Of course, if the sockets are super busy, then you are right back to low-performance, but most sockets in the world are mostly quiet and this is where libevent shines. There are other libraries as well like ZeroMq (C# mono), libev, Boost.ASIO.
http://zeromq.org/
http://libev.schmorp.de/bench.html
http://www.boost.org/doc/libs/1_36_0/doc/html/boost_asio.html
Here is my working, super-simple sample. You'll need to add threading protections but with less than an hour's work, you could easily support a few thousand simultaneous connections.
http://pastebin.com/g02S2RTi

Writing program to run on server, requesting experienced advice

I'm developing a program that will need to run on Internet servers (a back-end component to be used by several cross-platform programs). I'm familiar with the security precautions to take (to prevent buffer overflows and SQL Injection attacks, for instance), but have never written a server program before, or any program that will be used on this scale.
The program needs to be able to serve hundreds or thousands of clients simultaneously. The protocols are designed for processing speed and to minimize the amount of data that must be exchanged, and the server side will be written in C. There will be both a Windows and a Linux version from the same code.
Questions:
How should the program handle communications -- multiple threads, a single thread handling all the sockets in turn, or spawn a new process for every so many incoming connections (or for each one)?
Do I need to worry about things like memory fragmentation, since this program will need to run for months at a time?
What other design issues, specific to this kind of programming, might an experienced developer of cross-platform programs for desktop and mobile systems not be aware of?
Please, no suggestions to use a different language. That decision has already been made, for reasons I'm not at liberty to go into.
For I'd use libevent or libev and non-blocking I/O. This way the operating system will take case of most of your scheduling problems. I'd also use a thread pool for processing tasks, that by nature are blocking, so they don't block the main loop. And if you ever need to read or write large amounts of data to or from the disc, use mmap, again to let the OS handle as much as possible.
The basic advice is use the OS, as much as possible. If you want a good example of a program which does this look at Varnish, it is very well written, and performs fantastic.
With my experience running multiple servers for over 3 years of uptime, and programs with little over a year of uptime I can still recommend making the setup so that the system gracefully recovers from a program error and from a server reboot.
Even though performance gets a hit when a program is restarted, you need to be able to handle that as external circumstances can force the program to such a restart.
Don't try to reinvent the wheel when not needed, and have a look at zeromq or something like that to handle distribution of incoming communications. (If you are allowed to, prototype the backends in a more forgiving language than C like Python, then reimplement in C but keeping the communications protocol)

Whats the advantages and disadvantages of using Socket in IPC

I have been asked this question in some recent interviews,Whats the advantages and disadvantages of using Socket in IPC when there are other ways to perform IPC.Have not found exact answer .
Any help would be much appreciated.
Compared to pipes, IPC sockets differ by being bidirectional, that is, reads and writes can be done on the same descriptor. Pipes, unlike sockets, are unidirectional. You have to keep a pair of descriptors if you want to do both reads and writes.
Pipes, on the other hand, guarantee atomicity when reading or writing under a certain amount of bytes. Writing something less than PIPE_BUF bytes at once is guaranteed to be delivered in one chunk and never observed partial. Sockets do require more care from the programmer in that respect.
Shared memory, when used for IPC, requires explicit synchronisation from the programmer. It may be the most efficient and most flexible mechanism, but that comes at an increased complexity cost.
Another point in favour of sockets: an app using sockets can be easily distributed - ie. it can be run on one host or spread across several hosts with little effort. This depends of course on the nature of the app.
Perhaps this is too simplified an answer, yet it is an important detail. Sockets are not supported on all OS's. Recently, I have been aware of a project that used sockets for IPC all over the place only to find that they were forced to change from Linux to a proprietary OS which was POSIX, but did not support sockets the same way as Linux.
Sockets allow you a few benefits...
You can connect a simple client to them for testing (manually enter data, see the response).
This is very useful for debugging, simulating and blackbox testing.
You can run the processes on different machines. This can be useful for scalability and is very helpful in debugging / testing if you work in embedded software.
It becomes very easy to expose your process as a service
But there are drawbacks as well
Overhead is greater than IPC optimized for a single machine. Shared memory in particular is better if you need the performance, and you know your processes are all on the same machine.
Security - if your client apps can connect so can anyone else, if you're not careful about authentication. Data can also be sniffed if you're not encrypting, and modified if you're not at least signing data sent over the wire.
Using a true message queue tends to leave you with fixed sized messages. If you have a large number of messages of wildly varying sizes this can become a performance problem. Using a socket can be a way around this, though you're then left trying to wrap this functionality to become identical to a queue, which is tricky to get the detail right on, particularly aspects like blocking/non-blocking and atomicity.
Shared memory is quick but requires management (you end up writing a version of malloc to manage the SHM) plus you have to synchronise and lock it in some way. Though you can use libraries to help with this the availability depends on your environment and language.
Queues are easy but have the downsides listed as pros to my socket discussion.
Pipes have been covered by Blagovests answer to this question.
As is ever the case with this kind of stuff I would suggest reading the W. Richard Stevens books on IPC and sockets. There is no better explanation than his! :-)

Resources